Almost Unsupervised Text to Speech and Automatic Speech Recognition

时间 2021-01-04

标签 light-TTS TTS 栏目 HTML 繁體版

原文原文链接

Abstract: 无监督方法，只需要利用几百对文本—语音对和额外的无标签的数据，提供给TTS和ASR components: 1.a denosising auto-encoder 2. 双机制训练；TTS是把text y转成语音x，ASR把利用x和y进行训练，反之亦然 3. 双向序列建模，主要解决长语音序列和文本序列在训练过程中出现的错误传播问题 4.一个unified model 包含 TT

>>阅读原文<<

1. Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody
2. 5 Open Source Speech Recognition/Speech-to-Text Systems
3. Visual Speech Recognition: Lip Segmentation and Mapping
4. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
5. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
6. Azure Cognitive Services- Speech To Text
7. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin论文笔记
8. Food Log with Speech Recognition and NLP
9. FastSpeech: Fast, Robust and Controllable Text to Speech
10. （IS 19）wav2vec: Unsupervised Pre-training for Speech Recognition
更多相关文章...
• SVG - SVG 教程
• XSL-FO table-and-caption 对象 - XSL-FO 教程
• RxJava操作符（七）Conditional and Boolean
• 为了进字节跳动，我精选了29道Java经典算法题，带详细讲解

最新文章

1. 以实例说明微服务拆分（以SpringCloud+Gradle）
2. idea中通过Maven已经将依赖导入，在本地仓库和external libraries中均有，运行的时候报没有包的错误。
3. Maven把jar包打到指定目录下
4. 【SpringMvc】JSP+MyBatis 用户登陆后更改导航栏信息
5. 在Maven本地仓库安装架包
6. 搭建springBoot+gradle+mysql框架
7. PHP关于文件$_FILES一些问题、校验和限制
8. php 5.6连接mongodb扩展
9. Vue使用命令行创建项目
10. eclipse修改启动图片

本站公众号

欢迎关注本站公众号,获取更多信息

1. Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody
2. 5 Open Source Speech Recognition/Speech-to-Text Systems
3. Visual Speech Recognition: Lip Segmentation and Mapping
4. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
5. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
6. Azure Cognitive Services- Speech To Text
7. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin论文笔记
8. Food Log with Speech Recognition and NLP
9. FastSpeech: Fast, Robust and Controllable Text to Speech
10. （IS 19）wav2vec: Unsupervised Pre-training for Speech Recognition

>>更多相关文章<<