卷积-递归-卷积文本-语音系统

Kuo Chen, Xuebin Sun
{"title":"卷积-递归-卷积文本-语音系统","authors":"Kuo Chen, Xuebin Sun","doi":"10.1145/3548608.3559304","DOIUrl":null,"url":null,"abstract":"End-to-end speech synthesis technology has already replaced the positions of Statistical Parametric Speech Synthesis (SPSS) in text-to-speech (TTS) field. The end-to-end model based on neural network, does not require a lot of domain knowledge but synthesize more natural speeches. Tacotron is the first model that can synthesize speeches which even human is hard to distinguish. We propose a new end-to-end speech synthesis system which is called Convolution-Recurrent-Convolution Text-to-Speech (CRCTTS). We chose Tacotron as our baseline model and adjust the architecture through fully Convolution Neural Network (CNN) module and Dynamic Convolution Attention (DCA). Besides, we also introduce the attention guided mechanism to our model for accelerating the attention alignment in the decoder module. The model we proposed has been proved that can synthesis speech with better quality and cost less time in terms of training stage and synthesis stage than the baseline model with these technologies.","PeriodicalId":201434,"journal":{"name":"Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CRCTTS: Convolution-Recurrent-Convolution Text-to-Speech System\",\"authors\":\"Kuo Chen, Xuebin Sun\",\"doi\":\"10.1145/3548608.3559304\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"End-to-end speech synthesis technology has already replaced the positions of Statistical Parametric Speech Synthesis (SPSS) in text-to-speech (TTS) field. The end-to-end model based on neural network, does not require a lot of domain knowledge but synthesize more natural speeches. Tacotron is the first model that can synthesize speeches which even human is hard to distinguish. We propose a new end-to-end speech synthesis system which is called Convolution-Recurrent-Convolution Text-to-Speech (CRCTTS). We chose Tacotron as our baseline model and adjust the architecture through fully Convolution Neural Network (CNN) module and Dynamic Convolution Attention (DCA). Besides, we also introduce the attention guided mechanism to our model for accelerating the attention alignment in the decoder module. The model we proposed has been proved that can synthesis speech with better quality and cost less time in terms of training stage and synthesis stage than the baseline model with these technologies.\",\"PeriodicalId\":201434,\"journal\":{\"name\":\"Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3548608.3559304\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3548608.3559304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

端到端语音合成技术已经取代了统计参数语音合成(SPSS)在文本到语音(TTS)领域的地位。基于神经网络的端到端模型,不需要大量的领域知识,可以合成更多的自然语音。Tacotron是第一个能够合成连人类都难以分辨的语音的机器人。我们提出了一种新的端到端语音合成系统,称为卷积-递归-卷积文本到语音(CRCTTS)。我们选择Tacotron作为我们的基线模型,并通过全卷积神经网络(CNN)模块和动态卷积注意力(DCA)来调整架构。此外,我们还在模型中引入了注意引导机制,以加速解码器模块的注意对齐。我们提出的模型在训练阶段和合成阶段都比基线模型具有更好的合成质量和更少的合成时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CRCTTS: Convolution-Recurrent-Convolution Text-to-Speech System
End-to-end speech synthesis technology has already replaced the positions of Statistical Parametric Speech Synthesis (SPSS) in text-to-speech (TTS) field. The end-to-end model based on neural network, does not require a lot of domain knowledge but synthesize more natural speeches. Tacotron is the first model that can synthesize speeches which even human is hard to distinguish. We propose a new end-to-end speech synthesis system which is called Convolution-Recurrent-Convolution Text-to-Speech (CRCTTS). We chose Tacotron as our baseline model and adjust the architecture through fully Convolution Neural Network (CNN) module and Dynamic Convolution Attention (DCA). Besides, we also introduce the attention guided mechanism to our model for accelerating the attention alignment in the decoder module. The model we proposed has been proved that can synthesis speech with better quality and cost less time in terms of training stage and synthesis stage than the baseline model with these technologies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Study on Optimization of cold chain logistics distribution path of agricultural products in Hefei Design and Implementation of a Batteryless Pedometer based on a Motion Tracking Sensor Rapid visual positioning of sheet metal parts based on electronic drawing templates An analysis of hot topics and trends in foreign 3D printing technology research——analysis of knowledge graphs based on citation indexes such as SSCI Tibetan Jiu Chess Game Algorithm based on Expert Knowledge
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1