TTS - VLSP 2021: The Thunder Text-To-Speech System

N. Ngoc Anh, Nguyen Tien Thanh, Le Dang Linh
{"title":"TTS - VLSP 2021: The Thunder Text-To-Speech System","authors":"N. Ngoc Anh, Nguyen Tien Thanh, Le Dang Linh","doi":"10.25073/2588-1086/vnucsce.342","DOIUrl":null,"url":null,"abstract":"This paper describes our speech synthesis system participating in the Vietnamese Text-To-Speech track of the 2021 VLSP evaluation campaign. The goal of this challenge is to build a synthetic voice from a provided spontaneous speech corpus in Vietnamese. In this paper, we propose our implementation of FastSpeech2 model on spontaneous speech. We used a special strategy with spontaneous datasets using the TTS system. We present our utilization in generating mel-spectrograms from given texts and then synthesize speech from generated mel-spectrograms using a separately trained vocoder. In evaluation, our team achieved 3.943 mean score in MOS in-domain test, 3.3 in MOS out-domain test, and 85.00% SUS, which indicates the effectiveness of the proposed system.","PeriodicalId":416488,"journal":{"name":"VNU Journal of Science: Computer Science and Communication Engineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"VNU Journal of Science: Computer Science and Communication Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25073/2588-1086/vnucsce.342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

This paper describes our speech synthesis system participating in the Vietnamese Text-To-Speech track of the 2021 VLSP evaluation campaign. The goal of this challenge is to build a synthetic voice from a provided spontaneous speech corpus in Vietnamese. In this paper, we propose our implementation of FastSpeech2 model on spontaneous speech. We used a special strategy with spontaneous datasets using the TTS system. We present our utilization in generating mel-spectrograms from given texts and then synthesize speech from generated mel-spectrograms using a separately trained vocoder. In evaluation, our team achieved 3.943 mean score in MOS in-domain test, 3.3 in MOS out-domain test, and 85.00% SUS, which indicates the effectiveness of the proposed system.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
TTS - VLSP 2021:迅雷文本转语音系统
本文描述了我们的语音合成系统参与2021年VLSP评估活动的越南文本到语音轨道。这个挑战的目标是从提供的越南语自发语音语料库中构建一个合成语音。在本文中,我们提出了在自发语音上实现FastSpeech2模型。我们使用TTS系统对自发数据集使用了一种特殊的策略。我们介绍了从给定文本生成梅尔谱图的应用,然后使用单独训练的声码器从生成的梅尔谱图合成语音。在评估中,我们的团队在MOS域内测试中获得了3.943分的平均分,在MOS域外测试中获得了3.3分,SUS达到了85.00%,表明我们提出的系统是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Aspect-Category based Sentiment Analysis with Unified Sequence-To-Sequence Transfer Transformers A Bandwidth-Efficient High-Performance RTL-Microarchitecture of 2D-Convolution for Deep Neural Networks Noisy-label propagation for Video Anomaly Detection with Graph Transformer Network FRSL: A Domain Specific Language to Specify Functional Requirements A Contract-Based Specification Method for Model Transformations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1