Prosodic vs. segmental contributions to naturalness in a diphone synthesizer

H. Bunnell, S. Hoskins, Debra Yarrington
{"title":"Prosodic vs. segmental contributions to naturalness in a diphone synthesizer","authors":"H. Bunnell, S. Hoskins, Debra Yarrington","doi":"10.21437/ICSLP.1998-15","DOIUrl":null,"url":null,"abstract":"The relative contributions of segmental versus prosodic factors to the perceived naturalness of synthetic speech was measured by transplanting prosody between natural speech and the output of a diphone synthesizer. A small corpus was created containing matched sentence pairs wherein one member of the pair was a natural utterance and the other was a synthetic utterance generated with diphone data from the same talker. Two additional sentences were formed from each sentence pair by transplanting the prosodic structure between the natural and synthetic members of each pair. In two listening experiments subjects were asked to (a) classify each sentence as “natural” or “synthetic, or (b) rate the naturalness of each sentence. Results showed that the prosodic information was more important than segmental information in both classification and ratings of naturalness.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th International Conference on Spoken Language Processing (ICSLP 1998)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1998-15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

The relative contributions of segmental versus prosodic factors to the perceived naturalness of synthetic speech was measured by transplanting prosody between natural speech and the output of a diphone synthesizer. A small corpus was created containing matched sentence pairs wherein one member of the pair was a natural utterance and the other was a synthetic utterance generated with diphone data from the same talker. Two additional sentences were formed from each sentence pair by transplanting the prosodic structure between the natural and synthetic members of each pair. In two listening experiments subjects were asked to (a) classify each sentence as “natural” or “synthetic, or (b) rate the naturalness of each sentence. Results showed that the prosodic information was more important than segmental information in both classification and ratings of naturalness.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
音韵与音段对双管合成器自然性的贡献
通过在自然语音和双管合成器的输出之间移植韵律,测量了片段和韵律因素对合成语音感知自然度的相对贡献。创建了一个小语料库,其中包含匹配的句子对,其中一个成员是自然话语,另一个是由同一说话者的diphone数据生成的合成话语。通过将每对句子的自然句和合成句之间的韵律结构进行移植,形成两个附加句。在两个听力实验中,受试者被要求(a)将每个句子分类为“自然”或“合成”,或(b)对每个句子的自然程度进行评分。结果表明,韵律信息在自然度的分类和评分中都比片段信息更重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Assimilation of place in Japanese and dutch Articulatory analysis using a codebook for articulatory based low bit-rate speech coding Phonetic and phonological characteristics of paralinguistic information in spoken Japanese HMM-based visual speech recognition using intensity and location normalization Speech recognition via phonetically featured syllables
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1