基于功能F0模型的汉语声调到语调韵律建模

Jinfu Ni, S. Sakai, Tohru Shimizu, Satoshi Nakamura
{"title":"基于功能F0模型的汉语声调到语调韵律建模","authors":"Jinfu Ni, S. Sakai, Tohru Shimizu, Satoshi Nakamura","doi":"10.1109/ISUC.2008.37","DOIUrl":null,"url":null,"abstract":"Chinese is a tonal language. It has both lexical tones and intonation. The fundamental frequency (F0) contours thereby consist of tone and intonation components. This paper presents an approach to modeling the two components in separate ways and combining them to form the final F0 contours based on a functional F0 model. We analyze tonal patterns as sparse target points (tonal F0 peaks and valleys) and model them using classification and regression trees (CART) with contextual linguistic features. As a first step, we stylize expressive intonation using a few piecewise linear patterns specified by a few markup tags. Both tonal and intonational patterns are represented in a parametric form within the framework of this F0 model. Our experimental results indicated that very low F0 prediction errors were achieved by the CART-based modeling of the tonal patterns uttered by two female and male speakers. In a listening test, the native speakers could identify 90% of synthesized stimuli with enhancing emphasis in word. Also, the linguistic features related to the lexical tone context and distinction between voiced and unvoiced initials played the most important role in characterizing the tonal patterns.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"91 4-5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Prosody Modeling from Tone to Intonation in Chinese using a Functional F0 Model\",\"authors\":\"Jinfu Ni, S. Sakai, Tohru Shimizu, Satoshi Nakamura\",\"doi\":\"10.1109/ISUC.2008.37\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Chinese is a tonal language. It has both lexical tones and intonation. The fundamental frequency (F0) contours thereby consist of tone and intonation components. This paper presents an approach to modeling the two components in separate ways and combining them to form the final F0 contours based on a functional F0 model. We analyze tonal patterns as sparse target points (tonal F0 peaks and valleys) and model them using classification and regression trees (CART) with contextual linguistic features. As a first step, we stylize expressive intonation using a few piecewise linear patterns specified by a few markup tags. Both tonal and intonational patterns are represented in a parametric form within the framework of this F0 model. Our experimental results indicated that very low F0 prediction errors were achieved by the CART-based modeling of the tonal patterns uttered by two female and male speakers. In a listening test, the native speakers could identify 90% of synthesized stimuli with enhancing emphasis in word. Also, the linguistic features related to the lexical tone context and distinction between voiced and unvoiced initials played the most important role in characterizing the tonal patterns.\",\"PeriodicalId\":339811,\"journal\":{\"name\":\"2008 Second International Symposium on Universal Communication\",\"volume\":\"91 4-5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Second International Symposium on Universal Communication\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISUC.2008.37\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Second International Symposium on Universal Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISUC.2008.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

汉语是一种有声调的语言。它既有词汇语调,也有语调。基频(F0)轮廓由音调和语调组成。本文提出了一种方法,以不同的方式对这两个组件进行建模,并将它们结合起来,形成基于功能F0模型的最终F0轮廓。我们将音调模式作为稀疏目标点(音调F0峰值和谷)进行分析,并使用具有上下文语言特征的分类和回归树(CART)对它们进行建模。作为第一步,我们使用由几个标记标记指定的几个分段线性模式对表达性语调进行风格化。在这个F0模型的框架内,音调和语调模式都以参数形式表示。我们的实验结果表明,基于cart的对两名女性和男性说话者发出的音调模式进行建模可以获得非常低的F0预测误差。在听力测试中,以英语为母语的人可以识别90%的合成刺激,并加强单词的强调。此外,与词汇语调语境相关的语言特征以及浊音和浊音的区分对声调模式的表征起着最重要的作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Prosody Modeling from Tone to Intonation in Chinese using a Functional F0 Model
Chinese is a tonal language. It has both lexical tones and intonation. The fundamental frequency (F0) contours thereby consist of tone and intonation components. This paper presents an approach to modeling the two components in separate ways and combining them to form the final F0 contours based on a functional F0 model. We analyze tonal patterns as sparse target points (tonal F0 peaks and valleys) and model them using classification and regression trees (CART) with contextual linguistic features. As a first step, we stylize expressive intonation using a few piecewise linear patterns specified by a few markup tags. Both tonal and intonational patterns are represented in a parametric form within the framework of this F0 model. Our experimental results indicated that very low F0 prediction errors were achieved by the CART-based modeling of the tonal patterns uttered by two female and male speakers. In a listening test, the native speakers could identify 90% of synthesized stimuli with enhancing emphasis in word. Also, the linguistic features related to the lexical tone context and distinction between voiced and unvoiced initials played the most important role in characterizing the tonal patterns.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
AnHitz, Development and Integration of Language, Speech and Visual Technologies for Basque Chinese NP Chunking: A Semi-Supervised Approach The UCSD/Calit2 GreenLight Project (Invited Paper) Inferring User Interests from Relevance Feedback with High Similarity Sequence Data-Driven Clustering Computer Simulation of HRTFs for Personalization of 3D Audio
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1