基于功能F0模型的汉语声调到语调韵律建模

2008 Second International Symposium on Universal Communication Pub Date : 2008-12-15 DOI:10.1109/ISUC.2008.37

Jinfu Ni, S. Sakai, Tohru Shimizu, Satoshi Nakamura

{"title":"基于功能F0模型的汉语声调到语调韵律建模","authors":"Jinfu Ni, S. Sakai, Tohru Shimizu, Satoshi Nakamura","doi":"10.1109/ISUC.2008.37","DOIUrl":null,"url":null,"abstract":"Chinese is a tonal language. It has both lexical tones and intonation. The fundamental frequency (F0) contours thereby consist of tone and intonation components. This paper presents an approach to modeling the two components in separate ways and combining them to form the final F0 contours based on a functional F0 model. We analyze tonal patterns as sparse target points (tonal F0 peaks and valleys) and model them using classification and regression trees (CART) with contextual linguistic features. As a first step, we stylize expressive intonation using a few piecewise linear patterns specified by a few markup tags. Both tonal and intonational patterns are represented in a parametric form within the framework of this F0 model. Our experimental results indicated that very low F0 prediction errors were achieved by the CART-based modeling of the tonal patterns uttered by two female and male speakers. In a listening test, the native speakers could identify 90% of synthesized stimuli with enhancing emphasis in word. Also, the linguistic features related to the lexical tone context and distinction between voiced and unvoiced initials played the most important role in characterizing the tonal patterns.","PeriodicalId":339811,"journal":{"name":"2008 Second International Symposium on Universal Communication","volume":"91 4-5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Prosody Modeling from Tone to Intonation in Chinese using a Functional F0 Model\",\"authors\":\"Jinfu Ni, S. Sakai, Tohru Shimizu, Satoshi Nakamura\",\"doi\":\"10.1109/ISUC.2008.37\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Chinese is a tonal language. It has both lexical tones and intonation. The fundamental frequency (F0) contours thereby consist of tone and intonation components. This paper presents an approach to modeling the two components in separate ways and combining them to form the final F0 contours based on a functional F0 model. We analyze tonal patterns as sparse target points (tonal F0 peaks and valleys) and model them using classification and regression trees (CART) with contextual linguistic features. As a first step, we stylize expressive intonation using a few piecewise linear patterns specified by a few markup tags. Both tonal and intonational patterns are represented in a parametric form within the framework of this F0 model. Our experimental results indicated that very low F0 prediction errors were achieved by the CART-based modeling of the tonal patterns uttered by two female and male speakers. In a listening test, the native speakers could identify 90% of synthesized stimuli with enhancing emphasis in word. Also, the linguistic features related to the lexical tone context and distinction between voiced and unvoiced initials played the most important role in characterizing the tonal patterns.\",\"PeriodicalId\":339811,\"journal\":{\"name\":\"2008 Second International Symposium on Universal Communication\",\"volume\":\"91 4-5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Second International Symposium on Universal Communication\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISUC.2008.37\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Second International Symposium on Universal Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISUC.2008.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

汉语是一种有声调的语言。它既有词汇语调，也有语调。基频(F0)轮廓由音调和语调组成。本文提出了一种方法，以不同的方式对这两个组件进行建模，并将它们结合起来，形成基于功能F0模型的最终F0轮廓。我们将音调模式作为稀疏目标点(音调F0峰值和谷)进行分析，并使用具有上下文语言特征的分类和回归树(CART)对它们进行建模。作为第一步，我们使用由几个标记标记指定的几个分段线性模式对表达性语调进行风格化。在这个F0模型的框架内，音调和语调模式都以参数形式表示。我们的实验结果表明，基于cart的对两名女性和男性说话者发出的音调模式进行建模可以获得非常低的F0预测误差。在听力测试中，以英语为母语的人可以识别90%的合成刺激，并加强单词的强调。此外，与词汇语调语境相关的语言特征以及浊音和浊音的区分对声调模式的表征起着最重要的作用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Prosody Modeling from Tone to Intonation in Chinese using a Functional F0 Model

Chinese is a tonal language. It has both lexical tones and intonation. The fundamental frequency (F0) contours thereby consist of tone and intonation components. This paper presents an approach to modeling the two components in separate ways and combining them to form the final F0 contours based on a functional F0 model. We analyze tonal patterns as sparse target points (tonal F0 peaks and valleys) and model them using classification and regression trees (CART) with contextual linguistic features. As a first step, we stylize expressive intonation using a few piecewise linear patterns specified by a few markup tags. Both tonal and intonational patterns are represented in a parametric form within the framework of this F0 model. Our experimental results indicated that very low F0 prediction errors were achieved by the CART-based modeling of the tonal patterns uttered by two female and male speakers. In a listening test, the native speakers could identify 90% of synthesized stimuli with enhancing emphasis in word. Also, the linguistic features related to the lexical tone context and distinction between voiced and unvoiced initials played the most important role in characterizing the tonal patterns.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2008 Second International Symposium on Universal Communication

自引率

0.00%

发文量

期刊最新文献

AnHitz, Development and Integration of Language, Speech and Visual Technologies for Basque Chinese NP Chunking: A Semi-Supervised Approach The UCSD/Calit2 GreenLight Project (Invited Paper) Inferring User Interests from Relevance Feedback with High Similarity Sequence Data-Driven Clustering Computer Simulation of HRTFs for Personalization of 3D Audio