Generating emotional speech from neutral speech

2010 7th International Symposium on Chinese Spoken Language Processing Pub Date : 2010-11-01 DOI:10.1109/ISCSLP.2010.5684862

Ling Cen, P. Chan, M. Dong, Haizhou Li

引用次数: 10

Abstract

Emotional speech is one of the key techniques towards a natural and realistic conversation between human and machines. Generating emotional speech by means of converting a neutral speech is desirable as this allows us to generate emotional speech from many existing text-to-speech systems. The GMM based method is capable of synthesizing the desired spectrum, while the rule-based algorithm is effective in implementing the targeted prosodic features. Note that spectral and prosodic features are key factors that project the emotional effects of speech, in this paper, we propose the synthesis of emotional speech by applying a two-stage transformation that combines the GMM and RB methods. We synthesize happy, angry and sad speech and compare the proposed method with GMM linear transformation and RB transformation respectively. The listening test has shown that the speech synthesized by the proposed method is perceived to best portray the targeted speech emotion.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从中性言语中生成情感言语

情感语言是实现人与机器之间自然、真实对话的关键技术之一。通过转换中性语音生成情感语音是可取的，因为这允许我们从许多现有的文本到语音系统生成情感语音。基于GMM的方法能够合成所需的谱，而基于规则的算法能够有效地实现目标韵律特征。请注意，频谱和韵律特征是预测语音情感效应的关键因素，在本文中，我们提出通过结合GMM和RB方法的两阶段转换来合成情感语音。我们合成了快乐、愤怒和悲伤的语音，并分别与GMM线性变换和RB变换进行了比较。听力测试表明，用该方法合成的语音被认为是最能描述目标语音情绪的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2010 7th International Symposium on Chinese Spoken Language Processing

自引率

0.00%

发文量