Prosody modeling for syllable-based concatenative speech synthesis of Hindi and Tamil

2011 National Conference on Communications (NCC) Pub Date : 2011-03-17 DOI:10.1109/NCC.2011.5734737

Ashwin Bellur, K. Narayan, K. Raghava Krishnan, H. Murthy

引用次数: 39

Abstract

This paper describes ways to improve prosody modeling in syllable-based concatenative speech synthesis systems for two Indian languages, namely Hindi and Tamil, within the unit selection paradigm. The syllable is a larger unit than the diphone and contains most of the coarticulation information. Although syllable-based synthesis is quite intelligible compared to diphone based systems, naturalness especially in terms of prosody, requires additional processing. Since the synthesizer is built using a cluster unit framework, a hybrid approach, where a combination of both rule based and statistical models are proposed to model prosody of syllable like units better. It is further observed that prediction of phrase boundaries is crucial, particularly because Indian languages are replete with polysyllabic words. CART based phrase modeling for Hindi and Tamil are discussed. Perceptual experiments show a significant improvement in the MOS for both Hindi and Tamil synthesizers. Index Terms: speech synthesis, unit selection, cluster unit synthesis, phrase boundaries

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于音节的印地语和泰米尔语串联语音合成的韵律建模

本文描述了在单位选择范例中改进两种印度语言(即印地语和泰米尔语)基于音节的串联语音合成系统中的韵律建模方法。音节是一个比双音更大的单位，包含了大部分的协同发音信息。虽然基于音节的合成相对于基于双音器的系统更容易理解，但自然性，尤其是韵律方面，需要额外的处理。由于合成器是使用集群单元框架构建的，因此提出了一种混合方法，其中基于规则的模型和统计模型相结合，以更好地模拟音节单元的韵律。进一步观察到，短语边界的预测是至关重要的，特别是因为印度语言充满了多音节词。讨论了基于CART的印地语和泰米尔语短语建模。知觉实验显示印地语和泰米尔语合成器的MOS都有显著改善。索引术语:语音合成，单元选择，聚类单元合成，短语边界

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊