Modeling glottal effect on the spectral envelop of STRAIGHT using mixture of Gaussians

2004 International Symposium on Chinese Spoken Language Processing Pub Date : 2004-12-15 DOI:10.1109/CHINSL.2004.1409589

Zhenhua Ling, Yu-Ping Wang, Yu Hu, Ren-Hua Wang

引用次数: 4

Abstract

This paper presents a method to model the influence of glottal excitation on the STRAIGHT (speech transformation and representation using adaptive interpolation of weighted spectrum) spectrum by fitting the spectral envelop with a mixture of Gaussians (MOG). The first Gaussian component is used as the estimation for the glottal formant in the STRAIGHT spectrum because analysis results show that it has an obviously stronger correlation with fundamental frequency than other spectral components and has similar characteristics to the glottal formant. Then linear regression is carried out to measure the relationship between F/sub 0/ and the parameters of the first Gaussian component. This model is applied to the STRAIGHT synthesis process and proved to be effective in compensating the voice quality variation caused by pitch modification.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用混合高斯谱模拟声门效应对直声波频谱包络的影响

本文提出了一种用混合高斯谱(MOG)拟合频谱包络的方法来模拟声门激励对语音转换和加权谱自适应插值表示的影响。由于分析结果表明，第一个高斯分量与基频的相关性比其他谱分量明显更强，并且与声门形成峰具有相似的特征，因此采用第一个高斯分量作为STRAIGHT频谱中声门形成峰的估计。然后进行线性回归，测量F/sub 0/与第一个高斯分量参数之间的关系。将该模型应用于STRAIGHT合成过程中，结果表明该模型能有效补偿音高变化引起的音质变化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2004 International Symposium on Chinese Spoken Language Processing

自引率

0.00%

发文量