{"title":"Modeling glottal effect on the spectral envelop of STRAIGHT using mixture of Gaussians","authors":"Zhenhua Ling, Yu-Ping Wang, Yu Hu, Ren-Hua Wang","doi":"10.1109/CHINSL.2004.1409589","DOIUrl":null,"url":null,"abstract":"This paper presents a method to model the influence of glottal excitation on the STRAIGHT (speech transformation and representation using adaptive interpolation of weighted spectrum) spectrum by fitting the spectral envelop with a mixture of Gaussians (MOG). The first Gaussian component is used as the estimation for the glottal formant in the STRAIGHT spectrum because analysis results show that it has an obviously stronger correlation with fundamental frequency than other spectral components and has similar characteristics to the glottal formant. Then linear regression is carried out to measure the relationship between F/sub 0/ and the parameters of the first Gaussian component. This model is applied to the STRAIGHT synthesis process and proved to be effective in compensating the voice quality variation caused by pitch modification.","PeriodicalId":212562,"journal":{"name":"2004 International Symposium on Chinese Spoken Language Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2004 International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CHINSL.2004.1409589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
This paper presents a method to model the influence of glottal excitation on the STRAIGHT (speech transformation and representation using adaptive interpolation of weighted spectrum) spectrum by fitting the spectral envelop with a mixture of Gaussians (MOG). The first Gaussian component is used as the estimation for the glottal formant in the STRAIGHT spectrum because analysis results show that it has an obviously stronger correlation with fundamental frequency than other spectral components and has similar characteristics to the glottal formant. Then linear regression is carried out to measure the relationship between F/sub 0/ and the parameters of the first Gaussian component. This model is applied to the STRAIGHT synthesis process and proved to be effective in compensating the voice quality variation caused by pitch modification.