{"title":"Improving the consistency of vocal tract shape estimation","authors":"K. Nataraj, J. Jagbandhu, P. C. Pandey, M. Shah","doi":"10.1109/NCC.2011.5734729","DOIUrl":null,"url":null,"abstract":"Estimation of the vocal tract shape has applications in articulatory synthesis, speech recognition, and speech-training aids. LPC based analysis can be used to obtain the vocal tract shape during speech segments produced with glottal excitation and with fixed as well as transitional tract configurations. During the stop closures of vowel-consonant-vowel (VCV) utterances, the shape can be estimated by bivariate surface modeling of the shapes during transition segments. The shape obtained by LPC analysis of steady vowels shows variability with the position of the analysis frame. Low-pass filtering of the shapes across the frames for improving the consistency cannot be used during transition segments. A windowed energy index is calculated as the ratio of the energy of the windowed signal to the frame energy, and it is shown that the shapes in the frames corresponding to the valleys in this index have a reduced variability. Thus the selection of the frames based on this index can be used for improving the consistency of vocal tract shape estimation for various applications.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2011.5734729","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Estimation of the vocal tract shape has applications in articulatory synthesis, speech recognition, and speech-training aids. LPC based analysis can be used to obtain the vocal tract shape during speech segments produced with glottal excitation and with fixed as well as transitional tract configurations. During the stop closures of vowel-consonant-vowel (VCV) utterances, the shape can be estimated by bivariate surface modeling of the shapes during transition segments. The shape obtained by LPC analysis of steady vowels shows variability with the position of the analysis frame. Low-pass filtering of the shapes across the frames for improving the consistency cannot be used during transition segments. A windowed energy index is calculated as the ratio of the energy of the windowed signal to the frame energy, and it is shown that the shapes in the frames corresponding to the valleys in this index have a reduced variability. Thus the selection of the frames based on this index can be used for improving the consistency of vocal tract shape estimation for various applications.