{"title":"An improved chirp group delay based algorithm for estimating the vocal tract response","authors":"M. Jayesh, C. S. Ramalingam","doi":"10.5281/ZENODO.54522","DOIUrl":null,"url":null,"abstract":"We propose a method for vocal tract estimation that is better than Bozkurt's chirp group delay method [1] and its zero-phase variant [2]. The chirp group delay method works only for voiced speech, is critically dependent on finding the glottal closure instants (GCI), deteriorates in performance when more than two pitch cycles are included for analysis, and does not work for unvoiced speech. The zero-phase variant eliminates these drawbacks but works poorly for nasal sounds. In our proposed method all outside-unit-circle zeros are reflected inside before computing the chirp group delay. The advantages are: (a) GCI knowledge not required, (b) the vocal tract estimate is far less sensitive to the location and duration of the analysis window, (c) works for unvoiced sounds, and (d) captures the spectral valleys well for nasals, which in turn leads to better recognition accuracy.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 22nd European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5281/ZENODO.54522","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We propose a method for vocal tract estimation that is better than Bozkurt's chirp group delay method [1] and its zero-phase variant [2]. The chirp group delay method works only for voiced speech, is critically dependent on finding the glottal closure instants (GCI), deteriorates in performance when more than two pitch cycles are included for analysis, and does not work for unvoiced speech. The zero-phase variant eliminates these drawbacks but works poorly for nasal sounds. In our proposed method all outside-unit-circle zeros are reflected inside before computing the chirp group delay. The advantages are: (a) GCI knowledge not required, (b) the vocal tract estimate is far less sensitive to the location and duration of the analysis window, (c) works for unvoiced sounds, and (d) captures the spectral valleys well for nasals, which in turn leads to better recognition accuracy.