{"title":"Multiple regression using support vector machines for recognition of speech in a moving car environment","authors":"W. Lee, C. Sekhar, K. Takeda, F. Itakura","doi":"10.1109/ICONIP.2002.1198192","DOIUrl":null,"url":null,"abstract":"In a moving car environment, speech data is collected using a close-talking microphone placed in the headset of driver and multiple distant microphones placed around the driver. We address the issues in estimating spectral features of speech data collected using the close-talking microphone from the spectral features of data recorded on the distant microphones. We study methods such as concatenation, averaging, linear regression and nonlinear regression for estimation. We consider support vector machines (SVMs) for nonlinear regression of multiple spectral coefficients. We compare the performance of SVMs and hidden Markov models (HMMs) in recognition of subword units of speech using the original spectral features and the estimated spectral features. A Japanese speech corpus consisting of recordings in a moving car environment is used for our studies on estimation of spectral features and recognition of subword units of speech. Results of our studies show that SVM based regression performs better compared to linear regression, and SVMs give a higher recognition accuracy compared to HMMs.","PeriodicalId":146553,"journal":{"name":"Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICONIP.2002.1198192","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In a moving car environment, speech data is collected using a close-talking microphone placed in the headset of driver and multiple distant microphones placed around the driver. We address the issues in estimating spectral features of speech data collected using the close-talking microphone from the spectral features of data recorded on the distant microphones. We study methods such as concatenation, averaging, linear regression and nonlinear regression for estimation. We consider support vector machines (SVMs) for nonlinear regression of multiple spectral coefficients. We compare the performance of SVMs and hidden Markov models (HMMs) in recognition of subword units of speech using the original spectral features and the estimated spectral features. A Japanese speech corpus consisting of recordings in a moving car environment is used for our studies on estimation of spectral features and recognition of subword units of speech. Results of our studies show that SVM based regression performs better compared to linear regression, and SVMs give a higher recognition accuracy compared to HMMs.