Multiple regression using support vector machines for recognition of speech in a moving car environment

Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02. Pub Date : 2002-11-18 DOI:10.1109/ICONIP.2002.1198192

W. Lee, C. Sekhar, K. Takeda, F. Itakura

{"title":"Multiple regression using support vector machines for recognition of speech in a moving car environment","authors":"W. Lee, C. Sekhar, K. Takeda, F. Itakura","doi":"10.1109/ICONIP.2002.1198192","DOIUrl":null,"url":null,"abstract":"In a moving car environment, speech data is collected using a close-talking microphone placed in the headset of driver and multiple distant microphones placed around the driver. We address the issues in estimating spectral features of speech data collected using the close-talking microphone from the spectral features of data recorded on the distant microphones. We study methods such as concatenation, averaging, linear regression and nonlinear regression for estimation. We consider support vector machines (SVMs) for nonlinear regression of multiple spectral coefficients. We compare the performance of SVMs and hidden Markov models (HMMs) in recognition of subword units of speech using the original spectral features and the estimated spectral features. A Japanese speech corpus consisting of recordings in a moving car environment is used for our studies on estimation of spectral features and recognition of subword units of speech. Results of our studies show that SVM based regression performs better compared to linear regression, and SVMs give a higher recognition accuracy compared to HMMs.","PeriodicalId":146553,"journal":{"name":"Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICONIP.2002.1198192","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In a moving car environment, speech data is collected using a close-talking microphone placed in the headset of driver and multiple distant microphones placed around the driver. We address the issues in estimating spectral features of speech data collected using the close-talking microphone from the spectral features of data recorded on the distant microphones. We study methods such as concatenation, averaging, linear regression and nonlinear regression for estimation. We consider support vector machines (SVMs) for nonlinear regression of multiple spectral coefficients. We compare the performance of SVMs and hidden Markov models (HMMs) in recognition of subword units of speech using the original spectral features and the estimated spectral features. A Japanese speech corpus consisting of recordings in a moving car environment is used for our studies on estimation of spectral features and recognition of subword units of speech. Results of our studies show that SVM based regression performs better compared to linear regression, and SVMs give a higher recognition accuracy compared to HMMs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于支持向量机的多元回归在移动汽车环境下的语音识别

在移动的汽车环境中，语音数据的收集使用放置在驾驶员耳机中的近距离通话麦克风和放置在驾驶员周围的多个远距离麦克风。我们解决了从远端麦克风记录的数据频谱特征中估计近距离说话麦克风收集的语音数据的频谱特征问题。我们研究了诸如串联、平均、线性回归和非线性回归等估计方法。我们将支持向量机(svm)用于多谱系数的非线性回归。我们比较了支持向量机和隐马尔可夫模型(hmm)在使用原始谱特征和估计谱特征识别语音子词单位方面的性能。本文利用一个日语语音语料库，对语音的谱特征估计和子词单元识别进行了研究。我们的研究结果表明，基于SVM的回归优于线性回归，SVM的识别精度高于hmm。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02.

自引率

0.00%

发文量