Jerin Baby Mathew, Jonie Jacob, Karun Sajeev, Jithin Joy, R. Rajan
{"title":"困难语音识别中声学建模特征选择的意义","authors":"Jerin Baby Mathew, Jonie Jacob, Karun Sajeev, Jithin Joy, R. Rajan","doi":"10.1109/WISPNET.2018.8538531","DOIUrl":null,"url":null,"abstract":"In this paper, a comparative study of various feature extraction methods is carried out on dysarthric speech. Dysarthric speech is difficult to recognize and thus pose challenges that normal speech does not. Since various features can be used to model phonemes in hidden Markov model (HMM) based recognition system, which feature is suitable for the task specified is a topic to be addressed.Dysarthric speech becomes unintelligible due to the improper coordination of articulators. In this paper, recognition results are compared using mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP), filter bank and reflection coefficients feature sets. The performance is analyzed using TORGO database. Phonemes are grouped for the analysis. Our study shows that MFCC and PLP gave better results than filter bank and reflection coefficients for dysarthric speech analysis.","PeriodicalId":6858,"journal":{"name":"2018 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET)","volume":"62 1","pages":"1-4"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Significance of Feature Selection for Acoustic Modeling in Dysarthric Speech Recognition\",\"authors\":\"Jerin Baby Mathew, Jonie Jacob, Karun Sajeev, Jithin Joy, R. Rajan\",\"doi\":\"10.1109/WISPNET.2018.8538531\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a comparative study of various feature extraction methods is carried out on dysarthric speech. Dysarthric speech is difficult to recognize and thus pose challenges that normal speech does not. Since various features can be used to model phonemes in hidden Markov model (HMM) based recognition system, which feature is suitable for the task specified is a topic to be addressed.Dysarthric speech becomes unintelligible due to the improper coordination of articulators. In this paper, recognition results are compared using mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP), filter bank and reflection coefficients feature sets. The performance is analyzed using TORGO database. Phonemes are grouped for the analysis. Our study shows that MFCC and PLP gave better results than filter bank and reflection coefficients for dysarthric speech analysis.\",\"PeriodicalId\":6858,\"journal\":{\"name\":\"2018 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET)\",\"volume\":\"62 1\",\"pages\":\"1-4\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WISPNET.2018.8538531\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISPNET.2018.8538531","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Significance of Feature Selection for Acoustic Modeling in Dysarthric Speech Recognition
In this paper, a comparative study of various feature extraction methods is carried out on dysarthric speech. Dysarthric speech is difficult to recognize and thus pose challenges that normal speech does not. Since various features can be used to model phonemes in hidden Markov model (HMM) based recognition system, which feature is suitable for the task specified is a topic to be addressed.Dysarthric speech becomes unintelligible due to the improper coordination of articulators. In this paper, recognition results are compared using mel-frequency cepstral coefficients (MFCC), perceptual linear prediction (PLP), filter bank and reflection coefficients feature sets. The performance is analyzed using TORGO database. Phonemes are grouped for the analysis. Our study shows that MFCC and PLP gave better results than filter bank and reflection coefficients for dysarthric speech analysis.