{"title":"Combined speech decoders output for phoneme recognition enhancement","authors":"K. Abida, F. Karray, W. Abida","doi":"10.1109/ICSCS.2009.5412292","DOIUrl":null,"url":null,"abstract":"Phoneme recognition is an essential component of any robust speech decoder and has been tackled by many researchers. Speech feature extraction constitutes the front end module of any speech decoder: it plays an essential role and has a strong impact on the recognition performance. The research community is aggressively searching for more powerful solutions which combine the existing feature extraction methods for a better and more reliable information capture from the analog speech signal. In this research work, we propose new approaches to combining phoneme recognizers' output in order to provide better recognition performance and improved robustness with respect to noise and channel distortions. Machine learning tools such as the Naive Bayes Classifier, Decision Trees, and Support Vector Machines have been used in the combination of hypotheses. Experiments under different SNR levels have proven that our proposed approach outperforms the two most common feature extraction techniques, namely Mel Frequency Cepstral Coefficients (MFCC) and Perceptual Linear Prediction(PLP) with Cepstral Mean Subtraction (CMS) and RASTA respectively, for channel normalization.","PeriodicalId":126072,"journal":{"name":"2009 3rd International Conference on Signals, Circuits and Systems (SCS)","volume":"243 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 3rd International Conference on Signals, Circuits and Systems (SCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCS.2009.5412292","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Phoneme recognition is an essential component of any robust speech decoder and has been tackled by many researchers. Speech feature extraction constitutes the front end module of any speech decoder: it plays an essential role and has a strong impact on the recognition performance. The research community is aggressively searching for more powerful solutions which combine the existing feature extraction methods for a better and more reliable information capture from the analog speech signal. In this research work, we propose new approaches to combining phoneme recognizers' output in order to provide better recognition performance and improved robustness with respect to noise and channel distortions. Machine learning tools such as the Naive Bayes Classifier, Decision Trees, and Support Vector Machines have been used in the combination of hypotheses. Experiments under different SNR levels have proven that our proposed approach outperforms the two most common feature extraction techniques, namely Mel Frequency Cepstral Coefficients (MFCC) and Perceptual Linear Prediction(PLP) with Cepstral Mean Subtraction (CMS) and RASTA respectively, for channel normalization.