{"title":"基于信噪比/熵的Mel子带加权鲁棒ASR","authors":"H. Yeganeh, S. Ahadi, S. M. Mirrezaie, A. Ziaei","doi":"10.1109/ISSPIT.2008.4775710","DOIUrl":null,"url":null,"abstract":"Mel-frequency cepstral coefficients (MFCC) are the most widely used features for speech recognition. However, MFCC-based speech recognition performance degrades in presence of additive noise. In this paper, we propose a set of noise-robust features based on conventional MFCC feature extraction method. Our proposed method consists of two steps. In the first step, mel sub-band Wiener filtering is carried out. The second step consists of estimating SNR in each sub-band and calculating the sub-band entropy by defining a weight parameter based on sub-band SNR to entropy ratio. The weighting has been carried out in a way that gives more important roles, in cepstrum parameter formation, to sub-bands that are less affected by noise. Experimental results indicate that this method leads to improved ASR performance in noisy environments. Furthermore, due to the simplicity of the implementation of our method, its computational overhead in comparison to MFCC is quite small.","PeriodicalId":213756,"journal":{"name":"2008 IEEE International Symposium on Signal Processing and Information Technology","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Weighting of Mel Sub-bands Based on SNR/Entropy for Robust ASR\",\"authors\":\"H. Yeganeh, S. Ahadi, S. M. Mirrezaie, A. Ziaei\",\"doi\":\"10.1109/ISSPIT.2008.4775710\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mel-frequency cepstral coefficients (MFCC) are the most widely used features for speech recognition. However, MFCC-based speech recognition performance degrades in presence of additive noise. In this paper, we propose a set of noise-robust features based on conventional MFCC feature extraction method. Our proposed method consists of two steps. In the first step, mel sub-band Wiener filtering is carried out. The second step consists of estimating SNR in each sub-band and calculating the sub-band entropy by defining a weight parameter based on sub-band SNR to entropy ratio. The weighting has been carried out in a way that gives more important roles, in cepstrum parameter formation, to sub-bands that are less affected by noise. Experimental results indicate that this method leads to improved ASR performance in noisy environments. Furthermore, due to the simplicity of the implementation of our method, its computational overhead in comparison to MFCC is quite small.\",\"PeriodicalId\":213756,\"journal\":{\"name\":\"2008 IEEE International Symposium on Signal Processing and Information Technology\",\"volume\":\"2012 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Symposium on Signal Processing and Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSPIT.2008.4775710\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Symposium on Signal Processing and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSPIT.2008.4775710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Weighting of Mel Sub-bands Based on SNR/Entropy for Robust ASR
Mel-frequency cepstral coefficients (MFCC) are the most widely used features for speech recognition. However, MFCC-based speech recognition performance degrades in presence of additive noise. In this paper, we propose a set of noise-robust features based on conventional MFCC feature extraction method. Our proposed method consists of two steps. In the first step, mel sub-band Wiener filtering is carried out. The second step consists of estimating SNR in each sub-band and calculating the sub-band entropy by defining a weight parameter based on sub-band SNR to entropy ratio. The weighting has been carried out in a way that gives more important roles, in cepstrum parameter formation, to sub-bands that are less affected by noise. Experimental results indicate that this method leads to improved ASR performance in noisy environments. Furthermore, due to the simplicity of the implementation of our method, its computational overhead in comparison to MFCC is quite small.