{"title":"基于改进GFCC的鲁棒说话人识别","authors":"Xiao-dong Shi, Haiyan Yang, Ping Zhou","doi":"10.1109/COMPCOMM.2016.7925037","DOIUrl":null,"url":null,"abstract":"Focused on the issue that the robustness of traditional Mel Frequency Cepstral Coefficients (MFCC) feature degrades drastically in speaker recognition system, a kind algorithm that based improved Gammatone Frequency Cepstral Coefficients (GFCC) is proposed. The different between traditional MFCC and GFCC is that GFCC uses Gammatone filter bank to replace Mel filter bank to improve robustness. On this basis, this paper proposes one way that use Multitaper Estimation, MVA (Mean Subtraction, Variance Normzlization and Autoregressive Moving Average Filter)and other technologies to further enhance its robustness and tested with TIMIT speech database. The experimental results show that under different noise and different SNR, the improved GFCC that proposed by this paper has the lowest equal error rate and the best robustness, especially in the noise ratio is lower than 10dB, has greater advantage compared to other algorithms.","PeriodicalId":210833,"journal":{"name":"2016 2nd IEEE International Conference on Computer and Communications (ICCC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Robust speaker recognition based on improved GFCC\",\"authors\":\"Xiao-dong Shi, Haiyan Yang, Ping Zhou\",\"doi\":\"10.1109/COMPCOMM.2016.7925037\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Focused on the issue that the robustness of traditional Mel Frequency Cepstral Coefficients (MFCC) feature degrades drastically in speaker recognition system, a kind algorithm that based improved Gammatone Frequency Cepstral Coefficients (GFCC) is proposed. The different between traditional MFCC and GFCC is that GFCC uses Gammatone filter bank to replace Mel filter bank to improve robustness. On this basis, this paper proposes one way that use Multitaper Estimation, MVA (Mean Subtraction, Variance Normzlization and Autoregressive Moving Average Filter)and other technologies to further enhance its robustness and tested with TIMIT speech database. The experimental results show that under different noise and different SNR, the improved GFCC that proposed by this paper has the lowest equal error rate and the best robustness, especially in the noise ratio is lower than 10dB, has greater advantage compared to other algorithms.\",\"PeriodicalId\":210833,\"journal\":{\"name\":\"2016 2nd IEEE International Conference on Computer and Communications (ICCC)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 2nd IEEE International Conference on Computer and Communications (ICCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COMPCOMM.2016.7925037\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 2nd IEEE International Conference on Computer and Communications (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPCOMM.2016.7925037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
摘要
针对传统的Mel频率倒谱系数(MFCC)特征在说话人识别系统中鲁棒性急剧下降的问题,提出了一种基于改进Gammatone频率倒谱系数(GFCC)的说话人识别算法。传统MFCC与GFCC的不同之处在于,GFCC使用Gammatone滤波器组代替Mel滤波器组来提高鲁棒性。在此基础上,本文提出了一种利用多渐估计、MVA (Mean Subtraction, Variance normizzation and Autoregressive Moving Average Filter)等技术进一步增强鲁棒性的方法,并在TIMIT语音数据库中进行了测试。实验结果表明,在不同噪声和不同信噪比下,本文提出的改进GFCC具有最低的等错误率和最佳的鲁棒性,特别是在噪声比低于10dB的情况下,与其他算法相比具有更大的优势。
Focused on the issue that the robustness of traditional Mel Frequency Cepstral Coefficients (MFCC) feature degrades drastically in speaker recognition system, a kind algorithm that based improved Gammatone Frequency Cepstral Coefficients (GFCC) is proposed. The different between traditional MFCC and GFCC is that GFCC uses Gammatone filter bank to replace Mel filter bank to improve robustness. On this basis, this paper proposes one way that use Multitaper Estimation, MVA (Mean Subtraction, Variance Normzlization and Autoregressive Moving Average Filter)and other technologies to further enhance its robustness and tested with TIMIT speech database. The experimental results show that under different noise and different SNR, the improved GFCC that proposed by this paper has the lowest equal error rate and the best robustness, especially in the noise ratio is lower than 10dB, has greater advantage compared to other algorithms.