{"title":"Comparison of features extracted using time-frequency and frequency-time analysis approach for text-independent speaker identification","authors":"Nirmalya Sen, T. Basu, S. Chakroborty","doi":"10.1109/NCC.2011.5734720","DOIUrl":null,"url":null,"abstract":"This paper compares the feature sets extracted using time-frequency analysis approach and frequency-time analysis approach for text-independent speaker identification. Mel-frequency cepstral coefficient (MFCC) feature set and Inverted Mel-frequency cepstral coefficient (IMFCC) feature set are extracted using time-frequency analysis approach. Temporal energy subband cepstral coefficient (TESBCC) feature set is extracted using frequency time analysis approach. Time-bandwidth product of MFCC filter bank and TESBCC filter bank has been compared. RV coefficient has been used to calculate the correlation between the feature sets. Experimental evaluation was conducted on POLYCOST database with 130 speakers using Gaussian mixture speaker model. The TESBCC feature set has 9.5% higher average accuracy compared to the MFCC feature set. It is found that, the feature set extracted using time-frequency analysis approach is practically uncorrelated with the feature set extracted using frequency-time analysis approach. It is also demonstrated that IMFCC feature set has important role in fusion.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2011.5734720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
This paper compares the feature sets extracted using time-frequency analysis approach and frequency-time analysis approach for text-independent speaker identification. Mel-frequency cepstral coefficient (MFCC) feature set and Inverted Mel-frequency cepstral coefficient (IMFCC) feature set are extracted using time-frequency analysis approach. Temporal energy subband cepstral coefficient (TESBCC) feature set is extracted using frequency time analysis approach. Time-bandwidth product of MFCC filter bank and TESBCC filter bank has been compared. RV coefficient has been used to calculate the correlation between the feature sets. Experimental evaluation was conducted on POLYCOST database with 130 speakers using Gaussian mixture speaker model. The TESBCC feature set has 9.5% higher average accuracy compared to the MFCC feature set. It is found that, the feature set extracted using time-frequency analysis approach is practically uncorrelated with the feature set extracted using frequency-time analysis approach. It is also demonstrated that IMFCC feature set has important role in fusion.