Chi Zhang, Xiaoqiang Li, Wei Li, Peizhong Lu, Wenqiang Zhang
{"title":"A novel i-vector framework using multiple features and PCA for speaker recognition in short speech condition","authors":"Chi Zhang, Xiaoqiang Li, Wei Li, Peizhong Lu, Wenqiang Zhang","doi":"10.1109/ICALIP.2016.7846558","DOIUrl":null,"url":null,"abstract":"Speaker recognition in short speech condition is a difficult topic because the length of training and test speech is very short. One of the main disadvantage of the existing methods for speaker recognition is that they need very sufficient data and it's usually impossible in reality applications. In our experiments, the conventional methods with single feature don't make good performance in short speech. We propose a novel i-vector framework using multiple features and Principal Component Analysis (PCA) in short speech condition to overcome this difficulty, as multiple features combination can represent more aspects of a speaker. PCA is used to map the multiple features to an uncorrelated and orthogonal basis set to meet the requirements of Gaussian Mixture Model (GMM) with diagonal covariance matrices and i-vector. Improvement from the proposed approach compared to a state-of-the-art system are of roughly 50% relative at equal error rate when evaluated on the telephone conditions from the 2010 NIST speaker recognition evaluation (SRE).","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"165 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICALIP.2016.7846558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Speaker recognition in short speech condition is a difficult topic because the length of training and test speech is very short. One of the main disadvantage of the existing methods for speaker recognition is that they need very sufficient data and it's usually impossible in reality applications. In our experiments, the conventional methods with single feature don't make good performance in short speech. We propose a novel i-vector framework using multiple features and Principal Component Analysis (PCA) in short speech condition to overcome this difficulty, as multiple features combination can represent more aspects of a speaker. PCA is used to map the multiple features to an uncorrelated and orthogonal basis set to meet the requirements of Gaussian Mixture Model (GMM) with diagonal covariance matrices and i-vector. Improvement from the proposed approach compared to a state-of-the-art system are of roughly 50% relative at equal error rate when evaluated on the telephone conditions from the 2010 NIST speaker recognition evaluation (SRE).