{"title":"不同倒谱特征在说话人识别中的比较分析","authors":"R. Hanifa, I. K., M. S.","doi":"10.1109/SCOReD50371.2020.9250938","DOIUrl":null,"url":null,"abstract":"Speaker recognition is an Artificial Intelligent (AI) technology that lets the machine to process, interpret and respond to human language. In this work, the recorded speech developed from a collection of audio speeches is used as a database. Mel-frequency cepstral coefficients (MFCC) and gammatone frequency cepstral coefficients (GFCC) are two different cepstral features used in this work. These extracted features are then used to train, validate and test the classifier. Support Vector Machine (SVM) is the classifier used in developing the speaker identification system. This classifier is trained to classify the input speech into one of the ethnicity classes: Malay, Chinese, Indian or Bumiputera. The results are based on the two different usages of cepstral features from the same speech utterances by speakers. Finally, the comparative analysis of the speaker identification system is made concerning features and classifier. The results revealed that a combination of GFCC and pitch as the feature vectors (Model 4) produced the highest accuracy rate of 86.1%.","PeriodicalId":142867,"journal":{"name":"2020 IEEE Student Conference on Research and Development (SCOReD)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Comparative Analysis on Different Cepstral Features for Speaker Identification Recognition\",\"authors\":\"R. Hanifa, I. K., M. S.\",\"doi\":\"10.1109/SCOReD50371.2020.9250938\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speaker recognition is an Artificial Intelligent (AI) technology that lets the machine to process, interpret and respond to human language. In this work, the recorded speech developed from a collection of audio speeches is used as a database. Mel-frequency cepstral coefficients (MFCC) and gammatone frequency cepstral coefficients (GFCC) are two different cepstral features used in this work. These extracted features are then used to train, validate and test the classifier. Support Vector Machine (SVM) is the classifier used in developing the speaker identification system. This classifier is trained to classify the input speech into one of the ethnicity classes: Malay, Chinese, Indian or Bumiputera. The results are based on the two different usages of cepstral features from the same speech utterances by speakers. Finally, the comparative analysis of the speaker identification system is made concerning features and classifier. The results revealed that a combination of GFCC and pitch as the feature vectors (Model 4) produced the highest accuracy rate of 86.1%.\",\"PeriodicalId\":142867,\"journal\":{\"name\":\"2020 IEEE Student Conference on Research and Development (SCOReD)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Student Conference on Research and Development (SCOReD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCOReD50371.2020.9250938\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Student Conference on Research and Development (SCOReD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCOReD50371.2020.9250938","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparative Analysis on Different Cepstral Features for Speaker Identification Recognition
Speaker recognition is an Artificial Intelligent (AI) technology that lets the machine to process, interpret and respond to human language. In this work, the recorded speech developed from a collection of audio speeches is used as a database. Mel-frequency cepstral coefficients (MFCC) and gammatone frequency cepstral coefficients (GFCC) are two different cepstral features used in this work. These extracted features are then used to train, validate and test the classifier. Support Vector Machine (SVM) is the classifier used in developing the speaker identification system. This classifier is trained to classify the input speech into one of the ethnicity classes: Malay, Chinese, Indian or Bumiputera. The results are based on the two different usages of cepstral features from the same speech utterances by speakers. Finally, the comparative analysis of the speaker identification system is made concerning features and classifier. The results revealed that a combination of GFCC and pitch as the feature vectors (Model 4) produced the highest accuracy rate of 86.1%.