{"title":"说话人识别特征提取与建模方法的性能评价","authors":"Mustafa Yankayis","doi":"10.19080/arr.2018.04.555639","DOIUrl":null,"url":null,"abstract":"In this study, the performance of the prominent feature extraction and modeling methods in speaker recognition systems are evaluated on the specifically created database. The main feature of the database is that subjects are siblings or relatives. After giving the basic information about speaker recognition systems, outstanding properties of the methods are briefly mentioned. While Linear Predictive Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficients (MFCC) methods are preferred for feature extraction, Gaussian Mixture Model (GMM) and I-Vector methods are employed for modeling. The best results are tried to be obtained by changing the parameters of these methods. A number of features for LPCC and MFCC and number of mixture components for GMM are the parameters experimented by changing. The aim of this study is to find out which parameters of the most commonly used methods contribute the success and at the same time, to determine the best combination of feature extraction and modeling methods for the speakers having similar sounds. This study is also a good resource and guidance for the researchers in the area of speaker recognition.","PeriodicalId":93074,"journal":{"name":"Annals of reviews and research","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Performance Evaluation of Feature Extraction and Modeling Methods for Speaker Recognition\",\"authors\":\"Mustafa Yankayis\",\"doi\":\"10.19080/arr.2018.04.555639\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study, the performance of the prominent feature extraction and modeling methods in speaker recognition systems are evaluated on the specifically created database. The main feature of the database is that subjects are siblings or relatives. After giving the basic information about speaker recognition systems, outstanding properties of the methods are briefly mentioned. While Linear Predictive Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficients (MFCC) methods are preferred for feature extraction, Gaussian Mixture Model (GMM) and I-Vector methods are employed for modeling. The best results are tried to be obtained by changing the parameters of these methods. A number of features for LPCC and MFCC and number of mixture components for GMM are the parameters experimented by changing. The aim of this study is to find out which parameters of the most commonly used methods contribute the success and at the same time, to determine the best combination of feature extraction and modeling methods for the speakers having similar sounds. This study is also a good resource and guidance for the researchers in the area of speaker recognition.\",\"PeriodicalId\":93074,\"journal\":{\"name\":\"Annals of reviews and research\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of reviews and research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.19080/arr.2018.04.555639\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of reviews and research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.19080/arr.2018.04.555639","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Evaluation of Feature Extraction and Modeling Methods for Speaker Recognition
In this study, the performance of the prominent feature extraction and modeling methods in speaker recognition systems are evaluated on the specifically created database. The main feature of the database is that subjects are siblings or relatives. After giving the basic information about speaker recognition systems, outstanding properties of the methods are briefly mentioned. While Linear Predictive Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficients (MFCC) methods are preferred for feature extraction, Gaussian Mixture Model (GMM) and I-Vector methods are employed for modeling. The best results are tried to be obtained by changing the parameters of these methods. A number of features for LPCC and MFCC and number of mixture components for GMM are the parameters experimented by changing. The aim of this study is to find out which parameters of the most commonly used methods contribute the success and at the same time, to determine the best combination of feature extraction and modeling methods for the speakers having similar sounds. This study is also a good resource and guidance for the researchers in the area of speaker recognition.