{"title":"基于PCA-SVM模型的文本无关说话人识别","authors":"Muhammad Farin Akhsanta, S. Suyanto","doi":"10.1109/ISRITI51436.2020.9315412","DOIUrl":null,"url":null,"abstract":"The Speaker identification system is widely applied in various fields to detect the identity of a person by detecting the sound signal energy released by a person and not driven by a particular text. The challenges are how to differentiate the voices characteristic of the speaker, such as intonation style, rhythm, the pattern of pronunciation, accent, and vocabulary. In this paper, a speaker identification system using Principal Component Analysis (PCA) and Support Vector Machine (SVM) is developed. Besides, the Mel Frequency Cepstral Coefficient (MFCC) is used as the feature extraction. The system is then evaluated using unseen noisy utterances with various signal-noise ratio (SNR). The evaluation is performed using a confusion matrix to calculate the accuracy, precision, and recall to determine the relevance of the output results on the system. Experimental results show that the developed system is quite robust. It is capable of identifying speakers with high performance, an accuracy of 88.97%, a precision of 91,87%, and a recall of 94,39%, for a low noise level with SNR of 15dB. The performance slowly decreases as the noise level increases. For a high noise level with SNR of up to 0dB, it is still able to recognize the unseen speakers with an average accuracy of 70.93%, precision of 74.68%, and recall of 83.51%.","PeriodicalId":325920,"journal":{"name":"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Text-Independent Speaker Identification Using PCA-SVM Model\",\"authors\":\"Muhammad Farin Akhsanta, S. Suyanto\",\"doi\":\"10.1109/ISRITI51436.2020.9315412\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Speaker identification system is widely applied in various fields to detect the identity of a person by detecting the sound signal energy released by a person and not driven by a particular text. The challenges are how to differentiate the voices characteristic of the speaker, such as intonation style, rhythm, the pattern of pronunciation, accent, and vocabulary. In this paper, a speaker identification system using Principal Component Analysis (PCA) and Support Vector Machine (SVM) is developed. Besides, the Mel Frequency Cepstral Coefficient (MFCC) is used as the feature extraction. The system is then evaluated using unseen noisy utterances with various signal-noise ratio (SNR). The evaluation is performed using a confusion matrix to calculate the accuracy, precision, and recall to determine the relevance of the output results on the system. Experimental results show that the developed system is quite robust. It is capable of identifying speakers with high performance, an accuracy of 88.97%, a precision of 91,87%, and a recall of 94,39%, for a low noise level with SNR of 15dB. The performance slowly decreases as the noise level increases. For a high noise level with SNR of up to 0dB, it is still able to recognize the unseen speakers with an average accuracy of 70.93%, precision of 74.68%, and recall of 83.51%.\",\"PeriodicalId\":325920,\"journal\":{\"name\":\"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)\",\"volume\":\"99 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISRITI51436.2020.9315412\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISRITI51436.2020.9315412","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Text-Independent Speaker Identification Using PCA-SVM Model
The Speaker identification system is widely applied in various fields to detect the identity of a person by detecting the sound signal energy released by a person and not driven by a particular text. The challenges are how to differentiate the voices characteristic of the speaker, such as intonation style, rhythm, the pattern of pronunciation, accent, and vocabulary. In this paper, a speaker identification system using Principal Component Analysis (PCA) and Support Vector Machine (SVM) is developed. Besides, the Mel Frequency Cepstral Coefficient (MFCC) is used as the feature extraction. The system is then evaluated using unseen noisy utterances with various signal-noise ratio (SNR). The evaluation is performed using a confusion matrix to calculate the accuracy, precision, and recall to determine the relevance of the output results on the system. Experimental results show that the developed system is quite robust. It is capable of identifying speakers with high performance, an accuracy of 88.97%, a precision of 91,87%, and a recall of 94,39%, for a low noise level with SNR of 15dB. The performance slowly decreases as the noise level increases. For a high noise level with SNR of up to 0dB, it is still able to recognize the unseen speakers with an average accuracy of 70.93%, precision of 74.68%, and recall of 83.51%.