Text-Independent Speaker Identification Using PCA-SVM Model

Muhammad Farin Akhsanta, S. Suyanto
{"title":"Text-Independent Speaker Identification Using PCA-SVM Model","authors":"Muhammad Farin Akhsanta, S. Suyanto","doi":"10.1109/ISRITI51436.2020.9315412","DOIUrl":null,"url":null,"abstract":"The Speaker identification system is widely applied in various fields to detect the identity of a person by detecting the sound signal energy released by a person and not driven by a particular text. The challenges are how to differentiate the voices characteristic of the speaker, such as intonation style, rhythm, the pattern of pronunciation, accent, and vocabulary. In this paper, a speaker identification system using Principal Component Analysis (PCA) and Support Vector Machine (SVM) is developed. Besides, the Mel Frequency Cepstral Coefficient (MFCC) is used as the feature extraction. The system is then evaluated using unseen noisy utterances with various signal-noise ratio (SNR). The evaluation is performed using a confusion matrix to calculate the accuracy, precision, and recall to determine the relevance of the output results on the system. Experimental results show that the developed system is quite robust. It is capable of identifying speakers with high performance, an accuracy of 88.97%, a precision of 91,87%, and a recall of 94,39%, for a low noise level with SNR of 15dB. The performance slowly decreases as the noise level increases. For a high noise level with SNR of up to 0dB, it is still able to recognize the unseen speakers with an average accuracy of 70.93%, precision of 74.68%, and recall of 83.51%.","PeriodicalId":325920,"journal":{"name":"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISRITI51436.2020.9315412","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The Speaker identification system is widely applied in various fields to detect the identity of a person by detecting the sound signal energy released by a person and not driven by a particular text. The challenges are how to differentiate the voices characteristic of the speaker, such as intonation style, rhythm, the pattern of pronunciation, accent, and vocabulary. In this paper, a speaker identification system using Principal Component Analysis (PCA) and Support Vector Machine (SVM) is developed. Besides, the Mel Frequency Cepstral Coefficient (MFCC) is used as the feature extraction. The system is then evaluated using unseen noisy utterances with various signal-noise ratio (SNR). The evaluation is performed using a confusion matrix to calculate the accuracy, precision, and recall to determine the relevance of the output results on the system. Experimental results show that the developed system is quite robust. It is capable of identifying speakers with high performance, an accuracy of 88.97%, a precision of 91,87%, and a recall of 94,39%, for a low noise level with SNR of 15dB. The performance slowly decreases as the noise level increases. For a high noise level with SNR of up to 0dB, it is still able to recognize the unseen speakers with an average accuracy of 70.93%, precision of 74.68%, and recall of 83.51%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于PCA-SVM模型的文本无关说话人识别
说话人识别系统通过检测人自身释放的声音信号能量而不受特定文本的驱动来检测人的身份,被广泛应用于各个领域。挑战在于如何区分说话者的声音特征,如语调风格、节奏、发音模式、口音和词汇。本文提出了一种基于主成分分析(PCA)和支持向量机(SVM)的说话人识别系统。此外,采用Mel频率倒谱系数(MFCC)作为特征提取。然后使用具有不同信噪比(SNR)的看不见的噪声话语对系统进行评估。评估使用混淆矩阵来计算准确度、精密度和召回率,以确定系统上输出结果的相关性。实验结果表明,该系统具有较强的鲁棒性。在15dB信噪比的低噪声条件下,该系统能够高效识别扬声器,准确率为88.97%,精度为91.87%,召回率为94.39%。随着噪声水平的增加,性能逐渐下降。在信噪比高达0dB的高噪声水平下,仍能识别出未见的说话者,平均准确率为70.93%,精密度为74.68%,召回率为83.51%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Combined Firefly Algorithm-Random Forest to Classify Autistic Spectrum Disorders Analysis of Indonesia's Internet Topology Borders at the Autonomous System Level Influence Distribution Training Data on Performance Supervised Machine Learning Algorithms Design of Optimal Satellite Constellation for Indonesian Regional Navigation System based on GEO and GSO Satellites Real-time Testing on Improved Data Transmission Security in the Industrial Control System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1