Enhancing Speaker Recognition Models with Noise-Resilient Feature Optimization Strategies

Acoustics Pub Date : 2024-05-14 DOI:10.3390/acoustics6020024
Neha Chauhan, T. Isshiki, Dongju Li
{"title":"Enhancing Speaker Recognition Models with Noise-Resilient Feature Optimization Strategies","authors":"Neha Chauhan, T. Isshiki, Dongju Li","doi":"10.3390/acoustics6020024","DOIUrl":null,"url":null,"abstract":"This paper delves into an in-depth exploration of speaker recognition methodologies, with a primary focus on three pivotal approaches: feature-level fusion, dimension reduction employing principal component analysis (PCA) and independent component analysis (ICA), and feature optimization through a genetic algorithm (GA) and the marine predator algorithm (MPA). This study conducts comprehensive experiments across diverse speech datasets characterized by varying noise levels and speaker counts. Impressively, the research yields exceptional results across different datasets and classifiers. For instance, on the TIMIT babble noise dataset (120 speakers), feature fusion achieves a remarkable speaker identification accuracy of 92.7%, while various feature optimization techniques combined with K nearest neighbor (KNN) and linear discriminant (LD) classifiers result in a speaker verification equal error rate (SV EER) of 0.7%. Notably, this study achieves a speaker identification accuracy of 93.5% and SV EER of 0.13% on the TIMIT babble noise dataset (630 speakers) using a KNN classifier with feature optimization. On the TIMIT white noise dataset (120 and 630 speakers), speaker identification accuracies of 93.3% and 83.5%, along with SV EER values of 0.58% and 0.13%, respectively, were attained utilizing PCA dimension reduction and feature optimization techniques (PCA-MPA) with KNN classifiers. Furthermore, on the voxceleb1 dataset, PCA-MPA feature optimization with KNN classifiers achieves a speaker identification accuracy of 95.2% and an SV EER of 1.8%. These findings underscore the significant enhancement in computational speed and speaker recognition performance facilitated by feature optimization strategies.","PeriodicalId":502373,"journal":{"name":"Acoustics","volume":"35 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acoustics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/acoustics6020024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper delves into an in-depth exploration of speaker recognition methodologies, with a primary focus on three pivotal approaches: feature-level fusion, dimension reduction employing principal component analysis (PCA) and independent component analysis (ICA), and feature optimization through a genetic algorithm (GA) and the marine predator algorithm (MPA). This study conducts comprehensive experiments across diverse speech datasets characterized by varying noise levels and speaker counts. Impressively, the research yields exceptional results across different datasets and classifiers. For instance, on the TIMIT babble noise dataset (120 speakers), feature fusion achieves a remarkable speaker identification accuracy of 92.7%, while various feature optimization techniques combined with K nearest neighbor (KNN) and linear discriminant (LD) classifiers result in a speaker verification equal error rate (SV EER) of 0.7%. Notably, this study achieves a speaker identification accuracy of 93.5% and SV EER of 0.13% on the TIMIT babble noise dataset (630 speakers) using a KNN classifier with feature optimization. On the TIMIT white noise dataset (120 and 630 speakers), speaker identification accuracies of 93.3% and 83.5%, along with SV EER values of 0.58% and 0.13%, respectively, were attained utilizing PCA dimension reduction and feature optimization techniques (PCA-MPA) with KNN classifiers. Furthermore, on the voxceleb1 dataset, PCA-MPA feature optimization with KNN classifiers achieves a speaker identification accuracy of 95.2% and an SV EER of 1.8%. These findings underscore the significant enhancement in computational speed and speaker recognition performance facilitated by feature optimization strategies.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用抗噪特征优化策略增强说话人识别模型
本文深入探讨了扬声器识别方法,主要关注三种关键方法:特征级融合、采用主成分分析(PCA)和独立成分分析(ICA)的降维方法,以及通过遗传算法(GA)和海洋捕食者算法(MPA)进行特征优化。这项研究在具有不同噪声水平和说话人数特点的各种语音数据集上进行了全面的实验。令人印象深刻的是,这项研究在不同的数据集和分类器上都取得了卓越的成果。例如,在 TIMIT 咿呀学语噪声数据集(120 个说话人)上,特征融合实现了 92.7% 的出色说话人识别准确率,而各种特征优化技术与 K 近邻(KNN)和线性判别(LD)分类器相结合,实现了 0.7% 的说话人验证等同错误率(SV EER)。值得注意的是,本研究使用 KNN 分类器和特征优化技术,在 TIMIT 咿呀噪音数据集(630 个扬声器)上实现了 93.5% 的扬声器识别准确率和 0.13% 的 SV EER。在 TIMIT 白噪声数据集(120 个和 630 个扬声器)上,利用 PCA 降维和特征优化技术(PCA-MPA)以及 KNN 分类器,扬声器识别准确率分别达到 93.3% 和 83.5%,SV EER 值分别为 0.58% 和 0.13%。此外,在 voxceleb1 数据集上,使用 KNN 分类器的 PCA-MPA 特征优化技术实现了 95.2% 的说话人识别准确率和 1.8% 的 SV EER 值。这些发现表明,特征优化策略大大提高了计算速度和说话人识别性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Acoustic Analysis of a Hybrid Propulsion System for Drone Applications A Study on Adaptive Implicit–Explicit and Explicit–Explicit Time Integration Procedures for Wave Propagation Analyses Silent Neonatal Incubators, Prototype Nica+ The Effect of an Emotionalizing Sound Design on the Driver’s Choice of Headway in a Driving Simulator Atmospheric Sound Propagation over Rough Sea: Numerical Evaluation of Equivalent Acoustic Impedance of Varying Sea States
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1