利用粒子群算法优化基于FFNN的说话人识别系统的精度

Q2 Engineering International Journal on Communications Antenna and Propagation Pub Date : 2021-08-31 DOI:10.15866/irecap.v11i4.19883

A. M. Aaref, Zuhair Shakor Mahmood

{"title":"利用粒子群算法优化基于FFNN的说话人识别系统的精度","authors":"A. M. Aaref, Zuhair Shakor Mahmood","doi":"10.15866/irecap.v11i4.19883","DOIUrl":null,"url":null,"abstract":"Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.","PeriodicalId":38104,"journal":{"name":"International Journal on Communications Antenna and Propagation","volume":"163 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Optimization the Accuracy of FFNN Based Speaker Recognition System Using PSO Algorithm\",\"authors\":\"A. M. Aaref, Zuhair Shakor Mahmood\",\"doi\":\"10.15866/irecap.v11i4.19883\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.\",\"PeriodicalId\":38104,\"journal\":{\"name\":\"International Journal on Communications Antenna and Propagation\",\"volume\":\"163 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal on Communications Antenna and Propagation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15866/irecap.v11i4.19883\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Communications Antenna and Propagation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15866/irecap.v11i4.19883","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Engineering","Score":null,"Total":0}

引用次数: 4

摘要

说话人识别系统使用一种模型，通过输入录音并对其进行处理来学习说话人的讲话。频率连续变化的时变信号被识别为语音信号。言语有许多不确定的属性;因此，传统的语音识别技术，如使用过零和傅立叶变换，不能胜任这项任务。它的目的是为了帮助两个事业。第一部分主要研究抗噪声的说话人识别技术。虽然大多数先前的解决方案依赖于改变频率倒频谱系数，但具有基频特征系数，该提案将这两种修改与新的倒频谱分量集成在一起。为了构造特征矩阵，系统输入250个语音印记，用于特征提取技术。该矩阵用于教算法关于特征，然后使用不完整数据(特征矩阵中总数据的30%)对每个特征进行评估。通过对这些算法的深入研究，建立了精度更高的说话人识别模型。为每个算法生成这些变量(指标)，并将其应用于识别准确性和实现该准确性所需的时间的算法。与前人的研究结果进行对比，结果表明基于前馈神经网络的粒子群优化方法具有更好的性能。该模型可以准确识别96%的输入，处理时间更短。根据研究结果，利用先进的粒子群优化(又名粒子群优化)的优化最有可能对说话人识别的更高准确性负责。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Optimization the Accuracy of FFNN Based Speaker Recognition System Using PSO Algorithm

Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal on Communications Antenna and Propagation Engineering-Media Technology

CiteScore

2.90

自引率

0.00%

发文量

期刊介绍： The International Journal on Communications Antenna and Propagation (IRECAP) is a peer-reviewed journal that publishes original theoretical and applied papers on all aspects of Communications, Antenna, Propagation and networking technologies.