利用粒子群算法优化基于FFNN的说话人识别系统的精度

A. M. Aaref, Zuhair Shakor Mahmood
{"title":"利用粒子群算法优化基于FFNN的说话人识别系统的精度","authors":"A. M. Aaref, Zuhair Shakor Mahmood","doi":"10.15866/irecap.v11i4.19883","DOIUrl":null,"url":null,"abstract":"Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.","PeriodicalId":38104,"journal":{"name":"International Journal on Communications Antenna and Propagation","volume":"163 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Optimization the Accuracy of FFNN Based Speaker Recognition System Using PSO Algorithm\",\"authors\":\"A. M. Aaref, Zuhair Shakor Mahmood\",\"doi\":\"10.15866/irecap.v11i4.19883\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.\",\"PeriodicalId\":38104,\"journal\":{\"name\":\"International Journal on Communications Antenna and Propagation\",\"volume\":\"163 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal on Communications Antenna and Propagation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15866/irecap.v11i4.19883\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Communications Antenna and Propagation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15866/irecap.v11i4.19883","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 4

摘要

说话人识别系统使用一种模型,通过输入录音并对其进行处理来学习说话人的讲话。频率连续变化的时变信号被识别为语音信号。言语有许多不确定的属性;因此,传统的语音识别技术,如使用过零和傅立叶变换,不能胜任这项任务。它的目的是为了帮助两个事业。第一部分主要研究抗噪声的说话人识别技术。虽然大多数先前的解决方案依赖于改变频率倒频谱系数,但具有基频特征系数,该提案将这两种修改与新的倒频谱分量集成在一起。为了构造特征矩阵,系统输入250个语音印记,用于特征提取技术。该矩阵用于教算法关于特征,然后使用不完整数据(特征矩阵中总数据的30%)对每个特征进行评估。通过对这些算法的深入研究,建立了精度更高的说话人识别模型。为每个算法生成这些变量(指标),并将其应用于识别准确性和实现该准确性所需的时间的算法。与前人的研究结果进行对比,结果表明基于前馈神经网络的粒子群优化方法具有更好的性能。该模型可以准确识别96%的输入,处理时间更短。根据研究结果,利用先进的粒子群优化(又名粒子群优化)的优化最有可能对说话人识别的更高准确性负责。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Optimization the Accuracy of FFNN Based Speaker Recognition System Using PSO Algorithm
Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.90
自引率
0.00%
发文量
17
期刊介绍: The International Journal on Communications Antenna and Propagation (IRECAP) is a peer-reviewed journal that publishes original theoretical and applied papers on all aspects of Communications, Antenna, Propagation and networking technologies.
期刊最新文献
A Thermal Imaging Model for Roads Cracks Width Detection Towards an Improved Approach for Extracting Spatial Association Rules: an Empirical Study in Algeria Microstrip Resonant and Non-Resonant Antenna for 5 GHz Indoor and Outdoor Band Applications Fingerprint Classification Using Double k-Means Clustering Electromagnetic Field Screening on a Limited Antenna Arranger Located on a Multilayer Dielectric
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1