{"title":"利用粒子群算法优化基于FFNN的说话人识别系统的精度","authors":"A. M. Aaref, Zuhair Shakor Mahmood","doi":"10.15866/irecap.v11i4.19883","DOIUrl":null,"url":null,"abstract":"Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.","PeriodicalId":38104,"journal":{"name":"International Journal on Communications Antenna and Propagation","volume":"163 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Optimization the Accuracy of FFNN Based Speaker Recognition System Using PSO Algorithm\",\"authors\":\"A. M. Aaref, Zuhair Shakor Mahmood\",\"doi\":\"10.15866/irecap.v11i4.19883\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.\",\"PeriodicalId\":38104,\"journal\":{\"name\":\"International Journal on Communications Antenna and Propagation\",\"volume\":\"163 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal on Communications Antenna and Propagation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15866/irecap.v11i4.19883\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Communications Antenna and Propagation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15866/irecap.v11i4.19883","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Engineering","Score":null,"Total":0}
Optimization the Accuracy of FFNN Based Speaker Recognition System Using PSO Algorithm
Speaker recognition systems use a model that learns a speaker's speech by inputting an audio recording and processing it. Time-varying signal, with frequencies that continuously change, is identified as a speech signal. There are many uncertain attributes to speech; thus traditional speech recognition techniques such as using zero crossings and the Fourier Transform are not up to the task. It aims to be accomplished with the aim of helping two causes. The first part is designed to address speaker identification technology that is resistant to noise. While most prior solutions have relied on changing mel frequency cepstrum coefficients, with a Fundamental frequency feature coefficient, this proposal integrates both of these modifications with a new cepstrum component. In order to construct the feature matrix, the system is fed with two-hundred and fifty speech imprints that are used to apply features extraction techniques. The matrix is used to teach the algorithm about features, and each one is then evaluated using incomplete data (thirty percent of total data in features matrix). Speaker recognition models with improved accuracy are developed by studying the algorithms invasively. These variables (metrics) are generated for each algorithm and applied to the algorithm for recognition accuracy and the time required to achieve that accuracy. When tested against previous research, the findings show that the Feed Forward Neural Network-based Particle Swarm Optimization method has been better. This model can accurately identify 96% of the input with less processing time. According to the findings, optimization utilizing advanced particle swarm optimization (a.k.a. Particle Swarm Optimization) is most likely responsible for the higher accuracy seen in speaker identification.
期刊介绍:
The International Journal on Communications Antenna and Propagation (IRECAP) is a peer-reviewed journal that publishes original theoretical and applied papers on all aspects of Communications, Antenna, Propagation and networking technologies.