基于语音信号的年龄估计系统的新方法

Armagan Fidan, Rabia Ozge Bircan, S. Karamzadeh
{"title":"基于语音信号的年龄估计系统的新方法","authors":"Armagan Fidan, Rabia Ozge Bircan, S. Karamzadeh","doi":"10.1109/ISMSIT52890.2021.9604611","DOIUrl":null,"url":null,"abstract":"Developing technology and innovations have led to the development in many areas, and the age estimation with the human voice is a research area that has increased its popularity recently. For security problems or in the advertising sector, age recognition applications with voice have been used. Sound is structurally complex, but it has been seen that it is possible to extract the characteristic features of the sound. The designed system was created without giving any gender information in order to estimate age from human speech. The most popular audio feature extraction methods are Mel-Frequency Cepstrum Coefficient (MFCC) and Perceptual Linear Prediction (PLP) which were used in this study. In addition, Chroma features were also used. This study, it is aimed to get the highest efficiency from the voice features by using different feature extractors and rearranging the dataset according to the feature importance priority. For this purpose, eight age groups were formed from a dataset containing different speakers and so, the MLP (Multi-Layer Perceptron) classification method was used. Mozilla Open-Source Dataset was used in our system, and the highest accuracy rate of age classification was observed as 94.34% being the highest score in the literature.","PeriodicalId":120997,"journal":{"name":"2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A New Approach For Age Estimation System Based on Speech Signals\",\"authors\":\"Armagan Fidan, Rabia Ozge Bircan, S. Karamzadeh\",\"doi\":\"10.1109/ISMSIT52890.2021.9604611\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Developing technology and innovations have led to the development in many areas, and the age estimation with the human voice is a research area that has increased its popularity recently. For security problems or in the advertising sector, age recognition applications with voice have been used. Sound is structurally complex, but it has been seen that it is possible to extract the characteristic features of the sound. The designed system was created without giving any gender information in order to estimate age from human speech. The most popular audio feature extraction methods are Mel-Frequency Cepstrum Coefficient (MFCC) and Perceptual Linear Prediction (PLP) which were used in this study. In addition, Chroma features were also used. This study, it is aimed to get the highest efficiency from the voice features by using different feature extractors and rearranging the dataset according to the feature importance priority. For this purpose, eight age groups were formed from a dataset containing different speakers and so, the MLP (Multi-Layer Perceptron) classification method was used. Mozilla Open-Source Dataset was used in our system, and the highest accuracy rate of age classification was observed as 94.34% being the highest score in the literature.\",\"PeriodicalId\":120997,\"journal\":{\"name\":\"2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISMSIT52890.2021.9604611\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISMSIT52890.2021.9604611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

技术的发展和创新带动了许多领域的发展,利用人声进行年龄估计是近年来越来越受欢迎的研究领域。为了安全问题或广告领域,已经使用了语音年龄识别应用程序。声音在结构上是复杂的,但人们已经看到,提取声音的特征是可能的。设计的系统在创建时没有提供任何性别信息,以便从人类语言中估计年龄。本文采用的音频特征提取方法主要有Mel-Frequency倒频谱系数法(MFCC)和感知线性预测法(PLP)。此外,还使用了Chroma特征。本研究的目的是通过使用不同的特征提取器,并根据特征的重要优先级对数据集进行重新排列,以获得最高的语音特征提取效率。为此,从包含不同说话者的数据集中形成了8个年龄组,因此使用了MLP(多层感知器)分类方法。我们的系统使用Mozilla开源数据集,年龄分类准确率最高,达到94.34%,是文献中得分最高的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A New Approach For Age Estimation System Based on Speech Signals
Developing technology and innovations have led to the development in many areas, and the age estimation with the human voice is a research area that has increased its popularity recently. For security problems or in the advertising sector, age recognition applications with voice have been used. Sound is structurally complex, but it has been seen that it is possible to extract the characteristic features of the sound. The designed system was created without giving any gender information in order to estimate age from human speech. The most popular audio feature extraction methods are Mel-Frequency Cepstrum Coefficient (MFCC) and Perceptual Linear Prediction (PLP) which were used in this study. In addition, Chroma features were also used. This study, it is aimed to get the highest efficiency from the voice features by using different feature extractors and rearranging the dataset according to the feature importance priority. For this purpose, eight age groups were formed from a dataset containing different speakers and so, the MLP (Multi-Layer Perceptron) classification method was used. Mozilla Open-Source Dataset was used in our system, and the highest accuracy rate of age classification was observed as 94.34% being the highest score in the literature.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Design Improvement of a Spoke-Type PMSG with Ferrite Magnets to Reduce Space Harmonics Fuzzy AHP-TOPSIS Hybrid Method for Indoor Positioning Technology Selection for Shipyards Assessment of Slime Mould Algorithm Based Real PID Plus Second-order Derivative Controller for Magnetic Levitation System ROS Validation for Fuzzy Logic Contro Implemented under Differential Drive Mobile Robot Physical and Digital Accessibility in Museums in the New Reality
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1