Using a Model of the Cochlea Based in the Micro and Macro Mechanical to Find Parameters for Automatic Speech Recognition

J. Rodríguez, Jose Francisco Reyes Saldana
{"title":"Using a Model of the Cochlea Based in the Micro and Macro Mechanical to Find Parameters for Automatic Speech Recognition","authors":"J. Rodríguez, Jose Francisco Reyes Saldana","doi":"10.1109/MICAI.2013.39","DOIUrl":null,"url":null,"abstract":"Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). That is because this important organ of the hearing in the mammalians is the principal element used to make a transduction of the sound pressure that is received by the ear. In this paper we show how the macro and micro mechanical model is used in ASR tasks. We used the values that Neely founded in his work, related with the macro and micro mechanical model, such as was named, to set the central frequencies of a bank filter to obtain parameters from the speech used in a similar form as MFCC were constructed. We propose a new approach that considers a new form to construct the bank filter in our parametric representation. Then we used this distribution of the bank filter to have a new representation of the speech in frequency domain. It is important indicate that MFCC parameters use Mel scale to create a bank filter where central frequencies of each filter is in function of the scale mentioned above. We used the response of the Neely's model behavior to create the central frequencies of the bank filter mentioned above, then we substitute the Mel scale function by another representation. We use the place theory, and we reach a 98.5% of performance, for a task that uses isolated digits pronounced by 5 different speakers. Neely's model was used because a set of parameters of the cochlea as mass, damping and stiffness, among others, when are substituted inside the model make the response obtained is closer than von Békésy proposed in his preliminary work about principle function of the cochlea.","PeriodicalId":340039,"journal":{"name":"2013 12th Mexican International Conference on Artificial Intelligence","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 12th Mexican International Conference on Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MICAI.2013.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). That is because this important organ of the hearing in the mammalians is the principal element used to make a transduction of the sound pressure that is received by the ear. In this paper we show how the macro and micro mechanical model is used in ASR tasks. We used the values that Neely founded in his work, related with the macro and micro mechanical model, such as was named, to set the central frequencies of a bank filter to obtain parameters from the speech used in a similar form as MFCC were constructed. We propose a new approach that considers a new form to construct the bank filter in our parametric representation. Then we used this distribution of the bank filter to have a new representation of the speech in frequency domain. It is important indicate that MFCC parameters use Mel scale to create a bank filter where central frequencies of each filter is in function of the scale mentioned above. We used the response of the Neely's model behavior to create the central frequencies of the bank filter mentioned above, then we substitute the Mel scale function by another representation. We use the place theory, and we reach a 98.5% of performance, for a task that uses isolated digits pronounced by 5 different speakers. Neely's model was used because a set of parameters of the cochlea as mass, damping and stiffness, among others, when are substituted inside the model make the response obtained is closer than von Békésy proposed in his preliminary work about principle function of the cochlea.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于宏微观力学的耳蜗模型在语音自动识别中的应用
近年来,基于耳蜗行为的参数化表征已被应用于自动语音识别的相关研究中。这是因为在哺乳动物中,这个重要的听觉器官是将耳朵接收到的声压进行转导的主要因素。在本文中,我们展示了宏观和微观力学模型是如何在ASR任务中使用的。我们使用了Neely在他的工作中建立的与宏观和微观力学模型相关的值,例如命名的值,来设置银行滤波器的中心频率,以类似于构建MFCC的形式从语音中获取参数。我们提出了一种新的方法,在我们的参数表示中考虑一种新的形式来构造银行滤波器。然后我们使用这个组滤波器的分布在频域中得到一个新的语音表示。重要的是,MFCC参数使用Mel尺度来创建一个组滤波器,其中每个滤波器的中心频率是上述尺度的函数。我们使用尼利模型行为的响应来创建上面提到的银行滤波器的中心频率,然后我们用另一种表示代替梅尔尺度函数。我们使用位置理论,我们达到了98.5%的性能,对于一个任务,使用独立的数字由5个不同的说话者读出。之所以采用Neely的模型,是因为在模型内代入耳蜗的质量、阻尼、刚度等一组参数,得到的响应比von bsamksamsy在其关于耳蜗主要功能的初步工作中提出的更接近。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Coordination Model for Multi-robot Systems Based on Cooperative Behaviors JasMo - A Modularization Framework for Jason Examining Everyday Speech and Motor Symptoms of Parkinson's Disease for Diagnosis and Progression Tracking Quantifiers Types Resolution in NL Software Requirements An Uncertainty Quantification Method Based on Generalized Interval
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1