{"title":"基于宏微观力学的耳蜗模型在语音自动识别中的应用","authors":"J. Rodríguez, Jose Francisco Reyes Saldana","doi":"10.1109/MICAI.2013.39","DOIUrl":null,"url":null,"abstract":"Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). That is because this important organ of the hearing in the mammalians is the principal element used to make a transduction of the sound pressure that is received by the ear. In this paper we show how the macro and micro mechanical model is used in ASR tasks. We used the values that Neely founded in his work, related with the macro and micro mechanical model, such as was named, to set the central frequencies of a bank filter to obtain parameters from the speech used in a similar form as MFCC were constructed. We propose a new approach that considers a new form to construct the bank filter in our parametric representation. Then we used this distribution of the bank filter to have a new representation of the speech in frequency domain. It is important indicate that MFCC parameters use Mel scale to create a bank filter where central frequencies of each filter is in function of the scale mentioned above. We used the response of the Neely's model behavior to create the central frequencies of the bank filter mentioned above, then we substitute the Mel scale function by another representation. We use the place theory, and we reach a 98.5% of performance, for a task that uses isolated digits pronounced by 5 different speakers. Neely's model was used because a set of parameters of the cochlea as mass, damping and stiffness, among others, when are substituted inside the model make the response obtained is closer than von Békésy proposed in his preliminary work about principle function of the cochlea.","PeriodicalId":340039,"journal":{"name":"2013 12th Mexican International Conference on Artificial Intelligence","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Using a Model of the Cochlea Based in the Micro and Macro Mechanical to Find Parameters for Automatic Speech Recognition\",\"authors\":\"J. Rodríguez, Jose Francisco Reyes Saldana\",\"doi\":\"10.1109/MICAI.2013.39\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). That is because this important organ of the hearing in the mammalians is the principal element used to make a transduction of the sound pressure that is received by the ear. In this paper we show how the macro and micro mechanical model is used in ASR tasks. We used the values that Neely founded in his work, related with the macro and micro mechanical model, such as was named, to set the central frequencies of a bank filter to obtain parameters from the speech used in a similar form as MFCC were constructed. We propose a new approach that considers a new form to construct the bank filter in our parametric representation. Then we used this distribution of the bank filter to have a new representation of the speech in frequency domain. It is important indicate that MFCC parameters use Mel scale to create a bank filter where central frequencies of each filter is in function of the scale mentioned above. We used the response of the Neely's model behavior to create the central frequencies of the bank filter mentioned above, then we substitute the Mel scale function by another representation. We use the place theory, and we reach a 98.5% of performance, for a task that uses isolated digits pronounced by 5 different speakers. Neely's model was used because a set of parameters of the cochlea as mass, damping and stiffness, among others, when are substituted inside the model make the response obtained is closer than von Békésy proposed in his preliminary work about principle function of the cochlea.\",\"PeriodicalId\":340039,\"journal\":{\"name\":\"2013 12th Mexican International Conference on Artificial Intelligence\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 12th Mexican International Conference on Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MICAI.2013.39\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 12th Mexican International Conference on Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MICAI.2013.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using a Model of the Cochlea Based in the Micro and Macro Mechanical to Find Parameters for Automatic Speech Recognition
Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). That is because this important organ of the hearing in the mammalians is the principal element used to make a transduction of the sound pressure that is received by the ear. In this paper we show how the macro and micro mechanical model is used in ASR tasks. We used the values that Neely founded in his work, related with the macro and micro mechanical model, such as was named, to set the central frequencies of a bank filter to obtain parameters from the speech used in a similar form as MFCC were constructed. We propose a new approach that considers a new form to construct the bank filter in our parametric representation. Then we used this distribution of the bank filter to have a new representation of the speech in frequency domain. It is important indicate that MFCC parameters use Mel scale to create a bank filter where central frequencies of each filter is in function of the scale mentioned above. We used the response of the Neely's model behavior to create the central frequencies of the bank filter mentioned above, then we substitute the Mel scale function by another representation. We use the place theory, and we reach a 98.5% of performance, for a task that uses isolated digits pronounced by 5 different speakers. Neely's model was used because a set of parameters of the cochlea as mass, damping and stiffness, among others, when are substituted inside the model make the response obtained is closer than von Békésy proposed in his preliminary work about principle function of the cochlea.