{"title":"人工基底膜/毛细胞集成声学系统,受人类耳蜗启发在嘈杂环境中识别关键词","authors":"","doi":"10.1016/j.measurement.2024.115722","DOIUrl":null,"url":null,"abstract":"<div><p>We report a novel speech recognition method using a noise-robust acoustic sensor system that integrates a spatially frequency-separating sensor with a nonlinear amplification algorithm, mimicking the cochlea’s basilar membrane and hair cells. The multichannel piezoelectric artificial basilar membrane (ABM) sensor detects specific sound frequencies with high sensitivity over 0.2–6 kHz. The signal processing model of the Artificial Hair Cell inspired by the signal transduction mechanism of inner hair cells, simultaneously enhances the frequency selectivity of ABM sensors and improves noise robustness. In a 0 dB SNR noisy environment, it effectively detected the voice signal with a maximum SNR of 57 dB. Furthermore, we converted the frequency-separated signals for speech sounds in various noisy environments into heatmap images and utilized them as input for a CNN-based speech recognition algorithm. Consequently, our system demonstrated noise-robust recognition performance with 94 % accuracy, even in noisy environments.</p></div>","PeriodicalId":18349,"journal":{"name":"Measurement","volume":null,"pages":null},"PeriodicalIF":5.2000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Artificial basilar membrane/hair cell integrated acoustic system for keyword spotting in noisy environments inspired by human cochlea\",\"authors\":\"\",\"doi\":\"10.1016/j.measurement.2024.115722\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We report a novel speech recognition method using a noise-robust acoustic sensor system that integrates a spatially frequency-separating sensor with a nonlinear amplification algorithm, mimicking the cochlea’s basilar membrane and hair cells. The multichannel piezoelectric artificial basilar membrane (ABM) sensor detects specific sound frequencies with high sensitivity over 0.2–6 kHz. The signal processing model of the Artificial Hair Cell inspired by the signal transduction mechanism of inner hair cells, simultaneously enhances the frequency selectivity of ABM sensors and improves noise robustness. In a 0 dB SNR noisy environment, it effectively detected the voice signal with a maximum SNR of 57 dB. Furthermore, we converted the frequency-separated signals for speech sounds in various noisy environments into heatmap images and utilized them as input for a CNN-based speech recognition algorithm. Consequently, our system demonstrated noise-robust recognition performance with 94 % accuracy, even in noisy environments.</p></div>\",\"PeriodicalId\":18349,\"journal\":{\"name\":\"Measurement\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Measurement\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0263224124016075\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0263224124016075","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
我们报告了一种新颖的语音识别方法,它使用了一种噪声抑制声学传感器系统,该系统集成了空间分频传感器和非线性放大算法,模仿了耳蜗的基底膜和毛细胞。多通道压电人工基底膜(ABM)传感器能以高灵敏度检测 0.2-6 kHz 的特定声音频率。人工毛细胞的信号处理模型受到内毛细胞信号转导机制的启发,同时增强了人工基底膜传感器的频率选择性并提高了噪声鲁棒性。在信噪比为 0 dB 的噪声环境中,它能有效地检测到信噪比最高达 57 dB 的语音信号。此外,我们还将各种噪声环境中的语音频率分离信号转换成热图图像,并将其作为基于 CNN 的语音识别算法的输入。因此,即使在嘈杂环境中,我们的系统也能以 94% 的准确率表现出抗噪声识别性能。
Artificial basilar membrane/hair cell integrated acoustic system for keyword spotting in noisy environments inspired by human cochlea
We report a novel speech recognition method using a noise-robust acoustic sensor system that integrates a spatially frequency-separating sensor with a nonlinear amplification algorithm, mimicking the cochlea’s basilar membrane and hair cells. The multichannel piezoelectric artificial basilar membrane (ABM) sensor detects specific sound frequencies with high sensitivity over 0.2–6 kHz. The signal processing model of the Artificial Hair Cell inspired by the signal transduction mechanism of inner hair cells, simultaneously enhances the frequency selectivity of ABM sensors and improves noise robustness. In a 0 dB SNR noisy environment, it effectively detected the voice signal with a maximum SNR of 57 dB. Furthermore, we converted the frequency-separated signals for speech sounds in various noisy environments into heatmap images and utilized them as input for a CNN-based speech recognition algorithm. Consequently, our system demonstrated noise-robust recognition performance with 94 % accuracy, even in noisy environments.
期刊介绍:
Contributions are invited on novel achievements in all fields of measurement and instrumentation science and technology. Authors are encouraged to submit novel material, whose ultimate goal is an advancement in the state of the art of: measurement and metrology fundamentals, sensors, measurement instruments, measurement and estimation techniques, measurement data processing and fusion algorithms, evaluation procedures and methodologies for plants and industrial processes, performance analysis of systems, processes and algorithms, mathematical models for measurement-oriented purposes, distributed measurement systems in a connected world.