Vector-Quantized Zero-Delay Deep Autoencoders for the Compression of Electrical Stimulation Patterns of Cochlear Implants using STOI

Reemt Hinrichs, Felix Ortmann, Jörn Ostermann
{"title":"Vector-Quantized Zero-Delay Deep Autoencoders for the Compression of Electrical Stimulation Patterns of Cochlear Implants using STOI","authors":"Reemt Hinrichs, Felix Ortmann, Jörn Ostermann","doi":"10.1109/IECBES54088.2022.10079466","DOIUrl":null,"url":null,"abstract":"Cochlear implants (CIs) are battery-powered, surgically implanted hearing-aids capable of restoring a sense of hearing in people suffering from moderate to profound hearing loss. Wireless transmission of audio from or to signal processors of cochlear implants can be used to improve speech understanding and localization of CI users. Data compression algorithms can be used to conserve battery power in this wireless transmission. However, very low latency is a strict requirement, limiting severly the available source coding algorithms. Previously, instead of coding the audio, coding of the electrical stimulation patterns of CIs was proposed to optimize the trade-off between bit-rate, latency and quality. In this work, a zero-delay deep autoencoder (DAE) for the coding of the electrical stimulation patters of CIs is proposed. Combining for the first time bayesian optimization with numerical approximated gradients of a nondifferential speech intelligibility measure for CIs, the short-time intelligibility measure (STOI), an optimized DAE architecture was found and trained that achieved equal or superior speech understanding at zero delay, outperforming well-known audio codecs. The DAE achieved reference vocoder STOI scores at 13.5 kbit/s compared to 33.6 kbit/s for Opus and 24.5 kbit/s for AMR-WB.","PeriodicalId":146681,"journal":{"name":"2022 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IECBES54088.2022.10079466","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Cochlear implants (CIs) are battery-powered, surgically implanted hearing-aids capable of restoring a sense of hearing in people suffering from moderate to profound hearing loss. Wireless transmission of audio from or to signal processors of cochlear implants can be used to improve speech understanding and localization of CI users. Data compression algorithms can be used to conserve battery power in this wireless transmission. However, very low latency is a strict requirement, limiting severly the available source coding algorithms. Previously, instead of coding the audio, coding of the electrical stimulation patterns of CIs was proposed to optimize the trade-off between bit-rate, latency and quality. In this work, a zero-delay deep autoencoder (DAE) for the coding of the electrical stimulation patters of CIs is proposed. Combining for the first time bayesian optimization with numerical approximated gradients of a nondifferential speech intelligibility measure for CIs, the short-time intelligibility measure (STOI), an optimized DAE architecture was found and trained that achieved equal or superior speech understanding at zero delay, outperforming well-known audio codecs. The DAE achieved reference vocoder STOI scores at 13.5 kbit/s compared to 33.6 kbit/s for Opus and 24.5 kbit/s for AMR-WB.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
矢量量化零延迟深度自编码器在人工耳蜗电刺激模式压缩中的应用
人工耳蜗(CIs)是一种通过手术植入的电池供电的助听器,能够帮助患有中度到重度听力损失的人恢复听力。通过人工耳蜗信号处理器之间的音频无线传输,可以提高人工耳蜗用户的语音理解和定位能力。在这种无线传输中,可以使用数据压缩算法来节省电池电量。然而,非常低的延迟是一个严格的要求,严重限制了可用的源编码算法。以前,为了优化比特率、延迟和质量之间的权衡,提出了对ci的电刺激模式进行编码,而不是对音频进行编码。在这项工作中,提出了一个零延迟深度自编码器(DAE)编码的电刺激模式的ci。首次将贝叶斯优化与CIs的非差分语音可理解度度量(短时可理解度度量(STOI))的数值近似梯度相结合,发现并训练了一个优化的DAE架构,该架构在零延迟下实现了同等或更好的语音理解,优于知名的音频编解码器。DAE实现了参考声码器STOI分数为13.5 kbit/s,而Opus为33.6 kbit/s, AMR-WB为24.5 kbit/s。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Functional Connectivity Based Classification for Autism Spectrum Disorder Using Spearman’s Rank Correlation Vector-Quantized Zero-Delay Deep Autoencoders for the Compression of Electrical Stimulation Patterns of Cochlear Implants using STOI Depression Detection on Malay Dialects Using GPT-3 Mechanical Noise Affects Rambling and Trembling Trajectories During Quiet Standing Effect Of Shoe Cushioning Hardness to Running Biomechanics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1