Vector-Quantized Zero-Delay Deep Autoencoders for the Compression of Electrical Stimulation Patterns of Cochlear Implants using STOI

2022 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES) Pub Date : 2022-12-07 DOI:10.1109/IECBES54088.2022.10079466

Reemt Hinrichs, Felix Ortmann, Jörn Ostermann

{"title":"Vector-Quantized Zero-Delay Deep Autoencoders for the Compression of Electrical Stimulation Patterns of Cochlear Implants using STOI","authors":"Reemt Hinrichs, Felix Ortmann, Jörn Ostermann","doi":"10.1109/IECBES54088.2022.10079466","DOIUrl":null,"url":null,"abstract":"Cochlear implants (CIs) are battery-powered, surgically implanted hearing-aids capable of restoring a sense of hearing in people suffering from moderate to profound hearing loss. Wireless transmission of audio from or to signal processors of cochlear implants can be used to improve speech understanding and localization of CI users. Data compression algorithms can be used to conserve battery power in this wireless transmission. However, very low latency is a strict requirement, limiting severly the available source coding algorithms. Previously, instead of coding the audio, coding of the electrical stimulation patterns of CIs was proposed to optimize the trade-off between bit-rate, latency and quality. In this work, a zero-delay deep autoencoder (DAE) for the coding of the electrical stimulation patters of CIs is proposed. Combining for the first time bayesian optimization with numerical approximated gradients of a nondifferential speech intelligibility measure for CIs, the short-time intelligibility measure (STOI), an optimized DAE architecture was found and trained that achieved equal or superior speech understanding at zero delay, outperforming well-known audio codecs. The DAE achieved reference vocoder STOI scores at 13.5 kbit/s compared to 33.6 kbit/s for Opus and 24.5 kbit/s for AMR-WB.","PeriodicalId":146681,"journal":{"name":"2022 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IECBES54088.2022.10079466","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Cochlear implants (CIs) are battery-powered, surgically implanted hearing-aids capable of restoring a sense of hearing in people suffering from moderate to profound hearing loss. Wireless transmission of audio from or to signal processors of cochlear implants can be used to improve speech understanding and localization of CI users. Data compression algorithms can be used to conserve battery power in this wireless transmission. However, very low latency is a strict requirement, limiting severly the available source coding algorithms. Previously, instead of coding the audio, coding of the electrical stimulation patterns of CIs was proposed to optimize the trade-off between bit-rate, latency and quality. In this work, a zero-delay deep autoencoder (DAE) for the coding of the electrical stimulation patters of CIs is proposed. Combining for the first time bayesian optimization with numerical approximated gradients of a nondifferential speech intelligibility measure for CIs, the short-time intelligibility measure (STOI), an optimized DAE architecture was found and trained that achieved equal or superior speech understanding at zero delay, outperforming well-known audio codecs. The DAE achieved reference vocoder STOI scores at 13.5 kbit/s compared to 33.6 kbit/s for Opus and 24.5 kbit/s for AMR-WB.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

矢量量化零延迟深度自编码器在人工耳蜗电刺激模式压缩中的应用

人工耳蜗(CIs)是一种通过手术植入的电池供电的助听器，能够帮助患有中度到重度听力损失的人恢复听力。通过人工耳蜗信号处理器之间的音频无线传输，可以提高人工耳蜗用户的语音理解和定位能力。在这种无线传输中，可以使用数据压缩算法来节省电池电量。然而，非常低的延迟是一个严格的要求，严重限制了可用的源编码算法。以前，为了优化比特率、延迟和质量之间的权衡，提出了对ci的电刺激模式进行编码，而不是对音频进行编码。在这项工作中，提出了一个零延迟深度自编码器(DAE)编码的电刺激模式的ci。首次将贝叶斯优化与CIs的非差分语音可理解度度量(短时可理解度度量(STOI))的数值近似梯度相结合，发现并训练了一个优化的DAE架构，该架构在零延迟下实现了同等或更好的语音理解，优于知名的音频编解码器。DAE实现了参考声码器STOI分数为13.5 kbit/s，而Opus为33.6 kbit/s, AMR-WB为24.5 kbit/s。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES)

自引率

0.00%

发文量