基于谱图掩蔽和细化的双声检测辅助残余回波抑制

IF 1.3 Q3 ACOUSTICS Acoustics (Basel, Switzerland) Pub Date : 2022-08-25 DOI:10.3390/acoustics4030039
Eran Shachar, I. Cohen, B. Berdugo
{"title":"基于谱图掩蔽和细化的双声检测辅助残余回波抑制","authors":"Eran Shachar, I. Cohen, B. Berdugo","doi":"10.3390/acoustics4030039","DOIUrl":null,"url":null,"abstract":"Acoustic echo in full-duplex telecommunication systems is a common problem that may cause desired-speech quality degradation during double-talk periods. This problem is especially challenging in low signal-to-echo ratio (SER) scenarios, such as hands-free conversations over mobile phones when the loudspeaker volume is high. This paper proposes a two-stage deep-learning approach to residual echo suppression focused on the low SER scenario. The first stage consists of a speech spectrogram masking model integrated with a double-talk detector (DTD). The second stage consists of a spectrogram refinement model optimized for speech quality by minimizing a perceptual evaluation of speech quality (PESQ) related loss function. The proposed integration of DTD with the masking model outperforms several other configurations based on previous studies. We conduct an ablation study that shows the contribution of each part of the proposed system. We evaluate the proposed system in several SERs and demonstrate its efficiency in the challenging setting of a very low SER. Finally, the proposed approach outperforms competing methods in several residual echo suppression metrics. We conclude that the proposed system is well-suited for the task of low SER residual echo suppression.","PeriodicalId":72045,"journal":{"name":"Acoustics (Basel, Switzerland)","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and Refinement\",\"authors\":\"Eran Shachar, I. Cohen, B. Berdugo\",\"doi\":\"10.3390/acoustics4030039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Acoustic echo in full-duplex telecommunication systems is a common problem that may cause desired-speech quality degradation during double-talk periods. This problem is especially challenging in low signal-to-echo ratio (SER) scenarios, such as hands-free conversations over mobile phones when the loudspeaker volume is high. This paper proposes a two-stage deep-learning approach to residual echo suppression focused on the low SER scenario. The first stage consists of a speech spectrogram masking model integrated with a double-talk detector (DTD). The second stage consists of a spectrogram refinement model optimized for speech quality by minimizing a perceptual evaluation of speech quality (PESQ) related loss function. The proposed integration of DTD with the masking model outperforms several other configurations based on previous studies. We conduct an ablation study that shows the contribution of each part of the proposed system. We evaluate the proposed system in several SERs and demonstrate its efficiency in the challenging setting of a very low SER. Finally, the proposed approach outperforms competing methods in several residual echo suppression metrics. We conclude that the proposed system is well-suited for the task of low SER residual echo suppression.\",\"PeriodicalId\":72045,\"journal\":{\"name\":\"Acoustics (Basel, Switzerland)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2022-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acoustics (Basel, Switzerland)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/acoustics4030039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acoustics (Basel, Switzerland)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/acoustics4030039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

摘要

在全双工通信系统中,声回波是一个常见的问题,它可能导致双讲期间期望的语音质量下降。这个问题在低信号回波比(SER)的情况下尤其具有挑战性,例如在扬声器音量很大的情况下通过移动电话进行免提通话。本文提出了一种针对低SER场景的两阶段深度学习剩余回波抑制方法。第一阶段由语音谱图掩蔽模型和双音检测器(DTD)组成。第二阶段包括通过最小化语音质量感知评估(PESQ)相关损失函数来优化语音质量的频谱图优化模型。提出的DTD与屏蔽模型的集成优于基于先前研究的其他几种配置。我们进行了一项消融研究,显示了所提议系统的每个部分的贡献。我们在几个SER中评估了所提出的系统,并证明了它在非常低SER的挑战性设置下的效率。最后,该方法在几个剩余回波抑制指标上优于竞争方法。我们得出结论,该系统非常适合低SER残留回波抑制任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Double-Talk Detection-Aided Residual Echo Suppression via Spectrogram Masking and Refinement
Acoustic echo in full-duplex telecommunication systems is a common problem that may cause desired-speech quality degradation during double-talk periods. This problem is especially challenging in low signal-to-echo ratio (SER) scenarios, such as hands-free conversations over mobile phones when the loudspeaker volume is high. This paper proposes a two-stage deep-learning approach to residual echo suppression focused on the low SER scenario. The first stage consists of a speech spectrogram masking model integrated with a double-talk detector (DTD). The second stage consists of a spectrogram refinement model optimized for speech quality by minimizing a perceptual evaluation of speech quality (PESQ) related loss function. The proposed integration of DTD with the masking model outperforms several other configurations based on previous studies. We conduct an ablation study that shows the contribution of each part of the proposed system. We evaluate the proposed system in several SERs and demonstrate its efficiency in the challenging setting of a very low SER. Finally, the proposed approach outperforms competing methods in several residual echo suppression metrics. We conclude that the proposed system is well-suited for the task of low SER residual echo suppression.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.70
自引率
0.00%
发文量
0
审稿时长
11 weeks
期刊最新文献
Data-Driven Discovery of Anomaly-Sensitive Parameters from Uvula Wake Flows Using Wavelet Analyses and Poincaré Maps Importance of Noise Hygiene in Dairy Cattle Farming—A Review Finite Element–Boundary Element Acoustic Backscattering with Model Reduction of Surface Pressure Based on Coherent Clusters Applying New Algorithms for Numerical Integration on the Sphere in the Far Field of Sound Pressure Sound Environment during Dental Treatment in Relation to COVID-19 Pandemic
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1