Integrating Data Priors to Weighted Prediction Error for Speech Dereverberation

IF 4.1 2区 计算机科学 Q1 ACOUSTICS IEEE/ACM Transactions on Audio, Speech, and Language Processing Pub Date : 2024-08-19 DOI:10.1109/TASLP.2024.3440003
Ziye Yang;Wenxing Yang;Kai Xie;Jie Chen
{"title":"Integrating Data Priors to Weighted Prediction Error for Speech Dereverberation","authors":"Ziye Yang;Wenxing Yang;Kai Xie;Jie Chen","doi":"10.1109/TASLP.2024.3440003","DOIUrl":null,"url":null,"abstract":"Speech dereverberation aims to alleviate the detrimental effects of late-reverberant components. While the weighted prediction error (WPE) method has shown superior performance in dereverberation, there is still room for further improvement in terms of performance and robustness in complex and noisy environments. Recent research has highlighted the effectiveness of integrating physics-based and data-driven methods, enhancing the performance of various signal processing tasks while maintaining interpretability. Motivated by these advancements, this paper presents a novel dereverberation framework for the single-source case, which incorporates data-driven methods for capturing speech priors within the WPE framework. The plug-and-play (PnP) framework, specifically the regularization by denoising (RED) strategy, is utilized to incorporate speech prior information learnt from data during the optimization problem solving iterations. Experimental results validate the effectiveness of the proposed approach.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"3908-3923"},"PeriodicalIF":4.1000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10638806/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Speech dereverberation aims to alleviate the detrimental effects of late-reverberant components. While the weighted prediction error (WPE) method has shown superior performance in dereverberation, there is still room for further improvement in terms of performance and robustness in complex and noisy environments. Recent research has highlighted the effectiveness of integrating physics-based and data-driven methods, enhancing the performance of various signal processing tasks while maintaining interpretability. Motivated by these advancements, this paper presents a novel dereverberation framework for the single-source case, which incorporates data-driven methods for capturing speech priors within the WPE framework. The plug-and-play (PnP) framework, specifically the regularization by denoising (RED) strategy, is utilized to incorporate speech prior information learnt from data during the optimization problem solving iterations. Experimental results validate the effectiveness of the proposed approach.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
将数据先验因素与语音消除混响的加权预测误差相结合
语音消除混响的目的是减轻后期混响成分的有害影响。虽然加权预测误差(WPE)方法在消除混响方面表现出了卓越的性能,但在复杂和嘈杂环境中的性能和鲁棒性方面仍有进一步改进的空间。最近的研究突显了基于物理和数据驱动的方法整合的有效性,在保持可解释性的同时提高了各种信号处理任务的性能。在这些研究进展的推动下,本文针对单源情况提出了一种新的消除混响框架,该框架在 WPE 框架内采用了数据驱动方法来捕捉语音先验。即插即用(PnP)框架,特别是去噪正则化(RED)策略,被用来在优化问题迭代求解过程中纳入从数据中学到的语音先验信息。实验结果验证了所提方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE/ACM Transactions on Audio, Speech, and Language Processing
IEEE/ACM Transactions on Audio, Speech, and Language Processing ACOUSTICS-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
11.30
自引率
11.10%
发文量
217
期刊介绍: The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.
期刊最新文献
CLAPSep: Leveraging Contrastive Pre-Trained Model for Multi-Modal Query-Conditioned Target Sound Extraction Enhancing Robustness of Speech Watermarking Using a Transformer-Based Framework Exploiting Acoustic Features FTDKD: Frequency-Time Domain Knowledge Distillation for Low-Quality Compressed Audio Deepfake Detection ELSF: Entity-Level Slot Filling Framework for Joint Multiple Intent Detection and Slot Filling Proper Error Estimation and Calibration for Attention-Based Encoder-Decoder Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1