Room impulse response reshaping-based expectation–maximization in an underdetermined reverberant environment

IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Computer Speech and Language Pub Date : 2024-05-14 DOI:10.1016/j.csl.2024.101664
Yuan Xie , Tao Zou , Junjie Yang , Weijun Sun , Shengli Xie
{"title":"Room impulse response reshaping-based expectation–maximization in an underdetermined reverberant environment","authors":"Yuan Xie ,&nbsp;Tao Zou ,&nbsp;Junjie Yang ,&nbsp;Weijun Sun ,&nbsp;Shengli Xie","doi":"10.1016/j.csl.2024.101664","DOIUrl":null,"url":null,"abstract":"<div><p>Source separation in an underdetermined reverberation environment is a very challenging issue. The classical method is based on the expectation–maximization algorithm. However, it is limited to high reverberation environments, resulting in bad or even invalid separation performance. To eliminate this restriction, a room impulse response reshaping-based expectation–maximization method is designed to solve the problem of source separation in an underdetermined reverberant environment. Firstly, a room impulse response reshaping technology is designed to eliminate the influence of audible echo on the reverberant environment, improving the quality of the received signals. Then, a new mathematical model of time-frequency mixing signals is established to reduce the approximation error of model transformation caused by high reverberation. Furthermore, an improved expectation–maximization method is proposed for real-time update learning rules of model parameters, and then the sources are separated using the estimators provided by the improved expectation–maximization method. Experimental results based on source separation of speech and music mixtures demonstrate that the proposed algorithm achieves better separation performance while maintaining much better robustness than popular expectation–maximization methods.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"88 ","pages":"Article 101664"},"PeriodicalIF":3.1000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230824000470","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Source separation in an underdetermined reverberation environment is a very challenging issue. The classical method is based on the expectation–maximization algorithm. However, it is limited to high reverberation environments, resulting in bad or even invalid separation performance. To eliminate this restriction, a room impulse response reshaping-based expectation–maximization method is designed to solve the problem of source separation in an underdetermined reverberant environment. Firstly, a room impulse response reshaping technology is designed to eliminate the influence of audible echo on the reverberant environment, improving the quality of the received signals. Then, a new mathematical model of time-frequency mixing signals is established to reduce the approximation error of model transformation caused by high reverberation. Furthermore, an improved expectation–maximization method is proposed for real-time update learning rules of model parameters, and then the sources are separated using the estimators provided by the improved expectation–maximization method. Experimental results based on source separation of speech and music mixtures demonstrate that the proposed algorithm achieves better separation performance while maintaining much better robustness than popular expectation–maximization methods.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
欠确定混响环境中基于期望最大化的室内脉冲响应重塑
在混响不确定的环境中进行声源分离是一个非常具有挑战性的问题。经典方法基于期望最大化算法。然而,这种方法仅限于高混响环境,导致分离效果不佳甚至无效。为了消除这一限制,我们设计了一种基于房间脉冲响应重塑的期望最大化方法,以解决混响不确定环境下的声源分离问题。首先,设计了一种房间脉冲响应重塑技术,以消除可听回声对混响环境的影响,提高接收信号的质量。然后,建立了一种新的时频混合信号数学模型,以减少高混响引起的模型变换近似误差。此外,还提出了一种改进的期望最大化方法,用于实时更新模型参数的学习规则,然后利用改进的期望最大化方法提供的估计值进行声源分离。基于语音和音乐混合物声源分离的实验结果表明,与流行的期望最大化方法相比,所提出的算法既能实现更好的分离性能,又能保持更好的鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer Speech and Language
Computer Speech and Language 工程技术-计算机:人工智能
CiteScore
11.30
自引率
4.70%
发文量
80
审稿时长
22.9 weeks
期刊介绍: Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language. The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.
期刊最新文献
TR-Net: Token Relation Inspired Table Filling Network for Joint Entity and Relation Extraction CLIPMulti: Explore the performance of multimodal enhanced CLIP for zero-shot text classification UniKDD: A Unified Generative model for Knowledge-driven Dialogue Entity and relationship extraction based on span contribution evaluation and focusing framework Exploring the ability of LLMs to classify written proficiency levels
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1