结合声反馈消除和降噪的级联算法

IF 2.4 3区计算机科学 Journal on Audio Speech and Music Processing Pub Date : 2023-09-21 DOI:10.1186/s13636-023-00296-5

Santiago Ruiz, Toon van Waterschoot, Marc Moonen

{"title":"结合声反馈消除和降噪的级联算法","authors":"Santiago Ruiz, Toon van Waterschoot, Marc Moonen","doi":"10.1186/s13636-023-00296-5","DOIUrl":null,"url":null,"abstract":"Abstract This paper presents three cascade algorithms for combined acoustic feedback cancelation (AFC) and noise reduction (NR) in speech applications. A prediction error method (PEM)-based adaptive feedback cancelation (PEM-based AFC) algorithm is used for the AFC stage, while a multichannel Wiener filter (MWF) is applied for the NR stage. A scenario with M microphones and 1 loudspeaker is considered, without loss of generality. The first algorithm is the baseline algorithm, namely the cascade M -channel rank-1 MWF and PEM-AFC, where a NR stage is performed first using a rank-1 MWF followed by a single-channel AFC stage using a PEM-based AFC algorithm. The second algorithm is the cascade $$(M+1)$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mo>(</mml:mo> <mml:mi>M</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn> <mml:mo>)</mml:mo> </mml:mrow> </mml:math> -channel rank-2 MWF and PEM-AFC, where again a NR stage is applied first followed by a single-channel AFC stage. The novelty of this algorithm is to consider an ( $$M+1$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mi>M</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn> </mml:mrow> </mml:math> )-channel data model in the MWF formulation with two different desired signals, i.e., the speech component in the reference microphone signal and in the loudspeaker signal, both defined by the speech source signal but not equal to each other. The two desired signal estimates are later used in a single-channel PEM-based AFC stage. The third algorithm is the cascade M -channel PEM-AFC and rank-1 MWF where an M -channel AFC stage is performed first followed by an M -channel NR stage. Although in cascade algorithms where NR is performed first and then AFC the estimation of the feedback path is usually affected by the NR stage, it is shown here that by performing a rank-2 approximation of the speech correlation matrix this issue can be avoided and the feedback path can be correctly estimated. The performance of the algorithms is assessed by means of closed-loop simulations where it is shown that for the considered input signal-to-noise ratios (iSNRs) the cascade $$(M+1)$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mo>(</mml:mo> <mml:mi>M</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn> <mml:mo>)</mml:mo> </mml:mrow> </mml:math> -channel rank-2 MWF and PEM-AFC and the cascade M -channel PEM-AFC and rank-1 MWF algorithms outperform the cascade M -channel rank-1 MWF and PEM-AFC algorithm in terms of the added stable gain (ASG) and misadjustment (Mis) as well as in terms of perceptual metrics such as the short-time objective intelligibility (STOI), perceptual evaluation of speech quality (PESQ), and signal distortion (SD).","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":"72 1","pages":"0"},"PeriodicalIF":2.4000,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cascade algorithms for combined acoustic feedback cancelation and noise reduction\",\"authors\":\"Santiago Ruiz, Toon van Waterschoot, Marc Moonen\",\"doi\":\"10.1186/s13636-023-00296-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract This paper presents three cascade algorithms for combined acoustic feedback cancelation (AFC) and noise reduction (NR) in speech applications. A prediction error method (PEM)-based adaptive feedback cancelation (PEM-based AFC) algorithm is used for the AFC stage, while a multichannel Wiener filter (MWF) is applied for the NR stage. A scenario with M microphones and 1 loudspeaker is considered, without loss of generality. The first algorithm is the baseline algorithm, namely the cascade M -channel rank-1 MWF and PEM-AFC, where a NR stage is performed first using a rank-1 MWF followed by a single-channel AFC stage using a PEM-based AFC algorithm. The second algorithm is the cascade $$(M+1)$$ <mml:math xmlns:mml=\\\"http://www.w3.org/1998/Math/MathML\\\"> <mml:mrow> <mml:mo>(</mml:mo> <mml:mi>M</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn> <mml:mo>)</mml:mo> </mml:mrow> </mml:math> -channel rank-2 MWF and PEM-AFC, where again a NR stage is applied first followed by a single-channel AFC stage. The novelty of this algorithm is to consider an ( $$M+1$$ <mml:math xmlns:mml=\\\"http://www.w3.org/1998/Math/MathML\\\"> <mml:mrow> <mml:mi>M</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn> </mml:mrow> </mml:math> )-channel data model in the MWF formulation with two different desired signals, i.e., the speech component in the reference microphone signal and in the loudspeaker signal, both defined by the speech source signal but not equal to each other. The two desired signal estimates are later used in a single-channel PEM-based AFC stage. The third algorithm is the cascade M -channel PEM-AFC and rank-1 MWF where an M -channel AFC stage is performed first followed by an M -channel NR stage. Although in cascade algorithms where NR is performed first and then AFC the estimation of the feedback path is usually affected by the NR stage, it is shown here that by performing a rank-2 approximation of the speech correlation matrix this issue can be avoided and the feedback path can be correctly estimated. The performance of the algorithms is assessed by means of closed-loop simulations where it is shown that for the considered input signal-to-noise ratios (iSNRs) the cascade $$(M+1)$$ <mml:math xmlns:mml=\\\"http://www.w3.org/1998/Math/MathML\\\"> <mml:mrow> <mml:mo>(</mml:mo> <mml:mi>M</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn> <mml:mo>)</mml:mo> </mml:mrow> </mml:math> -channel rank-2 MWF and PEM-AFC and the cascade M -channel PEM-AFC and rank-1 MWF algorithms outperform the cascade M -channel rank-1 MWF and PEM-AFC algorithm in terms of the added stable gain (ASG) and misadjustment (Mis) as well as in terms of perceptual metrics such as the short-time objective intelligibility (STOI), perceptual evaluation of speech quality (PESQ), and signal distortion (SD).\",\"PeriodicalId\":49309,\"journal\":{\"name\":\"Journal on Audio Speech and Music Processing\",\"volume\":\"72 1\",\"pages\":\"0\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2023-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal on Audio Speech and Music Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s13636-023-00296-5\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal on Audio Speech and Music Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13636-023-00296-5","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

摘要本文提出了语音应用中声学反馈消除(AFC)和降噪(NR)相结合的三种级联算法。AFC阶段采用基于预测误差法(PEM)的自适应反馈抵消(PEM-based AFC)算法，NR阶段采用多通道维纳滤波器(MWF)。考虑一个有M个麦克风和1个扬声器的场景，但不失一般性。第一种算法是基线算法，即M通道rank-1 MWF和PEM-AFC级联，其中首先使用rank-1 MWF执行NR阶段，然后使用基于pem的AFC算法执行单通道AFC阶段。第二种算法是级联$$(M+1)$$ (M + 1) -通道排名2的MWF和PEM-AFC，其中首先应用NR阶段，然后是单通道AFC阶段。该算法的新颖之处在于在MWF公式中考虑一个($$M+1$$ M + 1)通道数据模型，该模型具有两个不同的期望信号，即参考麦克风信号中的语音分量和扬声器信号中的语音分量，这两个信号都由语音源信号定义，但彼此不相等。两个期望的信号估计随后用于单通道基于pem的AFC级。第三种算法是级联M通道PEM-AFC和排名1的MWF，其中M通道AFC阶段首先执行，然后是M通道NR阶段。虽然在先进行NR再进行AFC的级联算法中，反馈路径的估计通常会受到NR阶段的影响，但这里显示，通过对语音相关矩阵进行秩2近似，可以避免这个问题，并且可以正确估计反馈路径。通过闭环仿真来评估算法的性能，结果表明，对于考虑的输入信噪比(isnr)，级联$$(M+1)$$ (M + 1)通道等级2 MWF和PEM-AFC以及级联M通道等级2 MWF和等级1 MWF算法在增加的稳定增益(ASG)和失调(Mis)以及感知指标(如短时间)方面优于级联M通道等级1 MWF和PEM-AFC算法客观可理解度(STOI)、语音质量感知评价(PESQ)和信号失真(SD)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Cascade algorithms for combined acoustic feedback cancelation and noise reduction

Abstract This paper presents three cascade algorithms for combined acoustic feedback cancelation (AFC) and noise reduction (NR) in speech applications. A prediction error method (PEM)-based adaptive feedback cancelation (PEM-based AFC) algorithm is used for the AFC stage, while a multichannel Wiener filter (MWF) is applied for the NR stage. A scenario with M microphones and 1 loudspeaker is considered, without loss of generality. The first algorithm is the baseline algorithm, namely the cascade M -channel rank-1 MWF and PEM-AFC, where a NR stage is performed first using a rank-1 MWF followed by a single-channel AFC stage using a PEM-based AFC algorithm. The second algorithm is the cascade $$(M+1)$$ ( M + 1 ) -channel rank-2 MWF and PEM-AFC, where again a NR stage is applied first followed by a single-channel AFC stage. The novelty of this algorithm is to consider an ( $$M+1$$ M + 1 )-channel data model in the MWF formulation with two different desired signals, i.e., the speech component in the reference microphone signal and in the loudspeaker signal, both defined by the speech source signal but not equal to each other. The two desired signal estimates are later used in a single-channel PEM-based AFC stage. The third algorithm is the cascade M -channel PEM-AFC and rank-1 MWF where an M -channel AFC stage is performed first followed by an M -channel NR stage. Although in cascade algorithms where NR is performed first and then AFC the estimation of the feedback path is usually affected by the NR stage, it is shown here that by performing a rank-2 approximation of the speech correlation matrix this issue can be avoided and the feedback path can be correctly estimated. The performance of the algorithms is assessed by means of closed-loop simulations where it is shown that for the considered input signal-to-noise ratios (iSNRs) the cascade $$(M+1)$$ ( M + 1 ) -channel rank-2 MWF and PEM-AFC and the cascade M -channel PEM-AFC and rank-1 MWF algorithms outperform the cascade M -channel rank-1 MWF and PEM-AFC algorithm in terms of the added stable gain (ASG) and misadjustment (Mis) as well as in terms of perceptual metrics such as the short-time objective intelligibility (STOI), perceptual evaluation of speech quality (PESQ), and signal distortion (SD).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal on Audio Speech and Music Processing Engineering-Electrical and Electronic Engineering

CiteScore

4.10

自引率

4.20%

发文量

期刊介绍： The aim of “EURASIP Journal on Audio, Speech, and Music Processing” is to bring together researchers, scientists and engineers working on the theory and applications of the processing of various audio signals, with a specific focus on speech and music. EURASIP Journal on Audio, Speech, and Music Processing will be an interdisciplinary journal for the dissemination of all basic and applied aspects of speech communication and audio processes.