Exploiting Non-Negative Matrix Factorization for Binaural Sound Localization in the Presence of Directional Interference

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI:10.1109/ICASSP39728.2021.9414233

Ingvi Örnolfsson, T. Dau, Ning Ma, T. May

{"title":"Exploiting Non-Negative Matrix Factorization for Binaural Sound Localization in the Presence of Directional Interference","authors":"Ingvi Örnolfsson, T. Dau, Ning Ma, T. May","doi":"10.1109/ICASSP39728.2021.9414233","DOIUrl":null,"url":null,"abstract":"This study presents a novel solution to the problem of binaural localization of a speaker in the presence of interfering directional noise and reverberation. Using a state-of-the-art binaural localization algorithm based on a deep neural network (DNN), we propose adding a source separation stage based on non-negative matrix factorization (NMF) to improve the localization performance in conditions with interfering sources. The separation stage is coupled with the localization stage and is optimized with respect to a broad range of different acoustic conditions, emphasizing a robust and generalizable solution. The machine listening system is shown to greatly benefit from the NMF-based separation stage at low target-to-masker ratios (TMRs) for a variety of noise types, especially for non-stationary noise. It is also demonstrated that training the NMF algorithm on anechoic speech provides better performance than using reverberant speech, and that optimizing the source separation stage using a localization metric rather than a source separation metric substantially increases the system performance.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP39728.2021.9414233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

This study presents a novel solution to the problem of binaural localization of a speaker in the presence of interfering directional noise and reverberation. Using a state-of-the-art binaural localization algorithm based on a deep neural network (DNN), we propose adding a source separation stage based on non-negative matrix factorization (NMF) to improve the localization performance in conditions with interfering sources. The separation stage is coupled with the localization stage and is optimized with respect to a broad range of different acoustic conditions, emphasizing a robust and generalizable solution. The machine listening system is shown to greatly benefit from the NMF-based separation stage at low target-to-masker ratios (TMRs) for a variety of noise types, especially for non-stationary noise. It is also demonstrated that training the NMF algorithm on anechoic speech provides better performance than using reverberant speech, and that optimizing the source separation stage using a localization metric rather than a source separation metric substantially increases the system performance.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用非负矩阵分解法进行定向干扰下双耳声音定位

本文提出了一种新的方法来解决存在方向性噪声和混响干扰的扬声器双耳定位问题。利用基于深度神经网络(DNN)的最先进的双耳定位算法，我们提出了一个基于非负矩阵分解(NMF)的源分离阶段，以提高在有干扰源条件下的定位性能。分离阶段与定位阶段相结合，并针对广泛的不同声学条件进行了优化，强调了鲁棒性和可推广的解决方案。研究表明，对于各种类型的噪声，特别是非平稳噪声，在低目标掩蔽比(TMRs)下，基于nmf的分离阶段对机器侦听系统有很大的好处。研究还表明，在无回声语音上训练NMF算法比使用混响语音提供更好的性能，并且使用定位度量而不是源分离度量来优化源分离阶段大大提高了系统性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

自引率

0.00%

发文量