Exploiting Non-Negative Matrix Factorization for Binaural Sound Localization in the Presence of Directional Interference

Ingvi Örnolfsson, T. Dau, Ning Ma, T. May
{"title":"Exploiting Non-Negative Matrix Factorization for Binaural Sound Localization in the Presence of Directional Interference","authors":"Ingvi Örnolfsson, T. Dau, Ning Ma, T. May","doi":"10.1109/ICASSP39728.2021.9414233","DOIUrl":null,"url":null,"abstract":"This study presents a novel solution to the problem of binaural localization of a speaker in the presence of interfering directional noise and reverberation. Using a state-of-the-art binaural localization algorithm based on a deep neural network (DNN), we propose adding a source separation stage based on non-negative matrix factorization (NMF) to improve the localization performance in conditions with interfering sources. The separation stage is coupled with the localization stage and is optimized with respect to a broad range of different acoustic conditions, emphasizing a robust and generalizable solution. The machine listening system is shown to greatly benefit from the NMF-based separation stage at low target-to-masker ratios (TMRs) for a variety of noise types, especially for non-stationary noise. It is also demonstrated that training the NMF algorithm on anechoic speech provides better performance than using reverberant speech, and that optimizing the source separation stage using a localization metric rather than a source separation metric substantially increases the system performance.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP39728.2021.9414233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

This study presents a novel solution to the problem of binaural localization of a speaker in the presence of interfering directional noise and reverberation. Using a state-of-the-art binaural localization algorithm based on a deep neural network (DNN), we propose adding a source separation stage based on non-negative matrix factorization (NMF) to improve the localization performance in conditions with interfering sources. The separation stage is coupled with the localization stage and is optimized with respect to a broad range of different acoustic conditions, emphasizing a robust and generalizable solution. The machine listening system is shown to greatly benefit from the NMF-based separation stage at low target-to-masker ratios (TMRs) for a variety of noise types, especially for non-stationary noise. It is also demonstrated that training the NMF algorithm on anechoic speech provides better performance than using reverberant speech, and that optimizing the source separation stage using a localization metric rather than a source separation metric substantially increases the system performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用非负矩阵分解法进行定向干扰下双耳声音定位
本文提出了一种新的方法来解决存在方向性噪声和混响干扰的扬声器双耳定位问题。利用基于深度神经网络(DNN)的最先进的双耳定位算法,我们提出了一个基于非负矩阵分解(NMF)的源分离阶段,以提高在有干扰源条件下的定位性能。分离阶段与定位阶段相结合,并针对广泛的不同声学条件进行了优化,强调了鲁棒性和可推广的解决方案。研究表明,对于各种类型的噪声,特别是非平稳噪声,在低目标掩蔽比(TMRs)下,基于nmf的分离阶段对机器侦听系统有很大的好处。研究还表明,在无回声语音上训练NMF算法比使用混响语音提供更好的性能,并且使用定位度量而不是源分离度量来优化源分离阶段大大提高了系统性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Subspace Oddity - Optimization on Product of Stiefel Manifolds for EEG Data Recognition of Dynamic Hand Gesture Based on Mm-Wave Fmcw Radar Micro-Doppler Signatures Multi-Decoder Dprnn: Source Separation for Variable Number of Speakers Topic-Aware Dialogue Generation with Two-Hop Based Graph Attention On The Accuracy Limit of Joint Time-Delay/Doppler/Acceleration Estimation with a Band-Limited Signal
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1