Direction-of-arrival and power spectral density estimation using a single directional microphone and group-sparse optimization

IF 2.4 3区 计算机科学 Journal on Audio Speech and Music Processing Pub Date : 2023-10-04 DOI:10.1186/s13636-023-00304-8
Elisa Tengan, Thomas Dietzen, Filip Elvander, Toon van Waterschoot
{"title":"Direction-of-arrival and power spectral density estimation using a single directional microphone and group-sparse optimization","authors":"Elisa Tengan, Thomas Dietzen, Filip Elvander, Toon van Waterschoot","doi":"10.1186/s13636-023-00304-8","DOIUrl":null,"url":null,"abstract":"Abstract In this paper, two approaches are proposed for estimating the direction of arrival (DOA) and power spectral density (PSD) of stationary point sources by using a single, rotating, directional microphone. These approaches are based on a method previously presented by the authors, in which point source DOAs were estimated by using a broadband signal model and solving a group-sparse optimization problem, where the number of observations made by the rotating directional microphone can be lower than the number of candidate DOAs in an angular grid. The DOA estimation is followed by the estimation of the sources’ PSDs through the solution of an overdetermined least squares problem. The first approach proposed in this paper includes the use of an additional nonnegativity constraint on the residual noise term when solving the group-sparse optimization problem and is referred to as the Group Lasso Least Squares (GL-LS) approach. The second proposed approach, in addition to the new nonnegativity constraint, employs a narrowband signal model when building the linear system of equations used for formulating the group-sparse optimization problem, where the DOAs and PSDs can be jointly estimated by iterative, group-wise reweighting. This is referred to as the Group-Lasso with $$l_1$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:msub> <mml:mi>l</mml:mi> <mml:mn>1</mml:mn> </mml:msub> </mml:math> -reweighting (GL-L1) approach. Both proposed approaches are implemented using the alternating direction method of multipliers (ADMM), and their performance is evaluated through simulations in which different setup conditions are considered, ranging from different types of model mismatch to variations in the acoustic scene and microphone directivity pattern. The results obtained show that in a scenario involving a microphone response mismatch between observed data and the signal model used, having the additional nonnegativity constraint on the residual noise can improve the DOA estimation for the case of GL-LS and the PSD estimation for the case of GL-L1. Moreover, the GL-L1 approach can present an advantage over GL-LS in terms of DOA estimation performance in scenarios with low SNR or where multiple sources are closely located to each other. Finally, it is shown that having the least squares PSD re-estimation step is beneficial in most scenarios, such that GL-LS outperformed GL-L1 in terms of PSD estimation errors.","PeriodicalId":49309,"journal":{"name":"Journal on Audio Speech and Music Processing","volume":"18 1","pages":"0"},"PeriodicalIF":2.4000,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal on Audio Speech and Music Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13636-023-00304-8","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract In this paper, two approaches are proposed for estimating the direction of arrival (DOA) and power spectral density (PSD) of stationary point sources by using a single, rotating, directional microphone. These approaches are based on a method previously presented by the authors, in which point source DOAs were estimated by using a broadband signal model and solving a group-sparse optimization problem, where the number of observations made by the rotating directional microphone can be lower than the number of candidate DOAs in an angular grid. The DOA estimation is followed by the estimation of the sources’ PSDs through the solution of an overdetermined least squares problem. The first approach proposed in this paper includes the use of an additional nonnegativity constraint on the residual noise term when solving the group-sparse optimization problem and is referred to as the Group Lasso Least Squares (GL-LS) approach. The second proposed approach, in addition to the new nonnegativity constraint, employs a narrowband signal model when building the linear system of equations used for formulating the group-sparse optimization problem, where the DOAs and PSDs can be jointly estimated by iterative, group-wise reweighting. This is referred to as the Group-Lasso with $$l_1$$ l 1 -reweighting (GL-L1) approach. Both proposed approaches are implemented using the alternating direction method of multipliers (ADMM), and their performance is evaluated through simulations in which different setup conditions are considered, ranging from different types of model mismatch to variations in the acoustic scene and microphone directivity pattern. The results obtained show that in a scenario involving a microphone response mismatch between observed data and the signal model used, having the additional nonnegativity constraint on the residual noise can improve the DOA estimation for the case of GL-LS and the PSD estimation for the case of GL-L1. Moreover, the GL-L1 approach can present an advantage over GL-LS in terms of DOA estimation performance in scenarios with low SNR or where multiple sources are closely located to each other. Finally, it is shown that having the least squares PSD re-estimation step is beneficial in most scenarios, such that GL-LS outperformed GL-L1 in terms of PSD estimation errors.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于单方向传声器和群稀疏优化的到达方向和功率谱密度估计
摘要本文提出了两种利用单旋转定向传声器估计静止点源的到达方向(DOA)和功率谱密度(PSD)的方法。这些方法基于作者先前提出的方法,其中通过使用宽带信号模型和解决组稀疏优化问题来估计点源doa,其中旋转定向麦克风的观测数量可以低于角网格中的候选doa数量。在DOA估计之后,通过求解过定最小二乘问题估计源的psd。本文提出的第一种方法包括在求解群稀疏优化问题时对残余噪声项使用附加的非负性约束,称为群Lasso最小二乘(GL-LS)方法。第二种提出的方法,除了新的非负性约束外,在构建用于制定群稀疏优化问题的线性方程组时采用窄带信号模型,其中doa和psd可以通过迭代,群加权来联合估计。这被称为Group-Lasso with $$l_1$$ 1 -reweighting (GL-L1)方法。这两种方法都使用乘法器的交替方向方法(ADMM)来实现,并通过考虑不同设置条件的仿真来评估它们的性能,这些条件包括不同类型的模型不匹配、声场景和麦克风指向性模式的变化。结果表明,在观测数据与所用信号模型麦克风响应不匹配的情况下,对残差噪声附加非负性约束可以改善GL-LS情况下的DOA估计和GL-L1情况下的PSD估计。此外,GL-L1方法在低信噪比或多个信源彼此靠近的情况下,在DOA估计性能方面比GL-LS方法具有优势。最后,研究表明,在大多数情况下,最小二乘PSD重估计步骤是有益的,因此GL-LS在PSD估计误差方面优于GL-L1。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal on Audio Speech and Music Processing
Journal on Audio Speech and Music Processing Engineering-Electrical and Electronic Engineering
CiteScore
4.10
自引率
4.20%
发文量
28
期刊介绍: The aim of “EURASIP Journal on Audio, Speech, and Music Processing” is to bring together researchers, scientists and engineers working on the theory and applications of the processing of various audio signals, with a specific focus on speech and music. EURASIP Journal on Audio, Speech, and Music Processing will be an interdisciplinary journal for the dissemination of all basic and applied aspects of speech communication and audio processes.
期刊最新文献
A survey of technologies for automatic Dysarthric speech recognition Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling Robustness of ad hoc microphone clustering using speaker embeddings: evaluation under realistic and challenging scenarios W2VC: WavLM representation based one-shot voice conversion with gradient reversal distillation and CTC supervision YuYin: a multi-task learning model of multi-modal e-commerce background music recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1