Theoretical Analysis of Maclaurin Expansion Based Linear Differential Microphone Arrays and Improved Solutions

IF 5.1 2区计算机科学 Q1 ACOUSTICS IEEE/ACM Transactions on Audio, Speech, and Language Processing Pub Date : 2024-08-07 DOI:10.1109/TASLP.2024.3439994

Jinfu Wang;Feiran Yang;Xiaoqing Hu;Jun Yang

{"title":"Theoretical Analysis of Maclaurin Expansion Based Linear Differential Microphone Arrays and Improved Solutions","authors":"Jinfu Wang;Feiran Yang;Xiaoqing Hu;Jun Yang","doi":"10.1109/TASLP.2024.3439994","DOIUrl":null,"url":null,"abstract":"Linear differential microphone arrays (LDMAs) are becoming popular due to their potentially high directional gain and frequency-invariant beampattern. By increasing the number of microphones, the Maclaurin expansion-based LDMAs address the inherently poor robustness problem of the conventional LDMA at low frequencies. However, this method encounters severe beampattern distortion and the deep nulls problem in the white noise gain (WNG) and the directivity factor (DF) at high frequencies as the number of microphones increases. In this paper, we reveal that the severe beampattern distortion is attributed to the deviation term of the synthesized beampattern while the deep nulls problem in the WNG and the DF is attributed to the violation of the distortionless constraint in the desired direction. We then propose two new design methods to avoid the degraded performance of LDMAs. Compared to the Maclaurin series expansion-based method, the first method additionally imposes the distortionless constraint in the desired direction, and the deep nulls problem in the WNG and the DF can be avoided. The second method explicitly requires the response of the higher order spatial directivity pattern in the deviation term to be zero, and thus the beampattern distortion can be avoided. By choosing the frequency-wise parameter that determines the number of the considered higher order spatial directivity patterns, the second method enables a good trade-off between the WNG and the beampattern distortion. Simulations exemplify the superiority of the proposed method against existing methods in terms of the robustness and the beampattern distortion.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"3811-3825"},"PeriodicalIF":5.1000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10629057/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Linear differential microphone arrays (LDMAs) are becoming popular due to their potentially high directional gain and frequency-invariant beampattern. By increasing the number of microphones, the Maclaurin expansion-based LDMAs address the inherently poor robustness problem of the conventional LDMA at low frequencies. However, this method encounters severe beampattern distortion and the deep nulls problem in the white noise gain (WNG) and the directivity factor (DF) at high frequencies as the number of microphones increases. In this paper, we reveal that the severe beampattern distortion is attributed to the deviation term of the synthesized beampattern while the deep nulls problem in the WNG and the DF is attributed to the violation of the distortionless constraint in the desired direction. We then propose two new design methods to avoid the degraded performance of LDMAs. Compared to the Maclaurin series expansion-based method, the first method additionally imposes the distortionless constraint in the desired direction, and the deep nulls problem in the WNG and the DF can be avoided. The second method explicitly requires the response of the higher order spatial directivity pattern in the deviation term to be zero, and thus the beampattern distortion can be avoided. By choosing the frequency-wise parameter that determines the number of the considered higher order spatial directivity patterns, the second method enables a good trade-off between the WNG and the beampattern distortion. Simulations exemplify the superiority of the proposed method against existing methods in terms of the robustness and the beampattern distortion.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于麦克劳林扩展的线性差分麦克风阵列的理论分析和改进方案

线性差分传声器阵列（LDMA）因其潜在的高指向性增益和频率不变的振型而越来越受欢迎。通过增加传声器数量，基于麦克劳林扩展的 LDMA 解决了传统 LDMA 在低频下固有的鲁棒性差的问题。然而，随着麦克风数量的增加，这种方法会遇到严重的贝型失真以及高频白噪声增益（WNG）和指向性因子（DF）的深空问题。本文揭示了严重的贝型失真归因于合成贝型的偏差项，而白噪增益（WNG）和指向性因子（DF）中的深空问题则归因于在所需方向上违反了无失真约束。我们随后提出了两种新的设计方法，以避免 LDMA 性能下降。与基于 Maclaurin 数列展开的方法相比，第一种方法在所需方向上额外施加了无失真约束，从而避免了 WNG 和 DF 中的深空问题。第二种方法明确要求偏差项中高阶空间指向性模式的响应为零，因此可以避免振型失真。通过选择决定所考虑的高阶空间指向性模式数量的频率参数，第二种方法可以在 WNG 和贝叶斯失真之间实现良好的权衡。仿真结果表明，与现有方法相比，所提出的方法在鲁棒性和振铃失真方面更具优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE/ACM Transactions on Audio, Speech, and Language Processing ACOUSTICS-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

11.30

自引率

11.10%

发文量

217

期刊介绍： The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.

期刊最新文献

List of Reviewers IPDnet: A Universal Direct-Path IPD Estimation Network for Sound Source Localization MO-Transformer: Extract High-Level Relationship Between Words for Neural Machine Translation Online Neural Speaker Diarization With Target Speaker Tracking Blind Audio Bandwidth Extension: A Diffusion-Based Zero-Shot Approach