Room Acoustic Rendering Networks With Control of Scattering and Early Reflections

IF 5.1 2区计算机科学 Q1 ACOUSTICS IEEE/ACM Transactions on Audio, Speech, and Language Processing Pub Date : 2024-08-02 DOI:10.1109/TASLP.2024.3436702

Matteo Scerbo;Lauri Savioja;Enzo De Sena

{"title":"Room Acoustic Rendering Networks With Control of Scattering and Early Reflections","authors":"Matteo Scerbo;Lauri Savioja;Enzo De Sena","doi":"10.1109/TASLP.2024.3436702","DOIUrl":null,"url":null,"abstract":"Room acoustic synthesis can be used in virtual reality (VR), augmented reality (AR) and gaming applications to enhance listeners' sense of immersion, realism and externalisation. A common approach is to use geometrical acoustics (GA) models to compute impulse responses at interactive speed, and fast convolution methods to apply said responses in real time. Alternatively, delay-network-based models are capable of modeling certain aspects of room acoustics, but with a significantly lower computational cost. In order to bridge the gap between these classes of models, recent work introduced delay network designs that approximate Acoustic Radiance Transfer (ART), a geometrical acoustics (GA) model that simulates the transfer of acoustic energy between discrete surface patches in an environment. This paper presents two key extensions of such designs. The first extension involves a new physically-based and stability-preserving design of the feedback matrices, enabling more accurate control of scattering and, more in general, of late reverberation properties. The second extension allows an arbitrary number of early reflections to be modeled with high accuracy, meaning the network can be scaled at will between computational cost and early reverberation precision. The proposed extensions are compared to the baseline ART-approximating delay network as well as two reference GA models. The evaluation is based on objective measures of perceptually-relevant features, including frequency-dependent reverberation times, echo density build-up, and early decay time. Results show how the proposed extensions result in a significant improvement over the baseline model, especially for the case of non-convex geometries or the case of unevenly distributed wall absorption, both scenarios of broad practical interest.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"3745-3758"},"PeriodicalIF":5.1000,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10620637/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Room acoustic synthesis can be used in virtual reality (VR), augmented reality (AR) and gaming applications to enhance listeners' sense of immersion, realism and externalisation. A common approach is to use geometrical acoustics (GA) models to compute impulse responses at interactive speed, and fast convolution methods to apply said responses in real time. Alternatively, delay-network-based models are capable of modeling certain aspects of room acoustics, but with a significantly lower computational cost. In order to bridge the gap between these classes of models, recent work introduced delay network designs that approximate Acoustic Radiance Transfer (ART), a geometrical acoustics (GA) model that simulates the transfer of acoustic energy between discrete surface patches in an environment. This paper presents two key extensions of such designs. The first extension involves a new physically-based and stability-preserving design of the feedback matrices, enabling more accurate control of scattering and, more in general, of late reverberation properties. The second extension allows an arbitrary number of early reflections to be modeled with high accuracy, meaning the network can be scaled at will between computational cost and early reverberation precision. The proposed extensions are compared to the baseline ART-approximating delay network as well as two reference GA models. The evaluation is based on objective measures of perceptually-relevant features, including frequency-dependent reverberation times, echo density build-up, and early decay time. Results show how the proposed extensions result in a significant improvement over the baseline model, especially for the case of non-convex geometries or the case of unevenly distributed wall absorption, both scenarios of broad practical interest.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

控制散射和早期反射的室内声学渲染网络

室内声学合成可用于虚拟现实（VR）、增强现实（AR）和游戏应用，以增强听众的沉浸感、真实感和外在化。一种常见的方法是使用几何声学（GA）模型以交互速度计算脉冲响应，并使用快速卷积方法实时应用上述响应。另外，基于延迟网络的模型也能对房间声学的某些方面进行建模，但计算成本要低得多。为了缩小这两类模型之间的差距，最近的工作引入了近似声辐射传递（ART）的延迟网络设计，这是一种几何声学（GA）模型，用于模拟环境中离散表面斑块之间的声能传递。本文介绍了此类设计的两个关键扩展。第一个扩展是对反馈矩阵进行新的基于物理和保持稳定的设计，从而能够更精确地控制散射，并更广泛地控制后期混响特性。第二个扩展允许对任意数量的早期反射进行高精度建模，这意味着可以在计算成本和早期混响精度之间随意调整网络规模。我们将所提出的扩展功能与基准 ART 近似延迟网络以及两个参考 GA 模型进行了比较。评估基于感知相关特征的客观测量，包括频率相关混响时间、回声密度积累和早期衰减时间。结果表明，与基线模型相比，所提出的扩展方案有了显著的改进，尤其是在非凸几何形状或墙壁吸收分布不均的情况下，这两种情况都具有广泛的实际意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE/ACM Transactions on Audio, Speech, and Language Processing ACOUSTICS-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

11.30

自引率

11.10%

发文量

217

期刊介绍： The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.

期刊最新文献

List of Reviewers IPDnet: A Universal Direct-Path IPD Estimation Network for Sound Source Localization MO-Transformer: Extract High-Level Relationship Between Words for Neural Machine Translation Online Neural Speaker Diarization With Target Speaker Tracking Blind Audio Bandwidth Extension: A Diffusion-Based Zero-Shot Approach