{"title":"Optimal Modal Decomposition for Directionally Biased Sound Field Recording","authors":"Hao Gao;Junlong Ren;Jiazheng Cheng;Yong Shen","doi":"10.1109/TASLP.2024.3420252","DOIUrl":null,"url":null,"abstract":"Sound field recording aims to capture and preserve the information of the sound field in a specific area. Typically, the recorded sound field is decomposed as a superposition of a set of modes. Spherical harmonic functions are often used as basis functions for the modal decomposition, and they are optimal for directionally unbiased sound field recording, but the sound field recording problems in many practical application scenarios are directionally biased. However, most conventional directionally biased modal decomposition methods are non-optimal or have limited applications for sound field recording. In this paper, an optimal modal decomposition for directionally biased sound field recording is proposed, which minimizes the least-square error of the directionally biased sound field recording. This paper formulates the optimization problem of the modal decomposition with the consideration of the sound wave distribution and the directional importance. After that, the optimization problem is discretized and then solved to obtain the optimal basis functions for modal decomposition. To estimate the modal coefficients by using the spherical microphone array, the corresponding optimal encoding matrix is also derived. Finally, several simulations and experiments are presented to verify the proposed method. The results indicate that the proposed method performs well for directionally biased sound field recording.","PeriodicalId":13332,"journal":{"name":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","volume":"32 ","pages":"3424-3436"},"PeriodicalIF":4.1000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Audio, Speech, and Language Processing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10577235/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Sound field recording aims to capture and preserve the information of the sound field in a specific area. Typically, the recorded sound field is decomposed as a superposition of a set of modes. Spherical harmonic functions are often used as basis functions for the modal decomposition, and they are optimal for directionally unbiased sound field recording, but the sound field recording problems in many practical application scenarios are directionally biased. However, most conventional directionally biased modal decomposition methods are non-optimal or have limited applications for sound field recording. In this paper, an optimal modal decomposition for directionally biased sound field recording is proposed, which minimizes the least-square error of the directionally biased sound field recording. This paper formulates the optimization problem of the modal decomposition with the consideration of the sound wave distribution and the directional importance. After that, the optimization problem is discretized and then solved to obtain the optimal basis functions for modal decomposition. To estimate the modal coefficients by using the spherical microphone array, the corresponding optimal encoding matrix is also derived. Finally, several simulations and experiments are presented to verify the proposed method. The results indicate that the proposed method performs well for directionally biased sound field recording.
期刊介绍:
The IEEE/ACM Transactions on Audio, Speech, and Language Processing covers audio, speech and language processing and the sciences that support them. In audio processing: transducers, room acoustics, active sound control, human audition, analysis/synthesis/coding of music, and consumer audio. In speech processing: areas such as speech analysis, synthesis, coding, speech and speaker recognition, speech production and perception, and speech enhancement. In language processing: speech and text analysis, understanding, generation, dialog management, translation, summarization, question answering and document indexing and retrieval, as well as general language modeling.