首页 > 最新文献

Journal of the Audio Engineering Society最新文献

英文 中文
Effects of Torso Location and Rotation to HRTF 躯干位置和旋转对 HRTF 的影响
IF 1.1 4区 工程技术 Q3 ACOUSTICS Pub Date : 2024-07-18 DOI: 10.17743/jaes.2022.0152
Jaan Johansson, A. Mäkivirta, Matti Malinen
The significance of representing realistic torso orientation relative to the head in the head-related transfer function (HRTF) is studied in this work. Actual head position relative to the torso is found for 195 persons. The effect of the head position in HRTF is studied by modifying the 3D model of a Kemar head-and-torso simulator geometry by translating the head relative to torso in up-down and forward-backward directions and rotating the torso. The spectral difference is compared to that seen in the closest matching actual persons. Forward-backward location of the head has the strongest influence in the HRTF. The spectral difference between the fixed and rotated torso spectra can exceed a 1-dB limit for all sound arrival azimuth directions when the torso rotation exceeds 10°. The spectral difference decreases with increasing source elevation. A subjective listening test with personal HRTF demonstrates that the spectral effect of the torso rotation are audible as a sound color and location changes. The HRTF data in this work is found by calculating the sound field using the boundary element method and the 3D shape of the person acquired using photogrammetry.
这项工作研究了在头部相关传递函数(HRTF)中表示相对于头部的实际躯干方向的意义。研究发现了 195 人的头部相对于躯干的实际位置。通过修改 Kemar 头部和躯干模拟器的三维几何模型,在上下和前后方向上平移头部相对于躯干的位置并旋转躯干,研究了头部位置对 HRTF 的影响。将光谱差异与最接近的真实人物的光谱差异进行比较。头部前后位置对 HRTF 的影响最大。当躯干旋转超过 10° 时,固定躯干和旋转躯干频谱之间的频谱差在所有声音到达方位角方向上都会超过 1 分贝的限制。频谱差随着声源仰角的增加而减小。使用个人 HRTF 进行的主观听力测试表明,随着声音颜色和位置的变化,可以听到躯干旋转的频谱效应。这项工作中的 HRTF 数据是通过使用边界元法计算声场和使用摄影测量法获取人的三维形状而得到的。
{"title":"Effects of Torso Location and Rotation to HRTF","authors":"Jaan Johansson, A. Mäkivirta, Matti Malinen","doi":"10.17743/jaes.2022.0152","DOIUrl":"https://doi.org/10.17743/jaes.2022.0152","url":null,"abstract":"The significance of representing realistic torso orientation relative to the head in the head-related transfer function (HRTF) is studied in this work. Actual head position relative to the torso is found for 195 persons. The effect of the head position in HRTF is studied by modifying the 3D model of a Kemar head-and-torso simulator geometry by translating the head relative to torso in up-down and forward-backward directions and rotating the torso. The spectral difference is compared to that seen in the closest matching actual persons. Forward-backward location of the head has the strongest influence in the HRTF. The spectral difference between the fixed and rotated torso spectra can exceed a 1-dB limit for all sound arrival azimuth directions when the torso rotation exceeds 10°. The spectral difference decreases with increasing source elevation. A subjective listening test with personal HRTF demonstrates that the spectral effect of the torso rotation are audible as a sound color and location changes. The HRTF data in this work is found by calculating the sound field using the boundary element method and the 3D shape of the person acquired using photogrammetry.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141826125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Perceptual Comparison of Dynamic Binaural Reproduction Methods for Sparse Head-Mounted Microphone Arrays 稀疏头戴式麦克风阵列动态双耳再现方法的感知比较
IF 1.1 4区 工程技术 Q3 ACOUSTICS Pub Date : 2024-07-18 DOI: 10.17743/jaes.2022.0140
Benjamin Stahl, Stefan Riedel
This paper presents results of a listening experiment evaluating three-degrees-of-freedom binaural reproduction of head-mounted microphone array signals. The methods are applied to an array of five microphones whose signals were simulated for static and dynamic array orientations. Methods under test involve scene-agnostic binaural reproduction methods as well as methods that have knowledge of (a subset of) source directions. The results of an instrumental evaluation reveal errors in the reproduction of interaural level and time differences for all scene-agnostic methods, which are smallest for the end-to-end magnitude-least-squares method. Additionally, the inherent localization robustness of the array under test and different simulated microphone arrays is investigated and discussed, which is of interest for a parametric reproduction method included in the study. In the listening experiment, the end-to-end magnitude-least-squares reproduction method outperforms other scene-agnostic approaches. Above all, linearly constrained beamformers using known source directions in combination with the end-to-end magnitude-least-squares method outcompete the scene-agnostic methods in perceived quality, especially for a rotating microphone array under anechoic conditions.
本文介绍了评估头戴式麦克风阵列信号的三自由度双耳再现的听力实验结果。这些方法适用于由五个麦克风组成的阵列,其信号是针对静态和动态阵列方向进行模拟的。测试的方法包括与场景无关的双耳再现方法以及了解(部分)声源方向的方法。仪器评估的结果显示,所有场景无关的方法在再现耳际电平和时间差方面都存在误差,其中端到端幅度最小二乘法的误差最小。此外,还对被测阵列和不同模拟麦克风阵列固有的定位稳健性进行了研究和讨论,这对研究中的参数再现方法很有意义。在听音实验中,端到端幅度最小二乘重现方法优于其他场景识别方法。最重要的是,使用已知声源方向的线性约束波束形成器与端到端幅度最小二乘法相结合,在感知质量上优于场景无关方法,尤其是在消声条件下的旋转传声器阵列。
{"title":"Perceptual Comparison of Dynamic Binaural Reproduction Methods for Sparse Head-Mounted Microphone Arrays","authors":"Benjamin Stahl, Stefan Riedel","doi":"10.17743/jaes.2022.0140","DOIUrl":"https://doi.org/10.17743/jaes.2022.0140","url":null,"abstract":"This paper presents results of a listening experiment evaluating three-degrees-of-freedom binaural reproduction of head-mounted microphone array signals. The methods are applied to an array of five microphones whose signals were simulated for static and dynamic array orientations. Methods under test involve scene-agnostic binaural reproduction methods as well as methods that have knowledge of (a subset of) source directions. The results of an instrumental evaluation reveal errors in the reproduction of interaural level and time differences for all scene-agnostic methods, which are smallest for the end-to-end magnitude-least-squares method. Additionally, the inherent localization robustness of the array under test and different simulated microphone arrays is investigated and discussed, which is of interest for a parametric reproduction method included in the study. In the listening experiment, the end-to-end magnitude-least-squares reproduction method outperforms other scene-agnostic approaches. Above all, linearly constrained beamformers using known source directions in combination with the end-to-end magnitude-least-squares method outcompete the scene-agnostic methods in perceived quality, especially for a rotating microphone array under anechoic conditions.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141824152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Singer and Audience Evaluations of a Networked Immersive Audio Concert 歌手和观众对网络沉浸式音频音乐会的评价
IF 1.1 4区 工程技术 Q3 ACOUSTICS Pub Date : 2024-07-18 DOI: 10.17743/jaes.2022.0145
Patrick Cairns, Tomasz Rudzki, J. Cooper, Anthony Hunt, Kim Steele, Gerardo Acosta Martínez, Andrew Chadwick, Helena Daffern, Gavin Kearney
At the 2023 AES International Conference on Spatial and Immersive Audio, a networked immersive audio concert was performed. A vocal octet connected over the Internet between York and Huddersfield and provided a performance that was auralized in the acoustics of BBC Maida Vale Studio 2. A live audience in Huddersfield experienced the concert with local singers on stage, remote singers auralized alongside, and virtual acoustics rendered on a multichannel array. Another audience in York listened to the concert on headphones. An evaluation of the networked concert experience of the performers and audience is presented in this paper. Results demonstrate that a generally high-quality experience was delivered. Audience response to immersive audio rating items demonstrates a variance in experience. Several aspects of the evaluation context are identified as relevant to this rating variance and discussed as open challenges for audio engineers.
在 2023 年 AES 国际空间和沉浸式音频会议上,上演了一场网络沉浸式音频音乐会。一个八重唱小组通过互联网连接了约克和哈德斯菲尔德两地,并在英国广播公司梅达谷 2 号演播室的音响效果下进行了表演。哈德斯菲尔德的现场观众体验了这场音乐会,当地歌手在舞台上演唱,远程歌手在一旁听觉化,多通道阵列上呈现虚拟声学效果。约克的另一名观众则通过耳机聆听了音乐会。本文对表演者和观众的网络音乐会体验进行了评估。结果表明,观众普遍获得了高质量的体验。观众对沉浸式音频评分项目的反应显示了体验的差异。评估环境的几个方面被认为与这种评分差异有关,并作为音频工程师面临的挑战进行了讨论。
{"title":"Singer and Audience Evaluations of a Networked Immersive Audio Concert","authors":"Patrick Cairns, Tomasz Rudzki, J. Cooper, Anthony Hunt, Kim Steele, Gerardo Acosta Martínez, Andrew Chadwick, Helena Daffern, Gavin Kearney","doi":"10.17743/jaes.2022.0145","DOIUrl":"https://doi.org/10.17743/jaes.2022.0145","url":null,"abstract":"At the 2023 AES International Conference on Spatial and Immersive Audio, a networked immersive audio concert was performed. A vocal octet connected over the Internet between York and Huddersfield and provided a performance that was auralized in the acoustics of BBC Maida Vale Studio 2. A live audience in Huddersfield experienced the concert with local singers on stage, remote singers auralized alongside, and virtual acoustics rendered on a multichannel array. Another audience in York listened to the concert on headphones. An evaluation of the networked concert experience of the performers and audience is presented in this paper. Results demonstrate that a generally high-quality experience was delivered. Audience response to immersive audio rating items demonstrates a variance in experience. Several aspects of the evaluation context are identified as relevant to this rating variance and discussed as open challenges for audio engineers.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141827043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Practical Implementation of Automated Next Generation Audio Production for Live Sports 体育直播下一代音频自动制作的实际应用
IF 1.1 4区 工程技术 Q3 ACOUSTICS Pub Date : 2024-07-18 DOI: 10.17743/jaes.2022.0151
Aimée Moulson, Max Walley, Yannik Grewe, Rob Oldfield, Ben Shirley, Ulli Scuda
Producing a high-quality audio mix for a live sports production is a demanding task for mixing engineers. The management of many microphone signals and monitoring of various broadcast feeds mean engineers are often stretched, overseeing many tasks simultaneously. With the advancements in Next Generation Audio codecs providing many appealing features, such as interactivity and personalization to end users, consideration is needed as not to create further work for production staff. Therefore, the authors propose a novel approach to live sports production by combining an object-based audio workflow with the efficiency benefits of automated mixing. This paper describes how a fully object-based workflow can be built from point of capture to audience playback with minimal changes for the production staff. This was achieved by integrating Next Generation Audio authoring from the point of production, streamlining the workflow, and thus removing the need for additional authoring process later in the chain. As an exemplar, the authors applied this approach to a Premier League football match in a proof-of-concept trial.
为现场体育节目制作高质量的混音对混音工程师来说是一项艰巨的任务。管理许多麦克风信号和监控各种转播馈送意味着工程师经常要同时监督许多任务,工作强度很大。随着下一代音频编解码器的发展,为终端用户提供了许多吸引人的功能,如交互性和个性化,因此需要考虑避免给制作人员带来更多工作。因此,作者提出了一种新颖的现场体育节目制作方法,将基于对象的音频工作流程与自动混音的效率优势相结合。本文介绍了如何建立一个完全基于对象的工作流程,从采集点到观众回放,只需对制作人员做最小的改动。这是通过从制作点集成下一代音频创作、简化工作流程来实现的,因此无需在后期链中增加额外的创作流程。作为范例,作者将这种方法应用于一场英超足球联赛的概念验证试验中。
{"title":"Practical Implementation of Automated Next Generation Audio Production for Live Sports","authors":"Aimée Moulson, Max Walley, Yannik Grewe, Rob Oldfield, Ben Shirley, Ulli Scuda","doi":"10.17743/jaes.2022.0151","DOIUrl":"https://doi.org/10.17743/jaes.2022.0151","url":null,"abstract":"Producing a high-quality audio mix for a live sports production is a demanding task for mixing engineers. The management of many microphone signals and monitoring of various broadcast feeds mean engineers are often stretched, overseeing many tasks simultaneously. With the advancements in Next Generation Audio codecs providing many appealing features, such as interactivity and personalization to end users, consideration is needed as not to create further work for production staff. Therefore, the authors propose a novel approach to live sports production by combining an object-based audio workflow with the efficiency benefits of automated mixing. This paper describes how a fully object-based workflow can be built from point of capture to audience playback with minimal changes for the production staff. This was achieved by integrating Next Generation Audio authoring from the point of production, streamlining the workflow, and thus removing the need for additional authoring process later in the chain. As an exemplar, the authors applied this approach to a Premier League football match in a proof-of-concept trial.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141826952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial Sampling of Binaural Room Transfer Functions for Head-Tracked Personal Sound Zones 头戴式个人声区双耳室内传递函数的空间采样
IF 1.1 4区 工程技术 Q3 ACOUSTICS Pub Date : 2024-07-18 DOI: 10.17743/jaes.2022.0144
Yue Qiao, Jessica Luo, Edgar Choueiri
The spatial sampling of binaural room transfer functions that vary with listener movements, as required for rendering personal sound zone (PSZ) with head tracking, was experimentally investigated regarding its dependencies on various factors. Through measurements of the binaural room transfer functions in a practical PSZ system with either translational or rotational movements of one of the two mannequin listeners, the PSZ filters were generated along the measurement grid and then spatially downsampled to different resolutions, at which the isolation performance of the system was numerically simulated. It was found that the spatial sampling resolution generally depends on factors such as the moving listener’s position, frequency band of the rendered audio, and perturbation caused by the other listener. More specifically, the required sampling resolution is inversely proportional to the distance either between two listeners or between the moving listener and the loudspeakers and is proportional to the frequency of the rendered audio. The perturbation caused by the other listener may impair both the isolation performance and filter robustness against movements. Furthermore, two crossover frequencies were found to exist in the system, which divide the frequency band into three sub-bands, each with a distinctive requirement for spatial sampling.
我们通过实验研究了双耳房间传递函数的空间采样(随听者移动而变化)与各种因素的相关性,这是用头部跟踪技术呈现个人声区(PSZ)所必需的。通过测量实际 PSZ 系统中的双耳房间传递函数,以及两个人体模型听者之一的平移或旋转运动,沿测量网格生成 PSZ 滤波器,然后按不同分辨率进行空间降采样,并在此基础上对系统的隔离性能进行数值模拟。结果发现,空间采样分辨率一般取决于移动听者的位置、渲染音频的频段以及另一听者造成的扰动等因素。更具体地说,所需的采样分辨率与两个聆听者之间或移动聆听者与扬声器之间的距离成反比,与渲染音频的频率成正比。另一个聆听者造成的扰动可能会损害隔离性能和滤波器对移动的稳健性。此外,还发现系统中存在两个交叉频率,它们将频带分为三个子频带,每个子频带对空间采样都有独特的要求。
{"title":"Spatial Sampling of Binaural Room Transfer Functions for Head-Tracked Personal Sound Zones","authors":"Yue Qiao, Jessica Luo, Edgar Choueiri","doi":"10.17743/jaes.2022.0144","DOIUrl":"https://doi.org/10.17743/jaes.2022.0144","url":null,"abstract":"The spatial sampling of binaural room transfer functions that vary with listener movements, as required for rendering personal sound zone (PSZ) with head tracking, was experimentally investigated regarding its dependencies on various factors. Through measurements of the binaural room transfer functions in a practical PSZ system with either translational or rotational movements of one of the two mannequin listeners, the PSZ filters were generated along the measurement grid and then spatially downsampled to different resolutions, at which the isolation performance of the system was numerically simulated. It was found that the spatial sampling resolution generally depends on factors such as the moving listener’s position, frequency band of the rendered audio, and perturbation caused by the other listener. More specifically, the required sampling resolution is inversely proportional to the distance either between two listeners or between the moving listener and the loudspeakers and is proportional to the frequency of the rendered audio. The perturbation caused by the other listener may impair both the isolation performance and filter robustness against movements. Furthermore, two crossover frequencies were found to exist in the system, which divide the frequency band into three sub-bands, each with a distinctive requirement for spatial sampling.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141827122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Practitioners' Perspectives on Spatial Audio: Insights into Dolby Atmos and Binaural Mixes in Popular Music 从业人员对空间音频的看法:洞察流行音乐中的杜比全景声和双耳混音
IF 1.1 4区 工程技术 Q3 ACOUSTICS Pub Date : 2024-07-18 DOI: 10.17743/jaes.2022.0153
Christopher Dewey, Austin Moore, Hyunkook Lee
This paper presents the practitioners’ perspective on mixing popular music in spatial audio, particularly Dolby Atmos and the binaural mixes generated by the Dolby and Apple renderers. It presents the results of a dual-stage study, which utilized focus groups with eight professional music producers and a questionnaire completed by 140 practitioners. Analysis revealed the continued influence of stereo approaches on mix engineers, partly due to its historical dominance as a production platform and consumers’ continued use of headphones. It was also found that core elements of popular music productions, such as snare drums, tom-tom drums, kick drums, bass guitars, main guitars, and vocals, were less likely to have binaural processing applied compared with other sources. It was also shown there were perceived differences in the suitability of spatial audio mixing for specific genres, with electronic dance music, jazz, pop, classical, and world music rated as the most suitable. Regarding the binaural renderers, there was less user satisfaction with the Apple device compared with Dolby’s, and this dissatisfaction manifested mainly in the need for more user control. Finally, mix engineers were very aware of the importance of their mixes translating to smaller speaker systems and headphone playback, in particular.
本文介绍了从业人员对空间音频流行音乐混音的看法,特别是杜比 Atmos 以及杜比和苹果渲染器生成的双耳混音。该研究分为两个阶段,一是与 8 位专业音乐制作人进行焦点小组讨论,二是由 140 位从业人员填写调查问卷。分析表明,立体声方法对混音工程师的影响依然存在,部分原因是立体声作为一种制作平台在历史上占据主导地位,以及消费者继续使用耳机。分析还发现,流行音乐制作的核心元素,如小军鼓、嗵嗵鼓、踢踏鼓、低音吉他、主吉他和人声,与其他音源相比,采用双耳处理的可能性较小。研究还显示,空间音频混合对特定类型音乐的适合程度存在明显差异,电子舞曲、爵士乐、流行音乐、古典音乐和世界音乐被评为最适合的类型。关于双耳渲染器,用户对苹果设备的满意度低于对杜比设备的满意度,这种不满意主要表现在用户需要更多的控制。最后,混音工程师们都非常清楚他们的混音作品在较小的扬声器系统和耳机播放时的重要性。
{"title":"Practitioners' Perspectives on Spatial Audio: Insights into Dolby Atmos and Binaural Mixes in Popular Music","authors":"Christopher Dewey, Austin Moore, Hyunkook Lee","doi":"10.17743/jaes.2022.0153","DOIUrl":"https://doi.org/10.17743/jaes.2022.0153","url":null,"abstract":"This paper presents the practitioners’ perspective on mixing popular music in spatial audio, particularly Dolby Atmos and the binaural mixes generated by the Dolby and Apple renderers. It presents the results of a dual-stage study, which utilized focus groups with eight professional music producers and a questionnaire completed by 140 practitioners. Analysis revealed the continued influence of stereo approaches on mix engineers, partly due to its historical dominance as a production platform and consumers’ continued use of headphones. It was also found that core elements of popular music productions, such as snare drums, tom-tom drums, kick drums, bass guitars, main guitars, and vocals, were less likely to have binaural processing applied compared with other sources. It was also shown there were perceived differences in the suitability of spatial audio mixing for specific genres, with electronic dance music, jazz, pop, classical, and world music rated as the most suitable. Regarding the binaural renderers, there was less user satisfaction with the Apple device compared with Dolby’s, and this dissatisfaction manifested mainly in the need for more user control. Finally, mix engineers were very aware of the importance of their mixes translating to smaller speaker systems and headphone playback, in particular.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141827633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of Non-Parametric Interpolation Techniques for Sparsely Measured Binaural Room Impulse Responses 稀疏测量双耳室内脉冲响应的非参数插值技术比较
IF 1.1 4区 工程技术 Q3 ACOUSTICS Pub Date : 2024-07-18 DOI: 10.17743/jaes.2022.0150
David Bau, Hendrik Himmelein, Christoph Pörschmann
This study investigates different interpolation techniques for spatially upsampling Binaural Room Impulse Responses (BRIRs) measured on a sparse grid of view orientations. In this context, the authors recently presented the Spherical Array Interpolation by Time Alignment (SARITA) method for interpolating spherical microphone array signals with a limited number of microphones, which is adapted for the spatial upsampling of sparse BRIR datasets in the present work. SARITA is compared with two existing nonparametric BRIR-interpolation methods and naive linear interpolation. The study provides a technical and perceptual analysis of the interpolation performance. The results show the suitability of all interpolation methods apart from linear interpolation to achieving a realistic auralization, even for very sparse BRIR sets. For angular resolutions of 30° and real-world stimuli, most participants could not distinguish SARITA from an artifact-free reference.
本研究探讨了不同的插值技术,用于对在视角方向稀疏网格上测量的双耳室内脉冲响应(BRIR)进行空间升采样。在此背景下,作者最近提出了通过时间对齐进行球形阵列插值(SARITA)的方法,用于对麦克风数量有限的球形麦克风阵列信号进行插值。SARITA 与现有的两种非参数 BRIR 插值方法和天真线性插值方法进行了比较。研究提供了插值性能的技术和感知分析。结果表明,除线性插值外,所有插值方法都适用于实现逼真的听觉化,即使对于非常稀疏的 BRIR 集也是如此。对于 30° 的角度分辨率和真实世界的刺激物,大多数参与者无法将 SARITA 与无伪影的参照物区分开来。
{"title":"Comparison of Non-Parametric Interpolation Techniques for Sparsely Measured Binaural Room Impulse Responses","authors":"David Bau, Hendrik Himmelein, Christoph Pörschmann","doi":"10.17743/jaes.2022.0150","DOIUrl":"https://doi.org/10.17743/jaes.2022.0150","url":null,"abstract":"This study investigates different interpolation techniques for spatially upsampling Binaural Room Impulse Responses (BRIRs) measured on a sparse grid of view orientations. In this context, the authors recently presented the Spherical Array Interpolation by Time Alignment (SARITA) method for interpolating spherical microphone array signals with a limited number of microphones, which is adapted for the spatial upsampling of sparse BRIR datasets in the present work. SARITA is compared with two existing nonparametric BRIR-interpolation methods and naive linear interpolation. The study provides a technical and perceptual analysis of the interpolation performance. The results show the suitability of all interpolation methods apart from linear interpolation to achieving a realistic auralization, even for very sparse BRIR sets. For angular resolutions of 30° and real-world stimuli, most participants could not distinguish SARITA from an artifact-free reference.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141827823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Letting Pulsars Sing: Sonification With Granular Synthesis 让脉冲星歌唱:使用粒状合成的声化技术
IF 1.4 4区 工程技术 Q1 Arts and Humanities Pub Date : 2024-05-03 DOI: 10.17743/jaes.2022.0147
Mara Helmuth
An astronomy sonification project has been initiated to create sound and music from the data of pulsars in space. Pulsars are formed when some stars burn out all of their fuel and emit electromagnetic radiation, which hits earth periodically as the pulsar rotates. Each pulsar has unique characteristics. The source of the data is the online Pulsar Catalog from the Australian National Telescope Facility. The first result is a stereo fixed media composition, From Orion to Cassiopeia, which reveals a sweep of much of the Milky Way, displaying audio for many of the known pulsars. Galactic longitude, rotation speed, pulse width, mean flux density, age, and distance are mapped to granular synthesis parameters. Sound event duration, amplitude, amount of reverberation, grain rate, grain duration, grain frequency, and panning are controlled by the data. The piece was created with the new SGRAN2() instrument in the RTcmix music programming language.
一个天文学声化项目已经启动,目的是从太空脉冲星的数据中创造声音和音乐。脉冲星是在一些恒星燃尽所有燃料并发出电磁辐射后形成的,当脉冲星旋转时,电磁辐射会周期性地撞击地球。每颗脉冲星都有独特的特征。数据来源是澳大利亚国家望远镜设施的在线脉冲星目录。第一项成果是一个立体声固定媒体组合 "从猎户座到仙后座",揭示了银河系的大部分,显示了许多已知脉冲星的音频。银河经度、旋转速度、脉冲宽度、平均通量密度、年龄和距离都被映射为粒状合成参数。声音事件的持续时间、振幅、混响量、颗粒速率、颗粒持续时间、颗粒频率和平移均由数据控制。该作品使用 RTcmix 音乐编程语言中的新工具 SGRAN2() 制作。
{"title":"Letting Pulsars Sing: Sonification With Granular Synthesis","authors":"Mara Helmuth","doi":"10.17743/jaes.2022.0147","DOIUrl":"https://doi.org/10.17743/jaes.2022.0147","url":null,"abstract":"An astronomy sonification project has been initiated to create sound and music from the data of pulsars in space. Pulsars are formed when some stars burn out all of their fuel and emit electromagnetic radiation, which hits earth periodically as the pulsar rotates. Each pulsar has unique characteristics. The source of the data is the online Pulsar Catalog from the Australian National Telescope Facility. The first result is a stereo fixed media composition, From Orion to Cassiopeia, which reveals a sweep of much of the Milky Way, displaying audio for many of the known pulsars. Galactic longitude, rotation speed, pulse width, mean flux density, age, and distance are mapped to granular synthesis parameters. Sound event duration, amplitude, amount of reverberation, grain rate, grain duration, grain frequency, and panning are controlled by the data. The piece was created with the new SGRAN2() instrument in the RTcmix music programming language.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141017205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speech, Nonspeech Audio, and Visual Interruptions of a Tracking Task: A Replication and Extension of Nees and Sampsell (2021) 跟踪任务中的语音、非语音音频和视觉干扰:对 Nees 和 Sampsell(2021 年)的复制和扩展
IF 1.4 4区 工程技术 Q1 Arts and Humanities Pub Date : 2024-05-03 DOI: 10.17743/jaes.2022.0142
Michael A. Nees, Claire Liu, Krista Bogan
Interruptions from technology—such as alerts from mobile communication devices—are a pervasive aspect of modern life. Interruptions can be detrimental to performance of the ongoing, interrupted task. Designers often can choose whether interruptions are delivered as visual or auditory alerts. Contradictory theories have emerged regarding whether auditory or visual alerts are more harmful to performance of ongoing visual tasks. Multiple Resources Theory predicts better overall performance with auditory alerts, but Auditory Preemption Theory predicts better overall performance with visual alerts. Nees and Sampsell previously found that multitasking was superior with nonspeech auditory alerts as compared to visual alerts. In the current experiment, their methods were replicated and extended to include a speech auditory alerts condition. Performance of the ongoing tracking task was worse with interruption from visual alerts, and perceived workload also was highest in this condition. Reaction time to alerts was fastest with visual alerts. There also was converging evidence to suggest that performance with speech alerts was superior to performance with nonspeech tonal alerts. The current experiment replicated the results of Nees and Sampsell and extended their findings to speech alert sounds. Like in their study, the pattern of findings here support Multiple Resources Theory over Auditory Preemption Theory.
技术带来的干扰,如移动通信设备发出的警报,是现代生活中无处不在的一个方面。中断可能不利于正在进行的、被中断的任务的执行。设计者通常可以选择以视觉或听觉提示的方式来中断任务。关于听觉提示还是视觉提示更不利于正在进行的视觉任务,出现了一些相互矛盾的理论。多重资源理论认为,听觉提示的总体效果更好,但听觉抢占理论认为,视觉提示的总体效果更好。Nees 和 Sampsell 以前曾发现,与视觉警报相比,非语音听觉警报的多任务处理效果更好。在本次实验中,他们的方法得到了复制和扩展,加入了语音听觉警报条件。在视觉警报中断的情况下,正在进行的跟踪任务的成绩较差,在这种情况下,感知到的工作量也最大。视觉警报的反应时间最快。还有越来越多的证据表明,语音警报的表现优于非语音音调警报的表现。本实验重复了 Nees 和 Sampsell 的研究结果,并将他们的研究结果扩展到语音警报声。与他们的研究一样,本实验的研究结果也支持 "多重资源理论 "而非 "听觉抢占理论"。
{"title":"Speech, Nonspeech Audio, and Visual Interruptions of a Tracking Task: A Replication and Extension of Nees and Sampsell (2021)","authors":"Michael A. Nees, Claire Liu, Krista Bogan","doi":"10.17743/jaes.2022.0142","DOIUrl":"https://doi.org/10.17743/jaes.2022.0142","url":null,"abstract":"Interruptions from technology—such as alerts from mobile communication devices—are a pervasive aspect of modern life. Interruptions can be detrimental to performance of the ongoing, interrupted task. Designers often can choose whether interruptions are delivered as visual or auditory alerts. Contradictory theories have emerged regarding whether auditory or visual alerts are more harmful to performance of ongoing visual tasks. Multiple Resources Theory predicts better overall performance with auditory alerts, but Auditory Preemption Theory predicts better overall performance with visual alerts. Nees and Sampsell previously found that multitasking was superior with nonspeech auditory alerts as compared to visual alerts. In the current experiment, their methods were replicated and extended to include a speech auditory alerts condition. Performance of the ongoing tracking task was worse with interruption from visual alerts, and perceived workload also was highest in this condition. Reaction time to alerts was fastest with visual alerts. There also was converging evidence to suggest that performance with speech alerts was superior to performance with nonspeech tonal alerts. The current experiment replicated the results of Nees and Sampsell and extended their findings to speech alert sounds. Like in their study, the pattern of findings here support Multiple Resources Theory over Auditory Preemption Theory.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141015505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Natural Sonification Mapping for Handwriting 手写体的自然声化映射
IF 1.4 4区 工程技术 Q1 Arts and Humanities Pub Date : 2024-05-03 DOI: 10.17743/jaes.2022.0148
Katharina Groß-Vogt, Noah Rachdi, Matthias Frank
The sonification of handwriting has been shown effective in various learning tasks. In this paper, the authors investigate the sound design used for handwriting interaction based on a simple and cost-efficient prototype. The authentic interaction sound is compared with physically informed sonification designs that employ either natural or inverted mapping. In an experiment, participants copied text and drawings. The authors found simple measures of the structure-borne audio signal that showed how participants were affected in their movements, but only when drawing. In contrast, participants rated the sound features differently only for writing. The authentic interaction sound generally scored best, followed by a natural sonification mapping.
在各种学习任务中,手写的声音化已被证明是有效的。在本文中,作者基于一个简单、经济的原型,研究了用于手写交互的声音设计。真实的交互声音与采用自然或倒置映射的物理信息声化设计进行了比较。在一项实验中,参与者抄写了文字和图画。作者发现,结构传播音频信号的简单测量结果显示了参与者的动作受到的影响,但仅限于绘图时。相比之下,只有在书写时,参与者才会对声音特征进行不同的评分。一般来说,真实的交互声音得分最高,其次是自然的声音映射。
{"title":"A Natural Sonification Mapping for Handwriting","authors":"Katharina Groß-Vogt, Noah Rachdi, Matthias Frank","doi":"10.17743/jaes.2022.0148","DOIUrl":"https://doi.org/10.17743/jaes.2022.0148","url":null,"abstract":"The sonification of handwriting has been shown effective in various learning tasks. In this paper, the authors investigate the sound design used for handwriting interaction based on a simple and cost-efficient prototype. The authentic interaction sound is compared with physically informed sonification designs that employ either natural or inverted mapping. In an experiment, participants copied text and drawings. The authors found simple measures of the structure-borne audio signal that showed how participants were affected in their movements, but only when drawing. In contrast, participants rated the sound features differently only for writing. The authentic interaction sound generally scored best, followed by a natural sonification mapping.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141017521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the Audio Engineering Society
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1