L. McCormack, A. Politis, Thomas McKenzie, C. Hold, V. Pulkki
This article proposes a system for object-based six-degrees-of-freedom (6DoF) rendering of spatial sound scenes that are captured using a distributed arrangement of multiple Ambisonic receivers. The approach is based on first identifying and tracking the positions of sound sources within the scene, followed by the isolation of their signals through the use of beamformers. These sound objects are subsequently spatialized over the target playback setup, with respect to both the head orientation and position of the listener. The diffuse ambience of the scene is rendered separately by first spatially subtracting the source signals from the receivers located nearest to the listener position. The resultant residual Ambisonic signals are then spatialized, decorrelated, and summed together with suitable interpolation weights. The proposed system is evaluated through an in situ listening test conducted in 6DoF virtual reality, whereby real-world sound sources are compared with the auralization achieved through the proposed rendering method. The results of 15 participants suggest that in comparison to a linear interpolation-based alternative, the proposed object-based approach is perceived as being more realistic.
{"title":"Object-Based Six-Degrees-of-Freedom Rendering of Sound Scenes Captured with Multiple Ambisonic Receivers","authors":"L. McCormack, A. Politis, Thomas McKenzie, C. Hold, V. Pulkki","doi":"10.17743/jaes.2022.0010","DOIUrl":"https://doi.org/10.17743/jaes.2022.0010","url":null,"abstract":"This article proposes a system for object-based six-degrees-of-freedom (6DoF) rendering of spatial sound scenes that are captured using a distributed arrangement of multiple Ambisonic receivers. The approach is based on first identifying and tracking the positions of sound sources within the scene, followed by the isolation of their signals through the use of beamformers. These sound objects are subsequently spatialized over the target playback setup, with respect to both the head orientation and position of the listener. The diffuse ambience of the scene is rendered separately by first spatially subtracting the source signals from the receivers located nearest to the listener position. The resultant residual Ambisonic signals are then spatialized, decorrelated, and summed together with suitable interpolation weights. The proposed system is evaluated through an in situ listening test conducted in 6DoF virtual reality, whereby real-world sound sources are compared with the auralization achieved through the proposed rendering method. The results of 15 participants suggest that in comparison to a linear interpolation-based alternative, the proposed object-based approach is perceived as being more realistic.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45467816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Critical listening is the live-sound audio engineer’s most essential tool for informed sonic assessment. In producing a cohesive mix that fulfills an event’s aims, audio engineers affect the experience and well-being of all live-sound participants. This study compares the results from a 2020 international audio engineer survey with published research. The findings demonstrate that although in theory, engineers recognize their hearing as being their most essential critical listening tool, in practice, many have not found ways to manage their hearing and optimize their assessment ability effectively. Many engineers with impeded or impaired hearing continue to mix, believing that any negative impact on participants is minimal or nonexistent. The live-sound experience and participant health and well-being are improved by promoting and acting on appropriate hearing management practices.
{"title":"Managing the Live-Sound Audio Engineer’s Most Essential Critical Listening Tool","authors":"Stephen Compton","doi":"10.17743/jaes.2021.0065","DOIUrl":"https://doi.org/10.17743/jaes.2021.0065","url":null,"abstract":"Critical listening is the live-sound audio engineer’s most essential tool for informed sonic assessment. In producing a cohesive mix that fulfills an event’s aims, audio engineers affect the experience and well-being of all live-sound participants. This study compares the results from a 2020 international audio engineer survey with published research. The findings demonstrate that although in theory, engineers recognize their hearing as being their most essential critical listening tool, in practice, many have not found ways to manage their hearing and optimize their assessment ability effectively. Many engineers with impeded or impaired hearing continue to mix, believing that any negative impact on participants is minimal or nonexistent. The live-sound experience and participant health and well-being are improved by promoting and acting on appropriate hearing management practices.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43707582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper reviews the literature on critical listening education. Broadly speaking, academic research in this field is often limited to qualitative descriptions of curriculum and studies on the effectiveness of technical ear training. Furthermore audio engineering textbooks often view critical listening as secondary to technical concepts. To provide a basis for the development of curriculum and training, this paper investigates both academic and non-academic work in the field. Consequently a range of common curriculum topics is advanced as the focus areas in current practice. Moreover this paper uncovers pedagogical best practice for training sequence and the use of sounds/sight within instruction. A range of specific instructional activities, such as technical ear training, is also explored, thus providing insights into training in this field. Beyond a direct benefit to pedagogues, it is hoped that this review of the literature can provide a starting point for research in critical listening education.
{"title":"A Review of Literature in Critical Listening Education","authors":"Stephane Elmosnino","doi":"10.17743/jaes.2022.0004","DOIUrl":"https://doi.org/10.17743/jaes.2022.0004","url":null,"abstract":"This paper reviews the literature on critical listening education. Broadly speaking, academic research in this field is often limited to qualitative descriptions of curriculum and studies on the effectiveness of technical ear training. Furthermore audio engineering textbooks often view critical listening as secondary to technical concepts. To provide a basis for the development of curriculum and training, this paper investigates both academic and non-academic work in the field. Consequently a range of common curriculum topics is advanced as the focus areas in current practice. Moreover this paper uncovers pedagogical best practice for training sequence and the use of sounds/sight within instruction. A range of specific instructional activities, such as technical ear training, is also explored, thus providing insights into training in this field. Beyond a direct benefit to pedagogues, it is hoped that this review of the literature can provide a starting point for research in critical listening education.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43989606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Disembodied electronic sounds constitute a large part of the modern auditory lexicon, but research into timbre perception has focused mostly on the tones of conventional acoustic musical instruments. It is unclear whether insights from these studies generalize to electronic sounds, nor is it obvious how these relate to the creation of such sounds. This work presents an experiment on the semantic associations of sounds produced by FM synthesis with the aim of identifying whether existing models of timbre semantics are appropriate for such sounds. A novel experimental paradigm, in which experienced sound designers responded to semantic prompts by programming a synthesizer, was applied, and semantic ratings on the sounds they created were provided. Exploratory factor analysis revealed a five-dimensional semantic space. The first two factors mapped well to the concepts of luminance, texture, and mass. The remaining three factors did not have clear parallels, but correlation analysis with acoustic descriptors suggested an acoustical relationship to luminance and texture. The results suggest that further inquiry into the timbres of disembodied electronic sounds, their synthesis, and their semantic associations would be worthwhile and that this could benefit research into auditory perception and cognition and synthesis control and audio engineering.
{"title":"Disembodied Timbres: A Study on Semantically Prompted FM Synthesis","authors":"B. Hayes, C. Saitis, György Fazekas","doi":"10.17743/jaes.2022.0006","DOIUrl":"https://doi.org/10.17743/jaes.2022.0006","url":null,"abstract":"Disembodied electronic sounds constitute a large part of the modern auditory lexicon, but research into timbre perception has focused mostly on the tones of conventional acoustic musical instruments. It is unclear whether insights from these studies generalize to electronic sounds, nor is it obvious how these relate to the creation of such sounds. This work presents an experiment on the semantic associations of sounds produced by FM synthesis with the aim of identifying whether existing models of timbre semantics are appropriate for such sounds. A novel experimental paradigm, in which experienced sound designers responded to semantic prompts by programming a synthesizer, was applied, and semantic ratings on the sounds they created were provided. Exploratory factor analysis revealed a five-dimensional semantic space. The first two factors mapped well to the concepts of luminance, texture, and mass. The remaining three factors did not have clear parallels, but correlation analysis with acoustic descriptors suggested an acoustical relationship to luminance and texture. The results suggest that further inquiry into the timbres of disembodied electronic sounds, their synthesis, and their semantic associations would be worthwhile and that this could benefit research into auditory perception and cognition and synthesis control and audio engineering.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44242087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
May Jorella S. Lazaro, Sungho Kim, Minsik Choi, Ki-Saeng Kim, Dongchul Park, Soyoun Moon, M. Yun
{"title":"Design and Evaluation of Electric Vehicle Sound Using Granular Synthesis","authors":"May Jorella S. Lazaro, Sungho Kim, Minsik Choi, Ki-Saeng Kim, Dongchul Park, Soyoun Moon, M. Yun","doi":"10.17743/jaes.2021.0062","DOIUrl":"https://doi.org/10.17743/jaes.2021.0062","url":null,"abstract":"","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48412719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Voishvillo, Balázs Kákonyi, B. Mclaughlin
{"title":"Comparison of Different Methods to Measure Acoustical Impedance of Horns","authors":"Alexander Voishvillo, Balázs Kákonyi, B. Mclaughlin","doi":"10.17743/jaes.2022.0003","DOIUrl":"https://doi.org/10.17743/jaes.2022.0003","url":null,"abstract":"","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43752260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article investigates the impact of two commonly used Head-Related Transfer Function (HRTF) processing/modeling methods on the perceived spatial accuracy of binaural data by monitoring changes in user ratings of non-individualized HRTFs. The evaluated techniques are minimum-phase approximation and Infinite-Impulse Response (IIR) modeling. The study is based on the hypothesis that user-assessments should remain roughly unchanged, as long as the range of signal variations between processed and unprocessed (reference) HRTFs lies within ranges previously reported as perceptually insignificant. Objective assessments of the degree of spectral variations between reference and processed data, computed using the Spectral Distortion metric, showed no evident perceptually relevant variations in the minimum-phase data and spectral differences marginally exceeding the established thresholds for the IIR data, implying perceptual equivalence of spatial impression in the tested corpus. Nevertheless analysis of user responses in the perceptual study strongly indicated that variations introduced in the data by the tested methods of HRTF processing can lead to inversions in quality assessment, resulting in the perceptual rejection of HRTFs that were previously characterized in the ratings as the “most appropriate” or alternatively in the preference of datasets that were previously dismissed as “unfit.” The effect appears more apparent for IIR processing and is equally evident across the evaluated horizontal and median planes.
{"title":"Perceptual Impact on Localization Quality Evaluations of Common Pre-Processing for Non-Individual Head-Related Transfer Functions","authors":"Areti Andreopoulou, B. Katz","doi":"10.17743/jaes.2022.0008","DOIUrl":"https://doi.org/10.17743/jaes.2022.0008","url":null,"abstract":"This article investigates the impact of two commonly used Head-Related Transfer Function (HRTF) processing/modeling methods on the perceived spatial accuracy of binaural data by monitoring changes in user ratings of non-individualized HRTFs. The evaluated techniques are minimum-phase approximation and Infinite-Impulse Response (IIR) modeling. The study is based on the hypothesis that user-assessments should remain roughly unchanged, as long as the range of signal variations between processed and unprocessed (reference) HRTFs lies within ranges previously reported as perceptually insignificant. Objective assessments of the degree of spectral variations between reference and processed data, computed using the Spectral Distortion metric, showed no evident perceptually relevant variations in the minimum-phase data and spectral differences marginally exceeding the established thresholds for the IIR data, implying perceptual equivalence of spatial impression in the tested corpus. Nevertheless analysis of user responses in the perceptual study strongly indicated that variations introduced in the data by the tested methods of HRTF processing can lead to inversions in quality assessment, resulting in the perceptual rejection of HRTFs that were previously characterized in the ratings as the “most appropriate” or alternatively in the preference of datasets that were previously dismissed as “unfit.” The effect appears more apparent for IIR processing and is equally evident across the evaluated horizontal and median planes.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44460557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Isaac Engel, D. Alon, Kevin Scheumann, Jeff Crukley, Ravish Mehra
When reproducing spatial audio over headphones, ensuring that these have a flat frequency response is important to produce an accurate rendering. However, previous studies suggest that, when reproducing nonspatial content such as stereo music, the headphone response should resemble that of a loudspeaker system in a listening room (e.g., the so-called Harman target). It is not yet clear whether a pair of headphones calibrated in such way would be preferred by listeners for spatial audio reproduction too. This study investigates how listeners’ preference regarding headphone frequency response differs in the cases of stereo and spatial audio content reproduction, rendered using individual binaural room impulse responses. Three listening tests that evaluate seven different target headphone responses, two headphones, and two reproduction bandwidths are presented with over 20 listeners per test. Results suggest that a flat headphone response is preferred when listening to spatial audio content, whereas the Harman target was preferred for stereo content. This effect was found to be stronger when user-specific equalization was used and was not significantly affected by the choice of headphone or reproduction bandwidth.
{"title":"On the Differences in Preferred Headphone Response for Spatial and Stereo Content","authors":"Isaac Engel, D. Alon, Kevin Scheumann, Jeff Crukley, Ravish Mehra","doi":"10.17743/jaes.2022.0005","DOIUrl":"https://doi.org/10.17743/jaes.2022.0005","url":null,"abstract":"When reproducing spatial audio over headphones, ensuring that these have a flat frequency response is important to produce an accurate rendering. However, previous studies suggest that, when reproducing nonspatial content such as stereo music, the headphone response should resemble that of a loudspeaker system in a listening room (e.g., the so-called Harman target). It is not yet clear whether a pair of headphones calibrated in such way would be preferred by listeners for spatial audio reproduction too. This study investigates how listeners’ preference regarding headphone frequency response differs in the cases of stereo and spatial audio content reproduction, rendered using individual binaural room impulse responses. Three listening tests that evaluate seven different target headphone responses, two headphones, and two reproduction bandwidths are presented with over 20 listeners per test. Results suggest that a flat headphone response is preferred when listening to spatial audio content, whereas the Harman target was preferred for stereo content. This effect was found to be stronger when user-specific equalization was used and was not significantly affected by the choice of headphone or reproduction bandwidth.","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44732070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Variable Resonator Cap for User-Definable Microphone High Frequency Response","authors":"Benjamin Grigg","doi":"10.17743/jaes.2021.0063","DOIUrl":"https://doi.org/10.17743/jaes.2021.0063","url":null,"abstract":"","PeriodicalId":50008,"journal":{"name":"Journal of the Audio Engineering Society","volume":" ","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48303202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}