Understanding the initial signature of noise-induced auditory damage remains a significant priority. Animal models suggest the cochlear base is particularly vulnerable to noise, raising the possibility that early-stage noise exposure could be linked to basal cochlear dysfunction, even when thresholds at 0.25-8 kHz are normal. To investigate this in humans, we conducted a meta-analysis following a systematic review, examining the association between noise exposure and hearing in frequencies from 9 to 20 kHz as a marker for basal cochlear dysfunction. Systematic review and meta-analysis followed PRISMA guidelines and the PICOS framework. Studies on noise exposure and hearing in the 9 to 20 kHz region in adults with clinically normal audiograms were included by searching five electronic databases (e.g., PubMed). Cohorts from 30 studies, comprising approximately 2,500 participants, were systematically reviewed. Meta-analysis was conducted on 23 studies using a random-effects model for occupational and recreational noise exposure. Analysis showed a significant positive association between occupational noise and hearing thresholds, with medium effect sizes at 9 and 11.2 kHz and large effect sizes at 10, 12, 14, and 16 kHz. However, the association with recreational noise was less consistent, with significant effects only at 12, 12.5, and 16 kHz. Egger's test indicated some publication bias, specifically at 10 kHz. Findings suggest thresholds above 8 kHz may indicate early noise exposure effects, even when lower-frequency (≤8 kHz) thresholds remain normal. Longitudinal studies incorporating noise dosimetry are crucial to establish causality and further support the clinical utility of extended high-frequency testing.
{"title":"Is Noise Exposure Associated With Impaired Extended High Frequency Hearing Despite a Normal Audiogram? A Systematic Review and Meta-Analysis.","authors":"Sajana Aryal, Monica Trevino, Hansapani Rodrigo, Srikanta Mishra","doi":"10.1177/23312165251343757","DOIUrl":"10.1177/23312165251343757","url":null,"abstract":"<p><p>Understanding the initial signature of noise-induced auditory damage remains a significant priority. Animal models suggest the cochlear base is particularly vulnerable to noise, raising the possibility that early-stage noise exposure could be linked to basal cochlear dysfunction, even when thresholds at 0.25-8 kHz are normal. To investigate this in humans, we conducted a meta-analysis following a systematic review, examining the association between noise exposure and hearing in frequencies from 9 to 20 kHz as a marker for basal cochlear dysfunction. Systematic review and meta-analysis followed PRISMA guidelines and the PICOS framework. Studies on noise exposure and hearing in the 9 to 20 kHz region in adults with clinically normal audiograms were included by searching five electronic databases (e.g., PubMed). Cohorts from 30 studies, comprising approximately 2,500 participants, were systematically reviewed. Meta-analysis was conducted on 23 studies using a random-effects model for occupational and recreational noise exposure. Analysis showed a significant positive association between occupational noise and hearing thresholds, with medium effect sizes at 9 and 11.2 kHz and large effect sizes at 10, 12, 14, and 16 kHz. However, the association with recreational noise was less consistent, with significant effects only at 12, 12.5, and 16 kHz. Egger's test indicated some publication bias, specifically at 10 kHz. Findings suggest thresholds above 8 kHz may indicate early noise exposure effects, even when lower-frequency (≤8 kHz) thresholds remain normal. Longitudinal studies incorporating noise dosimetry are crucial to establish causality and further support the clinical utility of extended high-frequency testing.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251343757"},"PeriodicalIF":3.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12084714/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144081423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-08-14DOI: 10.1177/23312165251367630
Federica Bianchi, Sindri Jonsson, Torben Christiansen, Elaine Hoi Ning Ng
Although multitasking is a common everyday activity, it is often challenging. The aim of this study was to evaluate the effect of noise attenuation during an audio-visual dual task and investigate cognitive resource allocation over time via pupillometry. Twenty-six normal hearing participants performed a dual task consisting of a primary speech recognition task and a secondary visual reaction-time task, as well as a visual-only task. Four conditions were tested in the dual task: two speech levels (60- and 64-dB SPL) and two noise conditions (No Attenuation with noise at 70 dB SPL; Attenuation condition with noise attenuated by passive damping). Elevated pupillary responses for the No Attenuation condition relative to the Attenuation and visual-only conditions indicated that participants allocated additional resources on the primary task during the playback of the first part of the sentence, while reaction time to the secondary task increased significantly relative to the visual-only task. In the Attenuation condition, participants performed the secondary task with a similar reaction time relative to the visual-only task (no dual-task cost), while pupillary responses revealed allocation of resources on the primary task after completion of the secondary task. These findings reveal that the temporal dynamics of cognitive resource allocation between primary and secondary task were affected by the level of background noise in the primary task. This study demonstrates that noise attenuation, as offered for example by audio devices, frees up cognitive resources in noisy listening environments and may be beneficial to improve performance and decrease dual-task costs during multitasking.
虽然多任务处理是一种常见的日常活动,但它往往具有挑战性。本研究的目的是评估噪声衰减在视听双重任务中的效果,并通过瞳孔测量法研究认知资源随时间的分配。26名听力正常的参与者执行了一项双重任务,包括主要的语音识别任务和次要的视觉反应时间任务,以及一个仅限视觉的任务。在双重任务中测试了四种条件:两种语音水平(60和64 dB SPL)和两种噪声条件(70 dB SPL噪声无衰减;噪声经被动阻尼衰减后的衰减状态)。受试者在无衰减条件下的瞳孔反应明显高于衰减条件和仅视觉条件下的瞳孔反应,这表明受试者在回放句子第一部分时在主要任务上分配了额外的资源,而对次要任务的反应时间则明显高于仅视觉条件下的反应时间。在衰减条件下,受试者在完成次要任务时的反应时间与仅视觉任务相似(没有双任务成本),而瞳孔反应显示完成次要任务后资源在主要任务上的分配。研究结果表明,主次任务间认知资源分配的时间动态受到主次任务背景噪声水平的影响。这项研究表明,噪音衰减,如音频设备提供的,在嘈杂的听力环境中释放认知资源,可能有利于提高性能,减少多任务处理时的双重任务成本。
{"title":"Pupillary Responses During a Dual Task: Effect of Noise Attenuation on the Timing of Cognitive Resource Allocation.","authors":"Federica Bianchi, Sindri Jonsson, Torben Christiansen, Elaine Hoi Ning Ng","doi":"10.1177/23312165251367630","DOIUrl":"10.1177/23312165251367630","url":null,"abstract":"<p><p>Although multitasking is a common everyday activity, it is often challenging. The aim of this study was to evaluate the effect of noise attenuation during an audio-visual dual task and investigate cognitive resource allocation over time via pupillometry. Twenty-six normal hearing participants performed a dual task consisting of a primary speech recognition task and a secondary visual reaction-time task, as well as a visual-only task. Four conditions were tested in the dual task: two speech levels (60- and 64-dB SPL) and two noise conditions (<i>No Attenuation</i> with noise at 70 dB SPL<i>; Attenuation</i> condition with noise attenuated by passive damping). Elevated pupillary responses for the N<i>o Attenuation</i> condition relative to the A<i>ttenuation</i> and visual-only conditions indicated that participants allocated additional resources on the primary task during the playback of the first part of the sentence, while reaction time to the secondary task increased significantly relative to the visual-only task. In the A<i>ttenuation</i> condition, participants performed the secondary task with a similar reaction time relative to the visual-only task (no dual-task cost), while pupillary responses revealed allocation of resources on the primary task after completion of the secondary task. These findings reveal that the temporal dynamics of cognitive resource allocation between primary and secondary task were affected by the level of background noise in the primary task. This study demonstrates that noise attenuation, as offered for example by audio devices, frees up cognitive resources in noisy listening environments and may be beneficial to improve performance and decrease dual-task costs during multitasking.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251367630"},"PeriodicalIF":3.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12357024/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144849442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-06-25DOI: 10.1177/23312165251345572
Susan E Voss, Aaron K Remenschneider, Rebecca M Farrar, Soomin Myoung, Nicholas J Horton
This study provides a comprehensive analysis of ear canal geometry from 0.7 to 91 years, based on high-resolution computed tomography scans of 221 ears. Quantified features include cross-sectional areas along the canal's length, total canal length, curvature, and key anatomical landmarks such as the first and second bends and the cartilage-to-bone transition. Significant developmental changes occur during the first 10 years of life, with adult-like characteristics emerging between ages 10 and 15 years, likely coinciding with puberty. Substantial interindividual variability is observed across all ages, particularly in the canal area. The canal becomes fully cartilaginous at and lateral to the second bend by 0.7 years, with further growth occurring only in the bony segment thereafter. These anatomical findings have important implications for audiologic threshold assessments, wideband acoustic immitance measures, age-appropriate hearing aid fitting schedules, and surgical planning, particularly in pediatric populations where anatomical variation is greatest.
{"title":"Comprehensive Measurements and Analyses of Ear Canal Geometry From Late Infancy Through Late Adulthood: Age-Related Variations and Implications for Basic Science and Audiological Measurements.","authors":"Susan E Voss, Aaron K Remenschneider, Rebecca M Farrar, Soomin Myoung, Nicholas J Horton","doi":"10.1177/23312165251345572","DOIUrl":"10.1177/23312165251345572","url":null,"abstract":"<p><p>This study provides a comprehensive analysis of ear canal geometry from 0.7 to 91 years, based on high-resolution computed tomography scans of 221 ears. Quantified features include cross-sectional areas along the canal's length, total canal length, curvature, and key anatomical landmarks such as the first and second bends and the cartilage-to-bone transition. Significant developmental changes occur during the first 10 years of life, with adult-like characteristics emerging between ages 10 and 15 years, likely coinciding with puberty. Substantial interindividual variability is observed across all ages, particularly in the canal area. The canal becomes fully cartilaginous at and lateral to the second bend by 0.7 years, with further growth occurring only in the bony segment thereafter. These anatomical findings have important implications for audiologic threshold assessments, wideband acoustic immitance measures, age-appropriate hearing aid fitting schedules, and surgical planning, particularly in pediatric populations where anatomical variation is greatest.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251345572"},"PeriodicalIF":2.6,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12198549/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144486732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-05-30DOI: 10.1177/23312165251347131
Björn Herrmann
Speech-comprehension difficulties are common among older people. Standard speech tests do not fully capture such difficulties because the tests poorly resemble the context-rich, story-like nature of ongoing conversation and are typically available only in a country's dominant/official language (e.g., English), leading to inaccurate scores for native speakers of other languages. Assessments for naturalistic, story speech in multiple languages require accurate, time-efficient scoring. The current research leverages modern large language models (LLMs) in native English speakers and native speakers of 10 other languages to automate the generation of high-quality, spoken stories and scoring of speech recall in different languages. Participants listened to and freely recalled short stories (in quiet/clear and in babble noise) in their native language. Large language model text-embeddings and LLM prompt engineering with semantic similarity analyses to score speech recall revealed sensitivity to known effects of temporal order, primacy/recency, and background noise, and high similarity of recall scores across languages. The work overcomes limitations associated with simple speech materials and testing of closed native-speaker groups because recall data of varying length and details can be mapped across languages with high accuracy. The full automation of speech generation and recall scoring provides an important step toward comprehension assessments of naturalistic speech with clinical applicability.
{"title":"Language-agnostic, Automated Assessment of Listeners' Speech Recall Using Large Language Models.","authors":"Björn Herrmann","doi":"10.1177/23312165251347131","DOIUrl":"10.1177/23312165251347131","url":null,"abstract":"<p><p>Speech-comprehension difficulties are common among older people. Standard speech tests do not fully capture such difficulties because the tests poorly resemble the context-rich, story-like nature of ongoing conversation and are typically available only in a country's dominant/official language (e.g., English), leading to inaccurate scores for native speakers of other languages. Assessments for naturalistic, story speech in multiple languages require accurate, time-efficient scoring. The current research leverages modern large language models (LLMs) in native English speakers and native speakers of 10 other languages to automate the generation of high-quality, spoken stories and scoring of speech recall in different languages. Participants listened to and freely recalled short stories (in quiet/clear and in babble noise) in their native language. Large language model text-embeddings and LLM prompt engineering with semantic similarity analyses to score speech recall revealed sensitivity to known effects of temporal order, primacy/recency, and background noise, and high similarity of recall scores across languages. The work overcomes limitations associated with simple speech materials and testing of closed native-speaker groups because recall data of varying length and details can be mapped across languages with high accuracy. The full automation of speech generation and recall scoring provides an important step toward comprehension assessments of naturalistic speech with clinical applicability.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251347131"},"PeriodicalIF":2.6,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12125525/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144192395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-11-13DOI: 10.1177/23312165251389585
Nick Sommerhalder, Zbyněk Bureš, Oliver Profant, Tobias Kleinjung, Patrick Neff, Martin Meyer
Adults with chronic subjective tinnitus often struggle with speech recognition in challenging listening environments. While most research demonstrates deficits in speech recognition among individuals with tinnitus, studies focusing on older adults remain scarce. Besides speech recognition deficits, tinnitus has been linked to diminished cognitive performance, particularly in executive functions, yet its associations with specific cognitive domains in ageing populations are not fully understood. Our previous study of younger adults found that individuals with tinnitus exhibit deficits in speech recognition and interference control. Building on this, we hypothesized that these deficits are also present for older adults. We conducted a cross-sectional study of older adults (aged 60-79), 32 with tinnitus and 31 controls matched for age, gender, education, and approximately matched for hearing loss. Participants underwent audiometric, speech recognition, and cognitive tasks. The tinnitus participants performed more poorly in speech-in-noise and gated speech tasks, whereas no group differences were observed in the other suprathreshold auditory tasks. With regard to cognition, individuals with tinnitus showed reduced interference control, emotional interference, cognitive flexibility, and verbal working memory, correlating with tinnitus distress and loudness. It is concluded that tinnitus-related deficits persist and even worsen with age. Our results suggest that altered central mechanisms contribute to speech recognition difficulties in older adults with tinnitus.
{"title":"Association of Tinnitus With Speech Recognition and Executive Functions in Older Adults.","authors":"Nick Sommerhalder, Zbyněk Bureš, Oliver Profant, Tobias Kleinjung, Patrick Neff, Martin Meyer","doi":"10.1177/23312165251389585","DOIUrl":"10.1177/23312165251389585","url":null,"abstract":"<p><p>Adults with chronic subjective tinnitus often struggle with speech recognition in challenging listening environments. While most research demonstrates deficits in speech recognition among individuals with tinnitus, studies focusing on older adults remain scarce. Besides speech recognition deficits, tinnitus has been linked to diminished cognitive performance, particularly in executive functions, yet its associations with specific cognitive domains in ageing populations are not fully understood. Our previous study of younger adults found that individuals with tinnitus exhibit deficits in speech recognition and interference control. Building on this, we hypothesized that these deficits are also present for older adults. We conducted a cross-sectional study of older adults (aged 60-79), 32 with tinnitus and 31 controls matched for age, gender, education, and approximately matched for hearing loss. Participants underwent audiometric, speech recognition, and cognitive tasks. The tinnitus participants performed more poorly in speech-in-noise and gated speech tasks, whereas no group differences were observed in the other suprathreshold auditory tasks. With regard to cognition, individuals with tinnitus showed reduced interference control, emotional interference, cognitive flexibility, and verbal working memory, correlating with tinnitus distress and loudness. It is concluded that tinnitus-related deficits persist and even worsen with age. Our results suggest that altered central mechanisms contribute to speech recognition difficulties in older adults with tinnitus.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251389585"},"PeriodicalIF":3.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12615926/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145514780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-11-24DOI: 10.1177/23312165251397373
E Sebastian Lelo de Larrea-Mancera, Tess K Koerner, William J Bologna, Sara Momtaz, Katherine N Menon, Audrey Carrillo, Eric C Hoover, G Christopher Stecker, Frederick J Gallun, Aaron R Seitz
Previous research has demonstrated that remote testing of suprathreshold auditory function using distributed technologies can produce results that closely match those obtained in laboratory settings with specialized, calibrated equipment. This work has facilitated the validation of various behavioral measures in remote settings that provide valuable insights into auditory function. In the current study, we sought to address whether a broad battery of auditory assessments could explain variance in self-report of hearing handicap. To address this, we used a portable psychophysics assessment tool along with an online recruitment tool (Prolific) to collect auditory task data from participants with (n= 84) and without (n= 108) self-reported hearing difficulty. Results indicate several measures of auditory processing differentiate participants with and without self-reported hearing difficulty. In addition, we report the factor structure of the test battery to clarify the underlying constructs and the extent to which they individually or jointly inform hearing function. Relationships between measures of auditory processing were found to be largely consistent with a hypothesized construct model that guided task selection. Overall, this study advances our understanding of the relationship between auditory and cognitive processing in those with and without subjective hearing difficulty. More broadly, these results indicate promise that these measures can be used in larger scale research studies in remote settings and have potential to contribute to telehealth approaches to better address people's hearing needs.
{"title":"At-Home Auditory Assessment Using Portable Automated Rapid Testing (PART) to Understand Self-Reported Hearing Difficulties.","authors":"E Sebastian Lelo de Larrea-Mancera, Tess K Koerner, William J Bologna, Sara Momtaz, Katherine N Menon, Audrey Carrillo, Eric C Hoover, G Christopher Stecker, Frederick J Gallun, Aaron R Seitz","doi":"10.1177/23312165251397373","DOIUrl":"10.1177/23312165251397373","url":null,"abstract":"<p><p>Previous research has demonstrated that remote testing of suprathreshold auditory function using distributed technologies can produce results that closely match those obtained in laboratory settings with specialized, calibrated equipment. This work has facilitated the validation of various behavioral measures in remote settings that provide valuable insights into auditory function. In the current study, we sought to address whether a broad battery of auditory assessments could explain variance in self-report of hearing handicap. To address this, we used a portable psychophysics assessment tool along with an online recruitment tool (Prolific) to collect auditory task data from participants with (<i>n</i> <i>=</i> 84) and without (<i>n</i> <i>=</i> 108) self-reported hearing difficulty. Results indicate several measures of auditory processing differentiate participants with and without self-reported hearing difficulty. In addition, we report the factor structure of the test battery to clarify the underlying constructs and the extent to which they individually or jointly inform hearing function. Relationships between measures of auditory processing were found to be largely consistent with a hypothesized construct model that guided task selection. Overall, this study advances our understanding of the relationship between auditory and cognitive processing in those with and without subjective hearing difficulty. More broadly, these results indicate promise that these measures can be used in larger scale research studies in remote settings and have potential to contribute to telehealth approaches to better address people's hearing needs.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251397373"},"PeriodicalIF":3.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12644446/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145597487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-11-25DOI: 10.1177/23312165251396644
Vahid Ashkanichenarlogh, Paula Folkeard, Susan Scollie, Volker Kühnel, Vijay Parsa
This study evaluated a deep-neural-network denoising system using model-based design, comparing it with adaptive filtering and beamforming across various noise types, SNRs, and hearing-aid fittings. A KEMAR manikin fitted with five audiograms was recorded in reverberant and non-reverberant rooms, yielding 1,152 recordings. Speech intelligibility was estimated using the HASPI from 1,152 KEMAR manikin recordings. Effects of processing strategy and acoustic factors were tested with model-based within-device design that account for repeated recordings per device/program and fitting. Linear mixed model results showed that the DNN with beamforming outperformed conventional processing, with strongest gains at 0 and +5 dB SNR, moderate benefits at -5 dB in low reverberation, and none in medium reverberation. Across SNRs and noise types, the DNN combined with beamforming yielded the highest predicted intelligibility, with benefits attenuated under moderate reverberation. Azimuth effects varied; because estimates were derived from a better-ear metric on manikin recordings. Additionally, this paper reports comparisons using metrics of sound quality, for an intrusive metric (HASQI) and the pMOS non-intrusive metric. Results indicated that model type interacted with processing and acoustic factors. HASQI and pMOS scores increased with SNR and were moderately correlated (r² ≈ 0.479), supporting the use of non-intrusive metrics for large-scale assessment. However, pMOS showed greater variability across hearing aid programs and environments, suggesting non-intrusive models capture processing effects differently than intrusive metrics. These findings highlight the promise and limits of non-intrusive evaluation while emphasizing the benefit of combining deep learning with beamforming to improve intelligibility and quality.
{"title":"Objective Evaluation of a Deep Learning-Based Noise Reduction Algorithm for Hearing Aids Under Diverse Fitting and Listening Conditions.","authors":"Vahid Ashkanichenarlogh, Paula Folkeard, Susan Scollie, Volker Kühnel, Vijay Parsa","doi":"10.1177/23312165251396644","DOIUrl":"10.1177/23312165251396644","url":null,"abstract":"<p><p>This study evaluated a deep-neural-network denoising system using model-based design, comparing it with adaptive filtering and beamforming across various noise types, SNRs, and hearing-aid fittings. A KEMAR manikin fitted with five audiograms was recorded in reverberant and non-reverberant rooms, yielding 1,152 recordings. Speech intelligibility was estimated using the HASPI from 1,152 KEMAR manikin recordings. Effects of processing strategy and acoustic factors were tested with model-based within-device design that account for repeated recordings per device/program and fitting. Linear mixed model results showed that the DNN with beamforming outperformed conventional processing, with strongest gains at 0 and +5 dB SNR, moderate benefits at -5 dB in low reverberation, and none in medium reverberation. Across SNRs and noise types, the DNN combined with beamforming yielded the highest predicted intelligibility, with benefits attenuated under moderate reverberation. Azimuth effects varied; because estimates were derived from a better-ear metric on manikin recordings. Additionally, this paper reports comparisons using metrics of sound quality, for an intrusive metric (HASQI) and the pMOS non-intrusive metric. Results indicated that model type interacted with processing and acoustic factors. HASQI and pMOS scores increased with SNR and were moderately correlated (r² ≈ 0.479), supporting the use of non-intrusive metrics for large-scale assessment. However, pMOS showed greater variability across hearing aid programs and environments, suggesting non-intrusive models capture processing effects differently than intrusive metrics. These findings highlight the promise and limits of non-intrusive evaluation while emphasizing the benefit of combining deep learning with beamforming to improve intelligibility and quality.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251396644"},"PeriodicalIF":3.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12647563/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145606795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-04-13DOI: 10.1177/23312165251333528
John Kyle Cooper, Jonas Vanthornhout, Astrid van Wieringen, Tom Francart
Speech intelligibility in challenging listening environments relies on the integration of audiovisual cues. Measuring the effectiveness of audiovisual integration in these challenging listening environments can be difficult due to the complexity of such environments. The Audiovisual True-to-Life Assessment of Auditory Rehabilitation (AVATAR) is a paradigm that was developed to provide an ecological environment to capture both the audio and visual aspects of speech intelligibility measures. Previous research has shown the benefit from audiovisual cues can be measured using behavioral (e.g., word recognition) and electrophysiological (e.g., neural tracking) measures. The current research examines, when using the AVATAR paradigm, if electrophysiological measures of speech intelligibility yield similar outcomes as behavioral measures. We hypothesized visual cues would enhance both the behavioral and electrophysiological scores as the signal-to-noise ratio (SNR) of the speech signal decreased. Twenty young (18-25 years old) participants (1 male and 19 female) with normal hearing participated in our study. For our behavioral experiment, we administered lists of sentences using an adaptive procedure to estimate a speech reception threshold (SRT). For our electrophysiological experiment, we administered 35 lists of sentences randomized across five SNR levels (silence, 0, -3, -6, and -9 dB) and two visual conditions (audio-only and audiovisual). We used a neural tracking decoder to measure the reconstruction accuracies for each participant. We observed most participants had higher reconstruction accuracies for the audiovisual condition compared to the audio-only condition in conditions with moderate to high levels of noise. We found the electrophysiological measure may correlate with the behavioral measure that shows audiovisual benefit.
{"title":"Objectively Measuring Audiovisual Effects in Noise Using Virtual Human Speakers.","authors":"John Kyle Cooper, Jonas Vanthornhout, Astrid van Wieringen, Tom Francart","doi":"10.1177/23312165251333528","DOIUrl":"https://doi.org/10.1177/23312165251333528","url":null,"abstract":"<p><p>Speech intelligibility in challenging listening environments relies on the integration of audiovisual cues. Measuring the effectiveness of audiovisual integration in these challenging listening environments can be difficult due to the complexity of such environments. The Audiovisual True-to-Life Assessment of Auditory Rehabilitation (AVATAR) is a paradigm that was developed to provide an ecological environment to capture both the audio and visual aspects of speech intelligibility measures. Previous research has shown the benefit from audiovisual cues can be measured using behavioral (e.g., word recognition) and electrophysiological (e.g., neural tracking) measures. The current research examines, when using the AVATAR paradigm, if electrophysiological measures of speech intelligibility yield similar outcomes as behavioral measures. We hypothesized visual cues would enhance both the behavioral and electrophysiological scores as the signal-to-noise ratio (SNR) of the speech signal decreased. Twenty young (18-25 years old) participants (1 male and 19 female) with normal hearing participated in our study. For our behavioral experiment, we administered lists of sentences using an adaptive procedure to estimate a speech reception threshold (SRT). For our electrophysiological experiment, we administered 35 lists of sentences randomized across five SNR levels (silence, 0, -3, -6, and -9 dB) and two visual conditions (audio-only and audiovisual). We used a neural tracking decoder to measure the reconstruction accuracies for each participant. We observed most participants had higher reconstruction accuracies for the audiovisual condition compared to the audio-only condition in conditions with moderate to high levels of noise. We found the electrophysiological measure may correlate with the behavioral measure that shows audiovisual benefit.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251333528"},"PeriodicalIF":2.6,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12033406/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144043708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1177/23312165241311721
Onn Wah Lee, Demi Gao, Tommy Peng, Julia Wunderlich, Darren Mao, Gautam Balasubramanian, Colette M McKay
This study used functional near-infrared spectroscopy (fNIRS) to measure aspects of the speech discrimination ability of sleeping infants. We examined the morphology of the fNIRS response to three different speech contrasts, namely "Tea/Ba," "Bee/Ba," and "Ga/Ba." Sixteen infants aged between 3 and 13 months old were included in this study and their fNIRS data were recorded during natural sleep. The stimuli were presented using a nonsilence baseline paradigm, where repeated standard stimuli were presented between the novel stimuli blocks without any silence periods. The morphology of fNIRS responses varied between speech contrasts. The data were fit with a model in which the responses were the sum of two independent and concurrent response mechanisms that were derived from previously published fNIRS detection responses. These independent components were an oxyhemoglobin (HbO)-positive early-latency response and an HbO-negative late latency response, hypothesized to be related to an auditory canonical response and a brain arousal response, respectively. The goodness of fit of the model with the data was high with median goodness of fit of 81%. The data showed that both response components had later latency when the left ear was the test ear (p < .05) compared to the right ear and that the negative component, due to brain arousal, was smallest for the most subtle contrast, "Ga/Ba" (p = .003).
本研究使用功能近红外光谱(fNIRS)来测量睡眠婴儿的语言识别能力。我们研究了三种不同语音对比的fNIRS反应形态,即“Tea/Ba”、“Bee/Ba”和“Ga/Ba”。16名年龄在3到13个月之间的婴儿参与了这项研究,并记录了他们在自然睡眠期间的近红外光谱数据。刺激采用非沉默基线范式,在新刺激块之间重复呈现标准刺激,没有任何沉默期。不同语音对照的fNIRS反应形态不同。数据拟合了一个模型,其中响应是两个独立且并发的响应机制的总和,这些响应机制来源于先前发表的fNIRS检测响应。这些独立的成分是一个氧合血红蛋白(HbO)阳性的早期潜伏期反应和一个氧合血红蛋白阴性的晚期潜伏期反应,假设分别与听觉规范反应和大脑唤醒反应有关。模型与数据的拟合优度较高,中位拟合优度为81%。数据显示,当左耳为测试耳时,两种反应成分的潜伏期均较晚(p p = 0.003)。
{"title":"Measuring Speech Discrimination Ability in Sleeping Infants Using fNIRS-A Proof of Principle.","authors":"Onn Wah Lee, Demi Gao, Tommy Peng, Julia Wunderlich, Darren Mao, Gautam Balasubramanian, Colette M McKay","doi":"10.1177/23312165241311721","DOIUrl":"10.1177/23312165241311721","url":null,"abstract":"<p><p>This study used functional near-infrared spectroscopy (fNIRS) to measure aspects of the speech discrimination ability of sleeping infants. We examined the morphology of the fNIRS response to three different speech contrasts, namely \"Tea/Ba,\" \"Bee/Ba,\" and \"Ga/Ba.\" Sixteen infants aged between 3 and 13 months old were included in this study and their fNIRS data were recorded during natural sleep. The stimuli were presented using a nonsilence baseline paradigm, where repeated standard stimuli were presented between the novel stimuli blocks without any silence periods. The morphology of fNIRS responses varied between speech contrasts. The data were fit with a model in which the responses were the sum of two independent and concurrent response mechanisms that were derived from previously published fNIRS detection responses. These independent components were an oxyhemoglobin (HbO)-positive early-latency response and an HbO-negative late latency response, hypothesized to be related to an auditory canonical response and a brain arousal response, respectively. The goodness of fit of the model with the data was high with median goodness of fit of 81%. The data showed that both response components had later latency when the left ear was the test ear (<i>p</i> < .05) compared to the right ear and that the negative component, due to brain arousal, was smallest for the most subtle contrast, \"Ga/Ba\" (<i>p</i> = .003).</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165241311721"},"PeriodicalIF":2.6,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11758514/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143030151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-08-11DOI: 10.1177/23312165251365802
Ragini Sinha, Ann-Christin Scherer, Simon Doclo, Christian Rollwage, Jan Rennies
Speaker-conditioned target speaker extraction algorithms aim at extracting the target speaker from a mixture of multiple speakers by using additional information about the target speaker. Previous studies have evaluated the performance of these algorithms using either instrumental measures or subjective assessments with normal-hearing listeners or with hearing-impaired listeners. Notably, a previous study employing a quasicausal algorithm reported significant intelligibility improvements for both normal-hearing and hearing-impaired listeners, while another study demonstrated that a fully causal algorithm could enhance speech intelligibility and reduce listening effort for normal-hearing listeners. Building on these findings, this study focuses on an in-depth subjective assessment of two fully causal deep neural network-based speaker-conditioned target speaker extraction algorithms with hearing-impaired listeners, both without hearing loss compensation (unaided) and with linear hearing loss compensation (aided). Three different subjective performance measurement methods were used to cover a broad range of listening conditions, namely paired comparison, speech recognition thresholds, and categorically scaled perceived listening effort. The subjective evaluation results with 15 hearing-impaired listeners showed that one algorithm significantly reduced listening effort and improved intelligibility compared to unprocessed stimuli and the other algorithm. The data also suggest that hearing-impaired listeners experience a greater benefit in terms of listening effort (for both male and female interfering speakers) and speech recognition thresholds, especially in the presence of female interfering speakers than normal-hearing listeners, and that hearing loss compensation (linear amplification) is not required to obtain an algorithm benefit.
{"title":"Evaluation of Speaker-Conditioned Target Speaker Extraction Algorithms for Hearing-Impaired Listeners.","authors":"Ragini Sinha, Ann-Christin Scherer, Simon Doclo, Christian Rollwage, Jan Rennies","doi":"10.1177/23312165251365802","DOIUrl":"10.1177/23312165251365802","url":null,"abstract":"<p><p>Speaker-conditioned target speaker extraction algorithms aim at extracting the target speaker from a mixture of multiple speakers by using additional information about the target speaker. Previous studies have evaluated the performance of these algorithms using either instrumental measures or subjective assessments with normal-hearing listeners or with hearing-impaired listeners. Notably, a previous study employing a quasicausal algorithm reported significant intelligibility improvements for both normal-hearing and hearing-impaired listeners, while another study demonstrated that a fully causal algorithm could enhance speech intelligibility and reduce listening effort for normal-hearing listeners. Building on these findings, this study focuses on an in-depth subjective assessment of two fully causal deep neural network-based speaker-conditioned target speaker extraction algorithms with hearing-impaired listeners, both without hearing loss compensation (unaided) and with linear hearing loss compensation (aided). Three different subjective performance measurement methods were used to cover a broad range of listening conditions, namely paired comparison, speech recognition thresholds, and categorically scaled perceived listening effort. The subjective evaluation results with 15 hearing-impaired listeners showed that one algorithm significantly reduced listening effort and improved intelligibility compared to unprocessed stimuli and the other algorithm. The data also suggest that hearing-impaired listeners experience a greater benefit in terms of listening effort (for both male and female interfering speakers) and speech recognition thresholds, especially in the presence of female interfering speakers than normal-hearing listeners, and that hearing loss compensation (linear amplification) is not required to obtain an algorithm benefit.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"29 ","pages":"23312165251365802"},"PeriodicalIF":3.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12340209/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144817996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}