Pub Date : 2024-01-01DOI: 10.1177/23312165241292215
Stefanie E Kuchinsky, Frederick J Gallun, Adrian K C Lee
People regularly communicate in complex environments, requiring them to flexibly shift their attention across multiple sources of sensory information. Increasing recruitment of the executive functions that support successful speech comprehension in these multitasking settings is thought to contribute to the sense of effort that listeners often experience. One common research method employed to quantify listening effort is the dual-task paradigm in which individuals recognize speech and concurrently perform a secondary (often visual) task. Effort is operationalized as performance decrements on the secondary task as speech processing demands increase. However, recent reviews have noted critical inconsistencies in the results of dual-task experiments, likely in part due to how and when the two tasks place demands on a common set of mental resources and how flexibly individuals can allocate their attention to them. We propose that in order to move forward to address this gap, we need to first look backward: better integrating theoretical models of resource capacity and allocation as well as of task-switching that have been historically developed in domains outside of hearing research (viz., cognitive psychology and neuroscience). With this context in mind, we describe how dual-task experiments could be designed and interpreted such that they provide better and more robust insights into the mechanisms that contribute to effortful listening.
{"title":"Note on the Dual-Task Paradigm and its Use to Measure Listening Effort.","authors":"Stefanie E Kuchinsky, Frederick J Gallun, Adrian K C Lee","doi":"10.1177/23312165241292215","DOIUrl":"10.1177/23312165241292215","url":null,"abstract":"<p><p>People regularly communicate in complex environments, requiring them to flexibly shift their attention across multiple sources of sensory information. Increasing recruitment of the executive functions that support successful speech comprehension in these multitasking settings is thought to contribute to the sense of effort that listeners often experience. One common research method employed to quantify listening effort is the dual-task paradigm in which individuals recognize speech and concurrently perform a secondary (often visual) task. Effort is operationalized as performance decrements on the secondary task as speech processing demands increase. However, recent reviews have noted critical inconsistencies in the results of dual-task experiments, likely in part due to how and when the two tasks place demands on a common set of mental resources and how flexibly individuals can allocate their attention to them. We propose that in order to move forward to address this gap, we need to first look backward: better integrating theoretical models of resource capacity and allocation as well as of task-switching that have been historically developed in domains outside of hearing research (viz., cognitive psychology and neuroscience). With this context in mind, we describe how dual-task experiments could be designed and interpreted such that they provide better and more robust insights into the mechanisms that contribute to effortful listening.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241292215"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11626669/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142548411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241232219
L Behtani, D Paromov, K Moïn-Darbari, M S Houde, B A Bacon, M Maheu, T Leroux, F Champoux
Recent studies suggest that sound amplification via hearing aids can improve postural control in adults with hearing impairments. Unfortunately, only a few studies used well-defined posturography measures to assess balance in adults with hearing loss with and without their hearing aids. Of these, only two examined postural control specifically in the elderly with hearing loss. The present study examined the impact of hearing aid use on postural control during various sensory perturbations in older adults with age-related hearing loss. Thirty individuals with age-related hearing impairments and using hearing aids bilaterally were tested. Participants were asked to perform a modified clinical sensory integration in balance test on a force platform with and without hearing aids. The experiment was conducted in the presence of a broadband noise ranging from 0.1 to 4 kHz presented through a loudspeaker. As expected, hearing aid use had a beneficial impact on postural control, but only when visual and somatosensory inputs were both reduced. Data also suggest that hearing aid use decreases the dependence on somatosensory input for maintaining postural control. This finding can be of particular importance in older adults considering the reduction of tactile and proprioceptive sensitivity and acuity often associated with aging. These results provide an additional argument for encouraging early hearing aid fitting for people with hearing loss.
{"title":"Hearing Aid Amplification Improves Postural Control for Older Adults With Hearing Loss When Other Sensory Cues Are Impoverished.","authors":"L Behtani, D Paromov, K Moïn-Darbari, M S Houde, B A Bacon, M Maheu, T Leroux, F Champoux","doi":"10.1177/23312165241232219","DOIUrl":"10.1177/23312165241232219","url":null,"abstract":"<p><p>Recent studies suggest that sound amplification via hearing aids can improve postural control in adults with hearing impairments. Unfortunately, only a few studies used well-defined posturography measures to assess balance in adults with hearing loss with and without their hearing aids. Of these, only two examined postural control specifically in the elderly with hearing loss. The present study examined the impact of hearing aid use on postural control during various sensory perturbations in older adults with age-related hearing loss. Thirty individuals with age-related hearing impairments and using hearing aids bilaterally were tested. Participants were asked to perform a modified clinical sensory integration in balance test on a force platform with and without hearing aids. The experiment was conducted in the presence of a broadband noise ranging from 0.1 to 4 kHz presented through a loudspeaker. As expected, hearing aid use had a beneficial impact on postural control, but only when visual and somatosensory inputs were both reduced. Data also suggest that hearing aid use decreases the dependence on somatosensory input for maintaining postural control. This finding can be of particular importance in older adults considering the reduction of tactile and proprioceptive sensitivity and acuity often associated with aging. These results provide an additional argument for encouraging early hearing aid fitting for people with hearing loss.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241232219"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10868491/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139736482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165231224643
Dianne J Mecklenburg, Petra L Graham, Chris J James
Cochlear implantation successfully improves hearing in most adult recipients. However, in rare cases, post-implant rehabilitation is required to maximize benefit. The primary aim of this investigation was to test if self-reports by cochlear implant users indicate the need for post-implant rehabilitation. Listening performance was assessed with the Speech, Spatial and Qualities short-form SSQ12, which was self-administered via a web-based survey. Subjects included over 2000 adult bilateral or unilateral cochlear implant users with at least one year of experience. A novel application of regression tree analysis identified core SSQ12 items that serve as first steps in establishing a plan for further rehabilitation: items 1, 8, and 11 dealing with single-talker situations, loudness perception, and clarity, respectively. Further regression and classification tree analyses revealed that SSQ12 item scores were weakly related to age, degree of tinnitus, and use of bilateral versus unilateral implants. Conversely, SSQ12 scores were strongly associated with self-rated satisfaction and confidence in using their cochlear implant. The SSQ12 total scores did not vary significantly over 1-9 or more years' experience. These findings suggest that the SSQ12 may be a useful tool to guide rehabilitation at any time after cochlear implantation. Identification of poor performance may have implications for timely management to improve the outcomes, through various techniques such as device fitting adjustments, counseling, active sound exposure, and training spatial hearing.
{"title":"Relationships Between Speech, Spatial and Qualities of Hearing Short Form SSQ12 Item Scores and their Use in Guiding Rehabilitation for Cochlear Implant Recipients.","authors":"Dianne J Mecklenburg, Petra L Graham, Chris J James","doi":"10.1177/23312165231224643","DOIUrl":"10.1177/23312165231224643","url":null,"abstract":"<p><p>Cochlear implantation successfully improves hearing in most adult recipients. However, in rare cases, post-implant rehabilitation is required to maximize benefit. The primary aim of this investigation was to test if self-reports by cochlear implant users indicate the need for post-implant rehabilitation. Listening performance was assessed with the Speech, Spatial and Qualities short-form SSQ12, which was self-administered via a web-based survey. Subjects included over 2000 adult bilateral or unilateral cochlear implant users with at least one year of experience. A novel application of regression tree analysis identified core SSQ12 items that serve as first steps in establishing a plan for further rehabilitation: items 1, 8, and 11 dealing with single-talker situations, loudness perception, and clarity, respectively. Further regression and classification tree analyses revealed that SSQ12 item scores were weakly related to age, degree of tinnitus, and use of bilateral versus unilateral implants. Conversely, SSQ12 scores were strongly associated with self-rated satisfaction and confidence in using their cochlear implant. The SSQ12 total scores did not vary significantly over 1-9 or more years' experience. These findings suggest that the SSQ12 may be a useful tool to guide rehabilitation at any time after cochlear implantation. Identification of poor performance may have implications for timely management to improve the outcomes, through various techniques such as device fitting adjustments, counseling, active sound exposure, and training spatial hearing.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165231224643"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10874150/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139742321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165231224597
Michal Fereczkowski, Raul H Sanchez-Lopez, Stine Christiansen, Tobias Neher
Hearing aids provide nonlinear amplification to improve speech audibility and loudness perception. While more audibility typically increases speech intelligibility at low levels, the same is not true for above-conversational levels, where decreases in intelligibility ("rollover") can occur. In a previous study, we found rollover in speech intelligibility measurements made in quiet for 35 out of 74 test ears with a hearing loss. Furthermore, we found rollover occurrence in quiet to be associated with poorer speech intelligibility in noise as measured with linear amplification. Here, we retested 16 participants with rollover with three amplitude-compression settings. Two were designed to prevent rollover by applying slow- or fast-acting compression with a 5:1 compression ratio around the "sweet spot," that is, the area in an individual performance-intensity function with high intelligibility and listening comfort. The third, reference setting used gains and compression ratios prescribed by the "National Acoustic Laboratories Non-Linear 1" rule. Speech intelligibility was assessed in quiet and in noise. Pairwise preference judgments were also collected. For speech levels of 70 dB SPL and above, slow-acting sweet-spot compression gave better intelligibility in quiet and noise than the reference setting. Additionally, the participants clearly preferred slow-acting sweet-spot compression over the other settings. At lower levels, the three settings gave comparable speech intelligibility, and the participants preferred the reference setting over both sweet-spot settings. Overall, these results suggest that, for listeners with rollover, slow-acting sweet-spot compression is beneficial at 70 dB SPL and above, while at lower levels clinically established gain targets are more suited.
{"title":"Amplitude Compression for Preventing Rollover at Above-Conversational Speech Levels.","authors":"Michal Fereczkowski, Raul H Sanchez-Lopez, Stine Christiansen, Tobias Neher","doi":"10.1177/23312165231224597","DOIUrl":"10.1177/23312165231224597","url":null,"abstract":"<p><p>Hearing aids provide nonlinear amplification to improve speech audibility and loudness perception. While more audibility typically increases speech intelligibility at low levels, the same is not true for above-conversational levels, where decreases in intelligibility (\"rollover\") can occur. In a previous study, we found rollover in speech intelligibility measurements made in quiet for 35 out of 74 test ears with a hearing loss. Furthermore, we found rollover occurrence in quiet to be associated with poorer speech intelligibility in noise as measured with linear amplification. Here, we retested 16 participants with rollover with three amplitude-compression settings. Two were designed to prevent rollover by applying slow- or fast-acting compression with a 5:1 compression ratio around the \"sweet spot,\" that is, the area in an individual performance-intensity function with high intelligibility and listening comfort. The third, reference setting used gains and compression ratios prescribed by the \"National Acoustic Laboratories Non-Linear 1\" rule. Speech intelligibility was assessed in quiet and in noise. Pairwise preference judgments were also collected. For speech levels of 70 dB SPL and above, slow-acting sweet-spot compression gave better intelligibility in quiet and noise than the reference setting. Additionally, the participants clearly preferred slow-acting sweet-spot compression over the other settings. At lower levels, the three settings gave comparable speech intelligibility, and the participants preferred the reference setting over both sweet-spot settings. Overall, these results suggest that, for listeners with rollover, slow-acting sweet-spot compression is beneficial at 70 dB SPL and above, while at lower levels clinically established gain targets are more suited.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165231224597"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10771052/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139099037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241239541
Naomi F Bramhall, Garnett P McMillan
Cochlear synaptopathy, a form of cochlear deafferentation, has been demonstrated in a number of animal species, including non-human primates. Both age and noise exposure contribute to synaptopathy in animal models, indicating that it may be a common type of auditory dysfunction in humans. Temporal bone and auditory physiological data suggest that age and occupational/military noise exposure also lead to synaptopathy in humans. The predicted perceptual consequences of synaptopathy include tinnitus, hyperacusis, and difficulty with speech-in-noise perception. However, confirming the perceptual impacts of this form of cochlear deafferentation presents a particular challenge because synaptopathy can only be confirmed through post-mortem temporal bone analysis and auditory perception is difficult to evaluate in animals. Animal data suggest that deafferentation leads to increased central gain, signs of tinnitus and abnormal loudness perception, and deficits in temporal processing and signal-in-noise detection. If equivalent changes occur in humans following deafferentation, this would be expected to increase the likelihood of developing tinnitus, hyperacusis, and difficulty with speech-in-noise perception. Physiological data from humans is consistent with the hypothesis that deafferentation is associated with increased central gain and a greater likelihood of tinnitus perception, while human data on the relationship between deafferentation and hyperacusis is extremely limited. Many human studies have investigated the relationship between physiological correlates of deafferentation and difficulty with speech-in-noise perception, with mixed findings. A non-linear relationship between deafferentation and speech perception may have contributed to the mixed results. When differences in sample characteristics and study measurements are considered, the findings may be more consistent.
{"title":"Perceptual Consequences of Cochlear Deafferentation in Humans.","authors":"Naomi F Bramhall, Garnett P McMillan","doi":"10.1177/23312165241239541","DOIUrl":"10.1177/23312165241239541","url":null,"abstract":"<p><p>Cochlear synaptopathy, a form of cochlear deafferentation, has been demonstrated in a number of animal species, including non-human primates. Both age and noise exposure contribute to synaptopathy in animal models, indicating that it may be a common type of auditory dysfunction in humans. Temporal bone and auditory physiological data suggest that age and occupational/military noise exposure also lead to synaptopathy in humans. The predicted perceptual consequences of synaptopathy include tinnitus, hyperacusis, and difficulty with speech-in-noise perception. However, confirming the perceptual impacts of this form of cochlear deafferentation presents a particular challenge because synaptopathy can only be confirmed through post-mortem temporal bone analysis and auditory perception is difficult to evaluate in animals. Animal data suggest that deafferentation leads to increased central gain, signs of tinnitus and abnormal loudness perception, and deficits in temporal processing and signal-in-noise detection. If equivalent changes occur in humans following deafferentation, this would be expected to increase the likelihood of developing tinnitus, hyperacusis, and difficulty with speech-in-noise perception. Physiological data from humans is consistent with the hypothesis that deafferentation is associated with increased central gain and a greater likelihood of tinnitus perception, while human data on the relationship between deafferentation and hyperacusis is extremely limited. Many human studies have investigated the relationship between physiological correlates of deafferentation and difficulty with speech-in-noise perception, with mixed findings. A non-linear relationship between deafferentation and speech perception may have contributed to the mixed results. When differences in sample characteristics and study measurements are considered, the findings may be more consistent.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241239541"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11092548/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140913237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241262517
Daniel Oberfeld, Katharina Staab, Florian Kattner, Wolfgang Ellermeier
Listeners with normal audiometric thresholds show substantial variability in their ability to understand speech in noise (SiN). These individual differences have been reported to be associated with a range of auditory and cognitive abilities. The present study addresses the association between SiN processing and the individual susceptibility of short-term memory to auditory distraction (i.e., the irrelevant sound effect [ISE]). In a sample of 67 young adult participants with normal audiometric thresholds, we measured speech recognition performance in a spatial listening task with two interfering talkers (speech-in-speech identification), audiometric thresholds, binaural sensitivity to the temporal fine structure (interaural phase differences [IPD]), serial memory with and without interfering talkers, and self-reported noise sensitivity. Speech-in-speech processing was not significantly associated with the ISE. The most important predictors of high speech-in-speech recognition performance were a large short-term memory span, low IPD thresholds, bilaterally symmetrical audiometric thresholds, and low individual noise sensitivity. Surprisingly, the susceptibility of short-term memory to irrelevant sound accounted for a substantially smaller amount of variance in speech-in-speech processing than the nondisrupted short-term memory capacity. The data confirm the role of binaural sensitivity to the temporal fine structure, although its association to SiN recognition was weaker than in some previous studies. The inverse association between self-reported noise sensitivity and SiN processing deserves further investigation.
听力阈值正常的听者在理解噪声语音(SiN)的能力上存在很大差异。据报道,这些个体差异与一系列听觉和认知能力有关。本研究探讨了噪音语言处理能力与个体短期记忆对听觉干扰(即无关声音效应 [ISE])的敏感性之间的关联。我们以听阈正常的 67 名年轻成年参与者为样本,测量了在有两个干扰说话者的空间听力任务中的语音识别成绩(语音中的语音识别)、听阈、对时间精细结构的双耳敏感度(耳间相位差 [IPD])、有干扰说话者和无干扰说话者的序列记忆以及自我报告的噪声敏感度。语音中的语音处理与 ISE 没有明显关联。短期记忆跨度大、IPD阈值低、双侧听力阈值对称和个体噪声敏感度低是预测高语音识别能力的最重要因素。令人惊讶的是,短期记忆对无关声音的易感性在语音-语音处理中造成的差异远远小于未受干扰的短期记忆能力。这些数据证实了双耳对时间精细结构的敏感性所起的作用,尽管它与 SiN 识别的关联性比以前的一些研究要弱。自我报告的噪声敏感度与 SiN 处理之间的反向关联值得进一步研究。
{"title":"Is Recognition of Speech in Noise Related to Memory Disruption Caused by Irrelevant Sound?","authors":"Daniel Oberfeld, Katharina Staab, Florian Kattner, Wolfgang Ellermeier","doi":"10.1177/23312165241262517","DOIUrl":"10.1177/23312165241262517","url":null,"abstract":"<p><p>Listeners with normal audiometric thresholds show substantial variability in their ability to understand speech in noise (SiN). These individual differences have been reported to be associated with a range of auditory and cognitive abilities. The present study addresses the association between SiN processing and the individual susceptibility of short-term memory to auditory distraction (i.e., the irrelevant sound effect [ISE]). In a sample of 67 young adult participants with normal audiometric thresholds, we measured speech recognition performance in a spatial listening task with two interfering talkers (speech-in-speech identification), audiometric thresholds, binaural sensitivity to the temporal fine structure (interaural phase differences [IPD]), serial memory with and without interfering talkers, and self-reported noise sensitivity. Speech-in-speech processing was not significantly associated with the ISE. The most important predictors of high speech-in-speech recognition performance were a large short-term memory span, low IPD thresholds, bilaterally symmetrical audiometric thresholds, and low individual noise sensitivity. Surprisingly, the susceptibility of short-term memory to irrelevant sound accounted for a substantially smaller amount of variance in speech-in-speech processing than the nondisrupted short-term memory capacity. The data confirm the role of binaural sensitivity to the temporal fine structure, although its association to SiN recognition was weaker than in some previous studies. The inverse association between self-reported noise sensitivity and SiN processing deserves further investigation.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241262517"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11273587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141761865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241292205
Nasser-Eddine Monir, Paul Magron, Romain Serizel
In the intricate acoustic landscapes where speech intelligibility is challenged by noise and reverberation, multichannel speech enhancement emerges as a promising solution for individuals with hearing loss. Such algorithms are commonly evaluated at the utterance scale. However, this approach overlooks the granular acoustic nuances revealed by phoneme-specific analysis, potentially obscuring key insights into their performance. This paper presents an in-depth phoneme-scale evaluation of three state-of-the-art multichannel speech enhancement algorithms. These algorithms-filter-and-sum network, minimum variance distortionless response, and Tango-are here extensively evaluated across different noise conditions and spatial setups, employing realistic acoustic simulations with measured room impulse responses, and leveraging diversity offered by multiple microphones in a binaural hearing setup. The study emphasizes the fine-grained phoneme-scale analysis, revealing that while some phonemes like plosives are heavily impacted by environmental acoustics and challenging to deal with by the algorithms, others like nasals and sibilants see substantial improvements after enhancement. These investigations demonstrate important improvements in phoneme clarity in noisy conditions, with insights that could drive the development of more personalized and phoneme-aware hearing aid technologies. Additionally, while this study provides extensive data on the physical metrics of processed speech, these physical metrics do not necessarily imitate human perceptions of speech, and the impact of the findings presented would have to be investigated through listening tests.
{"title":"A Phoneme-Scale Assessment of Multichannel Speech Enhancement Algorithms.","authors":"Nasser-Eddine Monir, Paul Magron, Romain Serizel","doi":"10.1177/23312165241292205","DOIUrl":"10.1177/23312165241292205","url":null,"abstract":"<p><p>In the intricate acoustic landscapes where speech intelligibility is challenged by noise and reverberation, multichannel speech enhancement emerges as a promising solution for individuals with hearing loss. Such algorithms are commonly evaluated at the utterance scale. However, this approach overlooks the granular acoustic nuances revealed by phoneme-specific analysis, potentially obscuring key insights into their performance. This paper presents an in-depth phoneme-scale evaluation of three state-of-the-art multichannel speech enhancement algorithms. These algorithms-filter-and-sum network, minimum variance distortionless response, and Tango-are here extensively evaluated across different noise conditions and spatial setups, employing realistic acoustic simulations with measured room impulse responses, and leveraging diversity offered by multiple microphones in a binaural hearing setup. The study emphasizes the fine-grained phoneme-scale analysis, revealing that while some phonemes like plosives are heavily impacted by environmental acoustics and challenging to deal with by the algorithms, others like nasals and sibilants see substantial improvements after enhancement. These investigations demonstrate important improvements in phoneme clarity in noisy conditions, with insights that could drive the development of more personalized and phoneme-aware hearing aid technologies. Additionally, while this study provides extensive data on the physical metrics of processed speech, these physical metrics do not necessarily imitate human perceptions of speech, and the impact of the findings presented would have to be investigated through listening tests.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241292205"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11638999/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142814772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241293787
Nicholas R Haywood, David McAlpine, Deborah Vickers, Brian Roberts
Interaural time differences are often considered a weak cue for stream segregation. We investigated this claim with headphone-presented pure tones differing in a related form of interaural configuration-interaural phase differences (ΔIPD)-or/and in frequency (ΔF). In experiment 1, sequences comprised 5 × ABA- repetitions (A and B = 80-ms tones, "-" = 160-ms silence), and listeners reported whether integration or segregation was heard. Envelope shape was varied but remained constant across all tones within a trial. Envelopes were either quasi-trapezoidal or had a fast attack and slow release (FA-SR) or vice versa (SA-FR). The FA-SR envelope caused more segregation than SA-FR in a task where only ΔIPD cues were present, but not in a corresponding ΔF-only task. In experiment 2, interstimulus interval (ISI) was varied (0-60 ms) between FA-SR tones. ΔF-based segregation decreased with increasing ISI, whereas ΔIPD-based segregation increased. This suggests that binaural temporal integration may limit segregation at short ISIs. In another task, ΔF and ΔIPD cues were presented alone or in combination. Here, ΔIPD-based segregation was greatly reduced, suggesting ΔIPD-based segregation is highly sensitive to experimental context. Experiments 1-2 demonstrate that ΔIPD can promote segregation in optimized stimuli/tasks. Experiment 3 employed a task requiring integration for good performance. Listeners detected a delay on the final four B tones of an 8 × ABA- sequence. Although performance worsened with increasing ΔF, increasing ΔIPD had only a marginal impact. This suggests that, even in stimuli optimized for ΔIPD-based segregation, listeners remained mostly able to disregard ΔIPD when segregation was detrimental to performance.
{"title":"Factors Influencing Stream Segregation Based on Interaural Phase Difference Cues.","authors":"Nicholas R Haywood, David McAlpine, Deborah Vickers, Brian Roberts","doi":"10.1177/23312165241293787","DOIUrl":"10.1177/23312165241293787","url":null,"abstract":"<p><p>Interaural time differences are often considered a weak cue for stream segregation. We investigated this claim with headphone-presented pure tones differing in a related form of interaural configuration-interaural phase differences (ΔIPD)-or/and in frequency (ΔF). In experiment 1, sequences comprised 5 × ABA- repetitions (A and B = 80-ms tones, \"-\" = 160-ms silence), and listeners reported whether integration or segregation was heard. Envelope shape was varied but remained constant across all tones within a trial. Envelopes were either quasi-trapezoidal or had a fast attack and slow release (FA-SR) or vice versa (SA-FR). The FA-SR envelope caused more segregation than SA-FR in a task where only ΔIPD cues were present, but not in a corresponding ΔF-only task. In experiment 2, interstimulus interval (ISI) was varied (0-60 ms) between FA-SR tones. ΔF-based segregation decreased with increasing ISI, whereas ΔIPD-based segregation increased. This suggests that binaural temporal integration may limit segregation at short ISIs. In another task, ΔF and ΔIPD cues were presented alone or in combination. Here, ΔIPD-based segregation was greatly reduced, suggesting ΔIPD-based segregation is highly sensitive to experimental context. Experiments 1-2 demonstrate that ΔIPD can promote segregation in optimized stimuli/tasks. Experiment 3 employed a task requiring integration for good performance. Listeners detected a delay on the final four B tones of an 8 × ABA- sequence. Although performance worsened with increasing ΔF, increasing ΔIPD had only a marginal impact. This suggests that, even in stimuli optimized for ΔIPD-based segregation, listeners remained mostly able to disregard ΔIPD when segregation was detrimental to performance.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241293787"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11629429/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142802838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241286456
Nicolas Wallaert, Antoine Perry, Hadrien Jean, Gwenaelle Creff, Benoit Godey, Nihaad Paraouty
To date, pure-tone audiometry remains the gold standard for clinical auditory testing. However, pure-tone audiometry is time-consuming and only provides a discrete estimate of hearing acuity. Here, we aim to address these two main drawbacks by developing a machine learning (ML)-based approach for fully automated bone-conduction (BC) audiometry tests with forehead vibrator placement. Study 1 examines the occlusion effects when the headphones are positioned on both ears during BC forehead testing. Study 2 describes the ML-based approach for BC audiometry, with automated contralateral masking rules, compensation for occlusion effects and forehead-mastoid corrections. Next, the performance of ML-audiometry is examined in comparison to manual and conventional BC audiometry with mastoid placement. Finally, Study 3 examines the test-retest reliability of ML-audiometry. Our results show no significant performance difference between automated ML-audiometry and manual conventional audiometry. High test-retest reliability is achieved with the automated ML-audiometry. Together, our findings demonstrate the performance and reliability of the automated ML-based BC audiometry for both normal-hearing and hearing-impaired adult listeners with mild to severe hearing losses.
迄今为止,纯音测听仍是临床听觉测试的黄金标准。然而,纯音测听耗时较长,而且只能提供离散的听敏度估计值。在此,我们旨在通过开发一种基于机器学习(ML)的方法来解决这两个主要缺点,即使用前额振动器进行全自动骨传导(BC)听力测试。研究 1 探讨了 BC 前额测试中耳机置于双耳时的闭塞效应。研究 2 介绍了基于 ML 的 BC 听力测量方法,包括自动对侧掩蔽规则、闭塞效应补偿和前额-乳突校正。接下来,研究人员将 ML 测听法的性能与手动测听法和乳突置位的传统 BC 测听法进行了比较。最后,研究 3 检验了 ML 听力测定法的重复测试可靠性。研究结果表明,自动 ML 听力测定法与手动传统听力测定法之间没有明显的性能差异。自动 ML 听力测定法的测试再测可靠性很高。总之,我们的研究结果表明,对于听力正常和听力受损的轻度至重度听力损失的成年听众,基于 ML 的自动 BC 听力测定法都具有良好的性能和可靠性。
{"title":"Performance and Reliability Evaluation of an Automated Bone-Conduction Audiometry Using Machine Learning.","authors":"Nicolas Wallaert, Antoine Perry, Hadrien Jean, Gwenaelle Creff, Benoit Godey, Nihaad Paraouty","doi":"10.1177/23312165241286456","DOIUrl":"10.1177/23312165241286456","url":null,"abstract":"<p><p>To date, pure-tone audiometry remains the gold standard for clinical auditory testing. However, pure-tone audiometry is time-consuming and only provides a discrete estimate of hearing acuity. Here, we aim to address these two main drawbacks by developing a machine learning (ML)-based approach for fully automated bone-conduction (BC) audiometry tests with forehead vibrator placement. Study 1 examines the occlusion effects when the headphones are positioned on both ears during BC forehead testing. Study 2 describes the ML-based approach for BC audiometry, with automated contralateral masking rules, compensation for occlusion effects and forehead-mastoid corrections. Next, the performance of ML-audiometry is examined in comparison to manual and conventional BC audiometry with mastoid placement. Finally, Study 3 examines the test-retest reliability of ML-audiometry. Our results show no significant performance difference between automated ML-audiometry and manual conventional audiometry. High test-retest reliability is achieved with the automated ML-audiometry. Together, our findings demonstrate the performance and reliability of the automated ML-based BC audiometry for both normal-hearing and hearing-impaired adult listeners with mild to severe hearing losses.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241286456"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11703668/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142570248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Decoding speech envelopes from electroencephalogram (EEG) signals holds potential as a research tool for objectively assessing auditory processing, which could contribute to future developments in hearing loss diagnosis. However, current methods struggle to meet both high accuracy and interpretability. We propose a deep learning model called the auditory decoding transformer (ADT) network for speech envelope reconstruction from EEG signals to address these issues. The ADT network uses spatio-temporal convolution for feature extraction, followed by a transformer decoder to decode the speech envelopes. Through anticausal masking, the ADT considers only the current and future EEG features to match the natural relationship of speech and EEG. Performance evaluation shows that the ADT network achieves average reconstruction scores of 0.168 and 0.167 on the SparrKULee and DTU datasets, respectively, rivaling those of other nonlinear models. Furthermore, by visualizing the weights of the spatio-temporal convolution layer as time-domain filters and brain topographies, combined with an ablation study of the temporal convolution kernels, we analyze the behavioral patterns of the ADT network in decoding speech envelopes. The results indicate that low- (0.5-8 Hz) and high-frequency (14-32 Hz) EEG signals are more critical for envelope reconstruction and that the active brain regions are primarily distributed bilaterally in the auditory cortex, consistent with previous research. Visualization of attention scores further validated previous research. In summary, the ADT network balances high performance and interpretability, making it a promising tool for studying neural speech envelope tracking.
{"title":"ADT Network: A Novel Nonlinear Method for Decoding Speech Envelopes From EEG Signals.","authors":"Ruixiang Liu, Chang Liu, Dan Cui, Huan Zhang, Xinmeng Xu, Yuxin Duan, Yihu Chao, Xianzheng Sha, Limin Sun, Xiulan Ma, Shuo Li, Shijie Chang","doi":"10.1177/23312165241282872","DOIUrl":"https://doi.org/10.1177/23312165241282872","url":null,"abstract":"<p><p>Decoding speech envelopes from electroencephalogram (EEG) signals holds potential as a research tool for objectively assessing auditory processing, which could contribute to future developments in hearing loss diagnosis. However, current methods struggle to meet both high accuracy and interpretability. We propose a deep learning model called the auditory decoding transformer (ADT) network for speech envelope reconstruction from EEG signals to address these issues. The ADT network uses spatio-temporal convolution for feature extraction, followed by a transformer decoder to decode the speech envelopes. Through anticausal masking, the ADT considers only the current and future EEG features to match the natural relationship of speech and EEG. Performance evaluation shows that the ADT network achieves average reconstruction scores of 0.168 and 0.167 on the SparrKULee and DTU datasets, respectively, rivaling those of other nonlinear models. Furthermore, by visualizing the weights of the spatio-temporal convolution layer as time-domain filters and brain topographies, combined with an ablation study of the temporal convolution kernels, we analyze the behavioral patterns of the ADT network in decoding speech envelopes. The results indicate that low- (0.5-8 Hz) and high-frequency (14-32 Hz) EEG signals are more critical for envelope reconstruction and that the active brain regions are primarily distributed bilaterally in the auditory cortex, consistent with previous research. Visualization of attention scores further validated previous research. In summary, the ADT network balances high performance and interpretability, making it a promising tool for studying neural speech envelope tracking.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241282872"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11489951/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142478206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}