This study aimed to investigate the role of hearing aid (HA) usage in language outcomes among preschool children aged 3-5 years with mild bilateral hearing loss (MBHL). The data were retrieved from a total of 52 children with MBHL and 30 children with normal hearing (NH). The association between demographical, audiological factors and language outcomes was examined. Analyses of variance were conducted to compare the language abilities of HA users, non-HA users, and their NH peers. Furthermore, regression analyses were performed to identify significant predictors of language outcomes. Aided better ear pure-tone average (BEPTA) was significantly correlated with language comprehension scores. Among children with MBHL, those who used HA outperformed the ones who did not use HA across all linguistic domains. The language skills of children with MBHL were comparable to those of their peers with NH. The degree of improvement in audibility in terms of aided BEPTA was a significant predictor of language comprehension. It is noteworthy that 50% of the parents expressed reluctance regarding HA use for their children with MBHL. The findings highlight the positive impact of HA usage on language development in this population. Professionals may therefore consider HAs as a viable treatment option for children with MBHL, especially when there is a potential risk of language delay due to hearing loss. It was observed that 25% of the children with MBHL had late-onset hearing loss. Consequently, the implementation of preschool screening or a listening performance checklist is recommended to facilitate early detection.
{"title":"Impact of Hearing Aids on Language Outcomes in Preschool Children With Mild Bilateral Hearing Loss.","authors":"Yu-Chen Hung, Pei-Hsuan Ho, Pei-Hua Chen, Yi-Shin Tsai, Yi-Jui Li, Hung-Ching Lin","doi":"10.1177/23312165241256721","DOIUrl":"10.1177/23312165241256721","url":null,"abstract":"<p><p>This study aimed to investigate the role of hearing aid (HA) usage in language outcomes among preschool children aged 3-5 years with mild bilateral hearing loss (MBHL). The data were retrieved from a total of 52 children with MBHL and 30 children with normal hearing (NH). The association between demographical, audiological factors and language outcomes was examined. Analyses of variance were conducted to compare the language abilities of HA users, non-HA users, and their NH peers. Furthermore, regression analyses were performed to identify significant predictors of language outcomes. Aided better ear pure-tone average (BEPTA) was significantly correlated with language comprehension scores. Among children with MBHL, those who used HA outperformed the ones who did not use HA across all linguistic domains. The language skills of children with MBHL were comparable to those of their peers with NH. The degree of improvement in audibility in terms of aided BEPTA was a significant predictor of language comprehension. It is noteworthy that 50% of the parents expressed reluctance regarding HA use for their children with MBHL. The findings highlight the positive impact of HA usage on language development in this population. Professionals may therefore consider HAs as a viable treatment option for children with MBHL, especially when there is a potential risk of language delay due to hearing loss. It was observed that 25% of the children with MBHL had late-onset hearing loss. Consequently, the implementation of preschool screening or a listening performance checklist is recommended to facilitate early detection.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241256721"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11113073/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141076740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241260041
Larry E Humes, David A Zapala
Almost since the inception of the modern-day electroacoustic audiometer a century ago the results of pure-tone audiometry have been characterized by an audiogram. For almost as many years, clinicians and researchers have sought ways to distill the volume and complexity of information on the audiogram. Commonly used approaches have made use of pure-tone averages (PTAs) for various frequency ranges with the PTA for 500, 1000, 2000 and 4000 Hz (PTA4) being the most widely used for the categorization of hearing loss severity. Here, a three-digit triad is proposed as a single-number summary of not only the severity, but also the configuration and bilateral symmetry of the hearing loss. Each digit in the triad ranges from 0 to 9, increasing as the level of the pure-tone hearing threshold level (HTL) increases from a range of optimal hearing (< 10 dB Hearing Level; HL) to complete hearing loss (≥ 90 dB HL). Each digit also represents a different frequency region of the audiogram proceeding from left to right as: (Low, L) PTA for 500, 1000, and 2000 Hz; (Center, C) PTA for 3000, 4000 and 6000 Hz; and (High, H) HTL at 8000 Hz. This LCH Triad audiogram-classification system is evaluated using a large United States (U.S.) national dataset (N = 8,795) from adults 20 to 80 + years of age and two large clinical datasets totaling 8,254 adults covering a similar age range. Its ability to capture variations in hearing function was found to be superior to that of the widely used PTA4.
{"title":"Easy as 1-2-3: Development and Evaluation of a Simple yet Valid Audiogram-Classification System.","authors":"Larry E Humes, David A Zapala","doi":"10.1177/23312165241260041","DOIUrl":"10.1177/23312165241260041","url":null,"abstract":"<p><p>Almost since the inception of the modern-day electroacoustic audiometer a century ago the results of pure-tone audiometry have been characterized by an audiogram. For almost as many years, clinicians and researchers have sought ways to distill the volume and complexity of information on the audiogram. Commonly used approaches have made use of pure-tone averages (PTAs) for various frequency ranges with the PTA for 500, 1000, 2000 and 4000 Hz (PTA4) being the most widely used for the categorization of hearing loss severity. Here, a three-digit triad is proposed as a single-number summary of not only the severity, but also the configuration and bilateral symmetry of the hearing loss. Each digit in the triad ranges from 0 to 9, increasing as the level of the pure-tone hearing threshold level (HTL) increases from a range of optimal hearing (< 10 dB Hearing Level; HL) to complete hearing loss (≥ 90 dB HL). Each digit also represents a different frequency region of the audiogram proceeding from left to right as: (Low, L) PTA for 500, 1000, and 2000 Hz; (Center, C) PTA for 3000, 4000 and 6000 Hz; and (High, H) HTL at 8000 Hz. This LCH Triad audiogram-classification system is evaluated using a large United States (U.S.) national dataset (N = 8,795) from adults 20 to 80 + years of age and two large clinical datasets totaling 8,254 adults covering a similar age range. Its ability to capture variations in hearing function was found to be superior to that of the widely used PTA4.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241260041"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11179497/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141318660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165231215916
Moritz Wächtler, Pascale Sandmann, Hartmut Meister
When presenting two competing speech stimuli, one to each ear, a right-ear advantage (REA) can often be observed, reflected in better speech recognition compared to the left ear. Considering the left-hemispheric dominance for language, the REA has been explained by superior contralateral pathways (structural models) and language-induced shifts of attention to the right (attentional models). There is some evidence that the REA becomes more pronounced, as cognitive load increases. Hence, it is interesting to investigate the REA in static (constant target talker) and dynamic (target changing pseudo-randomly) cocktail-party situations, as the latter is associated with a higher cognitive load than the former. Furthermore, previous research suggests an increasing REA, when listening becomes more perceptually challenging. The present study examined the REA by using virtual acoustics to simulate static and dynamic cocktail-party situations, with three spatially separated talkers uttering concurrent matrix sentences. Sentences were presented at low sound pressure levels or processed with a noise vocoder to increase perceptual load. Sixteen young normal-hearing adults participated in the study. The REA was assessed by means of word recognition scores and a detailed error analysis. Word recognition revealed a greater REA for the dynamic than for the static situations, compatible with the view that an increase in cognitive load results in a heightened REA. Also, the REA depended on the type of perceptual load, as indicated by a higher REA associated with vocoded compared to low-level stimuli. The results of the error analysis support both structural and attentional models of the REA.
当呈现两个相互竞争的语音刺激时,两只耳朵各接受一个刺激,通常可以观察到右耳优势(REA),这反映在与左耳相比,右耳的语音识别能力更强。考虑到左半球在语言方面的优势,REA 可通过对侧的优势通路(结构模型)和语言引起的注意力向右侧转移(注意模型)来解释。有证据表明,随着认知负荷的增加,REA 会变得更加明显。因此,研究静态(目标谈话者不变)和动态(目标伪随机变化)鸡尾酒会情况下的 REA 是很有意义的,因为后者比前者与更高的认知负荷相关。此外,以往的研究表明,当听力变得更具知觉挑战性时,REA 会增加。本研究通过使用虚拟声学模拟静态和动态鸡尾酒会情境,让三个空间上分开的说话者同时说出矩阵句子,来检验 REA。句子以低声压级呈现,或用噪声声码器处理,以增加知觉负荷。16 名听力正常的年轻成年人参与了研究。通过单词识别得分和详细的错误分析对 REA 进行了评估。单词识别结果显示,动态情况下的 REA 高于静态情况下的 REA,这与认知负荷增加会导致 REA 增加的观点相吻合。此外,REA 还取决于感知负荷的类型,如与低级刺激相比,与声码刺激相关的 REA 更高。误差分析的结果支持 REA 的结构模型和注意模型。
{"title":"The Right-Ear Advantage in Static and Dynamic Cocktail-Party Situations.","authors":"Moritz Wächtler, Pascale Sandmann, Hartmut Meister","doi":"10.1177/23312165231215916","DOIUrl":"10.1177/23312165231215916","url":null,"abstract":"<p><p>When presenting two competing speech stimuli, one to each ear, a right-ear advantage (REA) can often be observed, reflected in better speech recognition compared to the left ear. Considering the left-hemispheric dominance for language, the REA has been explained by superior contralateral pathways (structural models) and language-induced shifts of attention to the right (attentional models). There is some evidence that the REA becomes more pronounced, as cognitive load increases. Hence, it is interesting to investigate the REA in static (constant target talker) and dynamic (target changing pseudo-randomly) cocktail-party situations, as the latter is associated with a higher cognitive load than the former. Furthermore, previous research suggests an increasing REA, when listening becomes more perceptually challenging. The present study examined the REA by using virtual acoustics to simulate static and dynamic cocktail-party situations, with three spatially separated talkers uttering concurrent matrix sentences. Sentences were presented at low sound pressure levels or processed with a noise vocoder to increase perceptual load. Sixteen young normal-hearing adults participated in the study. The REA was assessed by means of word recognition scores and a detailed error analysis. Word recognition revealed a greater REA for the dynamic than for the static situations, compatible with the view that an increase in cognitive load results in a heightened REA. Also, the REA depended on the type of perceptual load, as indicated by a higher REA associated with vocoded compared to low-level stimuli. The results of the error analysis support both structural and attentional models of the REA.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165231215916"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10826403/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139570355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165231217910
Robel Z Alemu, Blake C Papsin, Robert V Harrison, Al Blakeman, Karen A Gordon
The present study aimed to define use of head and eye movements during sound localization in children and adults to: (1) assess effects of stationary versus moving sound and (2) define effects of binaural cues degraded through acute monaural ear plugging. Thirty-three youth (MAge = 12.9 years) and seventeen adults (MAge = 24.6 years) with typical hearing were recruited and asked to localize white noise anywhere within a horizontal arc from -60° (left) to +60° (right) azimuth in two conditions (typical binaural and right ear plugged). In each trial, sound was presented at an initial stationary position (L1) and then while moving at ∼4°/s until reaching a second position (L2). Sound moved in five conditions (±40°, ±20°, or 0°). Participants adjusted a laser pointer to indicate L1 and L2 positions. Unrestricted head and eye movements were collected with gyroscopic sensors on the head and eye-tracking glasses, respectively. Results confirmed that accurate sound localization of both stationary and moving sound is disrupted by acute monaural ear plugging. Eye movements preceded head movements for sound localization in normal binaural listening and head movements were larger than eye movements during monaural plugging. Head movements favored the unplugged left ear when stationary sounds were presented in the right hemifield and during sound motion in both hemifields regardless of the movement direction. Disrupted binaural cues have greater effects on localization of moving than stationary sound. Head movements reveal preferential use of the better-hearing ear and relatively stable eye positions likely reflect normal vestibular-ocular reflexes.
{"title":"Head and Eye Movements Reveal Compensatory Strategies for Acute Binaural Deficits During Sound Localization.","authors":"Robel Z Alemu, Blake C Papsin, Robert V Harrison, Al Blakeman, Karen A Gordon","doi":"10.1177/23312165231217910","DOIUrl":"10.1177/23312165231217910","url":null,"abstract":"<p><p>The present study aimed to define use of head and eye movements during sound localization in children and adults to: (1) assess effects of stationary versus moving sound and (2) define effects of binaural cues degraded through acute monaural ear plugging. Thirty-three youth (<i>M</i><sub>Age </sub>= 12.9 years) and seventeen adults (<i>M</i><sub>Age </sub>= 24.6 years) with typical hearing were recruited and asked to localize white noise anywhere within a horizontal arc from -60° (left) to +60° (right) azimuth in two conditions (typical binaural and right ear plugged). In each trial, sound was presented at an initial stationary position (L1) and then while moving at ∼4°/s until reaching a second position (L2). Sound moved in five conditions (±40°, ±20°, or 0°). Participants adjusted a laser pointer to indicate L1 and L2 positions. Unrestricted head and eye movements were collected with gyroscopic sensors on the head and eye-tracking glasses, respectively. Results confirmed that accurate sound localization of both stationary and moving sound is disrupted by acute monaural ear plugging. Eye movements preceded head movements for sound localization in normal binaural listening and head movements were larger than eye movements during monaural plugging. Head movements favored the unplugged left ear when stationary sounds were presented in the right hemifield and during sound motion in both hemifields regardless of the movement direction. Disrupted binaural cues have greater effects on localization of moving than stationary sound. Head movements reveal preferential use of the better-hearing ear and relatively stable eye positions likely reflect normal vestibular-ocular reflexes.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165231217910"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10832417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139651917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241232551
Bethany Plain, Hidde Pielage, Sophia E Kramer, Michael Richter, Gabrielle H Saunders, Niek J Versfeld, Adriana A Zekveld, Tanveer A Bhuiyan
In daily life, both acoustic factors and social context can affect listening effort investment. In laboratory settings, information about listening effort has been deduced from pupil and cardiovascular responses independently. The extent to which these measures can jointly predict listening-related factors is unknown. Here we combined pupil and cardiovascular features to predict acoustic and contextual aspects of speech perception. Data were collected from 29 adults (mean = 64.6 years, SD = 9.2) with hearing loss. Participants performed a speech perception task at two individualized signal-to-noise ratios (corresponding to 50% and 80% of sentences correct) and in two social contexts (the presence and absence of two observers). Seven features were extracted per trial: baseline pupil size, peak pupil dilation, mean pupil dilation, interbeat interval, blood volume pulse amplitude, pre-ejection period and pulse arrival time. These features were used to train k-nearest neighbor classifiers to predict task demand, social context and sentence accuracy. The k-fold cross validation on the group-level data revealed above-chance classification accuracies: task demand, 64.4%; social context, 78.3%; and sentence accuracy, 55.1%. However, classification accuracies diminished when the classifiers were trained and tested on data from different participants. Individually trained classifiers (one per participant) performed better than group-level classifiers: 71.7% (SD = 10.2) for task demand, 88.0% (SD = 7.5) for social context, and 60.0% (SD = 13.1) for sentence accuracy. We demonstrated that classifiers trained on group-level physiological data to predict aspects of speech perception generalized poorly to novel participants. Individually calibrated classifiers hold more promise for future applications.
{"title":"Combining Cardiovascular and Pupil Features Using k-Nearest Neighbor Classifiers to Assess Task Demand, Social Context, and Sentence Accuracy During Listening.","authors":"Bethany Plain, Hidde Pielage, Sophia E Kramer, Michael Richter, Gabrielle H Saunders, Niek J Versfeld, Adriana A Zekveld, Tanveer A Bhuiyan","doi":"10.1177/23312165241232551","DOIUrl":"10.1177/23312165241232551","url":null,"abstract":"<p><p>In daily life, both acoustic factors and social context can affect listening effort investment. In laboratory settings, information about listening effort has been deduced from pupil and cardiovascular responses independently. The extent to which these measures can jointly predict listening-related factors is unknown. Here we combined pupil and cardiovascular features to predict acoustic and contextual aspects of speech perception. Data were collected from 29 adults (mean = 64.6 years, SD = 9.2) with hearing loss. Participants performed a speech perception task at two individualized signal-to-noise ratios (corresponding to 50% and 80% of sentences correct) and in two social contexts (the presence and absence of two observers). Seven features were extracted per trial: baseline pupil size, peak pupil dilation, mean pupil dilation, interbeat interval, blood volume pulse amplitude, pre-ejection period and pulse arrival time. These features were used to train k-nearest neighbor classifiers to predict task demand, social context and sentence accuracy. The k-fold cross validation on the group-level data revealed above-chance classification accuracies: task demand, 64.4%; social context, 78.3%; and sentence accuracy, 55.1%. However, classification accuracies diminished when the classifiers were trained and tested on data from different participants. Individually trained classifiers (one per participant) performed better than group-level classifiers: 71.7% (SD = 10.2) for task demand, 88.0% (SD = 7.5) for social context, and 60.0% (SD = 13.1) for sentence accuracy. We demonstrated that classifiers trained on group-level physiological data to predict aspects of speech perception generalized poorly to novel participants. Individually calibrated classifiers hold more promise for future applications.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241232551"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10981225/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140319548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241231685
Deborah A Vickers, Brian C J Moore
{"title":"Editorial: Cochlear Implants and Music.","authors":"Deborah A Vickers, Brian C J Moore","doi":"10.1177/23312165241231685","DOIUrl":"10.1177/23312165241231685","url":null,"abstract":"","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241231685"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10874149/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139742320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241276435
Inga Holube, Stefan Taesler, Saskia Ibelings, Martin Hansen, Jasper Ooster
In speech audiometry, the speech-recognition threshold (SRT) is usually established by adjusting the signal-to-noise ratio (SNR) until 50% of the words or sentences are repeated correctly. However, these conditions are rarely encountered in everyday situations. Therefore, for a group of 15 young participants with normal hearing and a group of 12 older participants with hearing impairment, speech-recognition scores were determined at SRT and at four higher SNRs using several stationary and fluctuating maskers. Participants' verbal responses were recorded, and participants were asked to self-report their listening effort on a categorical scale (self-reported listening effort, SR-LE). The responses were analyzed using an Automatic Speech Recognizer (ASR) and compared to the results of a human examiner. An intraclass correlation coefficient of r = .993 for the agreement between their corresponding speech-recognition scores was observed. As expected, speech-recognition scores increased with increasing SNR and decreased with increasing SR-LE. However, differences between speech-recognition scores for fluctuating and stationary maskers were observed as a function of SNR, but not as a function of SR-LE. The verbal response time (VRT) and the response speech rate (RSR) of the listeners' responses were measured using an ASR. The participants with hearing impairment showed significantly lower RSRs and higher VRTs compared to the participants with normal hearing. These differences may be attributed to differences in age, hearing, or both. With increasing SR-LE, VRT increased and RSR decreased. The results show the possibility of deriving a behavioral measure, VRT, measured directly from participants' verbal responses during speech audiometry, as a proxy for SR-LE.
{"title":"Automated Measurement of Speech Recognition, Reaction Time, and Speech Rate and Their Relation to Self-Reported Listening Effort for Normal-Hearing and Hearing-Impaired Listeners Using various Maskers.","authors":"Inga Holube, Stefan Taesler, Saskia Ibelings, Martin Hansen, Jasper Ooster","doi":"10.1177/23312165241276435","DOIUrl":"10.1177/23312165241276435","url":null,"abstract":"<p><p>In speech audiometry, the speech-recognition threshold (SRT) is usually established by adjusting the signal-to-noise ratio (SNR) until 50% of the words or sentences are repeated correctly. However, these conditions are rarely encountered in everyday situations. Therefore, for a group of 15 young participants with normal hearing and a group of 12 older participants with hearing impairment, speech-recognition scores were determined at SRT and at four higher SNRs using several stationary and fluctuating maskers. Participants' verbal responses were recorded, and participants were asked to self-report their listening effort on a categorical scale (self-reported listening effort, SR-LE). The responses were analyzed using an Automatic Speech Recognizer (ASR) and compared to the results of a human examiner. An intraclass correlation coefficient of <i>r </i>= .993 for the agreement between their corresponding speech-recognition scores was observed. As expected, speech-recognition scores increased with increasing SNR and decreased with increasing SR-LE. However, differences between speech-recognition scores for fluctuating and stationary maskers were observed as a function of SNR, but not as a function of SR-LE. The verbal response time (VRT) and the response speech rate (RSR) of the listeners' responses were measured using an ASR. The participants with hearing impairment showed significantly lower RSRs and higher VRTs compared to the participants with normal hearing. These differences may be attributed to differences in age, hearing, or both. With increasing SR-LE, VRT increased and RSR decreased. The results show the possibility of deriving a behavioral measure, VRT, measured directly from participants' verbal responses during speech audiometry, as a proxy for SR-LE.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241276435"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11421406/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241229057
Gloria Araiza-Illan, Luke Meyer, Khiet P Truong, Deniz Başkent
A practical speech audiometry tool is the digits-in-noise (DIN) test for hearing screening of populations of varying ages and hearing status. The test is usually conducted by a human supervisor (e.g., clinician), who scores the responses spoken by the listener, or online, where software scores the responses entered by the listener. The test has 24-digit triplets presented in an adaptive staircase procedure, resulting in a speech reception threshold (SRT). We propose an alternative automated DIN test setup that can evaluate spoken responses whilst conducted without a human supervisor, using the open-source automatic speech recognition toolkit, Kaldi-NL. Thirty self-reported normal-hearing Dutch adults (19-64 years) completed one DIN + Kaldi-NL test. Their spoken responses were recorded and used for evaluating the transcript of decoded responses by Kaldi-NL. Study 1 evaluated the Kaldi-NL performance through its word error rate (WER), percentage of summed decoding errors regarding only digits found in the transcript compared to the total number of digits present in the spoken responses. Average WER across participants was 5.0% (range 0-48%, SD = 8.8%), with average decoding errors in three triplets per participant. Study 2 analyzed the effect that triplets with decoding errors from Kaldi-NL had on the DIN test output (SRT), using bootstrapping simulations. Previous research indicated 0.70 dB as the typical within-subject SRT variability for normal-hearing adults. Study 2 showed that up to four triplets with decoding errors produce SRT variations within this range, suggesting that our proposed setup could be feasible for clinical applications.
{"title":"Automated Speech Audiometry: Can It Work Using Open-Source Pre-Trained Kaldi-NL Automatic Speech Recognition?","authors":"Gloria Araiza-Illan, Luke Meyer, Khiet P Truong, Deniz Başkent","doi":"10.1177/23312165241229057","DOIUrl":"10.1177/23312165241229057","url":null,"abstract":"<p><p>A practical speech audiometry tool is the digits-in-noise (DIN) test for hearing screening of populations of varying ages and hearing status. The test is usually conducted by a human supervisor (e.g., clinician), who scores the responses spoken by the listener, or online, where software scores the responses entered by the listener. The test has 24-digit triplets presented in an adaptive staircase procedure, resulting in a speech reception threshold (SRT). We propose an alternative automated DIN test setup that can evaluate spoken responses whilst conducted without a human supervisor, using the open-source automatic speech recognition toolkit, Kaldi-NL. Thirty self-reported normal-hearing Dutch adults (19-64 years) completed one DIN + Kaldi-NL test. Their spoken responses were recorded and used for evaluating the transcript of decoded responses by Kaldi-NL. Study 1 evaluated the Kaldi-NL performance through its word error rate (WER), percentage of summed decoding errors regarding only digits found in the transcript compared to the total number of digits present in the spoken responses. Average WER across participants was 5.0% (range 0-48%, SD = 8.8%), with average decoding errors in three triplets per participant. Study 2 analyzed the effect that triplets with decoding errors from Kaldi-NL had on the DIN test output (SRT), using bootstrapping simulations. Previous research indicated 0.70 dB as the typical within-subject SRT variability for normal-hearing adults. Study 2 showed that up to four triplets with decoding errors produce SRT variations within this range, suggesting that our proposed setup could be feasible for clinical applications.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241229057"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10943752/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140132882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01DOI: 10.1177/23312165241229880
Sean R Anderson, Emily Burg, Lukas Suveg, Ruth Y Litovsky
Bilateral cochlear implants (BiCIs) result in several benefits, including improvements in speech understanding in noise and sound source localization. However, the benefit bilateral implants provide among recipients varies considerably across individuals. Here we consider one of the reasons for this variability: difference in hearing function between the two ears, that is, interaural asymmetry. Thus far, investigations of interaural asymmetry have been highly specialized within various research areas. The goal of this review is to integrate these studies in one place, motivating future research in the area of interaural asymmetry. We first consider bottom-up processing, where binaural cues are represented using excitation-inhibition of signals from the left ear and right ear, varying with the location of the sound in space, and represented by the lateral superior olive in the auditory brainstem. We then consider top-down processing via predictive coding, which assumes that perception stems from expectations based on context and prior sensory experience, represented by cascading series of cortical circuits. An internal, perceptual model is maintained and updated in light of incoming sensory input. Together, we hope that this amalgamation of physiological, behavioral, and modeling studies will help bridge gaps in the field of binaural hearing and promote a clearer understanding of the implications of interaural asymmetry for future research on optimal patient interventions.
{"title":"Review of Binaural Processing With Asymmetrical Hearing Outcomes in Patients With Bilateral Cochlear Implants.","authors":"Sean R Anderson, Emily Burg, Lukas Suveg, Ruth Y Litovsky","doi":"10.1177/23312165241229880","DOIUrl":"10.1177/23312165241229880","url":null,"abstract":"<p><p>Bilateral cochlear implants (BiCIs) result in several benefits, including improvements in speech understanding in noise and sound source localization. However, the benefit bilateral implants provide among recipients varies considerably across individuals. Here we consider one of the reasons for this variability: difference in hearing function between the two ears, that is, interaural asymmetry. Thus far, investigations of interaural asymmetry have been highly specialized within various research areas. The goal of this review is to integrate these studies in one place, motivating future research in the area of interaural asymmetry. We first consider bottom-up processing, where binaural cues are represented using excitation-inhibition of signals from the left ear and right ear, varying with the location of the sound in space, and represented by the lateral superior olive in the auditory brainstem. We then consider top-down processing via predictive coding, which assumes that perception stems from expectations based on context and prior sensory experience, represented by cascading series of cortical circuits. An internal, perceptual model is maintained and updated in light of incoming sensory input. Together, we hope that this amalgamation of physiological, behavioral, and modeling studies will help bridge gaps in the field of binaural hearing and promote a clearer understanding of the implications of interaural asymmetry for future research on optimal patient interventions.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241229880"},"PeriodicalIF":2.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10976506/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140307503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cochlear implant (CI) users, even with substantial speech comprehension, generally have poor sensitivity to pitch information (or fundamental frequency, F0). This insensitivity is often attributed to limited spectral and temporal resolution in the CI signals. However, the pitch sensitivity markedly varies among individuals, and some users exhibit fairly good sensitivity. This indicates that the CI signal contains sufficient information about F0, and users' sensitivity is predominantly limited by other physiological conditions such as neuroplasticity or neural health. We estimated the upper limit of F0 information that a CI signal can convey by decoding F0 from simulated CI signals (multi-channel pulsatile signals) with a deep neural network model (referred to as the CI model). We varied the number of electrode channels and the pulse rate, which should respectively affect spectral and temporal resolutions of stimulus representations. The F0-estimation performance generally improved with increasing number of channels and pulse rate. For the sounds presented under quiet conditions, the model performance was at best comparable to that of a control waveform model, which received raw-waveform inputs. Under conditions in which background noise was imposed, the performance of the CI model generally degraded by a greater degree than that of the waveform model. The pulse rate had a particularly large effect on predicted performance. These observations indicate that the CI signal contains some information for predicting F0, which is particularly sufficient for targets under quiet conditions. The temporal resolution (represented as pulse rate) plays a critical role in pitch representation under noisy conditions.
人工耳蜗 (CI) 用户即使有很强的语音理解能力,一般对音高信息(或基频,F0)的敏感度也很低。这种不敏感通常归因于 CI 信号的频谱和时间分辨率有限。然而,不同个体的音调灵敏度存在明显差异,有些用户的灵敏度相当高。这表明 CI 信号包含足够的 F0 信息,而用户的灵敏度主要受到神经可塑性或神经健康等其他生理条件的限制。我们通过使用深度神经网络模型(简称 CI 模型)对模拟 CI 信号(多通道脉动信号)进行 F0 解码,从而估算出 CI 信号所能传达的 F0 信息上限。我们改变了电极通道的数量和脉冲频率,这将分别影响刺激表征的频谱和时间分辨率。随着通道数和脉冲频率的增加,F0 估算性能普遍提高。对于在安静条件下呈现的声音,模型性能最多只能与接收原始波形输入的对照波形模型相媲美。在有背景噪音的条件下,CI 模型的性能通常比波形模型的性能下降得更多。脉搏率对预测性能的影响尤其大。这些观察结果表明,CI 信号包含一些预测 F0 的信息,尤其是对安静条件下的目标而言,这些信息是足够的。时间分辨率(以脉搏率表示)在噪声条件下的音高表示中起着至关重要的作用。
{"title":"Estimating Pitch Information From Simulated Cochlear Implant Signals With Deep Neural Networks.","authors":"Takanori Ashihara, Shigeto Furukawa, Makio Kashino","doi":"10.1177/23312165241298606","DOIUrl":"https://doi.org/10.1177/23312165241298606","url":null,"abstract":"<p><p>Cochlear implant (CI) users, even with substantial speech comprehension, generally have poor sensitivity to pitch information (or fundamental frequency, F0). This insensitivity is often attributed to limited spectral and temporal resolution in the CI signals. However, the pitch sensitivity markedly varies among individuals, and some users exhibit fairly good sensitivity. This indicates that the CI signal contains sufficient information about F0, and users' sensitivity is predominantly limited by other physiological conditions such as neuroplasticity or neural health. We estimated the upper limit of F0 information that a CI signal can convey by decoding F0 from simulated CI signals (multi-channel pulsatile signals) with a deep neural network model (referred to as the CI model). We varied the number of electrode channels and the pulse rate, which should respectively affect spectral and temporal resolutions of stimulus representations. The F0-estimation performance generally improved with increasing number of channels and pulse rate. For the sounds presented under quiet conditions, the model performance was at best comparable to that of a control waveform model, which received raw-waveform inputs. Under conditions in which background noise was imposed, the performance of the CI model generally degraded by a greater degree than that of the waveform model. The pulse rate had a particularly large effect on predicted performance. These observations indicate that the CI signal contains some information for predicting F0, which is particularly sufficient for targets under quiet conditions. The temporal resolution (represented as pulse rate) plays a critical role in pitch representation under noisy conditions.</p>","PeriodicalId":48678,"journal":{"name":"Trends in Hearing","volume":"28 ","pages":"23312165241298606"},"PeriodicalIF":2.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}