Joshua S Stohl, Stephen R Dennison, Robert D Wolford, Blake S Wilson
For cochlear implant recipients with deeply inserted electrode arrays, pitch confusions are more prevalent among apical electrodes than among middle and basal electrodes. It was hypothesized that changing phase duration (PD) affects pitch evoked by electrodes in the apex but not in the middle or base of the cochlea. Participants ranked the pitch of stimuli that varied in PD but were presented via the same electrode. Target electrodes included an apical, middle, and basal electrode. Longer PDs led to lower pitch ranks but only among apical electrodes. Increasing the PDs for apical electrodes may help to alleviate pitch confusions.
{"title":"The effect of cochlear implant pulse phase duration on stimulus rankings according to pitch.","authors":"Joshua S Stohl, Stephen R Dennison, Robert D Wolford, Blake S Wilson","doi":"10.1121/10.0042321","DOIUrl":"https://doi.org/10.1121/10.0042321","url":null,"abstract":"<p><p>For cochlear implant recipients with deeply inserted electrode arrays, pitch confusions are more prevalent among apical electrodes than among middle and basal electrodes. It was hypothesized that changing phase duration (PD) affects pitch evoked by electrodes in the apex but not in the middle or base of the cochlea. Participants ranked the pitch of stimuli that varied in PD but were presented via the same electrode. Target electrodes included an apical, middle, and basal electrode. Longer PDs led to lower pitch ranks but only among apical electrodes. Increasing the PDs for apical electrodes may help to alleviate pitch confusions.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"6 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146108652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A Bayes factor discriminant is constructed for active sonar detection of a scattering body in an underwater refractive environment about which there is some depth uncertainty. The scenarios of interest here are associated with relatively high-frequency broadband waveforms and with reception along vertical arrays. The approach properly accounts for environmental information regarding the refractive media, as well as surface and volume reverberation models or in situ observations of the same. Uncertainty, both in the reverberation field and the scatterers' depth, is incorporated through proper marginalization rather than maximization as in more conventional generalized likelihood ratio tests. Bayes factor active sonar (BFAS) yields a set of time-varying quadratic forms in beam-delay space, optimally balancing uncertainty in the object of interest with reverberation and noise subspaces in the minimum average risk sense. By utilizing waveguide information, BFAS combines multi-path arrivals, optimally attenuating reverberation subspaces while preserving the target subspace, thereby effectively increasing signal-to-reverberation plus noise ratios despite uncertainty in target depth. Depth-invariant modes are leveraged to provide a valuable expansion of the discriminating information of the BFAS, thereby providing lower bounds on the performance of the BFAS. These bounds illustrate that even under depth uncertainty, the BFAS outperforms a single specular arrival detector with perfect knowledge of the scattering body's depth. Performance across various refractive and shallow-water environments is demonstrated, lending credence to the multi-path combining approach.
{"title":"A Bayes factor high-frequency broadband active sonar discriminant expansion via depth invariant modes.","authors":"Paul J Gendron, Kenneth T Bowers","doi":"10.1121/10.0042317","DOIUrl":"https://doi.org/10.1121/10.0042317","url":null,"abstract":"<p><p>A Bayes factor discriminant is constructed for active sonar detection of a scattering body in an underwater refractive environment about which there is some depth uncertainty. The scenarios of interest here are associated with relatively high-frequency broadband waveforms and with reception along vertical arrays. The approach properly accounts for environmental information regarding the refractive media, as well as surface and volume reverberation models or in situ observations of the same. Uncertainty, both in the reverberation field and the scatterers' depth, is incorporated through proper marginalization rather than maximization as in more conventional generalized likelihood ratio tests. Bayes factor active sonar (BFAS) yields a set of time-varying quadratic forms in beam-delay space, optimally balancing uncertainty in the object of interest with reverberation and noise subspaces in the minimum average risk sense. By utilizing waveguide information, BFAS combines multi-path arrivals, optimally attenuating reverberation subspaces while preserving the target subspace, thereby effectively increasing signal-to-reverberation plus noise ratios despite uncertainty in target depth. Depth-invariant modes are leveraged to provide a valuable expansion of the discriminating information of the BFAS, thereby providing lower bounds on the performance of the BFAS. These bounds illustrate that even under depth uncertainty, the BFAS outperforms a single specular arrival detector with perfect knowledge of the scattering body's depth. Performance across various refractive and shallow-water environments is demonstrated, lending credence to the multi-path combining approach.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"6 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Freely available online emotional stimuli allow researchers to conduct emotional perception research without needing to record their own and easily compare findings across studies; however, the acoustic properties, specifically the prosodic cues, are frequently unreported. Prosodic cues are important for a listener to contrast between the talker's emotional tone. Thus, understanding how these cues differ among an emotional stimuli database allows for a nuanced interpretation of findings for emotion perception researchers. This paper analyzes the prosodic cues (fundamental frequency, duration, and loudness) of the speech stimuli in The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS).
{"title":"Acoustic analyses of the RAVDESS corpus of emotional stimuli.","authors":"Devon P Major, Monita Chatterjee","doi":"10.1121/10.0042364","DOIUrl":"https://doi.org/10.1121/10.0042364","url":null,"abstract":"<p><p>Freely available online emotional stimuli allow researchers to conduct emotional perception research without needing to record their own and easily compare findings across studies; however, the acoustic properties, specifically the prosodic cues, are frequently unreported. Prosodic cues are important for a listener to contrast between the talker's emotional tone. Thus, understanding how these cues differ among an emotional stimuli database allows for a nuanced interpretation of findings for emotion perception researchers. This paper analyzes the prosodic cues (fundamental frequency, duration, and loudness) of the speech stimuli in The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS).</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"6 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clinical imaging resolution is often insufficient for accurate ultrasound simulation in personalised treatment planning. Naive image upsampling algorithms introduce artifacts that reduce simulation accuracy. We develop a Gaussian smoothing and mesh-based upsampling method and evaluate upsampled image fidelity and ultrasound simulation accuracy using an ex vivo high resolution human cervical vertebra volume obtained with hierarchical phase-contrast tomography, against linear and nearest neighbour interpolation. The mesh-based method significantly and consistently reduced L2 and L∞ errors and reduced focal errors with inconsistent significance. Medical image upsampling for ultrasound simulation should be completed with edge-aware methods to improve accuracy.
{"title":"Medical image upsampling algorithm to reduce errors in grid-based ultrasound simulation.","authors":"Donny Liangpu Liu, Rui Xu","doi":"10.1121/10.0042350","DOIUrl":"https://doi.org/10.1121/10.0042350","url":null,"abstract":"<p><p>Clinical imaging resolution is often insufficient for accurate ultrasound simulation in personalised treatment planning. Naive image upsampling algorithms introduce artifacts that reduce simulation accuracy. We develop a Gaussian smoothing and mesh-based upsampling method and evaluate upsampled image fidelity and ultrasound simulation accuracy using an ex vivo high resolution human cervical vertebra volume obtained with hierarchical phase-contrast tomography, against linear and nearest neighbour interpolation. The mesh-based method significantly and consistently reduced L2 and L∞ errors and reduced focal errors with inconsistent significance. Medical image upsampling for ultrasound simulation should be completed with edge-aware methods to improve accuracy.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"6 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146108687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xianhui Wang, Chao-Yang Lee, Yu Zhang, Seth Wiener
How the dual function of fundamental frequency (F0)-talker separation and word distinction-affects Mandarin word recognition in a cocktail party scenario is investigated. A robust benefit of talker F0 separation is observed: Target recognition was more accurate with different-sex talkers (85%) than same-sex talkers (48%). The effect of word-F0 was modulated by lexical status: Real-word tonal minimal pairs lowered accuracy relative to the baseline (average 4% decrease), whereas nonword tonal minimal pairs did not compromise performance. Thus, tone language listeners leverage talker-F0 differences just as non-tone language listeners do, but the advantage is constrained by the lexical role of F0.
{"title":"Processing multi-talker speech in a tone language: Dumplings interfere with sleep at a cocktail party.","authors":"Xianhui Wang, Chao-Yang Lee, Yu Zhang, Seth Wiener","doi":"10.1121/10.0042461","DOIUrl":"https://doi.org/10.1121/10.0042461","url":null,"abstract":"<p><p>How the dual function of fundamental frequency (F0)-talker separation and word distinction-affects Mandarin word recognition in a cocktail party scenario is investigated. A robust benefit of talker F0 separation is observed: Target recognition was more accurate with different-sex talkers (85%) than same-sex talkers (48%). The effect of word-F0 was modulated by lexical status: Real-word tonal minimal pairs lowered accuracy relative to the baseline (average 4% decrease), whereas nonword tonal minimal pairs did not compromise performance. Thus, tone language listeners leverage talker-F0 differences just as non-tone language listeners do, but the advantage is constrained by the lexical role of F0.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"6 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146151327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Fogerty, Amin Edraki, Wai-Yip Chan, Jesper Jensen
Bone conducted speech (BCS) signals have reduced intelligibility and quality due to a limited frequency response and speaker-specific distortion introduced by bone conduction pathways. However, BCS is isolated from the environment, which may offer noise-free communication in high noise environments. Deep neural network speech models were designed to investigate the enhancement of BCS for new unseen speakers. Personalization to the new speakers was examined through full-model fine tuning and parameter-efficient adaptation. Listener subjective quality ratings and objective metrics of intelligibility and quality demonstrate significant enhancement of BCS for unseen speakers, with personalization using data-limited parameter-efficient model adaptation.
{"title":"Parameter efficient speaker adaptation for enhancement of bone-conducted speech.","authors":"Daniel Fogerty, Amin Edraki, Wai-Yip Chan, Jesper Jensen","doi":"10.1121/10.0042462","DOIUrl":"https://doi.org/10.1121/10.0042462","url":null,"abstract":"<p><p>Bone conducted speech (BCS) signals have reduced intelligibility and quality due to a limited frequency response and speaker-specific distortion introduced by bone conduction pathways. However, BCS is isolated from the environment, which may offer noise-free communication in high noise environments. Deep neural network speech models were designed to investigate the enhancement of BCS for new unseen speakers. Personalization to the new speakers was examined through full-model fine tuning and parameter-efficient adaptation. Listener subjective quality ratings and objective metrics of intelligibility and quality demonstrate significant enhancement of BCS for unseen speakers, with personalization using data-limited parameter-efficient model adaptation.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"6 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A device was developed to simultaneously observe two signals when an ultrasound was irradiated to cancellous bone: a piezoelectric signal generated in the bone and an ultrasound signal propagated through the bone. This device was based on the previously developed "piezoelectric cell (PE-cell)," which is an ultrasound sensor using a cancellous bone specimen as a piezoelectric element, and a receiving device for the ultrasound signal was added. Using this device, both the piezoelectric and ultrasound signal waveforms could be simultaneously observed. From the observed waveforms, it was shown that the piezoelectric signal was related to the ultrasound signal.
{"title":"Development of a device for simultaneous observation of piezoelectric and ultrasound signals in cancellous bone.","authors":"Atsushi Hosokawa","doi":"10.1121/10.0042394","DOIUrl":"https://doi.org/10.1121/10.0042394","url":null,"abstract":"<p><p>A device was developed to simultaneously observe two signals when an ultrasound was irradiated to cancellous bone: a piezoelectric signal generated in the bone and an ultrasound signal propagated through the bone. This device was based on the previously developed \"piezoelectric cell (PE-cell),\" which is an ultrasound sensor using a cancellous bone specimen as a piezoelectric element, and a receiving device for the ultrasound signal was added. Using this device, both the piezoelectric and ultrasound signal waveforms could be simultaneously observed. From the observed waveforms, it was shown that the piezoelectric signal was related to the ultrasound signal.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"6 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study investigated, with 40 young-adult Taiwan Mandarin listeners, the perception of clearly vs conversationally produced Mandarin fricatives in quiet and noisy conditions. Clear speech did not improve identification accuracy but consistently facilitated processing, as shown by shorter reaction times. Correlations with acoustic measures suggest that modifications in spectral variance, skewness, and relative amplitude are associated with this clear speech advantage. These findings underscore processing speed as a dimension of clear speech benefit and extend our understanding of clear speech effects to the full Mandarin fricative inventory.
{"title":"Clear speech effects on Mandarin fricative perception.","authors":"Yung-Hsiang Shawn Chang, Yu-Wen Chen","doi":"10.1121/10.0042407","DOIUrl":"https://doi.org/10.1121/10.0042407","url":null,"abstract":"<p><p>This study investigated, with 40 young-adult Taiwan Mandarin listeners, the perception of clearly vs conversationally produced Mandarin fricatives in quiet and noisy conditions. Clear speech did not improve identification accuracy but consistently facilitated processing, as shown by shorter reaction times. Correlations with acoustic measures suggest that modifications in spectral variance, skewness, and relative amplitude are associated with this clear speech advantage. These findings underscore processing speed as a dimension of clear speech benefit and extend our understanding of clear speech effects to the full Mandarin fricative inventory.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"6 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146151389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katerina A Tetzloff, Sarah E Yoho, Eric W Healy, Stephanie A Borrie
Contextual clues aid in speech perception, especially when the signal is degraded by speech disorders or background noise. This study examined whether different types of degradation affect how listeners use contextual predictability. Two groups of 50 listeners were tested across three conditions: dysarthric speech, neurotypical speech masked by noise, and dysarthric speech masked by noise. Listeners relied on semantic context similarly for dysarthric speech in quiet and neurotypical speech in noise (single degradations). However, when dysarthric speech was masked by noise (concurrent degradation), contextual benefit was greatly reduced. Findings highlight the communication burden noise adds for understanding dysarthric speech.
{"title":"Background noise inhibits listeners' use of contextual cues for dysarthric speech.","authors":"Katerina A Tetzloff, Sarah E Yoho, Eric W Healy, Stephanie A Borrie","doi":"10.1121/10.0042316","DOIUrl":"10.1121/10.0042316","url":null,"abstract":"<p><p>Contextual clues aid in speech perception, especially when the signal is degraded by speech disorders or background noise. This study examined whether different types of degradation affect how listeners use contextual predictability. Two groups of 50 listeners were tested across three conditions: dysarthric speech, neurotypical speech masked by noise, and dysarthric speech masked by noise. Listeners relied on semantic context similarly for dysarthric speech in quiet and neurotypical speech in noise (single degradations). However, when dysarthric speech was masked by noise (concurrent degradation), contextual benefit was greatly reduced. Findings highlight the communication burden noise adds for understanding dysarthric speech.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"6 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12870364/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146108694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study addresses methodological issues in the modulation-masking paradigm for future clinical applications for assessing modulation filters. The aim of the paradigm is to determine a target-modulation threshold in a fixed-depth masker. Two experiments with naive normal-hearing listeners evaluated measurement efficiency and parameter settings. The first experiment examined whether providing a cue during practice improves detection task efficiency, finding no significant benefit. The second experiment compared two masker modulation depths (-5 and -10 dB) and found no significant difference in the derived modulation-tuning properties, suggesting that masker depth does not affect estimated masking threshold pattern within this range.
{"title":"Effects of cues and modulation depth on the measurement of modulation masking pattern.","authors":"Kazuaki Honda, Shigeto Furukawa","doi":"10.1121/10.0042176","DOIUrl":"https://doi.org/10.1121/10.0042176","url":null,"abstract":"<p><p>This study addresses methodological issues in the modulation-masking paradigm for future clinical applications for assessing modulation filters. The aim of the paradigm is to determine a target-modulation threshold in a fixed-depth masker. Two experiments with naive normal-hearing listeners evaluated measurement efficiency and parameter settings. The first experiment examined whether providing a cue during practice improves detection task efficiency, finding no significant benefit. The second experiment compared two masker modulation depths (-5 and -10 dB) and found no significant difference in the derived modulation-tuning properties, suggesting that masker depth does not affect estimated masking threshold pattern within this range.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"6 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145890669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}