首页 > 最新文献

JASA express letters最新文献

英文 中文
Broadband surface acoustic wave attenuation in metals using chirp compression and dispersive interdigital transducers. 利用啁啾压缩和色散数字间换能器衰减金属中的宽带表面声波。
IF 1.4 Q3 ACOUSTICS Pub Date : 2025-09-01 DOI: 10.1121/10.0039237
Dame Fall, Marc Duquennoy, Nikolay Smagin, Zakariae Oumekloul, Mohammadi Ouaftouh

This study presents a non-destructive method for estimating surface acoustic wave attenuation, which is highly sensitive to microstructural features, especially at high frequencies. The method uses a single wideband dispersive interdigital transducer (IDT) that remotely emits acoustic waves at the sample's edge. Chirp compression of the temporal displacement response is achieved by correlating the excitation signal with the spatial configuration of the IDT's electrodes. This technique generates high-amplitude pulses with a sufficient signal-to-noise ratio, critical for enabling accurate attenuation estimation over a frequency range (15-70 MHz). Results from nickel and aluminum demonstrate the method's effectiveness for rapid material characterization.

本研究提出了一种非破坏性的表面声波衰减估计方法,该方法对微结构特征非常敏感,特别是在高频处。该方法使用单个宽带色散数字间换能器(IDT)在样品边缘远程发射声波。时间位移响应的啁啾压缩是通过将激励信号与IDT电极的空间结构相关联来实现的。该技术产生具有足够信噪比的高振幅脉冲,这对于在频率范围(15-70 MHz)内实现准确的衰减估计至关重要。镍和铝的实验结果证明了该方法快速表征材料的有效性。
{"title":"Broadband surface acoustic wave attenuation in metals using chirp compression and dispersive interdigital transducers.","authors":"Dame Fall, Marc Duquennoy, Nikolay Smagin, Zakariae Oumekloul, Mohammadi Ouaftouh","doi":"10.1121/10.0039237","DOIUrl":"https://doi.org/10.1121/10.0039237","url":null,"abstract":"<p><p>This study presents a non-destructive method for estimating surface acoustic wave attenuation, which is highly sensitive to microstructural features, especially at high frequencies. The method uses a single wideband dispersive interdigital transducer (IDT) that remotely emits acoustic waves at the sample's edge. Chirp compression of the temporal displacement response is achieved by correlating the excitation signal with the spatial configuration of the IDT's electrodes. This technique generates high-amplitude pulses with a sufficient signal-to-noise ratio, critical for enabling accurate attenuation estimation over a frequency range (15-70 MHz). Results from nickel and aluminum demonstrate the method's effectiveness for rapid material characterization.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 9","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145002179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Influence of hearing aid processing on acoustic features associated with emotional speech: Acoustic analyses and perception by listeners with normal hearing. 助听器加工对与情绪言语相关的声学特征的影响:听力正常的听者的声学分析和感知。
IF 1.4 Q3 ACOUSTICS Pub Date : 2025-09-01 DOI: 10.1121/10.0039220
Frederic Marmel, Dina Lelic

Hearing aid (HA) processing can affect acoustic features linked with emotions, potentially making them less distinguishable. This study investigated whether HA processing, with both standard and short processing delays, affects emotion prediction from a set of acoustic features associated with speech emotions and how well these predictions align with perceived emotions. The findings indicated that anger and sadness are the easiest emotions to predict from acoustic features, while happiness and fear are the most accurately perceived emotions by listeners with normal hearing. HA processing, regardless of delay, does not seem to impair the predictability of emotions from acoustic features or the perception of these emotions.

助听器(HA)处理可以影响与情绪相关的声学特征,潜在地使它们难以区分。本研究调查了具有标准和短处理延迟的HA处理是否会影响与语音情绪相关的一组声学特征的情绪预测,以及这些预测与感知情绪的一致程度。研究结果表明,愤怒和悲伤是最容易从声学特征中预测出来的情绪,而快乐和恐惧是听力正常的听众最准确地感知到的情绪。无论延迟如何,HA处理似乎不会损害声音特征对情绪的可预测性或对这些情绪的感知。
{"title":"Influence of hearing aid processing on acoustic features associated with emotional speech: Acoustic analyses and perception by listeners with normal hearing.","authors":"Frederic Marmel, Dina Lelic","doi":"10.1121/10.0039220","DOIUrl":"https://doi.org/10.1121/10.0039220","url":null,"abstract":"<p><p>Hearing aid (HA) processing can affect acoustic features linked with emotions, potentially making them less distinguishable. This study investigated whether HA processing, with both standard and short processing delays, affects emotion prediction from a set of acoustic features associated with speech emotions and how well these predictions align with perceived emotions. The findings indicated that anger and sadness are the easiest emotions to predict from acoustic features, while happiness and fear are the most accurately perceived emotions by listeners with normal hearing. HA processing, regardless of delay, does not seem to impair the predictability of emotions from acoustic features or the perception of these emotions.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 9","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145031224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Psychoacoustic assessment of misophonia. 恐音症的心理声学评估。
IF 1.4 Q3 ACOUSTICS Pub Date : 2025-09-01 DOI: 10.1121/10.0039238
Benjamin J Kirby, Alaina Cunningham, Olivia Montou Zant

Misophonia is a condition characterized by intense negative emotional reactions to trigger sounds and related stimuli. In this study, adult listeners (N = 15) with a self-reported history of misophonia symptoms and a control group without misophonia (N = 15) completed listening judgements of recorded misophonia trigger stimuli using a standard scale. Participants also completed an established questionnaire of misophonia symptoms, the Misophonia Questionnaire (MQ). Summed scores of the listening task were significantly correlated with overall MQ score. The misophonia group had significantly higher listening scores and MQ scores compared to controls. These findings indicate applications for psychoacoustic methods in the assessment of misophonia.

恐音症是一种以强烈的负面情绪反应为特征的疾病,会引发声音和相关刺激。在本研究中,有恐音症症状史的成年听者(N = 15)和无恐音症的对照组(N = 15)使用标准量表完成对记录的恐音症触发刺激的听力判断。参与者还完成了恐音症症状问卷,恐音症问卷(MQ)。听力任务的总得分与整体MQ得分显著相关。与对照组相比,恐音症组的听力分数和MQ分数明显更高。这些发现表明心理声学方法在恐音症评估中的应用。
{"title":"Psychoacoustic assessment of misophonia.","authors":"Benjamin J Kirby, Alaina Cunningham, Olivia Montou Zant","doi":"10.1121/10.0039238","DOIUrl":"https://doi.org/10.1121/10.0039238","url":null,"abstract":"<p><p>Misophonia is a condition characterized by intense negative emotional reactions to trigger sounds and related stimuli. In this study, adult listeners (N = 15) with a self-reported history of misophonia symptoms and a control group without misophonia (N = 15) completed listening judgements of recorded misophonia trigger stimuli using a standard scale. Participants also completed an established questionnaire of misophonia symptoms, the Misophonia Questionnaire (MQ). Summed scores of the listening task were significantly correlated with overall MQ score. The misophonia group had significantly higher listening scores and MQ scores compared to controls. These findings indicate applications for psychoacoustic methods in the assessment of misophonia.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 9","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145024843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectral and temporal information and presentation mode effects on individual speaker identification and listening effort. 频谱和时间信息和呈现方式对个体说话人识别和聆听努力的影响。
IF 1.4 Q3 ACOUSTICS Pub Date : 2025-09-01 DOI: 10.1121/10.0039369
Jenna L Cramer, Ashley Reynard, Vanessa Torres, Jeremy J Donai

Identifying speakers of interest in an auditory scene is a fundamental task that facilitates effective communication. Little is known about the specific contributions of spectral and temporal detail required for identifying a specific speaker of interest by human listeners. This study investigated the relative contributions of spectral and temporal detail for identifying a speaker of interest and perceived effort in doing so. Results showed significant improvements in speaker identification and decreased effort ratings as spectral channels increased. Improved speaker identification performance with increased temporal filter cutoff from 20 Hz to 800 Hz was observed. These results have implications for speech signal processing by amplification devices and automated speaker recognition systems.

识别听觉场景中感兴趣的说话人是促进有效沟通的基本任务。对于识别人类听众感兴趣的特定说话人所需的光谱和时间细节的具体贡献,我们知之甚少。本研究调查了光谱和时间细节对识别感兴趣的说话人的相对贡献以及在此过程中感知到的努力。结果表明,随着频谱通道的增加,说话人识别能力显著提高,努力程度降低。将时间滤波器截止频率从20 Hz提高到800 Hz,可以提高说话人识别性能。这些结果对放大装置和自动说话人识别系统的语音信号处理具有启示意义。
{"title":"Spectral and temporal information and presentation mode effects on individual speaker identification and listening effort.","authors":"Jenna L Cramer, Ashley Reynard, Vanessa Torres, Jeremy J Donai","doi":"10.1121/10.0039369","DOIUrl":"https://doi.org/10.1121/10.0039369","url":null,"abstract":"<p><p>Identifying speakers of interest in an auditory scene is a fundamental task that facilitates effective communication. Little is known about the specific contributions of spectral and temporal detail required for identifying a specific speaker of interest by human listeners. This study investigated the relative contributions of spectral and temporal detail for identifying a speaker of interest and perceived effort in doing so. Results showed significant improvements in speaker identification and decreased effort ratings as spectral channels increased. Improved speaker identification performance with increased temporal filter cutoff from 20 Hz to 800 Hz was observed. These results have implications for speech signal processing by amplification devices and automated speaker recognition systems.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 9","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145082696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Concentric fluid spheres: Scattering and radiation forces and the lowest monopole resonance of bubble shells. 同心流体球:散射和辐射力以及气泡壳的最低单极共振。
IF 1.4 Q3 ACOUSTICS Pub Date : 2025-09-01 DOI: 10.1121/10.0039423
Philip L Marston

A prior solution for the scattering of traveling wave sound by concentric fluid spheres is recast using complex unimodular s-function notation, which is convenient for expressing partial wave amplitudes and radiation forces on spheres in standing waves. Viscous and thermal energy dissipation are neglected. The fluid core affects the low-frequency dynamics of the fluid shell. The lowest monopole mode of air-filled liquid shells in air is considered. The frequency is approximated by generalizing the analysis of the Minnaert resonance of an air bubble in water. This analysis is relevant to the acoustical scattering by and conditions for trapping of compound drops.

用复单模s函数符号重新构造了同心流体球对行波声散射的先验解,方便了在驻波中表示部分波幅值和球上的辐射力。忽略了粘性和热能耗散。流体核影响流体壳的低频动力学。考虑了充气液壳在空气中的最低单极模式。通过对水中气泡Minnaert共振的分析,可以近似地求得该频率。本文分析了复合液滴的声散射和捕获条件。
{"title":"Concentric fluid spheres: Scattering and radiation forces and the lowest monopole resonance of bubble shells.","authors":"Philip L Marston","doi":"10.1121/10.0039423","DOIUrl":"https://doi.org/10.1121/10.0039423","url":null,"abstract":"<p><p>A prior solution for the scattering of traveling wave sound by concentric fluid spheres is recast using complex unimodular s-function notation, which is convenient for expressing partial wave amplitudes and radiation forces on spheres in standing waves. Viscous and thermal energy dissipation are neglected. The fluid core affects the low-frequency dynamics of the fluid shell. The lowest monopole mode of air-filled liquid shells in air is considered. The frequency is approximated by generalizing the analysis of the Minnaert resonance of an air bubble in water. This analysis is relevant to the acoustical scattering by and conditions for trapping of compound drops.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 9","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145126593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GEPPETO-OFC: An optimal feedback speech motor control model integrating biomechanical constraints and multisensory goal specification. GEPPETO-OFC:一种集成生物力学约束和多感官目标规范的最优反馈语音运动控制模型。
IF 1.4 Q3 ACOUSTICS Pub Date : 2025-09-01 DOI: 10.1121/10.0039197
Ny Tsiky Rakotomalala, Pierre Baraduc, Pascal Perrier

We present a speech motor control model that integrates optimal feedback control (OFC) for movement planning and execution with a biomechanical model of the vocal tract. The OFC model was designed to optimize a cost function that combines motor effort and the achievement of multisensory goal zones. We show that the model can account for various aspects of speech production: kinematic properties, coarticulation, and sensorimotor integration. Furthermore, we provide evidence that hearing, proprioception, and tactile feedback may play distinct roles in shaping speech trajectories.

我们提出了一个语音运动控制模型,该模型集成了用于运动规划和执行的最优反馈控制(OFC)和声道的生物力学模型。OFC模型旨在优化一个结合运动努力和实现多感官目标区域的成本函数。我们表明,该模型可以解释语音产生的各个方面:运动学特性、协同发音和感觉运动整合。此外,我们提供的证据表明,听觉、本体感觉和触觉反馈可能在塑造语言轨迹中发挥不同的作用。
{"title":"GEPPETO-OFC: An optimal feedback speech motor control model integrating biomechanical constraints and multisensory goal specification.","authors":"Ny Tsiky Rakotomalala, Pierre Baraduc, Pascal Perrier","doi":"10.1121/10.0039197","DOIUrl":"https://doi.org/10.1121/10.0039197","url":null,"abstract":"<p><p>We present a speech motor control model that integrates optimal feedback control (OFC) for movement planning and execution with a biomechanical model of the vocal tract. The OFC model was designed to optimize a cost function that combines motor effort and the achievement of multisensory goal zones. We show that the model can account for various aspects of speech production: kinematic properties, coarticulation, and sensorimotor integration. Furthermore, we provide evidence that hearing, proprioception, and tactile feedback may play distinct roles in shaping speech trajectories.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 9","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145002173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The effect of mounting conditions on the vibration and directivity patterns of the glockenspiel. 安装条件对钟琴振动模式和指向性模式的影响。
IF 1.4 Q3 ACOUSTICS Pub Date : 2025-09-01 DOI: 10.1121/10.0039258
Hanna M Pavill, Micah R Shepherd

The glockenspiel is a bright, resonant percussion instrument with a series of simple bars mounted next to each other in a frame. Its acoustic radiation remains underexplored, particularly in its full instrument configuration. This study investigates the acoustic radiation and vibrational behavior of a glockenspiel bar in different mounting conditions. Directivity measurements and the scanning laser Doppler vibrometer were used to compare a single bar in free-free, baffled, and full-instrument configurations. The results show that the mounting significantly alters radiation patterns of the bar, particularly at higher modes. Torsional modes exhibited greater deviation from free-free predictions than bending modes, especially in the full-instrument case. The findings highlight the importance of considering frame and structural interactions in modeling glockenspiel vibration and radiation.

钟琴是一种明亮的,共振的打击乐器,在一个框架中有一系列简单的杆,彼此相邻。它的声辐射仍未得到充分探索,特别是在其完整的仪器配置中。研究了钟琴杆在不同安装条件下的声辐射和振动特性。使用指向性测量和扫描激光多普勒振动仪比较了自由-自由、挡板和全仪器配置下的单个杆。结果表明,安装显著改变了棒的辐射模式,特别是在高模态下。扭转模态表现出比弯曲模态更大的偏离自由-自由预测,特别是在全仪器情况下。研究结果强调了在钟琴振动和辐射建模中考虑框架和结构相互作用的重要性。
{"title":"The effect of mounting conditions on the vibration and directivity patterns of the glockenspiel.","authors":"Hanna M Pavill, Micah R Shepherd","doi":"10.1121/10.0039258","DOIUrl":"https://doi.org/10.1121/10.0039258","url":null,"abstract":"<p><p>The glockenspiel is a bright, resonant percussion instrument with a series of simple bars mounted next to each other in a frame. Its acoustic radiation remains underexplored, particularly in its full instrument configuration. This study investigates the acoustic radiation and vibrational behavior of a glockenspiel bar in different mounting conditions. Directivity measurements and the scanning laser Doppler vibrometer were used to compare a single bar in free-free, baffled, and full-instrument configurations. The results show that the mounting significantly alters radiation patterns of the bar, particularly at higher modes. Torsional modes exhibited greater deviation from free-free predictions than bending modes, especially in the full-instrument case. The findings highlight the importance of considering frame and structural interactions in modeling glockenspiel vibration and radiation.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 9","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145132926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Passive localization of dual targets in deep-ocean direct-arrival zone using a horizontal line array. 基于水平线阵列的深海直达区双目标被动定位。
IF 1.4 Q3 ACOUSTICS Pub Date : 2025-09-01 DOI: 10.1121/10.0039110
Xiongyi Yu, Feilong Zhu, Yonggang Guo, Dai Liu

The passive localization of dual targets composed of a surface ship and a submerged source located nearby beneath the ship is an intriguing problem. This study develops a passive localization method based on multipath arrival angles for dual targets, with similar source levels in the deep-ocean direct arrival zone, using a horizontal line array. Compared to the classical minimum variance distortionless response method, the sparse Bayesian learning method is used to improve resolution for multipath arrival angles under coherent signal conditions, enhancing both the effective range and localization accuracy. The effectiveness of the proposed method has been validated through simulation and experiment.

由水面舰艇和水下源组成的双目标被动定位是一个令人感兴趣的问题。本文研究了一种基于多径到达角的深海直接到达区源电平相近的双目标被动定位方法,采用水平线阵列。与经典的最小方差无失真响应方法相比,利用稀疏贝叶斯学习方法提高了相干信号条件下多径到达角的分辨率,提高了有效距离和定位精度。通过仿真和实验验证了该方法的有效性。
{"title":"Passive localization of dual targets in deep-ocean direct-arrival zone using a horizontal line array.","authors":"Xiongyi Yu, Feilong Zhu, Yonggang Guo, Dai Liu","doi":"10.1121/10.0039110","DOIUrl":"10.1121/10.0039110","url":null,"abstract":"<p><p>The passive localization of dual targets composed of a surface ship and a submerged source located nearby beneath the ship is an intriguing problem. This study develops a passive localization method based on multipath arrival angles for dual targets, with similar source levels in the deep-ocean direct arrival zone, using a horizontal line array. Compared to the classical minimum variance distortionless response method, the sparse Bayesian learning method is used to improve resolution for multipath arrival angles under coherent signal conditions, enhancing both the effective range and localization accuracy. The effectiveness of the proposed method has been validated through simulation and experiment.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 9","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144981257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can speech foundation models effectively identify languages in low-resource multilingual aging populations? 语音基础模型能否在资源匮乏的多语种老龄化人群中有效识别语言?
IF 1.4 Q3 ACOUSTICS Pub Date : 2025-09-01 DOI: 10.1121/10.0039265
Aditya Kommineni, Rajat Hebbar, Sarah Petrosyan, Pranali Khobragade, Sudarsana Kadiri, Miguel Arce Rentería, Jinkook Lee, Shrikanth Narayanan

Speech foundation models (SFMs) achieve state-of-the-art results in many tasks, but their performance on elderly, multilingual speech remains underexplored. In this work, we investigate SFMs' ability to analyze multilingual speech from older adults using spoken language identification as a proxy task. We propose three key qualities for foundation models to serve multilingual aging populations: robustness to input duration, invariance to speaker demographics, and few-shot transferability in low-resource settings. Zero-shot evaluation indicates a noticeable performance drop for shorter inputs. We find that native speakers' speech consistently outperforms non-native speech across languages. Few-shot learning indicates better transferability in larger models.

语音基础模型(SFMs)在许多任务中取得了最先进的结果,但它们在老年人多语言语音中的表现仍未得到充分探索。在这项工作中,我们研究了SFMs使用口语识别作为代理任务来分析老年人多语种语音的能力。我们提出了为多语言老龄化人口服务的基础模型的三个关键品质:对输入时间的鲁棒性,对说话人人口统计的不变性,以及在低资源环境下的少量可转移性。对于较短的输入,零射击评估表明明显的性能下降。我们发现,在不同的语言中,以母语为母语的人说话的表现总是优于非母语的人。在更大的模型中,Few-shot学习表明了更好的可移植性。
{"title":"Can speech foundation models effectively identify languages in low-resource multilingual aging populations?","authors":"Aditya Kommineni, Rajat Hebbar, Sarah Petrosyan, Pranali Khobragade, Sudarsana Kadiri, Miguel Arce Rentería, Jinkook Lee, Shrikanth Narayanan","doi":"10.1121/10.0039265","DOIUrl":"10.1121/10.0039265","url":null,"abstract":"<p><p>Speech foundation models (SFMs) achieve state-of-the-art results in many tasks, but their performance on elderly, multilingual speech remains underexplored. In this work, we investigate SFMs' ability to analyze multilingual speech from older adults using spoken language identification as a proxy task. We propose three key qualities for foundation models to serve multilingual aging populations: robustness to input duration, invariance to speaker demographics, and few-shot transferability in low-resource settings. Zero-shot evaluation indicates a noticeable performance drop for shorter inputs. We find that native speakers' speech consistently outperforms non-native speech across languages. Few-shot learning indicates better transferability in larger models.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 9","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12434620/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vocal economy in contemporary commercial music singers: A pilot study on twang-like voices. 当代商业音乐歌手的声乐经济:对鼻音式声乐的初步研究。
IF 1.4 Q3 ACOUSTICS Pub Date : 2025-09-01 DOI: 10.1121/10.0039036
Marcelo Saldías O'Hrens, Víctor M Espinoza, Valentina Cruz, Melanie Garay, Josefa Reyes, Camilo Quezada, Pedro Cortez, Christian Castro, Jesús Parra, Anne-Maria Laukkanen

Modelling studies suggest that twang-like voice production with supralaryngeal constriction increases vocal economy. This has not been studied in contemporary commercial music (CCM) singers. This study explores the vocal economy of twang-like voices in CCM singers using the "quasi-output-cost ratio" (QOCR). Ten CCM singers sang the syllable [pa:] loudly, using neutral and twang-like voices at low and high pitches. QOCR, electroglottografic contact quotient, sound pressure level, air pressure, and inverse filtering measures were obtained. QOCR showed no significant differences between the voice types. Air pressure measures were significantly higher in twang-like voices, suggesting increased aerodynamic effort to compensate for supralaryngeal constriction. New tools for studying vocal economy in singing are warranted.

模型研究表明,喉上收缩产生鼻音样的声音增加了声音的经济性。这在当代商业音乐(CCM)歌手中还没有被研究过。本研究使用“准产出成本比”(QOCR)探讨了CCM歌手中鼻音嗓音的声乐经济。十位CCM歌手大声地唱出了音节[pa:],在高低音调上使用中性和鼻音般的声音。得到QOCR、声门电接触商、声压级、气压和反滤波措施。QOCR在不同语音类型间无显著差异。在类似鼻音的声音中,气压测量值明显更高,表明增加了空气动力学的努力来补偿咽上收缩。研究歌唱中声乐经济的新工具是必要的。
{"title":"Vocal economy in contemporary commercial music singers: A pilot study on twang-like voices.","authors":"Marcelo Saldías O'Hrens, Víctor M Espinoza, Valentina Cruz, Melanie Garay, Josefa Reyes, Camilo Quezada, Pedro Cortez, Christian Castro, Jesús Parra, Anne-Maria Laukkanen","doi":"10.1121/10.0039036","DOIUrl":"https://doi.org/10.1121/10.0039036","url":null,"abstract":"<p><p>Modelling studies suggest that twang-like voice production with supralaryngeal constriction increases vocal economy. This has not been studied in contemporary commercial music (CCM) singers. This study explores the vocal economy of twang-like voices in CCM singers using the \"quasi-output-cost ratio\" (QOCR). Ten CCM singers sang the syllable [pa:] loudly, using neutral and twang-like voices at low and high pitches. QOCR, electroglottografic contact quotient, sound pressure level, air pressure, and inverse filtering measures were obtained. QOCR showed no significant differences between the voice types. Air pressure measures were significantly higher in twang-like voices, suggesting increased aerodynamic effort to compensate for supralaryngeal constriction. New tools for studying vocal economy in singing are warranted.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"5 9","pages":""},"PeriodicalIF":1.4,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145076611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
JASA express letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1