首页 > 最新文献

Journal of the Acoustical Society of America最新文献

英文 中文
Empirical correction method for spatial averaging effect in ultrasonic device calibration: Enhancing the precision-sensitivity trade-off. 超声装置标定中空间平均效应的经验校正方法:提高精度灵敏度的权衡。
IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2026-02-01 DOI: 10.1121/10.0042356
Francisco Alves, Mário Santos, André Alvarenga, Lorena Petrella

The calibration of ultrasound diagnostic equipment is essential to ensure their effectiveness and safety. Calibrating acoustic fields using hydrophones involves the measurement of the maximum pressure point. Since the signal at the hydrophone output is proportional to the average pressure incident on its surface, when the active area of the hydrophone is larger than the area of the ultrasonic beam at the focus, the lower acoustic pressures surrounding the point of maximum pressure will cause an underestimation of this value. This phenomenon is referred as the spatial averaging effect. The main limitation in the use of smaller hydrophones is the inherent reduction in sensitivity. The International Electrotechnical Commission standards provide methods to correct for the spatial averaging effect when the ratio of the - 6 dB beam width to the hydrophone diameter (Rbh) is higher than 1.5. In this study, a novel method is presented for spatial averaging correction, developed using computational simulation. It consists of an empirical correction factor and allows extending corrections for Rbh values as low as 0.35, with errors below 3%, addressing the compromise between precision and sensitivity of the hydrophone. This method also generalizes to ultrasonic probes with varying characteristics.

超声诊断设备的标定是保证其有效性和安全性的关键。利用水听器校准声场涉及到最大压力点的测量。由于水听器输出处的信号与入射到其表面的平均压力成正比,当水听器的活动面积大于焦点处超声波束的面积时,最大压力点周围较低的声压会导致该值被低估。这种现象被称为空间平均效应。使用较小的水听器的主要限制是其固有的灵敏度降低。国际电工委员会标准提供了校正- 6db波束宽度与水听器直径之比(Rbh)大于1.5时的空间平均效应的方法。本文提出了一种基于计算模拟的空间平均校正方法。它包括一个经验校正因子,允许扩展校正Rbh值低至0.35,误差低于3%,解决了水听器精度和灵敏度之间的折衷问题。这种方法也适用于具有不同特性的超声波探头。
{"title":"Empirical correction method for spatial averaging effect in ultrasonic device calibration: Enhancing the precision-sensitivity trade-off.","authors":"Francisco Alves, Mário Santos, André Alvarenga, Lorena Petrella","doi":"10.1121/10.0042356","DOIUrl":"https://doi.org/10.1121/10.0042356","url":null,"abstract":"<p><p>The calibration of ultrasound diagnostic equipment is essential to ensure their effectiveness and safety. Calibrating acoustic fields using hydrophones involves the measurement of the maximum pressure point. Since the signal at the hydrophone output is proportional to the average pressure incident on its surface, when the active area of the hydrophone is larger than the area of the ultrasonic beam at the focus, the lower acoustic pressures surrounding the point of maximum pressure will cause an underestimation of this value. This phenomenon is referred as the spatial averaging effect. The main limitation in the use of smaller hydrophones is the inherent reduction in sensitivity. The International Electrotechnical Commission standards provide methods to correct for the spatial averaging effect when the ratio of the - 6 dB beam width to the hydrophone diameter (Rbh) is higher than 1.5. In this study, a novel method is presented for spatial averaging correction, developed using computational simulation. It consists of an empirical correction factor and allows extending corrections for Rbh values as low as 0.35, with errors below 3%, addressing the compromise between precision and sensitivity of the hydrophone. This method also generalizes to ultrasonic probes with varying characteristics.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"159 2","pages":"1027-1035"},"PeriodicalIF":2.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146105957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Validation of Count-the-Dots audiogram approaches to calculating speech intelligibility indices. 计算语音可理解度指标的点阵听力图方法的验证。
IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2026-02-01 DOI: 10.1121/10.0042425
Koenraad S Rhebergen, Chaslav V Pavlovic

This study evaluates Count-the-Dots Audiogram approaches as a simplified clinically viable method to closely estimate the American National Standards Institute [ANSI (1997). S3.5-1997) Speech Intelligibility Index (SII)] standard in quiet environments. We compared audibility calculations and predicted intelligibility scores between Count-the-Dots methods and multiple ANSI [(1997). S3.5-1997)] SII variants, using eight frequency Band Importance Functions (BIF) for 14 776 audiograms from the National Health and Nutrition Examination Survey dataset. Results showed that Count-the-Dots methods closely approximate the ANSI [(1997). S3.5-1997)] SII model as long as the speech levels and the BIF used for calculations were equivalent between the two methods. This was true for audibility calculations and speech intelligibility predictions. However, deviations occurred at higher speech levels [≥65 dB sound pressure level (SPL)] because of differences in how masking is modeled. Count-the-Dots audiogram approaches offer a clinically viable, intuitive alternative for counseling purposes in quiet settings, particularly at natural speech levels (about 55 dB SPL). However, for speech-in-noise conditions, high-level speech, or aided speech inputs, the ANSI [(1997). S3.5-1997)] SII remains the preferred model because of its more detailed acoustic modeling.

本研究评估了点数听力图方法作为一种简化的临床可行方法来密切评估美国国家标准协会[ANSI(1997)]。S3.5-1997)安静环境下的语音清晰度指数(SII)]标准。我们比较了“点计数”方法和多个ANSI[(1997)]之间的可听性计算和预测可理解性分数。S3.5-1997)] SII变体,使用8个频带重要性函数(BIF)对来自国家健康和营养检查调查数据集的14776个听力图进行分析。结果表明,Count-the-Dots方法非常接近ANSI[(1997)]。(S3.5-1997)] SII模型,只要两种方法计算的语音电平和BIF相等。对于可听性计算和语音可理解性预测来说,这是正确的。然而,在较高的语音水平[≥65 dB声压级(SPL)]下,由于掩蔽建模方式的差异,出现了偏差。在安静环境下,特别是在自然语音水平(约55 dB SPL)下,点数听力学方法为咨询目的提供了临床可行的、直观的替代方案。然而,对于噪声中的语音条件,高级语音或辅助语音输入,ANSI[(1997)]。S3.5-1997)] SII仍然是首选模型,因为它更详细的声学建模。
{"title":"Validation of Count-the-Dots audiogram approaches to calculating speech intelligibility indices.","authors":"Koenraad S Rhebergen, Chaslav V Pavlovic","doi":"10.1121/10.0042425","DOIUrl":"https://doi.org/10.1121/10.0042425","url":null,"abstract":"<p><p>This study evaluates Count-the-Dots Audiogram approaches as a simplified clinically viable method to closely estimate the American National Standards Institute [ANSI (1997). S3.5-1997) Speech Intelligibility Index (SII)] standard in quiet environments. We compared audibility calculations and predicted intelligibility scores between Count-the-Dots methods and multiple ANSI [(1997). S3.5-1997)] SII variants, using eight frequency Band Importance Functions (BIF) for 14 776 audiograms from the National Health and Nutrition Examination Survey dataset. Results showed that Count-the-Dots methods closely approximate the ANSI [(1997). S3.5-1997)] SII model as long as the speech levels and the BIF used for calculations were equivalent between the two methods. This was true for audibility calculations and speech intelligibility predictions. However, deviations occurred at higher speech levels [≥65 dB sound pressure level (SPL)] because of differences in how masking is modeled. Count-the-Dots audiogram approaches offer a clinically viable, intuitive alternative for counseling purposes in quiet settings, particularly at natural speech levels (about 55 dB SPL). However, for speech-in-noise conditions, high-level speech, or aided speech inputs, the ANSI [(1997). S3.5-1997)] SII remains the preferred model because of its more detailed acoustic modeling.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"159 2","pages":"1337-1347"},"PeriodicalIF":2.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146149759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How does a deep neural network look at lexical stress in English words? 深度神经网络如何看待英语单词中的词汇重音?
IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2026-02-01 DOI: 10.1121/10.0042429
Itai Allouche, Itay Asael, Rotem Rousso, Vered Dassa, Ann Bradlow, Seung-Eun Kim, Matthew Goldrick, Joseph Keshet

Despite their success in speech processing, neural networks often operate as black boxes, prompting the following questions: What informs their decisions, and how can we interpret them? This work examines this issue in the context of lexical stress. A dataset of English disyllabic words was automatically constructed from read and spontaneous speech. Several convolutional neural network (CNN) architectures were trained to predict stress position from a spectrographic representation of disyllabic words lacking minimal stress pairs (e.g., initial stress WAllet, final stress exTEND), achieving up to 92% accuracy on held-out test data. Layerwise relevance propagation, a technique for neural network interpretability analysis, revealed that predictions for held-out minimal pairs (PROtest vs proTEST) were most strongly influenced by information in stressed versus unstressed syllables, particularly the spectral properties of stressed vowels. However, the classifiers also attended to information throughout the word. A feature-specific relevance analysis is proposed, and its results suggest that the best-performing classifier is strongly influenced by the stressed vowel's first and second formants, with some evidence that its pitch and third formant also contribute. These results reveal deep learning's ability to acquire distributed cues to stress from naturally occurring data, extending traditional phonetic work based around highly controlled stimuli.

尽管神经网络在语音处理方面取得了成功,但它们经常像黑盒子一样运作,这引发了以下问题:是什么影响了它们的决定,我们如何解释这些决定?这项工作考察了这个问题在词汇重音的背景下。从阅读和自发语音中自动构建英语双音节词数据集。几个卷积神经网络(CNN)架构被训练来从缺乏最小应力对的双音节单词的频谱表示(例如,初始应力WAllet,最终应力exTEND)中预测应力位置,在保持测试数据上达到高达92%的准确率。分层关联传播是一种神经网络可解释性分析技术,它揭示了对保持最小对(PROtest vs . PROtest)的预测最强烈地受到重音和非重音音节信息的影响,尤其是重音元音的频谱特性。然而,分类器也关注整个单词的信息。提出了一种特征相关分析,其结果表明,表现最好的分类器受到重读元音的第一个和第二个共振峰的强烈影响,有证据表明其音高和第三个共振峰也有贡献。这些结果揭示了深度学习从自然发生的数据中获取分布式压力线索的能力,扩展了基于高度受控刺激的传统语音工作。
{"title":"How does a deep neural network look at lexical stress in English words?","authors":"Itai Allouche, Itay Asael, Rotem Rousso, Vered Dassa, Ann Bradlow, Seung-Eun Kim, Matthew Goldrick, Joseph Keshet","doi":"10.1121/10.0042429","DOIUrl":"https://doi.org/10.1121/10.0042429","url":null,"abstract":"<p><p>Despite their success in speech processing, neural networks often operate as black boxes, prompting the following questions: What informs their decisions, and how can we interpret them? This work examines this issue in the context of lexical stress. A dataset of English disyllabic words was automatically constructed from read and spontaneous speech. Several convolutional neural network (CNN) architectures were trained to predict stress position from a spectrographic representation of disyllabic words lacking minimal stress pairs (e.g., initial stress WAllet, final stress exTEND), achieving up to 92% accuracy on held-out test data. Layerwise relevance propagation, a technique for neural network interpretability analysis, revealed that predictions for held-out minimal pairs (PROtest vs proTEST) were most strongly influenced by information in stressed versus unstressed syllables, particularly the spectral properties of stressed vowels. However, the classifiers also attended to information throughout the word. A feature-specific relevance analysis is proposed, and its results suggest that the best-performing classifier is strongly influenced by the stressed vowel's first and second formants, with some evidence that its pitch and third formant also contribute. These results reveal deep learning's ability to acquire distributed cues to stress from naturally occurring data, extending traditional phonetic work based around highly controlled stimuli.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"159 2","pages":"1348-1358"},"PeriodicalIF":2.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146157220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Iterative Born solver for the acoustic Helmholtz equation with heterogeneous sound speed and density. 具有非均匀声速和密度的声学亥姆霍兹方程的迭代Born求解器。
IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2026-02-01 DOI: 10.1121/10.0042259
Antonio Stanziola, Simon R Arridge, Bradley E Treeby, Benjamin T Cox

Efficient numerical solution of the acoustic Helmholtz equation in heterogeneous media remains challenging, particularly for large-scale problems with spatially varying density-a limitation that restricts applications in biomedical acoustics and seismic imaging. A fast iterative solver that extends the convergent Born series [Osnabrugge, Leedumrongwatthanakun, and Vellekoop, J. Comput. Phys. 322, 113-124 (2016)] method to handle arbitrary variations in sound speed, density, and absorption simultaneously is presented. This approach reformulates the Helmholtz equation as a first-order system and applies the universal split-preconditioner from Vettenburg and Vellekoop [arXiv:2207.14222v2 (2022)], yielding a matrix-free algorithm that leverages Fast Fourier Transforms for computational efficiency. Unlike existing Born series methods, this solver accommodates heterogeneous density without requiring expensive matrix decompositions or preprocessing steps, making it suitable for large-scale three-dimensional problems with minimal memory overhead. The method provides forward and adjoint solutions, enabling its application for inverse problems. Accuracy is validated through comparison against an analytical solution and the solver's practical utility is demonstrated through transcranial ultrasound simulations. The solver achieves convergence for strong scattering scenarios, offering a computationally efficient alternative to time-domain methods and matrix-based Helmholtz solvers for applications ranging from medical ultrasound treatment planning to seismic exploration.

非均质介质中声学亥姆霍兹方程的有效数值解仍然具有挑战性,特别是对于具有空间变化密度的大规模问题,这限制了生物医学声学和地震成像的应用。一种扩展收敛Born级数的快速迭代求解器[j]。提出了同时处理声速、密度和吸收任意变化的方法。物理学报,322,113-124 (2016)]该方法将Helmholtz方程重新表述为一阶系统,并应用Vettenburg和Vellekoop的通用分裂预条件[arXiv:2207.14222v2(2022)],产生了一种利用快速傅里叶变换提高计算效率的无矩阵算法。与现有的Born系列方法不同,该求解器适应异构密度,而不需要昂贵的矩阵分解或预处理步骤,使其适用于内存开销最小的大规模三维问题。该方法提供了正解和伴随解,可用于求解反问题。通过与解析解的比较验证了准确性,并通过经颅超声模拟证明了求解器的实用性。该求解器在强散射情况下实现收敛,为从医学超声治疗计划到地震勘探等应用提供了时域方法和基于矩阵的亥姆霍兹求解器的计算效率替代方案。
{"title":"Iterative Born solver for the acoustic Helmholtz equation with heterogeneous sound speed and density.","authors":"Antonio Stanziola, Simon R Arridge, Bradley E Treeby, Benjamin T Cox","doi":"10.1121/10.0042259","DOIUrl":"https://doi.org/10.1121/10.0042259","url":null,"abstract":"<p><p>Efficient numerical solution of the acoustic Helmholtz equation in heterogeneous media remains challenging, particularly for large-scale problems with spatially varying density-a limitation that restricts applications in biomedical acoustics and seismic imaging. A fast iterative solver that extends the convergent Born series [Osnabrugge, Leedumrongwatthanakun, and Vellekoop, J. Comput. Phys. 322, 113-124 (2016)] method to handle arbitrary variations in sound speed, density, and absorption simultaneously is presented. This approach reformulates the Helmholtz equation as a first-order system and applies the universal split-preconditioner from Vettenburg and Vellekoop [arXiv:2207.14222v2 (2022)], yielding a matrix-free algorithm that leverages Fast Fourier Transforms for computational efficiency. Unlike existing Born series methods, this solver accommodates heterogeneous density without requiring expensive matrix decompositions or preprocessing steps, making it suitable for large-scale three-dimensional problems with minimal memory overhead. The method provides forward and adjoint solutions, enabling its application for inverse problems. Accuracy is validated through comparison against an analytical solution and the solver's practical utility is demonstrated through transcranial ultrasound simulations. The solver achieves convergence for strong scattering scenarios, offering a computationally efficient alternative to time-domain methods and matrix-based Helmholtz solvers for applications ranging from medical ultrasound treatment planning to seismic exploration.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"159 2","pages":"1457-1470"},"PeriodicalIF":2.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146180755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A simplified method of sound speed profiles for precise positioning of underwater dynamic targets. 水下动力目标精确定位声速分布的简化方法。
IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2026-02-01 DOI: 10.1121/10.0042427
Baojin Li, Shuang Zhao, Zhenjie Wang, Shuqiang Xue, Yixu Liu

The accuracy of underwater acoustic positioning is seriously affected by the spatiotemporal variations of sound speed. For precise positioning of seafloor fixed points, the simplified reference sound speed profiles (SSPs) are commonly adopted to correct the influence of these variations. However, in underwater dynamic target positioning, the simplified method that only considers acoustic ray tracing equivalence at fixed depths can lead to significant representativeness errors at other depths. To address this issue, we propose a criterion for minimizing acoustic ray tracing errors through the entire depth range and then employ a metaheuristic algorithm to solve the combinatorial optimization problem involved in this criterion. The results show that, compared to the maximum sound speed deviation, area difference, and genetic algorithm based on the minimum acoustic ray tracing error criterion at fixed depth methods, the SSP simplified by the proposed method exhibits higher geometric accuracy, acoustic ray tracing accuracy, and positioning accuracy through the entire depth range. The proposed method is suitable for underwater dynamic target positioning, especially in scenarios with significant depth variations.

声速的时空变化严重影响水声定位的精度。为了精确定位海底定点,通常采用简化参考声速剖面(ssp)来校正这些变化的影响。然而,在水下动态目标定位中,仅考虑固定深度声射线追踪等效的简化方法在其他深度会导致显著的代表性误差。为了解决这个问题,我们提出了一个在整个深度范围内最小化声射线追踪误差的准则,然后使用一个元启发式算法来解决该准则中涉及的组合优化问题。结果表明,与最大声速偏差、面积差和基于定深最小声射线追踪误差准则的遗传算法相比,该方法简化的SSP在全深度范围内具有更高的几何精度、声射线追踪精度和定位精度。该方法适用于水下动态目标定位,特别是在深度变化较大的情况下。
{"title":"A simplified method of sound speed profiles for precise positioning of underwater dynamic targets.","authors":"Baojin Li, Shuang Zhao, Zhenjie Wang, Shuqiang Xue, Yixu Liu","doi":"10.1121/10.0042427","DOIUrl":"https://doi.org/10.1121/10.0042427","url":null,"abstract":"<p><p>The accuracy of underwater acoustic positioning is seriously affected by the spatiotemporal variations of sound speed. For precise positioning of seafloor fixed points, the simplified reference sound speed profiles (SSPs) are commonly adopted to correct the influence of these variations. However, in underwater dynamic target positioning, the simplified method that only considers acoustic ray tracing equivalence at fixed depths can lead to significant representativeness errors at other depths. To address this issue, we propose a criterion for minimizing acoustic ray tracing errors through the entire depth range and then employ a metaheuristic algorithm to solve the combinatorial optimization problem involved in this criterion. The results show that, compared to the maximum sound speed deviation, area difference, and genetic algorithm based on the minimum acoustic ray tracing error criterion at fixed depth methods, the SSP simplified by the proposed method exhibits higher geometric accuracy, acoustic ray tracing accuracy, and positioning accuracy through the entire depth range. The proposed method is suitable for underwater dynamic target positioning, especially in scenarios with significant depth variations.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"159 2","pages":"1446-1456"},"PeriodicalIF":2.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146180814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time-varying partial loudness of noise burst sequences in stationary noise with a similar level. 相似水平的平稳噪声中噪声突发序列的时变部分响度。
IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2026-02-01 DOI: 10.1121/10.0042387
Josef Schlittenlacher, Agatha R Cox, Brian C J Moore

Loudness increases with increasing duration up to 200 ms after sound onset. This temporal integration is well documented in quiet but less understood in the presence of other sounds and for very short durations. The present study investigates the temporal integration of partial loudness for bursts of noise in the presence of equally intense background noise. Level differences required for equal loudness between a reference burst duration of 20 ms and target burst durations of 1, 2, 5, and 10 ms were obtained using a 1-up/1-down staircase procedure in the laboratory and online for burst repetition rates of 5, 10, and 20 Hz and for rectangular and Hann shaped bursts. All results showed that the short duration bursts were perceived as louder than expected from the temporal integration of energy. The difference was equivalent to a change in level up to 6.7 dB and was larger for higher burst repetition rates. The difference was higher when using abrupt onsets and offsets for both target and reference compared to bursts with a Hann window shape. Differences between experiments conducted in the laboratory and online were small (up to 1.2 dB) but were statistically significant.

响度随着持续时间的增加而增加,声音开始后可达200毫秒。这种时间整合在安静的情况下有很好的记录,但在有其他声音存在的情况下,持续时间很短,就不那么容易理解了。本研究探讨了在同样强烈的背景噪声存在的情况下,局部响度的时间整合。参考爆发持续时间为20毫秒,目标爆发持续时间为1、2、5和10毫秒,在实验室和在线上使用1-上/1-下楼梯程序获得了相同响度所需的水平差异,用于爆发重复率为5、10和20 Hz以及矩形和汉恩形爆发。所有的结果都表明,短时间的脉冲被认为比能量的时间整合所预期的要大。这种差异相当于6.7 dB的水平变化,并且在更高的突发重复率下会更大。与使用汉窗口形状的爆发相比,使用目标和参考的突然发作和偏移量的差异更大。在实验室进行的实验和在线进行的实验之间的差异很小(高达1.2 dB),但具有统计学意义。
{"title":"Time-varying partial loudness of noise burst sequences in stationary noise with a similar level.","authors":"Josef Schlittenlacher, Agatha R Cox, Brian C J Moore","doi":"10.1121/10.0042387","DOIUrl":"https://doi.org/10.1121/10.0042387","url":null,"abstract":"<p><p>Loudness increases with increasing duration up to 200 ms after sound onset. This temporal integration is well documented in quiet but less understood in the presence of other sounds and for very short durations. The present study investigates the temporal integration of partial loudness for bursts of noise in the presence of equally intense background noise. Level differences required for equal loudness between a reference burst duration of 20 ms and target burst durations of 1, 2, 5, and 10 ms were obtained using a 1-up/1-down staircase procedure in the laboratory and online for burst repetition rates of 5, 10, and 20 Hz and for rectangular and Hann shaped bursts. All results showed that the short duration bursts were perceived as louder than expected from the temporal integration of energy. The difference was equivalent to a change in level up to 6.7 dB and was larger for higher burst repetition rates. The difference was higher when using abrupt onsets and offsets for both target and reference compared to bursts with a Hann window shape. Differences between experiments conducted in the laboratory and online were small (up to 1.2 dB) but were statistically significant.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"159 2","pages":"1048-1056"},"PeriodicalIF":2.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146106009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A modified robust multi-target tracking method with high clutter and underwater nonstationary measurement noisea). 一种改进的高杂波和水下非平稳测量噪声下的鲁棒多目标跟踪方法。
IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2026-02-01 DOI: 10.1121/10.0042218
Xianghao Hou, Yuxuan Chen, Weisi Hua, Xinyu Gu, Yixin Yang

Robust active tracking of multiple underwater targets in environments with strong clutter and nonstationary measurement noise is a key research topic in underwater acoustic signal and information processing. Under these underwater conditions, existing random finite set (RFS) multi-target tracking algorithms suffer from serious contamination of the observation likelihood by clutter, low discrimination between targets and clutter, and poor tracking accuracy. To address these issues, this paper proposes a two-stage modified variational Bayesian delta-generalized labeled multi-Bernoulli multi-target tracking algorithm. First, in the delta-generalized labeled multi-Bernoulli filtering update stage, this method introduces the Sage-Husa (SH) estimation technique based on the minimum residual criterion to roughly correct the measurement noise covariance matrix. It effectively alleviates the contamination of the likelihood function by clutter in adaptive RFS and improves the discrimination between targets and clutter under complex noise conditions. Second, in the stage of multi-target state estimation, the measurement noise covariance estimate is further optimized through variational Bayesian framework, thereby achieving real-time correction of measurement noise caused by unknown underwater environments and significantly enhancing the robustness of underwater multi-target active tracking. Both simulation and experimental results show that the proposed algorithm significantly outperforms traditional and existing adaptive generalized labeled multi-Bernoulli methods in scenarios with strong clutter and nonstationary measurement noise.

多目标在强杂波和非平稳测量噪声环境下的鲁棒主动跟踪是水声信号与信息处理中的一个重要研究课题。在这些水下条件下,现有随机有限集(RFS)多目标跟踪算法存在着观测似然受杂波污染严重、目标与杂波分辨能力差、跟踪精度差的问题。针对这些问题,本文提出了一种改进的两阶段变分贝叶斯-广义标记多伯努利多目标跟踪算法。首先,在delta广义标记多重伯努利滤波更新阶段,引入基于最小残差准则的Sage-Husa (SH)估计技术,对测量噪声协方差矩阵进行粗略校正;该方法有效地减轻了自适应RFS中杂波对似然函数的污染,提高了复杂噪声条件下目标与杂波的区分能力。其次,在多目标状态估计阶段,通过变分贝叶斯框架进一步优化测量噪声协方差估计,从而实现对未知水下环境引起的测量噪声的实时校正,显著增强水下多目标主动跟踪的鲁棒性。仿真和实验结果表明,在强杂波和非平稳测量噪声情况下,该算法明显优于传统和现有的自适应广义标记多伯努利方法。
{"title":"A modified robust multi-target tracking method with high clutter and underwater nonstationary measurement noisea).","authors":"Xianghao Hou, Yuxuan Chen, Weisi Hua, Xinyu Gu, Yixin Yang","doi":"10.1121/10.0042218","DOIUrl":"https://doi.org/10.1121/10.0042218","url":null,"abstract":"<p><p>Robust active tracking of multiple underwater targets in environments with strong clutter and nonstationary measurement noise is a key research topic in underwater acoustic signal and information processing. Under these underwater conditions, existing random finite set (RFS) multi-target tracking algorithms suffer from serious contamination of the observation likelihood by clutter, low discrimination between targets and clutter, and poor tracking accuracy. To address these issues, this paper proposes a two-stage modified variational Bayesian delta-generalized labeled multi-Bernoulli multi-target tracking algorithm. First, in the delta-generalized labeled multi-Bernoulli filtering update stage, this method introduces the Sage-Husa (SH) estimation technique based on the minimum residual criterion to roughly correct the measurement noise covariance matrix. It effectively alleviates the contamination of the likelihood function by clutter in adaptive RFS and improves the discrimination between targets and clutter under complex noise conditions. Second, in the stage of multi-target state estimation, the measurement noise covariance estimate is further optimized through variational Bayesian framework, thereby achieving real-time correction of measurement noise caused by unknown underwater environments and significantly enhancing the robustness of underwater multi-target active tracking. Both simulation and experimental results show that the proposed algorithm significantly outperforms traditional and existing adaptive generalized labeled multi-Bernoulli methods in scenarios with strong clutter and nonstationary measurement noise.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"159 2","pages":"1086-1104"},"PeriodicalIF":2.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146106023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Emotional and autonomic responses to natural sounds in listeners with cochlear implantsa). 植入人工耳蜗的听者对自然声音的情绪和自主反应[j]。
IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2026-02-01 DOI: 10.1121/10.0042405
Prabuddha Bhatarai, Kelly N Jahn

This study characterized emotional responses to environmental sounds in 35 adults, including 18 cochlear implant (CI) users and 17 listeners with normal hearing (NH), using a comprehensive battery of self-report, behavioral, and autonomic measures. Changes in emotional reactions, pupil dilation, and skin conductance were assessed while participants listened to a series of emotionally evocative, naturally occurring sounds. The CI listeners exhibited a constricted range of emotional responses to the sounds, wherein they perceived pleasant and unpleasant sounds to be significantly less pleasant and less unpleasant, respectively, than the NH listeners. This reduced valence range was statistically associated with self-reported emotional deficits in daily life. Furthermore, the CI listeners exhibited significantly slower sound-evoked pupil dilations than the NH listeners, suggesting that they were slower to process the emotionally evocative sounds. These findings can support clinicians in identifying targets for counseling and rehabilitation to improve quality of life for adult CI listeners. The differences in emotional responses to naturalistic stimuli in CI listeners also highlight the need for future research to explore ecologically valid measures of assessment and rehabilitation.

本研究对35名成年人(包括18名人工耳蜗使用者和17名听力正常的听者)对环境声音的情绪反应进行了研究,采用了综合的自我报告、行为和自主测量方法。当参与者听一系列唤起情感的自然声音时,评估了情绪反应、瞳孔扩张和皮肤电导的变化。CI听众对声音的情绪反应范围有限,其中他们感受到的愉快和不愉快的声音分别比NH听众明显不那么愉快和不那么不愉快。这种降低的效价范围在统计上与日常生活中自我报告的情绪缺陷有关。此外,CI听众比NH听众表现出更慢的声音诱发的瞳孔扩张,这表明他们处理情感唤起声音的速度更慢。这些发现可以支持临床医生确定咨询和康复的目标,以提高成人CI听者的生活质量。CI听者对自然刺激的情绪反应的差异也强调了未来研究探索评估和康复的生态有效措施的必要性。
{"title":"Emotional and autonomic responses to natural sounds in listeners with cochlear implantsa).","authors":"Prabuddha Bhatarai, Kelly N Jahn","doi":"10.1121/10.0042405","DOIUrl":"https://doi.org/10.1121/10.0042405","url":null,"abstract":"<p><p>This study characterized emotional responses to environmental sounds in 35 adults, including 18 cochlear implant (CI) users and 17 listeners with normal hearing (NH), using a comprehensive battery of self-report, behavioral, and autonomic measures. Changes in emotional reactions, pupil dilation, and skin conductance were assessed while participants listened to a series of emotionally evocative, naturally occurring sounds. The CI listeners exhibited a constricted range of emotional responses to the sounds, wherein they perceived pleasant and unpleasant sounds to be significantly less pleasant and less unpleasant, respectively, than the NH listeners. This reduced valence range was statistically associated with self-reported emotional deficits in daily life. Furthermore, the CI listeners exhibited significantly slower sound-evoked pupil dilations than the NH listeners, suggesting that they were slower to process the emotionally evocative sounds. These findings can support clinicians in identifying targets for counseling and rehabilitation to improve quality of life for adult CI listeners. The differences in emotional responses to naturalistic stimuli in CI listeners also highlight the need for future research to explore ecologically valid measures of assessment and rehabilitation.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"159 2","pages":"1235-1246"},"PeriodicalIF":2.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146142262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The temporal effects of auditory and visual immersion on speech level in virtual environments. 虚拟环境中听觉和视觉沉浸对言语水平的时间效应。
IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2026-01-01 DOI: 10.1121/10.0042240
Xinyi N Zhang, Arian Shamei, Florian Grond, Ingrid Verduyckt, Rachel E Bouserhal

Speech takes place in physical environments with visual and acoustic properties, yet how these elements and their interaction influence speech production is not fully understood. While a room's appearance can suggest its acoustics, it is unclear whether people adjust their speech based on this visual information. Previous research shows that higher reverberation leads to reduced speech level, but how auditory and visual information interact in this process remains limited. This study examined how audiovisual information affects speech level by immersing participants in virtual environments with varying reverberation and room visuals (hemi-anechoic room, classroom, and gymnasium) while completing speech tasks. Speech level was analyzed using generalized additive mixed-effects modeling to assess temporal changes during utterances across conditions. Results showed that visual information significantly influenced speech level, though not strictly in line with expected acoustics or perceived room size; auditory information had a stronger overall effect than visual information. Visual information had an earlier influence that diminished over time, whereas the auditory effect increased and plateaued. These findings contribute to the understanding of multisensory integration in speech control and have implications in enhancing vocal performance and supporting more naturalistic communication in virtual environments.

语音发生在具有视觉和声学特性的物理环境中,但这些元素及其相互作用如何影响语音产生尚不完全清楚。虽然一个房间的外观可以表明它的声学效果,但人们是否会根据这种视觉信息来调整自己的语言还不清楚。先前的研究表明,较高的混响会导致语音水平降低,但听觉和视觉信息在这一过程中如何相互作用仍然有限。本研究通过将参与者沉浸在具有不同混响和房间视觉效果(半消声室、教室和体育馆)的虚拟环境中,同时完成演讲任务,研究了视听信息如何影响语音水平。使用广义加性混合效应模型分析语音水平,以评估不同条件下话语的时间变化。结果表明,视觉信息显著影响语音水平,尽管与预期的声学或感知的房间大小不完全一致;听觉信息的整体效果强于视觉信息。视觉信息的早期影响随着时间的推移而减弱,而听觉的影响则增加并趋于稳定。这些发现有助于理解语音控制中的多感觉整合,并对提高语音表现和支持虚拟环境中更自然的交流具有重要意义。
{"title":"The temporal effects of auditory and visual immersion on speech level in virtual environments.","authors":"Xinyi N Zhang, Arian Shamei, Florian Grond, Ingrid Verduyckt, Rachel E Bouserhal","doi":"10.1121/10.0042240","DOIUrl":"https://doi.org/10.1121/10.0042240","url":null,"abstract":"<p><p>Speech takes place in physical environments with visual and acoustic properties, yet how these elements and their interaction influence speech production is not fully understood. While a room's appearance can suggest its acoustics, it is unclear whether people adjust their speech based on this visual information. Previous research shows that higher reverberation leads to reduced speech level, but how auditory and visual information interact in this process remains limited. This study examined how audiovisual information affects speech level by immersing participants in virtual environments with varying reverberation and room visuals (hemi-anechoic room, classroom, and gymnasium) while completing speech tasks. Speech level was analyzed using generalized additive mixed-effects modeling to assess temporal changes during utterances across conditions. Results showed that visual information significantly influenced speech level, though not strictly in line with expected acoustics or perceived room size; auditory information had a stronger overall effect than visual information. Visual information had an earlier influence that diminished over time, whereas the auditory effect increased and plateaued. These findings contribute to the understanding of multisensory integration in speech control and have implications in enhancing vocal performance and supporting more naturalistic communication in virtual environments.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"159 1","pages":"384-397"},"PeriodicalIF":2.3,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145959731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PAMGuard: Application software for passive acoustic detection, classification, and localisation of animal sounds. PAMGuard:用于被动声学检测、分类和动物声音定位的应用软件。
IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2026-01-01 DOI: 10.1121/10.0042245
Douglas Gillespie, Jamie Macaulay, Michael Oswald, Marie Roch

Detection, classification, and localisation of animal sounds are essential in many ecological studies, including density estimation and behavioural studies. Real-time acoustic processing can also be used in mitigation exercises, with the possibility of curtailing harmful human activities when animals are present. Animal vocalisations vary widely, and there is no single detection algorithm that can robustly detect all sound types. Human-in-the loop analysis is often required to validate algorithm performance and deal with unexpected noise sources such as are often encountered in real-world situations. The PAMGuard software combines advanced automatic analysis algorithms, including AI methods, with interactive visual tools allowing users to develop efficient workflows for both real-time use and for processing archived datasets. A modular framework enables users to configure multiple detectors, classifiers, and localisers suitable for the equipment and species of interest in a particular application. Multiple detectors for different sound types can be run concurrently on the same data. An extensible "plug-in" interface also makes it possible for third parties to independently develop new modules to run within the software framework. Here, we describe the software's core functionality, illustrated using workflows for both real-time and offline use, and present an update on the latest features.

动物声音的检测、分类和定位在许多生态学研究中是必不可少的,包括密度估计和行为研究。实时声学处理也可用于缓解活动,有可能在有动物在场时减少有害的人类活动。动物的声音变化很大,没有单一的检测算法可以检测所有的声音类型。在验证算法性能和处理意外噪声源(如在现实世界中经常遇到的噪声源)时,通常需要人在循环分析。PAMGuard软件结合了先进的自动分析算法,包括人工智能方法,以及交互式可视化工具,允许用户开发实时使用和处理存档数据集的高效工作流程。模块化框架使用户能够配置多个检测器、分类器和本地化器,适合特定应用程序中感兴趣的设备和物种。不同声音类型的多个检测器可以在同一数据上并发运行。可扩展的“插件”接口也使第三方能够独立开发在软件框架内运行的新模块。在这里,我们描述了软件的核心功能,使用实时和离线使用的工作流进行说明,并介绍了最新功能的更新。
{"title":"PAMGuard: Application software for passive acoustic detection, classification, and localisation of animal sounds.","authors":"Douglas Gillespie, Jamie Macaulay, Michael Oswald, Marie Roch","doi":"10.1121/10.0042245","DOIUrl":"https://doi.org/10.1121/10.0042245","url":null,"abstract":"<p><p>Detection, classification, and localisation of animal sounds are essential in many ecological studies, including density estimation and behavioural studies. Real-time acoustic processing can also be used in mitigation exercises, with the possibility of curtailing harmful human activities when animals are present. Animal vocalisations vary widely, and there is no single detection algorithm that can robustly detect all sound types. Human-in-the loop analysis is often required to validate algorithm performance and deal with unexpected noise sources such as are often encountered in real-world situations. The PAMGuard software combines advanced automatic analysis algorithms, including AI methods, with interactive visual tools allowing users to develop efficient workflows for both real-time use and for processing archived datasets. A modular framework enables users to configure multiple detectors, classifiers, and localisers suitable for the equipment and species of interest in a particular application. Multiple detectors for different sound types can be run concurrently on the same data. An extensible \"plug-in\" interface also makes it possible for third parties to independently develop new modules to run within the software framework. Here, we describe the software's core functionality, illustrated using workflows for both real-time and offline use, and present an update on the latest features.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"159 1","pages":"437-443"},"PeriodicalIF":2.3,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145985116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the Acoustical Society of America
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1