Pub Date : 2026-01-02DOI: 10.1016/j.apacoust.2025.111212
Tingting Wang , Marvin Borsdorf , Qiquan Zhang , Longting Xu , Xi Shao
Replay spoofing speech poses a great concern to automatic speaker verification (ASV) systems, with the advent of high-quality recording and playback devices. Recent studies have explored graph Fourier transform (GFT)-based features and graph frequency cepstral coefficients for replay speech detection. However, these methods often suffer from unstable graph spectral differences between genuine and replayed speech, caused by sensitivity in the graph Fourier basis (i.e., GFT with eigenvalue decomposition). This instability limits the effectiveness of graph frequency cepstral coefficient features for reliable replay attack detection. To address this, we propose constructing robust undirected graph topologies using an exponential function and graph shift operator. Based on these graph topologies, we derive stable graph Laplacian matrices with singular value decomposition to define the logarithmic graph Fourier transform and logarithmic joint graph Fourier transform. These enable extraction of enhanced (joint) graph frequency cepstral coefficient features for replay attack detection. Experimental results on the ASVspoof dataset show that ASV systems utilizing our (joint) graph frequency cepstral coefficient features significantly outperform current state-of-the-art methods on the ASVspoof datasets. We release our source code at: https://github.com/Wangfighting0015/GFT_project.
{"title":"Going deeper with log-graph Fourier transform-based feature extraction for playback speech detection","authors":"Tingting Wang , Marvin Borsdorf , Qiquan Zhang , Longting Xu , Xi Shao","doi":"10.1016/j.apacoust.2025.111212","DOIUrl":"10.1016/j.apacoust.2025.111212","url":null,"abstract":"<div><div>Replay spoofing speech poses a great concern to automatic speaker verification (ASV) systems, with the advent of high-quality recording and playback devices. Recent studies have explored graph Fourier transform (GFT)-based features and graph frequency cepstral coefficients for replay speech detection. However, these methods often suffer from unstable graph spectral differences between genuine and replayed speech, caused by sensitivity in the graph Fourier basis (i.e., GFT with eigenvalue decomposition). This instability limits the effectiveness of graph frequency cepstral coefficient features for reliable replay attack detection. To address this, we propose constructing robust undirected graph topologies using an exponential function and graph shift operator. Based on these graph topologies, we derive stable graph Laplacian matrices with singular value decomposition to define the logarithmic graph Fourier transform and logarithmic joint graph Fourier transform. These enable extraction of enhanced (joint) graph frequency cepstral coefficient features for replay attack detection. Experimental results on the ASVspoof dataset show that ASV systems utilizing our (joint) graph frequency cepstral coefficient features significantly outperform current state-of-the-art methods on the ASVspoof datasets. We release our source code at: <span><span>https://github.com/Wangfighting0015/GFT_project</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"245 ","pages":"Article 111212"},"PeriodicalIF":3.4,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145884077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study evaluates multiple machine learning and deep learning approaches for forecasting A-weighted equivalent continuous sound levels (LAeq) and investigates spatiotemporal fluctuations in road traffic noise at the Sahloul Road in Sousse City, Tunisia. Road traffic noise levels (dB)) were measured at five urban sites with differing traffic intensities using TES 1352A sound level meter, with data collected at both hourly and 15-minute intervals. Classification of traffic composition into motorcycles, light vehicles, and heavy vehicles is carried out using the video-monitored traffic count. Results show that heavy vehicles and total traffic volume present the strongest correlation with noise levels. Maximum noise occurred during the morning (07:00–09:00) and evening (16:00–18:00) hours, which exceeded the World Health Organization (WHO) guideline by more than 50 % for the entire measurement study. Four machine learning algorithms (XGBoost, Random Forest, LightGBM, and LSTM) are applied, utilizing vehicle counts as factors indicating traffic volume for the prediction of LAeq. On an hourly timescale, the performance of the XGBoost algorithm was best (R2 = 0.952, MAE = 0.18 dB). However, the performance of the algorithm decreased progressively for smaller temporal resolutions, which showed a marked difference among the two timescales, indicating higher variability of noise at the smaller timescale. As such, the LSTM algorithm indicated poor performance, specifically at the temporal resolutions (15 min). this study’s findings have brought out the effectiveness of ensemble tree-based methods for predicting urban noise levels, as well as the importance of considering the time resolution when structuring measures for reducing urban noise pollution.
本研究评估了用于预测a加权等效连续声级(LAeq)的多种机器学习和深度学习方法,并调查了突尼斯苏塞市Sahloul路道路交通噪声的时空波动。使用TES 1352A声级计在五个不同交通强度的城市地点测量道路交通噪音水平(dB),每隔一小时和15分钟收集一次数据。利用视频监控的交通统计,将交通构成分为摩托车、轻型车辆和重型车辆。结果表明,重型车辆和总交通量与噪声水平的相关性最强。最大噪音发生在早上(07:00-09:00)和晚上(16:00-18:00),超过了整个测量研究中世界卫生组织(WHO)指南的50%以上。应用了四种机器学习算法(XGBoost、Random Forest、LightGBM和LSTM),利用车辆计数作为指示交通量的因素来预测LAeq。在小时尺度上,XGBoost算法的性能最好(R2 = 0.952, MAE = 0.18 dB)。然而,在较小的时间分辨率下,算法的性能逐渐下降,这在两个时间尺度之间表现出明显的差异,表明在较小的时间尺度下噪声的变异性更高。因此,LSTM算法表现出较差的性能,特别是在时间分辨率(15分钟)下。本研究的结果表明了基于集合树的方法预测城市噪声水平的有效性,以及在制定减少城市噪声污染的措施时考虑时间分辨率的重要性。
{"title":"Urban noise pollution prediction using traffic patterns and AI models in Sahloul Road, Sousse City, Tunisia","authors":"Najah Kechiche , Jawher Bouaziz , Nader Boumrifeg , Walid Hassen , Tarek Salem Abdennaji , Chemseddine Maatki , Badr M. Alshammari , Lioua Kolsi","doi":"10.1016/j.apacoust.2025.111218","DOIUrl":"10.1016/j.apacoust.2025.111218","url":null,"abstract":"<div><div>This study evaluates multiple machine learning and deep learning approaches for forecasting A-weighted equivalent continuous sound levels (LAeq) and investigates spatiotemporal fluctuations in road traffic noise at the Sahloul Road in Sousse City, Tunisia. Road traffic noise levels (dB)) were measured at five urban sites with differing traffic intensities using TES 1352A sound level meter, with data collected at both hourly and 15-minute intervals. Classification of traffic composition into motorcycles, light vehicles, and heavy vehicles is carried out using the video-monitored traffic count. Results show that heavy vehicles and total traffic volume present the strongest correlation with noise levels. Maximum noise occurred during the morning (07:00–09:00) and evening (16:00–18:00) hours, which exceeded the World Health Organization (WHO) guideline by more than 50 % for the entire measurement study. Four machine learning algorithms (XGBoost, Random Forest, LightGBM, and LSTM) are applied, utilizing vehicle counts as factors indicating traffic volume for the prediction of LAeq. On an hourly timescale, the performance of the XGBoost algorithm was best (R2 = 0.952, MAE = 0.18 dB). However, the performance of the algorithm decreased progressively for smaller temporal resolutions, which showed a marked difference among the two timescales, indicating higher variability of noise at the smaller timescale. As such, the LSTM algorithm indicated poor performance, specifically at the temporal resolutions (15 min). this study’s findings have brought out the effectiveness of ensemble tree-based methods for predicting urban noise levels, as well as the importance of considering the time resolution when structuring measures for reducing urban noise pollution.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"245 ","pages":"Article 111218"},"PeriodicalIF":3.4,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145884046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-29DOI: 10.1016/j.apacoust.2025.111210
Yoshinari Yamada
The spectral structures of sound fields in rooms differ between steady-state conditions and transient processes. This study presents a methodology to enable analysis of the spectral characteristics of the transient processes of room sound fields. The energy spectra appearing during the decay and growth processes for sinusoidal and white noise excitations are formulated by redefining energy-time functions in reference to previous findings. The derived formulae present a unified approach for the interrupted sine and noise methods. The energy spectrum is used to examine the spectral shift and broadening induced by out-of-band modes during transient processes. Frequency statistics, spectral bandwidth, and band energy ratios are introduced as physical indices. The proposed method was applied to analyze the decay process of actual room sound fields following preliminary examinations using room impulse responses generated by numerical simulations. The results indicate that the effect of out-of-band modes was insignificant when the product of the excitation bandwidth and reverberation time exceeded 18 and 36, respectively, in cases with and without acoustic modal overlap. Band limiting of the received signal may not always be necessary under these conditions. In concert halls, this condition was satisfied even when the excitation bandwidth was a 1/12 octave. Thus, the physical indices introduced in this study can characterize sound fields in different rooms from an unconventional perspective.
{"title":"Quantification of spectral shift and broadening induced by out-of-band modes in the transient process of room sound field","authors":"Yoshinari Yamada","doi":"10.1016/j.apacoust.2025.111210","DOIUrl":"10.1016/j.apacoust.2025.111210","url":null,"abstract":"<div><div>The spectral structures of sound fields in rooms differ between steady-state conditions and transient processes. This study presents a methodology to enable analysis of the spectral characteristics of the transient processes of room sound fields. The energy spectra appearing during the decay and growth processes for sinusoidal and white noise excitations are formulated by redefining energy-time functions in reference to previous findings. The derived formulae present a unified approach for the interrupted sine and noise methods. The energy spectrum is used to examine the spectral shift and broadening induced by out-of-band modes during transient processes. Frequency statistics, spectral bandwidth, and band energy ratios are introduced as physical indices. The proposed method was applied to analyze the decay process of actual room sound fields following preliminary examinations using room impulse responses generated by numerical simulations. The results indicate that the effect of out-of-band modes was insignificant when the product of the excitation bandwidth and reverberation time exceeded 18 and 36, respectively, in cases with and without acoustic modal overlap. Band limiting of the received signal may not always be necessary under these conditions. In concert halls, this condition was satisfied even when the excitation bandwidth was a 1/12 octave. Thus, the physical indices introduced in this study can characterize sound fields in different rooms from an unconventional perspective.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"245 ","pages":"Article 111210"},"PeriodicalIF":3.4,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145884083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The difference in fundamental frequency (F0) among vowels is a crucial cue for detecting concurrent vowels. Normal hearing (NH) listeners have higher percent identification scores for both vowels as the F0 difference increased, reaching an asymptote at ∼ 3 Hz. In complex listening environments, aged hearing (AH) listeners exhibit a reduction in the overall concurrent vowel scores across F0 differences, which may be related to age-related loss of neural synchrony in the auditory system. To understand these age effects, the current modeling study predicts the concurrent vowel scores across F0 differences for both NH and AH subjects. The NH model used the neural responses of an auditory-nerve (AN) model with the neurogram similarity index measure (NSIM) metric, instead of the previous F0-guided segregation, to predict concurrent vowel scores. Previous behavioral studies have shown that temporal jitter in the acoustic domain can cause an age-related decrease of neural synchrony, resulting in reduced identification scores. Thus, a temporal-jitter concurrent vowel was implemented in the AH model to obtain the neurograms from the AN model. The NSIM metric was applied on these neurograms to predict the concurrent vowel scores of AH subjects. Both models qualitatively predicted the concurrent vowel scores pattern across F0 differences, obtained from the previous behavioral data. A chi-square test analysis has shown that the scores correlated well with the concurrent vowel data. The temporal-jitter AH model predictions showed neural asynchrony, reducing the concurrent vowel scores among AH listeners. These model predictions suggest that the neural asynchrony in the temporal-jitter AH model might contribute to the reduced concurrent vowel scores across F0 differences.
{"title":"Predicting the age effects on concurrent vowel scores using a temporal jitter computational model","authors":"Harshavardhan Settibhaktini , Rithik Rathi , Ananthakrishna Chintanpalli","doi":"10.1016/j.apacoust.2025.111215","DOIUrl":"10.1016/j.apacoust.2025.111215","url":null,"abstract":"<div><div>The difference in fundamental frequency (F0) among vowels is a crucial cue for detecting concurrent vowels. Normal hearing (NH) listeners have higher percent identification scores for both vowels as the F0 difference increased, reaching an asymptote at ∼ 3 Hz. In complex listening environments, aged hearing (AH) listeners exhibit a reduction in the overall concurrent vowel scores across F0 differences, which may be related to age-related loss of neural synchrony in the auditory system. To understand these age effects, the current modeling study predicts the concurrent vowel scores across F0 differences for both NH and AH subjects. The NH model used the neural responses of an auditory-nerve (AN) model with the neurogram similarity index measure (NSIM) metric, instead of the previous F0-guided segregation, to predict concurrent vowel scores. Previous behavioral studies have shown that temporal jitter in the acoustic domain can cause an age-related decrease of neural synchrony, resulting in reduced identification scores. Thus, a temporal-jitter concurrent vowel was implemented in the AH model to obtain the neurograms from the AN model. The NSIM metric was applied on these neurograms to predict the concurrent vowel scores of AH subjects. Both models qualitatively predicted the concurrent vowel scores pattern across F0 differences, obtained from the previous behavioral data. A chi-square test analysis has shown that the scores correlated well with the concurrent vowel data. The temporal-jitter AH model predictions showed neural asynchrony, reducing the concurrent vowel scores among AH listeners. These model predictions suggest that the neural asynchrony in the temporal-jitter AH model might contribute to the reduced concurrent vowel scores across F0 differences.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"245 ","pages":"Article 111215"},"PeriodicalIF":3.4,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-27DOI: 10.1016/j.apacoust.2025.111214
Camila Sanches Schimidt , Leopoldo Pisanelli Rodrigues de Oliveira , Virgilio Junior Caetano , Carlos De Marqui Junior
This work presents a reconfigurable piezoelectric metamaterial beam designed to manipulate acoustic radiation and control sound directivity through tailored vibrational modes. The proposed architecture enables the induction of specific modes at arbitrary frequencies via an electromechanical coupling mechanism. The beam incorporates periodically distributed piezoelectric elements, each connected to digitally programmable synthetic impedance shunts that allow adaptive tuning. The metamaterial is divided into three regions: a central section that amplifies a selected target mode shape and two side sections that operate within a programmed band gap to suppress bending waves. A modal-analysis-based modeling framework captures the combined effects of band gap tuning and mode induction, guiding the selection of shunt parameters to achieve the desired vibration patterns. Experimental measurements show excellent agreement with numerical predictions, confirming the model’s accuracy and the effectiveness of the reconfiguration strategy in reproducing similar vibration shapes at different frequencies. Coupled structural–acoustic simulations further demonstrate the steering of radiated sound energy by modulating the beam’s vibration profile. The results highlight the metamaterial’s ability for on-demand wave manipulation and adaptive sound field shaping, exhibiting similar directivity patterns at distinct frequencies. Overall, the findings establish a versatile platform for adaptive wave manipulation and sound directivity control.
{"title":"Tailoring vibration modes in piezoelectric metamaterial beams for sound directivity control","authors":"Camila Sanches Schimidt , Leopoldo Pisanelli Rodrigues de Oliveira , Virgilio Junior Caetano , Carlos De Marqui Junior","doi":"10.1016/j.apacoust.2025.111214","DOIUrl":"10.1016/j.apacoust.2025.111214","url":null,"abstract":"<div><div>This work presents a reconfigurable piezoelectric metamaterial beam designed to manipulate acoustic radiation and control sound directivity through tailored vibrational modes. The proposed architecture enables the induction of specific modes at arbitrary frequencies via an electromechanical coupling mechanism. The beam incorporates periodically distributed piezoelectric elements, each connected to digitally programmable synthetic impedance shunts that allow adaptive tuning. The metamaterial is divided into three regions: a central section that amplifies a selected target mode shape and two side sections that operate within a programmed band gap to suppress bending waves. A modal-analysis-based modeling framework captures the combined effects of band gap tuning and mode induction, guiding the selection of shunt parameters to achieve the desired vibration patterns. Experimental measurements show excellent agreement with numerical predictions, confirming the model’s accuracy and the effectiveness of the reconfiguration strategy in reproducing similar vibration shapes at different frequencies. Coupled structural–acoustic simulations further demonstrate the steering of radiated sound energy by modulating the beam’s vibration profile. The results highlight the metamaterial’s ability for on-demand wave manipulation and adaptive sound field shaping, exhibiting similar directivity patterns at distinct frequencies. Overall, the findings establish a versatile platform for adaptive wave manipulation and sound directivity control.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"245 ","pages":"Article 111214"},"PeriodicalIF":3.4,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-24DOI: 10.1016/j.apacoust.2025.111216
Ziheng Xia, Yansong He, Zhifei Zhang, Hao Chen, Xiaoyu Fu
Active road noise control (ARNC) systems use multiple acceleration signals from the chassis as the references to attenuate vehicle interior road noise. However, due to the coupling effect among the chassis structures, the predetermined reference signals cannot always ensure their independence and coherence to the interior noises. In addition, the computational complexity of the conventional centralized ARNC systems increase rapidly as the growth of the channels. To address the above issues, this study proposes a multi-channel ARNC system using virtual references and local-clustered control strategy. Firstly, a virtual reference method based on incremental singular value decomposition is developed to generate an orthogonal virtual reference set, which improves the system’s noise reduction performance. The conventional centralized system is then decomposed into two independent active headrest subsystems based on the local-clustered control strategy to reduce the computational complexity. Numerical simulations and on-board experiments are conducted to validate the effectiveness of the proposed ARNC system. The results demonstrate that the proposed system can effectively attenuate the interior road noises with lower computational effort compared to the conventional centralized ARNC system: when the vehicle is driving at the speed of 60 km/h, the overall noise reduction values improved by 1.5 dB(A), and when the vehicle is driving at the speed of 80 km/h, the values improved by 1.0 dB(A).
{"title":"A multi-channel active road noise control system using incremental SVD based virtual references and local-clustered control strategy","authors":"Ziheng Xia, Yansong He, Zhifei Zhang, Hao Chen, Xiaoyu Fu","doi":"10.1016/j.apacoust.2025.111216","DOIUrl":"10.1016/j.apacoust.2025.111216","url":null,"abstract":"<div><div>Active road noise control (ARNC) systems use multiple acceleration signals from the chassis as the references to attenuate vehicle interior road noise. However, due to the coupling effect among the chassis structures, the predetermined reference signals cannot always ensure their independence and coherence to the interior noises. In addition, the computational complexity of the conventional centralized ARNC systems increase rapidly as the growth of the channels. To address the above issues, this study proposes a multi-channel ARNC system using virtual references and local-clustered control strategy. Firstly, a virtual reference method based on incremental singular value decomposition is developed to generate an orthogonal virtual reference set, which improves the system’s noise reduction performance. The conventional centralized system is then decomposed into two independent active headrest subsystems based on the local-clustered control strategy to reduce the computational complexity. Numerical simulations and on-board experiments are conducted to validate the effectiveness of the proposed ARNC system. The results demonstrate that the proposed system can effectively attenuate the interior road noises with lower computational effort compared to the conventional centralized ARNC system: when the vehicle is driving at the speed of 60 km/h, the overall noise reduction values improved by 1.5 dB(A), and when the vehicle is driving at the speed of 80 km/h, the values improved by 1.0 dB(A).</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"245 ","pages":"Article 111216"},"PeriodicalIF":3.4,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-22DOI: 10.1016/j.apacoust.2025.111170
Jonathan Terroir
Many occupational sectors are exposed to noise from very high audible frequencies (VHF) or low-frequency ultrasound (LFUS). Despite numerous studies on VHF/LFUS exposure, the question of the impact of several parameters in the on-site measurement protocol on the representativeness of the measurement remains open. Laboratory measurements were carried out to assess the variability of exposure measurements under controlled conditions as a function of the presence/absence of the microphone’s protection grid, the location of the microphone (shoulder, ear entrance, etc.), the side of the dummy where the microphone is located, and the frequency and angle of the source. Particular attention has been paid to the possibility of positioning the microphone at temple level, thanks to a specific mounting system adapted to on-site use.
{"title":"Measuring occupational exposure to very high frequencies noise: Assessing measurement variability under controlled conditions","authors":"Jonathan Terroir","doi":"10.1016/j.apacoust.2025.111170","DOIUrl":"10.1016/j.apacoust.2025.111170","url":null,"abstract":"<div><div>Many occupational sectors are exposed to noise from very high audible frequencies (VHF) or low-frequency ultrasound (LFUS). Despite numerous studies on VHF/LFUS exposure, the question of the impact of several parameters in the on-site measurement protocol on the representativeness of the measurement remains open. Laboratory measurements were carried out to assess the variability of exposure measurements under controlled conditions as a function of the presence/absence of the microphone’s protection grid, the location of the microphone (shoulder, ear entrance, etc.), the side of the dummy where the microphone is located, and the frequency and angle of the source. Particular attention has been paid to the possibility of positioning the microphone at temple level, thanks to a specific mounting system adapted to on-site use.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"245 ","pages":"Article 111170"},"PeriodicalIF":3.4,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-22DOI: 10.1016/j.apacoust.2025.111211
Qian Ma , Mingyang Li , Chao Sun
Most prevalent source bearing estimation methods are based on the plane-wave assumption, which might lead to biases in the estimates due to multimode propagation in waveguide channels, especially when the source is near the endfire direction of the sensor array. While the environment-dependent method, such as matched field processing (MFP), can provide unbiased estimates, it requires either a computationally expensive 3-D search over range, depth and bearing or prior knowledge of the source range and depth. The recently-developed subspace intersection method (SIM) circumvents these limitations by exploiting the alignment between the signal vector and the modal subspace, enabling accurate bearing estimation via a 1-D search. However, it requires prior knowledge of the number of sources. This work reformulates source bearing estimation as a matrix similarity measurement problem. It is demonstrated that the maximum likelihood estimate (MLE) for the source bearing can be derived by minimizing a tailored Euclidean distance between the sampled covariance matrix and the modal subspace projection matrix. Furthermore, a novel minimum determinant estimate (MDE) is proposed based on the Jensen-Bregman LogDet divergence, which minimizes the determinant of the sum of the data sampled covariance matrix and the modal subspace projection matrix. Numerical simulations in a shallow water waveguide demonstrate that the MDE achieves accurate bearing estimation of multiple sources without requiring information on the number of sources, and produces ambiguity surfaces with a clean background. The proposed method is also validated using experimental data from the SWellEx-96 trial.
{"title":"Minimum determinant estimate for source bearing estimation in shallow-water waveguides","authors":"Qian Ma , Mingyang Li , Chao Sun","doi":"10.1016/j.apacoust.2025.111211","DOIUrl":"10.1016/j.apacoust.2025.111211","url":null,"abstract":"<div><div>Most prevalent source bearing estimation methods are based on the plane-wave assumption, which might lead to biases in the estimates due to multimode propagation in waveguide channels, especially when the source is near the endfire direction of the sensor array. While the environment-dependent method, such as matched field processing (MFP), can provide unbiased estimates, it requires either a computationally expensive 3-D search over range, depth and bearing or prior knowledge of the source range and depth. The recently-developed subspace intersection method (SIM) circumvents these limitations by exploiting the alignment between the signal vector and the modal subspace, enabling accurate bearing estimation via a 1-D search. However, it requires prior knowledge of the number of sources. This work reformulates source bearing estimation as a matrix similarity measurement problem. It is demonstrated that the maximum likelihood estimate (MLE) for the source bearing can be derived by minimizing a tailored Euclidean distance between the sampled covariance matrix and the modal subspace projection matrix. Furthermore, a novel minimum determinant estimate (MDE) is proposed based on the Jensen-Bregman LogDet divergence, which minimizes the determinant of the sum of the data sampled covariance matrix and the modal subspace projection matrix. Numerical simulations in a shallow water waveguide demonstrate that the MDE achieves accurate bearing estimation of multiple sources without requiring information on the number of sources, and produces ambiguity surfaces with a clean background. The proposed method is also validated using experimental data from the SWellEx-96 trial.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"245 ","pages":"Article 111211"},"PeriodicalIF":3.4,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-21DOI: 10.1016/j.apacoust.2025.111205
Shaoji Zhang , Bo Song , Lei Zhang , Aiguo Zhao , Cheng Shen , Xiangyan Meng , Jiajie Luo , Yucheng Yuan , Hao Li , Liang Gao , Yusheng Shi
Ventilated acoustic metamaterials possess dual functionalities of ventilation and noise suppressing, which meets the needs of many scenarios. This work proposes a unit cell with high parametric tunability that allows flexible control of transmission loss peak quantities, facilitating customized ventilated metamaterials through cascaded unit cell configurations. To achieve the desired acoustic performance within specific frequency ranges, a partitioned optimization strategy was employed, targeting individual unit cells for distinct frequency sub-bands. By integrating machine learning and genetic algorithms, the geometrical parameters of the unit cell can be rapidly optimized to meet the acoustic performance target. In this work, we designed dual-band ventilated metamaterials and broadband ventilated metamaterials to demonstrated the framework’s effectiveness. Both ventilated metamaterials were investigated via finite element method and experiments. The transmission loss performances of experiments are perfect agreement with simulations. Machine learning model surrogate approach bypasses the repetitive process of modeling and finite element analysis, addressing the time-consuming and labor-intensive limitations of traditional trial–error and exhaustive methods, thereby establishing an accelerated pathway for ventilated metamaterial design.
{"title":"Highly efficient design of sound attenuation and ventilated metamaterial via machine learning and genetic algorithm","authors":"Shaoji Zhang , Bo Song , Lei Zhang , Aiguo Zhao , Cheng Shen , Xiangyan Meng , Jiajie Luo , Yucheng Yuan , Hao Li , Liang Gao , Yusheng Shi","doi":"10.1016/j.apacoust.2025.111205","DOIUrl":"10.1016/j.apacoust.2025.111205","url":null,"abstract":"<div><div>Ventilated acoustic metamaterials possess dual functionalities of ventilation and noise suppressing, which meets the needs of many scenarios. This work proposes a unit cell with high parametric tunability that allows flexible control of transmission loss peak quantities, facilitating customized ventilated metamaterials through cascaded unit cell configurations. To achieve the desired acoustic performance within specific frequency ranges, a partitioned optimization strategy was employed, targeting individual unit cells for distinct frequency sub-bands. By integrating machine learning and genetic algorithms, the geometrical parameters of the unit cell can be rapidly optimized to meet the acoustic performance target. In this work, we designed dual-band ventilated metamaterials and broadband ventilated metamaterials to demonstrated the framework’s effectiveness. Both ventilated metamaterials were investigated via finite element method and experiments. The transmission loss performances of experiments are perfect agreement with simulations. Machine learning model surrogate approach bypasses the repetitive process of modeling and finite element analysis, addressing the time-consuming and labor-intensive limitations of traditional trial–error and exhaustive methods, thereby establishing an accelerated pathway for ventilated metamaterial design.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"245 ","pages":"Article 111205"},"PeriodicalIF":3.4,"publicationDate":"2025-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1016/j.apacoust.2025.111209
Xin Liu, Xiaodong Jing
Closed-section wind tunnels represent a typical reverberant environment where microphone arrays are frequently used to identify noise sources. However, wall reflections can be a serious problem for source localization, especially at lower frequencies or when a source locates close to reflecting walls. To address this problem, a novel inverse method is developed for source localization in a hard walled rectangular duct, which incorporates wall reflections into the proposed algorithm by using an appropriate rectangular waveguide Green’s function. No flow is considered in this study, focusing solely on wall reflection effects. Numerical and experimental results obtained with the conventional beamforming (CBF), the image source model (ISM) and the present inverse method (IM) are presented and analyzed. Pronounced spurious sidelobes appear on the CBF maps, causing reduced resolution or erroneous source location at low frequencies. The ISM can remove most of the spurious side lobes, but it still suffers from low resolution at low frequencies and its spatial resolution is direction dependent. By comparison, the IM shows considerably improved performance in terms of mainlobe width, localization accuracy and sidelobe level, with its resolution close to omnidirectional similar to that obtained under anechoic conditions. It is effective for both coherent and incoherent sound sources. At the low frequency of 500 Hz, it achieves subwavelength resolution by exploiting evanescent modes. Furthermore, the robustness of the IM to noise is examined through both simulations and experiments, showing that it outperforms the other two methods at an SNR of 10 dB.
{"title":"An inverse method for source identification in rectangular waveguides with reverberation","authors":"Xin Liu, Xiaodong Jing","doi":"10.1016/j.apacoust.2025.111209","DOIUrl":"10.1016/j.apacoust.2025.111209","url":null,"abstract":"<div><div>Closed-section wind tunnels represent a typical reverberant environment where microphone arrays are frequently used to identify noise sources. However, wall reflections can be a serious problem for source localization, especially at lower frequencies or when a source locates close to reflecting walls. To address this problem, a novel inverse method is developed for source localization in a hard walled rectangular duct, which incorporates wall reflections into the proposed algorithm by using an appropriate rectangular waveguide Green’s function. No flow is considered in this study, focusing solely on wall reflection effects. Numerical and experimental results obtained with the conventional beamforming (CBF), the image source model (ISM) and the present inverse method (IM) are presented and analyzed. Pronounced spurious sidelobes appear on the CBF maps, causing reduced resolution or erroneous source location at low frequencies. The ISM can remove most of the spurious side lobes, but it still suffers from low resolution at low frequencies and its spatial resolution is direction dependent. By comparison, the IM shows considerably improved performance in terms of mainlobe width, localization accuracy and sidelobe level, with its resolution close to omnidirectional similar to that obtained under anechoic conditions. It is effective for both coherent and incoherent sound sources. At the low frequency of 500 Hz, it achieves subwavelength resolution by exploiting evanescent modes. Furthermore, the robustness of the IM to noise is examined through both simulations and experiments, showing that it outperforms the other two methods at an SNR of 10 dB.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"245 ","pages":"Article 111209"},"PeriodicalIF":3.4,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145798064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}