首页 > 最新文献

Journal of the Acoustical Society of America最新文献

英文 中文
Orthogonal time-frequency space modulation for underwater mobile acoustic communications.
IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2025-02-01 DOI: 10.1121/10.0035938
Yukang Xue, Xiyuan Zhu, Y Rosa Zheng

This paper presents a new turbo decision feedback equalizer and decoder (TDFED) for the orthogonal time-frequency space (OTFS) system of underwater mobile acoustic communications where the communication channel suffers from severe multipath and Doppler effects simultaneously. The proposed TDFED employs a set of feedforward and feedback filters in the time domain instead of the common approach that employs a normalized least mean square equalizer in the delay-Doppler domain. The receiver also utilizes low-complexity improved proportionate normalized least mean square channel estimation in the delay-Doppler domain. Practical OTFS modulation schemes are designed for acoustic transmission at a center frequency of 115 kHz and a symbol rate of 11.5 ksps (kilo-symbols-per-second). Several lake experiments in mobile communication scenarios are conducted to evaluate the proposed OTFS in comparison to the single-carrier coherent modulation (SCCM) and the orthogonal frequency division modulation (OFDM) schemes. The experimental results demonstrate that the proposed OTFS receiver effectively reduces the accuracy requirements of the Doppler compensation algorithm compared to the SCCM and OFDM schemes. The proposed TDFED algorithm achieves a much better bit error rate against long-multipath fading and severe Doppler shift than the existing delay-Doppler domain equalizers.

{"title":"Orthogonal time-frequency space modulation for underwater mobile acoustic communications.","authors":"Yukang Xue, Xiyuan Zhu, Y Rosa Zheng","doi":"10.1121/10.0035938","DOIUrl":"10.1121/10.0035938","url":null,"abstract":"<p><p>This paper presents a new turbo decision feedback equalizer and decoder (TDFED) for the orthogonal time-frequency space (OTFS) system of underwater mobile acoustic communications where the communication channel suffers from severe multipath and Doppler effects simultaneously. The proposed TDFED employs a set of feedforward and feedback filters in the time domain instead of the common approach that employs a normalized least mean square equalizer in the delay-Doppler domain. The receiver also utilizes low-complexity improved proportionate normalized least mean square channel estimation in the delay-Doppler domain. Practical OTFS modulation schemes are designed for acoustic transmission at a center frequency of 115 kHz and a symbol rate of 11.5 ksps (kilo-symbols-per-second). Several lake experiments in mobile communication scenarios are conducted to evaluate the proposed OTFS in comparison to the single-carrier coherent modulation (SCCM) and the orthogonal frequency division modulation (OFDM) schemes. The experimental results demonstrate that the proposed OTFS receiver effectively reduces the accuracy requirements of the Doppler compensation algorithm compared to the SCCM and OFDM schemes. The proposed TDFED algorithm achieves a much better bit error rate against long-multipath fading and severe Doppler shift than the existing delay-Doppler domain equalizers.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"1378-1390"},"PeriodicalIF":2.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143468339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inertial effects on single-perforation plates resistivity at high flow rates: Computational fluid dynamics and experimental studies.
IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2025-02-01 DOI: 10.1121/10.0035642
Maël Lopez, Alla Eddine Benchikh Le Hocine, Tenon Charly Kone, Thomas Dupont, Raymond Panneton

This article is focused on the viscous and inertial effects on airflow resistivity of periodic arrays of single-perforation plates spaced by thin air cavities. Analyzing this effect would provide better insight into losses within the material, including additional losses due to increasing sound excitation levels. In this way, the material pressure drop is predicted by computational fluid dynamics function (CFD) of the flow rate for corresponding pore Reynolds numbers between 0.3 and 1500. The static airflow resistivity coefficient is determined by the linear part of the pressure drop (viscous effect) and the Forchheimer coefficient from the nonlinear part of the pressure drop (inertial effect). Both coefficients are determined on the entirety of the material (globally) and at the plate levels (locally). Good agreement is observed between CFD predictions and experimental measurements on the whole range of studied Reynolds numbers. By locally investigating the pressure drops, the observations show that the viscous effects are constant through the material. With increasing pore Reynolds number, inertial effects of the first plate dominate over those of the other plates. The consideration of the local inertial effect will be a key component in the acoustic modeling of this type of material under high sound excitation levels.

{"title":"Inertial effects on single-perforation plates resistivity at high flow rates: Computational fluid dynamics and experimental studies.","authors":"Maël Lopez, Alla Eddine Benchikh Le Hocine, Tenon Charly Kone, Thomas Dupont, Raymond Panneton","doi":"10.1121/10.0035642","DOIUrl":"https://doi.org/10.1121/10.0035642","url":null,"abstract":"<p><p>This article is focused on the viscous and inertial effects on airflow resistivity of periodic arrays of single-perforation plates spaced by thin air cavities. Analyzing this effect would provide better insight into losses within the material, including additional losses due to increasing sound excitation levels. In this way, the material pressure drop is predicted by computational fluid dynamics function (CFD) of the flow rate for corresponding pore Reynolds numbers between 0.3 and 1500. The static airflow resistivity coefficient is determined by the linear part of the pressure drop (viscous effect) and the Forchheimer coefficient from the nonlinear part of the pressure drop (inertial effect). Both coefficients are determined on the entirety of the material (globally) and at the plate levels (locally). Good agreement is observed between CFD predictions and experimental measurements on the whole range of studied Reynolds numbers. By locally investigating the pressure drops, the observations show that the viscous effects are constant through the material. With increasing pore Reynolds number, inertial effects of the first plate dominate over those of the other plates. The consideration of the local inertial effect will be a key component in the acoustic modeling of this type of material under high sound excitation levels.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"1512-1522"},"PeriodicalIF":2.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143523842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing accounts of formant normalization against US English listeners' vowel perception.
IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2025-02-01 DOI: 10.1121/10.0035476
Anna Persson, Santiago Barreda, T Florian Jaeger

Human speech recognition tends to be robust, despite substantial cross-talker variability. Believed to be critical to this ability are auditory normalization mechanisms whereby listeners adapt to individual differences in vocal tract physiology. This study investigates the computations involved in such normalization. Two 8-way alternative forced-choice experiments assessed L1 listeners' categorizations across the entire US English vowel space-both for unaltered and synthesized stimuli. Listeners' responses in these experiments were compared against the predictions of 20 influential normalization accounts that differ starkly in the inference and memory capacities they imply for speech perception. This includes variants of estimation-free transformations into psycho-acoustic spaces, intrinsic normalizations relative to concurrent acoustic properties, and extrinsic normalizations relative to talker-specific statistics. Listeners' responses were best explained by extrinsic normalization, suggesting that listeners learn and store distributional properties of talkers' speech. Specifically, computationally simple (single-parameter) extrinsic normalization best fit listeners' responses. This simple extrinsic normalization also clearly outperformed Lobanov normalization-a computationally more complex account that remains popular in research on phonetics and phonology, sociolinguistics, typology, and language acquisition.

{"title":"Comparing accounts of formant normalization against US English listeners' vowel perception.","authors":"Anna Persson, Santiago Barreda, T Florian Jaeger","doi":"10.1121/10.0035476","DOIUrl":"https://doi.org/10.1121/10.0035476","url":null,"abstract":"<p><p>Human speech recognition tends to be robust, despite substantial cross-talker variability. Believed to be critical to this ability are auditory normalization mechanisms whereby listeners adapt to individual differences in vocal tract physiology. This study investigates the computations involved in such normalization. Two 8-way alternative forced-choice experiments assessed L1 listeners' categorizations across the entire US English vowel space-both for unaltered and synthesized stimuli. Listeners' responses in these experiments were compared against the predictions of 20 influential normalization accounts that differ starkly in the inference and memory capacities they imply for speech perception. This includes variants of estimation-free transformations into psycho-acoustic spaces, intrinsic normalizations relative to concurrent acoustic properties, and extrinsic normalizations relative to talker-specific statistics. Listeners' responses were best explained by extrinsic normalization, suggesting that listeners learn and store distributional properties of talkers' speech. Specifically, computationally simple (single-parameter) extrinsic normalization best fit listeners' responses. This simple extrinsic normalization also clearly outperformed Lobanov normalization-a computationally more complex account that remains popular in research on phonetics and phonology, sociolinguistics, typology, and language acquisition.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"1458-1482"},"PeriodicalIF":2.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143492458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Magnified interaural level differences enhance binaural unmasking in bilateral cochlear implant users.
IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2025-02-01 DOI: 10.1121/10.0034869
Benjamin N Richardson, Jana M Kainerstorfer, Barbara G Shinn-Cunningham, Christopher A Brown

Bilateral cochlear implant (BiCI) usage makes binaural benefits a possibility for implant users. Yet for BiCI users, limited access to interaural time difference (ITD) cues and reduced saliency of interaural level difference (ILD) cues restricts perceptual benefits of spatially separating a target from masker sounds. The present study explored whether magnifying ILD cues improves intelligibility of masked speech for BiCI listeners in a "symmetrical-masker" configuration, which ensures that neither ear benefits from a long-term positive target-to-masker ratio (TMR) due to naturally occurring ILD cues. ILD magnification estimates moment-to-moment ITDs in octave-wide frequency bands, and applies corresponding ILDs to the target-masker mixtures reaching the two ears at each specific time and frequency band. ILD magnification significantly improved intelligibility in two experiments: one with normal hearing (NH) listeners using vocoded stimuli and one with BiCI users. BiCI listeners showed no benefit of spatial separation between target and maskers with natural ILDs, even for the largest target-masker separation. Because ILD magnification relies on and manipulates only the mixed signals at each ear, the strategy never alters the monaural TMR in either ear at any time. Thus, the observed improvements to masked speech intelligibility come from binaural effects, likely from increased perceptual separation of the competing sources.

{"title":"Magnified interaural level differences enhance binaural unmasking in bilateral cochlear implant users.","authors":"Benjamin N Richardson, Jana M Kainerstorfer, Barbara G Shinn-Cunningham, Christopher A Brown","doi":"10.1121/10.0034869","DOIUrl":"10.1121/10.0034869","url":null,"abstract":"<p><p>Bilateral cochlear implant (BiCI) usage makes binaural benefits a possibility for implant users. Yet for BiCI users, limited access to interaural time difference (ITD) cues and reduced saliency of interaural level difference (ILD) cues restricts perceptual benefits of spatially separating a target from masker sounds. The present study explored whether magnifying ILD cues improves intelligibility of masked speech for BiCI listeners in a \"symmetrical-masker\" configuration, which ensures that neither ear benefits from a long-term positive target-to-masker ratio (TMR) due to naturally occurring ILD cues. ILD magnification estimates moment-to-moment ITDs in octave-wide frequency bands, and applies corresponding ILDs to the target-masker mixtures reaching the two ears at each specific time and frequency band. ILD magnification significantly improved intelligibility in two experiments: one with normal hearing (NH) listeners using vocoded stimuli and one with BiCI users. BiCI listeners showed no benefit of spatial separation between target and maskers with natural ILDs, even for the largest target-masker separation. Because ILD magnification relies on and manipulates only the mixed signals at each ear, the strategy never alters the monaural TMR in either ear at any time. Thus, the observed improvements to masked speech intelligibility come from binaural effects, likely from increased perceptual separation of the competing sources.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"1045-1056"},"PeriodicalIF":2.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11817532/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143390473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding and mitigating the impact of passing ships on underwater environmental estimation from ambient sounda).
IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2025-02-01 DOI: 10.1121/10.0035643
John Lipor, John Gebbie, Martin Siderius

We investigate the impact of low-rank interference on the problem of distinguishing between two seabed types using ambient sound as an acoustic source. The resulting frequency-domain snapshots follow a zero-mean, circularly-symmetric Gaussian distribution, where each seabed type has a unique covariance matrix. Detecting changes in the seabed type across distinct spatial locations can be formulated as a two-sample hypothesis test for equality of covariance, for which Box's M-test is the classical solution. Interference sources such as passing ships result in additive noise with a low-rank covariance that can reduce the performance of hypothesis testing. We first present a method to construct a worst-case interference field, making hypothesis testing as difficult as possible. We then provide an alternating optimization procedure to recover the interference-free covariance matrix. Experiments on synthetic data show that the optimized interferer can greatly reduce hypothesis testing performance, while our recovery method perfectly eliminates this interference for a sufficiently small interference rank. On real data from the New England Shelf Break Acoustics experiment, we show that our approach successfully mitigates interference, allowing for accurate hypothesis testing and improving bottom loss estimation.

{"title":"Understanding and mitigating the impact of passing ships on underwater environmental estimation from ambient sounda).","authors":"John Lipor, John Gebbie, Martin Siderius","doi":"10.1121/10.0035643","DOIUrl":"https://doi.org/10.1121/10.0035643","url":null,"abstract":"<p><p>We investigate the impact of low-rank interference on the problem of distinguishing between two seabed types using ambient sound as an acoustic source. The resulting frequency-domain snapshots follow a zero-mean, circularly-symmetric Gaussian distribution, where each seabed type has a unique covariance matrix. Detecting changes in the seabed type across distinct spatial locations can be formulated as a two-sample hypothesis test for equality of covariance, for which Box's M-test is the classical solution. Interference sources such as passing ships result in additive noise with a low-rank covariance that can reduce the performance of hypothesis testing. We first present a method to construct a worst-case interference field, making hypothesis testing as difficult as possible. We then provide an alternating optimization procedure to recover the interference-free covariance matrix. Experiments on synthetic data show that the optimized interferer can greatly reduce hypothesis testing performance, while our recovery method perfectly eliminates this interference for a sufficiently small interference rank. On real data from the New England Shelf Break Acoustics experiment, we show that our approach successfully mitigates interference, allowing for accurate hypothesis testing and improving bottom loss estimation.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"811-823"},"PeriodicalIF":2.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143256082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a rapid automated binaural detection task.
IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2025-02-01 DOI: 10.1121/10.0035644
Daniel E Shub, Ken W Grant, Douglas S Brungart

Three different automated procedures for efficiently measuring the 500 Hz N0Sπ binaural tone detection threshold of individuals with normal audiometric thresholds were tested with 6803 subjects and the results are compared with an automated version of an existing clinical procedure. Two of these procedures resulted in substantially reduced binaural detection performance and caused a notable decrease in the reliability of the behavior of the subjects. The remaining procedure was an 18-trial yes/no procedure and the difficulty of the trials in this procedure varied in a non-monotonic manner. This procedure not only has fewer trials than the existing clinical procedure but the reliability of the estimate of threshold is improved. The average difference in threshold between this procedure and the clinical procedure was only -0.67 dB, which is likely not clinically significant. Further, only 0.35% of the 6208 subjects tested with the non-monotonic-order procedure were unable to complete the fully automated test which was better than the failure rate for the automated version of the clinical procedure. With such a low failure rate, the modified procedure appears suitable for use as a rapid tool for helping to detect functional hearing deficits that are not captured by the audiogram.

{"title":"Development of a rapid automated binaural detection task.","authors":"Daniel E Shub, Ken W Grant, Douglas S Brungart","doi":"10.1121/10.0035644","DOIUrl":"https://doi.org/10.1121/10.0035644","url":null,"abstract":"<p><p>Three different automated procedures for efficiently measuring the 500 Hz N0Sπ binaural tone detection threshold of individuals with normal audiometric thresholds were tested with 6803 subjects and the results are compared with an automated version of an existing clinical procedure. Two of these procedures resulted in substantially reduced binaural detection performance and caused a notable decrease in the reliability of the behavior of the subjects. The remaining procedure was an 18-trial yes/no procedure and the difficulty of the trials in this procedure varied in a non-monotonic manner. This procedure not only has fewer trials than the existing clinical procedure but the reliability of the estimate of threshold is improved. The average difference in threshold between this procedure and the clinical procedure was only -0.67 dB, which is likely not clinically significant. Further, only 0.35% of the 6208 subjects tested with the non-monotonic-order procedure were unable to complete the fully automated test which was better than the failure rate for the automated version of the clinical procedure. With such a low failure rate, the modified procedure appears suitable for use as a rapid tool for helping to detect functional hearing deficits that are not captured by the audiogram.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"1276-1289"},"PeriodicalIF":2.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143458495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The multimode coupled vibration transducer with the higher emission performances at resonant frequencies.
IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2025-02-01 DOI: 10.1121/10.0035568
Yihao Chen, Shuyu Lin

The majority of existing piezoelectric transducers work at a single resonant frequency, and their applications in scenarios with multi-frequency or frequency variation are not fully considered. Moreover, emitting high-energy ultrasound at different frequencies is also crucial. Here, we propose the three-frequency coupled vibration piezoelectric transducer, which exhibits higher emission performances. The proposed transducer is comprised of two rectangular piezoelectric ceramics, which are cut from a piece of rectangular piezoelectric ceramic. We derive the three-dimensional coupled vibration electromechanical equivalent circuit of the proposed transducer. Then, the characteristics of the transducer are numerically simulated. And comparison experiments between the proposed transducer and a piece of rectangular piezoelectric ceramic transducer were done. An ultrasonic water tank measurement system was used to measure their sound field, axial sound pressure, and transmitting voltage response. Experiments are conducted to verify the electromechanical and sound field characteristics of transducers, which are in good agreement with the simulated results and theoretical predictions. The proposed transducer can generate stable and stronger energy ultrasonic waves at three resonant frequencies. And this study can provide the theoretical and experimental references for multi-frequency conversion and high-energy ultrasonic radiation of the transducer.

{"title":"The multimode coupled vibration transducer with the higher emission performances at resonant frequencies.","authors":"Yihao Chen, Shuyu Lin","doi":"10.1121/10.0035568","DOIUrl":"https://doi.org/10.1121/10.0035568","url":null,"abstract":"<p><p>The majority of existing piezoelectric transducers work at a single resonant frequency, and their applications in scenarios with multi-frequency or frequency variation are not fully considered. Moreover, emitting high-energy ultrasound at different frequencies is also crucial. Here, we propose the three-frequency coupled vibration piezoelectric transducer, which exhibits higher emission performances. The proposed transducer is comprised of two rectangular piezoelectric ceramics, which are cut from a piece of rectangular piezoelectric ceramic. We derive the three-dimensional coupled vibration electromechanical equivalent circuit of the proposed transducer. Then, the characteristics of the transducer are numerically simulated. And comparison experiments between the proposed transducer and a piece of rectangular piezoelectric ceramic transducer were done. An ultrasonic water tank measurement system was used to measure their sound field, axial sound pressure, and transmitting voltage response. Experiments are conducted to verify the electromechanical and sound field characteristics of transducers, which are in good agreement with the simulated results and theoretical predictions. The proposed transducer can generate stable and stronger energy ultrasonic waves at three resonant frequencies. And this study can provide the theoretical and experimental references for multi-frequency conversion and high-energy ultrasonic radiation of the transducer.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"1307-1321"},"PeriodicalIF":2.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143458500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial-temporal activity-informed diarization and separation.
IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2025-02-01 DOI: 10.1121/10.0035830
Yicheng Hsu, Ssuhan Chen, Yuhsin Lai, Chingyen Wang, Mingsian R Bai

A robust multichannel speaker diarization and separation system is proposed by exploiting the spatiotemporal activity of the speakers. The system is realized in a hybrid architecture that combines the array signal processing units and the deep learning units. For speaker diarization, a spatial coherence matrix across time frames is computed based on the whitened Relative Transfer Functions of the microphone array. This serves as a robust feature for subsequent machine learning without the need for prior knowledge of the array configuration. A computationally efficient modified End-to-End Neural Diarization system in the Encoder-Decoder-based Attractor network is constructed to estimate the speaker activity from the spatial coherence matrix. For speaker separation, we propose the Global and Local Activity-driven Speaker Extraction network to separate speaker signals via speaker-specific global and local spatial activity functions. The local spatial activity functions depend on the coherence between the whitened Relative Transfer Functions of each time-frequency bin and the target speaker-dominant bins. The global spatial activity functions are computed from the global spatial coherence functions based on frequency-averaged local spatial activity functions. Experimental results have demonstrated superior speaker, diarization, counting, and separation performance achieved by the proposed system with low computational complexity compared to the pre-selected baselines.

{"title":"Spatial-temporal activity-informed diarization and separation.","authors":"Yicheng Hsu, Ssuhan Chen, Yuhsin Lai, Chingyen Wang, Mingsian R Bai","doi":"10.1121/10.0035830","DOIUrl":"https://doi.org/10.1121/10.0035830","url":null,"abstract":"<p><p>A robust multichannel speaker diarization and separation system is proposed by exploiting the spatiotemporal activity of the speakers. The system is realized in a hybrid architecture that combines the array signal processing units and the deep learning units. For speaker diarization, a spatial coherence matrix across time frames is computed based on the whitened Relative Transfer Functions of the microphone array. This serves as a robust feature for subsequent machine learning without the need for prior knowledge of the array configuration. A computationally efficient modified End-to-End Neural Diarization system in the Encoder-Decoder-based Attractor network is constructed to estimate the speaker activity from the spatial coherence matrix. For speaker separation, we propose the Global and Local Activity-driven Speaker Extraction network to separate speaker signals via speaker-specific global and local spatial activity functions. The local spatial activity functions depend on the coherence between the whitened Relative Transfer Functions of each time-frequency bin and the target speaker-dominant bins. The global spatial activity functions are computed from the global spatial coherence functions based on frequency-averaged local spatial activity functions. Experimental results have demonstrated superior speaker, diarization, counting, and separation performance achieved by the proposed system with low computational complexity compared to the pre-selected baselines.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"1162-1175"},"PeriodicalIF":2.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143408624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characterization of fibrous media transport parameters from multi-compression-ratio measurements of normal incidence sound absorptiona).
IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2025-02-01 DOI: 10.1121/10.0035847
Andrea Santoni, Francesco Pompoli, Cristina Marescotti, Patrizio Fausti

This study presents a novel approach for estimating the transport parameters that characterize the acoustic behavior of fibrous materials using the Johnson-Champoux-Allard equivalent fluid model. We propose an inversion technique, based on an optimization algorithm, to fit the Johnson-Champoux-Allard model's predictions of normal incidence sound absorption coefficient to multi-compression-ratio experimental data. Experimental measurements using the two-microphone technique within an impedance tube are conducted on fibrous material samples tested at various compression ratios. Optimization is performed using both a non-linear programming solver and a genetic algorithm. Validation of the proposed method shows good agreement with well-established techniques and demonstrates its effectiveness across a range of fibrous materials. A sensitivity analysis emphasizes the importance of selecting appropriate boundaries for the search space in the optimization process. To enhance the robustness of optimization, a two-step iterative procedure is proposed. This straightforward methodology offers a robust and reliable framework for characterizing the transport properties of fibrous materials. Its ease of implementation and accuracy make it a valuable tool for enhancing material design and optimization in acoustic engineering.

{"title":"Characterization of fibrous media transport parameters from multi-compression-ratio measurements of normal incidence sound absorptiona).","authors":"Andrea Santoni, Francesco Pompoli, Cristina Marescotti, Patrizio Fausti","doi":"10.1121/10.0035847","DOIUrl":"https://doi.org/10.1121/10.0035847","url":null,"abstract":"<p><p>This study presents a novel approach for estimating the transport parameters that characterize the acoustic behavior of fibrous materials using the Johnson-Champoux-Allard equivalent fluid model. We propose an inversion technique, based on an optimization algorithm, to fit the Johnson-Champoux-Allard model's predictions of normal incidence sound absorption coefficient to multi-compression-ratio experimental data. Experimental measurements using the two-microphone technique within an impedance tube are conducted on fibrous material samples tested at various compression ratios. Optimization is performed using both a non-linear programming solver and a genetic algorithm. Validation of the proposed method shows good agreement with well-established techniques and demonstrates its effectiveness across a range of fibrous materials. A sensitivity analysis emphasizes the importance of selecting appropriate boundaries for the search space in the optimization process. To enhance the robustness of optimization, a two-step iterative procedure is proposed. This straightforward methodology offers a robust and reliable framework for characterizing the transport properties of fibrous materials. Its ease of implementation and accuracy make it a valuable tool for enhancing material design and optimization in acoustic engineering.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"1185-1201"},"PeriodicalIF":2.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143408644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Acoustic vector sensor based multi-sources localization in reverberant environment using acoustic polarization state analysis.
IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Pub Date : 2025-02-01 DOI: 10.1121/10.0035816
Yuan Sun, Hao Ge, Bei Wang, Kai Wang, Xiang-Yuan Xu, Ming-Hui Lu, Yan-Feng Chen

Estimating the direction of arrival (DOA) under real-world conditions poses a significant challenge, as reverberations can lead to erroneous information. We note that the direct-path component and the reverberant components of sound exhibit distinct polarization states within the acoustic particle velocity field. Based on this observation, we propose and experimentally verify a method for localizing multiple sources in reverberant environments using a single acoustic vector sensor (AVS). The measurement of polarization states via AVS enables the identification of time-frequency bins primarily influenced by the direct-path component within the time-frequency domain, which are subsequently utilized for DOA estimation. Our study offers a novel perspective on sound field detection and may catalyze future applications including de-reverberation and the determination of environmental geometric parameters.

{"title":"Acoustic vector sensor based multi-sources localization in reverberant environment using acoustic polarization state analysis.","authors":"Yuan Sun, Hao Ge, Bei Wang, Kai Wang, Xiang-Yuan Xu, Ming-Hui Lu, Yan-Feng Chen","doi":"10.1121/10.0035816","DOIUrl":"https://doi.org/10.1121/10.0035816","url":null,"abstract":"<p><p>Estimating the direction of arrival (DOA) under real-world conditions poses a significant challenge, as reverberations can lead to erroneous information. We note that the direct-path component and the reverberant components of sound exhibit distinct polarization states within the acoustic particle velocity field. Based on this observation, we propose and experimentally verify a method for localizing multiple sources in reverberant environments using a single acoustic vector sensor (AVS). The measurement of polarization states via AVS enables the identification of time-frequency bins primarily influenced by the direct-path component within the time-frequency domain, which are subsequently utilized for DOA estimation. Our study offers a novel perspective on sound field detection and may catalyze future applications including de-reverberation and the determination of environmental geometric parameters.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"1019-1026"},"PeriodicalIF":2.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143389949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the Acoustical Society of America
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1