Xianpeng Li, Yupeng Tai, Haibin Wang, Jun Wang, Shuo Jia, Yonglin Zhang, Weiming Gan
Underwater acoustic communication signals suffer from time dispersion due to time-varying multipath propagation in the ocean. This leads to intersymbol interference, which in turn degrades the performance of the communication system. Typically, the channel correlation functions are employed to describe these characteristics. In this paper, a metric called the channel average correlation coefficient (CACC) is proposed from the correlation function to quantify the time-varying characteristics. It has a theoretical negative relationship with communication performance. Comparative analysis involving simulations and experimental data processing highlights the superior effectiveness of CACC over the traditional metric, the channel coherence time.
{"title":"A metric to quantify the time-varying characteristics of underwater acoustic communication channels.","authors":"Xianpeng Li, Yupeng Tai, Haibin Wang, Jun Wang, Shuo Jia, Yonglin Zhang, Weiming Gan","doi":"10.1121/10.0026601","DOIUrl":"https://doi.org/10.1121/10.0026601","url":null,"abstract":"<p><p>Underwater acoustic communication signals suffer from time dispersion due to time-varying multipath propagation in the ocean. This leads to intersymbol interference, which in turn degrades the performance of the communication system. Typically, the channel correlation functions are employed to describe these characteristics. In this paper, a metric called the channel average correlation coefficient (CACC) is proposed from the correlation function to quantify the time-varying characteristics. It has a theoretical negative relationship with communication performance. Comparative analysis involving simulations and experimental data processing highlights the superior effectiveness of CACC over the traditional metric, the channel coherence time.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 7","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141560404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently researchers often normalize the radiation force on spheres in standing waves in inviscid fluids using an acoustic contrast factor (typically denoted by Φ) that is independent of kR where k is the wave number and R is the sphere radius. An alternative normalization uses a function Ys that depends on kR. Here, standard results for Φ are extended as a power series in kR using prior Ys results. Also, new terms are found for fluid spheres and applied to the kR dependence of Φ for strongly responsive and weakly responsive examples. Partial-wave phase shifts are used in the derivation.
{"title":"Contrast factor for standing-wave radiation forces on spheres: Series expansion in powers of sphere radius.","authors":"Philip L Marston","doi":"10.1121/10.0027928","DOIUrl":"https://doi.org/10.1121/10.0027928","url":null,"abstract":"<p><p>Recently researchers often normalize the radiation force on spheres in standing waves in inviscid fluids using an acoustic contrast factor (typically denoted by Φ) that is independent of kR where k is the wave number and R is the sphere radius. An alternative normalization uses a function Ys that depends on kR. Here, standard results for Φ are extended as a power series in kR using prior Ys results. Also, new terms are found for fluid spheres and applied to the kR dependence of Φ for strongly responsive and weakly responsive examples. Partial-wave phase shifts are used in the derivation.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 7","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141617765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xianhui Wang, Jonathan Ge, Leo Meller, Ye Yang, Fan-Gang Zeng
Although the telephone band (0.3-3 kHz) provides sufficient information for speech recognition, the contribution of the non-telephone band (<0.3 and >3 kHz) is unclear. To investigate its contribution, speech intelligibility and talker identification were evaluated using consonants, vowels, and sentences. The non-telephone band produced relatively good intelligibility for consonants (76.0%) and sentences (77.4%), but not vowels (11.5%). The non-telephone band supported good talker identification only with sentences (74.5%), but not vowels (45.8%) or consonants (10.8%). Furthermore, the non-telephone band cannot produce satisfactory speech intelligibility in noise at the sentence level, suggesting the importance of full-band access in realistic listening.
{"title":"Speech intelligibility and talker identification with non-telephone frequencies.","authors":"Xianhui Wang, Jonathan Ge, Leo Meller, Ye Yang, Fan-Gang Zeng","doi":"10.1121/10.0027938","DOIUrl":"10.1121/10.0027938","url":null,"abstract":"<p><p>Although the telephone band (0.3-3 kHz) provides sufficient information for speech recognition, the contribution of the non-telephone band (<0.3 and >3 kHz) is unclear. To investigate its contribution, speech intelligibility and talker identification were evaluated using consonants, vowels, and sentences. The non-telephone band produced relatively good intelligibility for consonants (76.0%) and sentences (77.4%), but not vowels (11.5%). The non-telephone band supported good talker identification only with sentences (74.5%), but not vowels (45.8%) or consonants (10.8%). Furthermore, the non-telephone band cannot produce satisfactory speech intelligibility in noise at the sentence level, suggesting the importance of full-band access in realistic listening.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 7","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141763039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The peaked cochlear tonotopic response does not show the typical phenomenology of a resonant system. Simulations of a 2 D viscous model show that the position of the peak is determined by the competition between a sharp pressure boost due to the increase in the real part of the wavenumber as the forward wave enters the short-wave region, and a sudden increase in the viscous losses, partly counteracted by the input power provided by the outer hair cells. This viewpoint also explains the peculiar experimental behavior of the cochlear admittance (broadly tuned and almost level-independent) in the peak region.
峰值耳蜗声调反应并不显示共振系统的典型现象。对 2 D 粘滞模型的模拟表明,峰值的位置是由前向波进入短波区时,由于实际波数的增加而产生的急剧压力提升和粘滞损耗的突然增加之间的竞争决定的,其中部分被外毛细胞提供的输入功率所抵消。这一观点也解释了耳蜗导纳在峰值区域的特殊实验行为(宽调谐且几乎与电平无关)。
{"title":"The tonotopic cochlea puzzle: A resonant transmission line with a \"non-resonant\" response peak.","authors":"Renata Sisto, Arturo Moleti","doi":"10.1121/10.0028020","DOIUrl":"10.1121/10.0028020","url":null,"abstract":"<p><p>The peaked cochlear tonotopic response does not show the typical phenomenology of a resonant system. Simulations of a 2 D viscous model show that the position of the peak is determined by the competition between a sharp pressure boost due to the increase in the real part of the wavenumber as the forward wave enters the short-wave region, and a sudden increase in the viscous losses, partly counteracted by the input power provided by the outer hair cells. This viewpoint also explains the peculiar experimental behavior of the cochlear admittance (broadly tuned and almost level-independent) in the peak region.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 7","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141728397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jay C Spendlove, Tracianne B Neilsen, Mark K Transtrum
The model manifold, an information geometry tool, is a geometric representation of a model that can quantify the expected information content of modeling parameters. For a normal-mode sound propagation model in a shallow ocean environment, transmission loss (TL) is calculated for a vertical line array and model manifolds are constructed for both absolute and relative TL. For the example presented in this paper, relative TL yields more compact model manifolds with seabed environments that are less statistically distinguishable than manifolds of absolute TL. This example illustrates how model manifolds can be used to improve experimental design for inverse problems.
{"title":"Information geometry analysis example for absolute and relative transmission loss in a shallow ocean.","authors":"Jay C Spendlove, Tracianne B Neilsen, Mark K Transtrum","doi":"10.1121/10.0026449","DOIUrl":"https://doi.org/10.1121/10.0026449","url":null,"abstract":"<p><p>The model manifold, an information geometry tool, is a geometric representation of a model that can quantify the expected information content of modeling parameters. For a normal-mode sound propagation model in a shallow ocean environment, transmission loss (TL) is calculated for a vertical line array and model manifolds are constructed for both absolute and relative TL. For the example presented in this paper, relative TL yields more compact model manifolds with seabed environments that are less statistically distinguishable than manifolds of absolute TL. This example illustrates how model manifolds can be used to improve experimental design for inverse problems.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 7","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141473265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elastic periodic lattices act as mechanical filters of incident vibrations. By and large, they forbid wave propagation within bandgaps and resonate outside them. However, they often encounter "truncation resonances" (TRs) inside bandgaps when certain conditions are met. In this study, we show that the extent of unit cell asymmetry, its mass and stiffness contrasts, and the boundary conditions all play a role in the TR location and wave profile. The work is experimentally supported via two examples that validate the methodology, and a set of design charts is provided as a blueprint for selective TR placement in diatomic lattices.
{"title":"A blueprint for truncation resonance placement in elastic diatomic lattices with unit cell asymmetrya).","authors":"Hasan B Al Ba'ba'a, Hosam Yousef, Mostafa Nouh","doi":"10.1121/10.0027939","DOIUrl":"https://doi.org/10.1121/10.0027939","url":null,"abstract":"<p><p>Elastic periodic lattices act as mechanical filters of incident vibrations. By and large, they forbid wave propagation within bandgaps and resonate outside them. However, they often encounter \"truncation resonances\" (TRs) inside bandgaps when certain conditions are met. In this study, we show that the extent of unit cell asymmetry, its mass and stiffness contrasts, and the boundary conditions all play a role in the TR location and wave profile. The work is experimentally supported via two examples that validate the methodology, and a set of design charts is provided as a blueprint for selective TR placement in diatomic lattices.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 7","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Age-related changes in auditory processing may reduce physiological coding of acoustic cues, contributing to older adults' difficulty perceiving speech in background noise. This study investigated whether older adults differed from young adults in patterns of acoustic cue weighting for categorizing vowels in quiet and in noise. All participants relied primarily on spectral quality to categorize /ɛ/ and /æ/ sounds under both listening conditions. However, relative to young adults, older adults exhibited greater reliance on duration and less reliance on spectral quality. These results suggest that aging alters patterns of perceptual cue weights that may influence speech recognition abilities.
{"title":"Age and masking effects on acoustic cues for vowel categorizationa).","authors":"Mishaela DiNino","doi":"10.1121/10.0026371","DOIUrl":"10.1121/10.0026371","url":null,"abstract":"<p><p>Age-related changes in auditory processing may reduce physiological coding of acoustic cues, contributing to older adults' difficulty perceiving speech in background noise. This study investigated whether older adults differed from young adults in patterns of acoustic cue weighting for categorizing vowels in quiet and in noise. All participants relied primarily on spectral quality to categorize /ɛ/ and /æ/ sounds under both listening conditions. However, relative to young adults, older adults exhibited greater reliance on duration and less reliance on spectral quality. These results suggest that aging alters patterns of perceptual cue weights that may influence speech recognition abilities.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 6","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141332651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angela Cooper, Matthew Eitel, Natalie Fecher, Elizabeth Johnson, Laura K Cirelli
Singing is socially important but constrains voice acoustics, potentially masking certain aspects of vocal identity. Little is known about how well listeners extract talker details from sung speech or identify talkers across the sung and spoken modalities. Here, listeners (n = 149) were trained to recognize sung or spoken voices and then tested on their identification of these voices in both modalities. Learning vocal identities was initially easier through speech than song. At test, cross-modality voice recognition was above chance, but weaker than within-modality recognition. We conclude that talker information is accessible in sung speech, despite acoustic constraints in song.
{"title":"Who is singing? Voice recognition from spoken versus sung speech.","authors":"Angela Cooper, Matthew Eitel, Natalie Fecher, Elizabeth Johnson, Laura K Cirelli","doi":"10.1121/10.0026385","DOIUrl":"10.1121/10.0026385","url":null,"abstract":"<p><p>Singing is socially important but constrains voice acoustics, potentially masking certain aspects of vocal identity. Little is known about how well listeners extract talker details from sung speech or identify talkers across the sung and spoken modalities. Here, listeners (n = 149) were trained to recognize sung or spoken voices and then tested on their identification of these voices in both modalities. Learning vocal identities was initially easier through speech than song. At test, cross-modality voice recognition was above chance, but weaker than within-modality recognition. We conclude that talker information is accessible in sung speech, despite acoustic constraints in song.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 6","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141422096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study investigated the acoustic cue weighting of the Korean stop contrast in the perception and production of speakers who moved from a nonstandard dialect region to the standard dialect region, Seoul. Through comparing these mobile speakers with data from nonmobile speakers in Seoul and their home region, it was found that the speakers shifted their cue weighting in perception and production to some degree, but also retained some subphonemic features of their home dialect in production. The implications of these results for the role of dialect prestige and awareness in second dialect acquisition are discussed.
{"title":"The perception and production of Korean stops in second dialect acquisition.","authors":"Hyunjung Lee, Eun Jong Kong, Jeffrey J Holliday","doi":"10.1121/10.0026374","DOIUrl":"10.1121/10.0026374","url":null,"abstract":"<p><p>This study investigated the acoustic cue weighting of the Korean stop contrast in the perception and production of speakers who moved from a nonstandard dialect region to the standard dialect region, Seoul. Through comparing these mobile speakers with data from nonmobile speakers in Seoul and their home region, it was found that the speakers shifted their cue weighting in perception and production to some degree, but also retained some subphonemic features of their home dialect in production. The implications of these results for the role of dialect prestige and awareness in second dialect acquisition are discussed.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 6","pages":""},"PeriodicalIF":1.2,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141312448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The automatic classification of phonation types in singing voice is essential for tasks such as identification of singing style. In this study, it is proposed to use wavelet scattering network (WSN)-based features for classification of phonation types in singing voice. WSN, which has a close similarity with auditory physiological models, generates acoustic features that greatly characterize the information related to pitch, formants, and timbre. Hence, the WSN-based features can effectively capture the discriminative information across phonation types in singing voice. The experimental results show that the proposed WSN-based features improved phonation classification accuracy by at least 9% compared to state-of-the-art features.
{"title":"Classification of phonation types in singing voice using wavelet scattering network-based features.","authors":"Kiran Reddy Mittapalle, Paavo Alku","doi":"10.1121/10.0026241","DOIUrl":"https://doi.org/10.1121/10.0026241","url":null,"abstract":"<p><p>The automatic classification of phonation types in singing voice is essential for tasks such as identification of singing style. In this study, it is proposed to use wavelet scattering network (WSN)-based features for classification of phonation types in singing voice. WSN, which has a close similarity with auditory physiological models, generates acoustic features that greatly characterize the information related to pitch, formants, and timbre. Hence, the WSN-based features can effectively capture the discriminative information across phonation types in singing voice. The experimental results show that the proposed WSN-based features improved phonation classification accuracy by at least 9% compared to state-of-the-art features.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":"4 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141285490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}