This study is concerned with the aperture of the mid vowel /E/ in nonfinal syllables in Quebec French. The hypothesis tested is that in underived disyllabic words, the aperture of /E/ would be determined via harmony with the following vowel. Based on predictions from a classifier trained on acoustic properties of word-final vowels, nonfinal vowels were labeled as mid-close or mid-open. Although distant coarticulatory effects were observed, the harmony hypothesis was not supported. The results revealed a bias toward a mid-open quality and a reduced acoustic distinction, which warrant further investigation.
{"title":"A data-driven assessment of harmony in Quebec French [e] and [ε].","authors":"Josiane Riverin-Coutlée, Michele Gubian","doi":"10.1121/10.0025831","DOIUrl":"https://doi.org/10.1121/10.0025831","url":null,"abstract":"<p><p>This study is concerned with the aperture of the mid vowel /E/ in nonfinal syllables in Quebec French. The hypothesis tested is that in underived disyllabic words, the aperture of /E/ would be determined via harmony with the following vowel. Based on predictions from a classifier trained on acoustic properties of word-final vowels, nonfinal vowels were labeled as mid-close or mid-open. Although distant coarticulatory effects were observed, the harmony hypothesis was not supported. The results revealed a bias toward a mid-open quality and a reduced acoustic distinction, which warrant further investigation.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140867298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Interaural pitch matching is a common task used with bilateral cochlear implant (CI) users, although studies measuring this have largely focused on place-based pitch matches. Temporal-based pitch also plays an important role in CI users' perception, but interaural temporal-based pitch matching has not been well characterized for CI users. To investigate this, bilateral CI users were asked to match amplitude modulation frequencies of stimulation across ears. Comparisons were made to previous place-based pitch matching data that were collected using similar procedures. The results indicate that temporal-based pitch matching is particularly sensitive to the choice of reference ear.
耳内音高匹配是双侧人工耳蜗(CI)用户经常使用的一项任务,但这方面的测量研究主要集中在基于位置的音高匹配上。基于时间的音高在 CI 使用者的感知中也起着重要作用,但对于 CI 使用者来说,基于时间的耳间音高匹配还没有很好的描述。为了研究这个问题,研究人员要求双耳 CI 使用者对双耳刺激的振幅调制频率进行匹配。与之前使用类似程序收集的基于位置的音高匹配数据进行了比较。结果表明,基于时间的音高匹配对参考耳的选择特别敏感。
{"title":"Temporal pitch matching with bilateral cochlear implants.","authors":"Justin M Aronoff, Simin Soleimanifar, Prajna Bk","doi":"10.1121/10.0025507","DOIUrl":"10.1121/10.0025507","url":null,"abstract":"<p><p>Interaural pitch matching is a common task used with bilateral cochlear implant (CI) users, although studies measuring this have largely focused on place-based pitch matches. Temporal-based pitch also plays an important role in CI users' perception, but interaural temporal-based pitch matching has not been well characterized for CI users. To investigate this, bilateral CI users were asked to match amplitude modulation frequencies of stimulation across ears. Comparisons were made to previous place-based pitch matching data that were collected using similar procedures. The results indicate that temporal-based pitch matching is particularly sensitive to the choice of reference ear.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10989667/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140337902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes a stochastic model of ray trajectory propagation through a medium-such as the ocean-which has an uncertain sound speed profile. We frame ray propagation as a geometric fractal Brownian motion process on the special Euclidean group of dimension two, SE(2). The framing includes diffusion parameters to describe how the stochastic rays deviate from the expected rays, and these diffusion parameters are a function of the uncertainty in the sound speed profile. We demonstrate this framing for the classical Munk profile and a double-ducted profile in the Beaufort.
{"title":"Geometric stochastic ray propagation using the special Euclidean group.","authors":"Tyler Paine, E. Bhatt","doi":"10.1121/10.0025522","DOIUrl":"https://doi.org/10.1121/10.0025522","url":null,"abstract":"This paper describes a stochastic model of ray trajectory propagation through a medium-such as the ocean-which has an uncertain sound speed profile. We frame ray propagation as a geometric fractal Brownian motion process on the special Euclidean group of dimension two, SE(2). The framing includes diffusion parameters to describe how the stochastic rays deviate from the expected rays, and these diffusion parameters are a function of the uncertainty in the sound speed profile. We demonstrate this framing for the classical Munk profile and a double-ducted profile in the Beaufort.","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140784140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Georgios Papadimitriou, Jonas Brunskog, Franz M Heuchel, Viveka Lyberg Åhlander, Greta Öhlund Wistbacka
This study investigates speech production under various room acoustic conditions in virtual environments, by comparing vocal behavior and the subjective experience of speaking in four real rooms and their audio-visual virtual replicas. Sex differences were explored. Males and females (N = 13) adjusted their voice levels similarly to room acoustic changes in the real rooms, but only males did so in the virtual rooms. Females, however, rated the visual virtual environment as more realistic compared to males. This suggests a discrepancy between sexes regarding the experience of realism in a virtual environment and changes in objective behavioral measures such as voice level.
{"title":"Sex differences in vocal behavior in virtual rooms compared to real rooms.","authors":"Georgios Papadimitriou, Jonas Brunskog, Franz M Heuchel, Viveka Lyberg Åhlander, Greta Öhlund Wistbacka","doi":"10.1121/10.0025523","DOIUrl":"https://doi.org/10.1121/10.0025523","url":null,"abstract":"This study investigates speech production under various room acoustic conditions in virtual environments, by comparing vocal behavior and the subjective experience of speaking in four real rooms and their audio-visual virtual replicas. Sex differences were explored. Males and females (N = 13) adjusted their voice levels similarly to room acoustic changes in the real rooms, but only males did so in the virtual rooms. Females, however, rated the visual virtual environment as more realistic compared to males. This suggests a discrepancy between sexes regarding the experience of realism in a virtual environment and changes in objective behavioral measures such as voice level.","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140782386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kenneth Ooi, Jessie Goh, Hao-Weng Lin, Zhen-Ting Ong, Trevor Wong, Karn N. Watcharasupat, Bhan Lam, Woon-Seng Gan
This study presents a dataset of audio-visual soundscape recordings at 62 different locations in Singapore, initially made as full-length recordings over spans of 9-38 min. For consistency and reduction in listener fatigue in future subjective studies, one-minute excerpts were cropped from the full-length recordings. An automated method using pre-trained models for Pleasantness and Eventfulness (according to ISO 12913) in a modified partitioning around medoids algorithm was employed to generate the set of excerpts by balancing the need to encompass the perceptual space with uniformity in distribution. A validation study on the method confirmed its adherence to the intended design.
本研究提供了一个新加坡 62 个不同地点的视听声景录音数据集,最初的录音长度为 9-38 分钟。为了保持一致性,并在今后的主观研究中减少听者的疲劳感,我们从全长录音中摘录了一分钟的片段。我们采用了一种自动方法,利用预先训练好的愉悦度和事件度模型(根据 ISO 12913 标准),通过修改后的中位数分区算法来生成节选集,从而兼顾了感知空间和均匀分布的需要。对该方法的验证研究证实,该方法符合预期设计。
{"title":"Lion city soundscapes: Modified partitioning around medoids for a perceptually diverse dataset of Singaporean soundscapesa).","authors":"Kenneth Ooi, Jessie Goh, Hao-Weng Lin, Zhen-Ting Ong, Trevor Wong, Karn N. Watcharasupat, Bhan Lam, Woon-Seng Gan","doi":"10.1121/10.0025830","DOIUrl":"https://doi.org/10.1121/10.0025830","url":null,"abstract":"This study presents a dataset of audio-visual soundscape recordings at 62 different locations in Singapore, initially made as full-length recordings over spans of 9-38 min. For consistency and reduction in listener fatigue in future subjective studies, one-minute excerpts were cropped from the full-length recordings. An automated method using pre-trained models for Pleasantness and Eventfulness (according to ISO 12913) in a modified partitioning around medoids algorithm was employed to generate the set of excerpts by balancing the need to encompass the perceptual space with uniformity in distribution. A validation study on the method confirmed its adherence to the intended design.","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140768632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study explores the engagement of national standards bodies and practitioners with the ISO 12913 series on soundscape. It reveals critical challenges in stakeholder engagement, communication, competence, and practical application. A strategic roadmap, aligned with the normalization process theory, is proposed, comprising meaningful stakeholder engagement, building workability and integration, and community building and reflective monitoring. Results underscore the influence of national priorities, communication gaps, limited resources, and the need for practical guidance. Future efforts should focus on promoting cross-disciplinary collaboration and developing tools to quantify the societal and economic impact of soundscape interventions, addressing the multifaceted barriers identified.
本研究探讨了国家标准机构和从业人员参与 ISO 12913 系列声景标准的情况。研究揭示了在利益相关者参与、沟通、能力和实际应用方面存在的重大挑战。研究提出了与规范化进程理论相一致的战略路线图,包括有意义的利益相关者参与、建立可操作性和整合性,以及社区建设和反思性监测。结果强调了国家优先事项、交流差距、资源有限以及实际指导需求的影响。未来的工作重点应是促进跨学科合作,开发工具量化声景干预措施的社会和经济影响,解决已发现的多方面障碍。
{"title":"Identifying barriers to engage with soundscape standards: Insights from national standards bodies and expertsa).","authors":"Francesco Aletta, Jieling Xiao, Jian Kang","doi":"10.1121/10.0025454","DOIUrl":"https://doi.org/10.1121/10.0025454","url":null,"abstract":"<p><p>This study explores the engagement of national standards bodies and practitioners with the ISO 12913 series on soundscape. It reveals critical challenges in stakeholder engagement, communication, competence, and practical application. A strategic roadmap, aligned with the normalization process theory, is proposed, comprising meaningful stakeholder engagement, building workability and integration, and community building and reflective monitoring. Results underscore the influence of national priorities, communication gaps, limited resources, and the need for practical guidance. Future efforts should focus on promoting cross-disciplinary collaboration and developing tools to quantify the societal and economic impact of soundscape interventions, addressing the multifaceted barriers identified.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140338549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michelle Cohn, Zion Mengesha, Michal Lahav, Courtney Heldreth
This paper examines the adaptations African American English speakers make when imagining talking to a voice assistant, compared to a close friend/family member and to a stranger. Results show that speakers slowed their rate and produced less pitch variation in voice-assistant-"directed speech" (DS), relative to human-DS. These adjustments were not mediated by how often participants reported experiencing errors with automatic speech recognition. Overall, this paper addresses a limitation in the types of language varieties explored when examining technology-DS registers and contributes to our understanding of the dynamics of human-computer interaction.
{"title":"African American English speakers' pitch variation and rate adjustments for imagined technological and human addressees.","authors":"Michelle Cohn, Zion Mengesha, Michal Lahav, Courtney Heldreth","doi":"10.1121/10.0025484","DOIUrl":"https://doi.org/10.1121/10.0025484","url":null,"abstract":"<p><p>This paper examines the adaptations African American English speakers make when imagining talking to a voice assistant, compared to a close friend/family member and to a stranger. Results show that speakers slowed their rate and produced less pitch variation in voice-assistant-\"directed speech\" (DS), relative to human-DS. These adjustments were not mediated by how often participants reported experiencing errors with automatic speech recognition. Overall, this paper addresses a limitation in the types of language varieties explored when examining technology-DS registers and contributes to our understanding of the dynamics of human-computer interaction.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140873790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alex Zager, Sonja Ahlberg, Olivia Boyan, Jocelyn Brierley, Valerie Eddington, Remington J Moll, Laura N Kloepper
Moose are a popular species with recreationists but understudied acoustically. We used publicly available videos to characterize and quantify the vocalizations of moose in New Hampshire separated by age/sex class. We found significant differences in peak frequency, center frequency, bandwidth, and duration across the groups. Our results provide quantification of wild moose vocalizations across age/sex classes, which is a key step for passive acoustic detection of this species and highlights public videos as a potential resource for bioacoustics research of hard-to-capture and understudied species.
{"title":"Characteristics of wild moose (Alces alces) vocalizations.","authors":"Alex Zager, Sonja Ahlberg, Olivia Boyan, Jocelyn Brierley, Valerie Eddington, Remington J Moll, Laura N Kloepper","doi":"10.1121/10.0025465","DOIUrl":"10.1121/10.0025465","url":null,"abstract":"<p><p>Moose are a popular species with recreationists but understudied acoustically. We used publicly available videos to characterize and quantify the vocalizations of moose in New Hampshire separated by age/sex class. We found significant differences in peak frequency, center frequency, bandwidth, and duration across the groups. Our results provide quantification of wild moose vocalizations across age/sex classes, which is a key step for passive acoustic detection of this species and highlights public videos as a potential resource for bioacoustics research of hard-to-capture and understudied species.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140338548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The discovery that listeners more accurately identify words repeated in the same voice than in a different voice has had an enormous influence on models of representation and speech perception. Widely replicated in English, we understand little about whether and how this effect generalizes across languages. In a continuous recognition memory study with Hindi speakers and listeners (N = 178), we replicated the talker-specificity effect for accuracy-based measures (hit rate and D'), and found the latency advantage to be marginal (p = 0.06). These data help us better understand talker-specificity effects cross-linguistically and highlight the importance of expanding work to less studied languages.
{"title":"The episodic encoding of spoken words in Hindi.","authors":"William Clapp, Meghan Sumner","doi":"10.1121/10.0025134","DOIUrl":"10.1121/10.0025134","url":null,"abstract":"<p><p>The discovery that listeners more accurately identify words repeated in the same voice than in a different voice has had an enormous influence on models of representation and speech perception. Widely replicated in English, we understand little about whether and how this effect generalizes across languages. In a continuous recognition memory study with Hindi speakers and listeners (N = 178), we replicated the talker-specificity effect for accuracy-based measures (hit rate and D'), and found the latency advantage to be marginal (p = 0.06). These data help us better understand talker-specificity effects cross-linguistically and highlight the importance of expanding work to less studied languages.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139998497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biao Chen, Xinyi Zhang, Jingyuan Chen, Ying Shi, Xinyue Zou, Ping Liu, Yongxin Li, John J Galvin, Qian-Jie Fu
English-speaking bimodal and bilateral cochlear implant (CI) users can segregate competing speech using talker sex cues but not spatial cues. While tonal language experience allows for greater utilization of talker sex cues for listeners with normal hearing, tonal language benefits remain unclear for CI users. The present study assessed the ability of Mandarin-speaking bilateral and bimodal CI users to recognize target sentences amidst speech maskers that varied in terms of spatial cues and/or talker sex cues, relative to the target. Different from English-speaking CI users, Mandarin-speaking CI users exhibited greater utilization of spatial cues, particularly in bimodal listening.
说英语的双模态和双侧人工耳蜗(CI)使用者可以利用说话者性别线索而不是空间线索来分离竞争性语音。对于听力正常的听者来说,音调语言经验可以让他们更多地利用谈话者性别线索,但对于 CI 用户来说,音调语言的益处仍不明确。本研究评估了讲普通话的双侧和双模 CI 用户在空间线索和/或谈话者性别线索相对于目标句子不同的语音掩蔽器中识别目标句子的能力。与讲英语的 CI 用户不同,讲普通话的 CI 用户表现出更多地利用空间线索,尤其是在双模听力中。
{"title":"Tonal language experience facilitates the use of spatial cues for segregating competing speech in bimodal cochlear implant listeners.","authors":"Biao Chen, Xinyi Zhang, Jingyuan Chen, Ying Shi, Xinyue Zou, Ping Liu, Yongxin Li, John J Galvin, Qian-Jie Fu","doi":"10.1121/10.0025058","DOIUrl":"10.1121/10.0025058","url":null,"abstract":"<p><p>English-speaking bimodal and bilateral cochlear implant (CI) users can segregate competing speech using talker sex cues but not spatial cues. While tonal language experience allows for greater utilization of talker sex cues for listeners with normal hearing, tonal language benefits remain unclear for CI users. The present study assessed the ability of Mandarin-speaking bilateral and bimodal CI users to recognize target sentences amidst speech maskers that varied in terms of spatial cues and/or talker sex cues, relative to the target. Different from English-speaking CI users, Mandarin-speaking CI users exhibited greater utilization of spatial cues, particularly in bimodal listening.</p>","PeriodicalId":73538,"journal":{"name":"JASA express letters","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10926108/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139998498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}