Comparing accounts of formant normalization against US English listeners' vowel perception.

IF 2.3 2区 物理与天体物理 Q2 ACOUSTICS Journal of the Acoustical Society of America Pub Date : 2025-02-01 DOI:10.1121/10.0035476
Anna Persson, Santiago Barreda, T Florian Jaeger
{"title":"Comparing accounts of formant normalization against US English listeners' vowel perception.","authors":"Anna Persson, Santiago Barreda, T Florian Jaeger","doi":"10.1121/10.0035476","DOIUrl":null,"url":null,"abstract":"<p><p>Human speech recognition tends to be robust, despite substantial cross-talker variability. Believed to be critical to this ability are auditory normalization mechanisms whereby listeners adapt to individual differences in vocal tract physiology. This study investigates the computations involved in such normalization. Two 8-way alternative forced-choice experiments assessed L1 listeners' categorizations across the entire US English vowel space-both for unaltered and synthesized stimuli. Listeners' responses in these experiments were compared against the predictions of 20 influential normalization accounts that differ starkly in the inference and memory capacities they imply for speech perception. This includes variants of estimation-free transformations into psycho-acoustic spaces, intrinsic normalizations relative to concurrent acoustic properties, and extrinsic normalizations relative to talker-specific statistics. Listeners' responses were best explained by extrinsic normalization, suggesting that listeners learn and store distributional properties of talkers' speech. Specifically, computationally simple (single-parameter) extrinsic normalization best fit listeners' responses. This simple extrinsic normalization also clearly outperformed Lobanov normalization-a computationally more complex account that remains popular in research on phonetics and phonology, sociolinguistics, typology, and language acquisition.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"157 2","pages":"1458-1482"},"PeriodicalIF":2.3000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Acoustical Society of America","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1121/10.0035476","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Human speech recognition tends to be robust, despite substantial cross-talker variability. Believed to be critical to this ability are auditory normalization mechanisms whereby listeners adapt to individual differences in vocal tract physiology. This study investigates the computations involved in such normalization. Two 8-way alternative forced-choice experiments assessed L1 listeners' categorizations across the entire US English vowel space-both for unaltered and synthesized stimuli. Listeners' responses in these experiments were compared against the predictions of 20 influential normalization accounts that differ starkly in the inference and memory capacities they imply for speech perception. This includes variants of estimation-free transformations into psycho-acoustic spaces, intrinsic normalizations relative to concurrent acoustic properties, and extrinsic normalizations relative to talker-specific statistics. Listeners' responses were best explained by extrinsic normalization, suggesting that listeners learn and store distributional properties of talkers' speech. Specifically, computationally simple (single-parameter) extrinsic normalization best fit listeners' responses. This simple extrinsic normalization also clearly outperformed Lobanov normalization-a computationally more complex account that remains popular in research on phonetics and phonology, sociolinguistics, typology, and language acquisition.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
比较构音正常化与美国英语听者的元音感知。
人类的语音识别往往是稳健的,尽管有大量的对话者的变化。这种能力的关键是听者适应声道生理个体差异的听觉正常化机制。本研究探讨了这种归一化所涉及的计算。两个8-way选择性强迫选择实验评估了L1听众在整个美国英语元音空间中的分类——包括未改变的和合成的刺激。在这些实验中,听众的反应与20种有影响力的标准化解释的预测进行了比较,这些解释在语音感知的推理和记忆能力方面存在明显差异。这包括到心理声学空间的无估计变换的变体,相对于并发声学特性的内在归一化,以及相对于说话者特定统计的外在归一化。听者的反应最好用外在规格化来解释,这表明听者学习并储存了说话者话语的分布特性。具体来说,计算简单(单参数)的外在归一化最适合听众的反应。这种简单的外在归一化也明显优于Lobanov归一化,Lobanov归一化是一种计算上更复杂的方法,在语音学和音系学、社会语言学、类型学和语言习得的研究中仍然很流行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.60
自引率
16.70%
发文量
1433
审稿时长
4.7 months
期刊介绍: Since 1929 The Journal of the Acoustical Society of America has been the leading source of theoretical and experimental research results in the broad interdisciplinary study of sound. Subject coverage includes: linear and nonlinear acoustics; aeroacoustics, underwater sound and acoustical oceanography; ultrasonics and quantum acoustics; architectural and structural acoustics and vibration; speech, music and noise; psychology and physiology of hearing; engineering acoustics, transduction; bioacoustics, animal bioacoustics.
期刊最新文献
Deep learning-based approach for linking microstructural and macroscopic acoustic properties of sound-absorbing polyurethane foam. A few-shot learning method for underwater acoustic target recognition based on generative data augmentationa). Estimating band importance for environmental sound recognition using deep learninga). Heart heard in motion. Morphology modulation and its forefront implication of a three-dimensional particulate-fluid system by weaving acoustic field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1