Language and Speech最新文献_第2页

Word Learning Through Eye-Gaze Cues at Ages 12 and 18 Months. 12个月和18个月儿童通过眼睛注视线索学习单词。

IF 1.1 2区文学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Language and Speech

Pub Date : 2026-01-02 DOI: 10.1177/00238309251395278

Osnat Segal, Zipora Yegudayev

This study examined how infants exploit an interlocutor's eye gaze for word learning, using a novel eye-tracking paradigm. The final sample included 25 Hebrew-speaking infants aged 12 and 18 months. Infants completed three experimental phases: (a) a 2-part validation phase: (1) recognition of a familiar object (ball) among two items (ball, bottle) upon hearing its label (e.g., "Where is the ball?"), and (2) exposure to an interlocutor gazing at and talking to an unfamiliar object (rattle) without labeling it (e.g., "Look, it is here"); (b) a learning phase, in which two unfamiliar animal dolls of similar visual salience were presented, and the interlocutor labeled one doll (e.g., "Look, here is bícket"); and (c) a test phase, in which the four objects (ball, rattle, and the two animal dolls) were shown together, and infants were tested to see if they look at the target object upon hearing the learned label (e.g., "Where is bícket?") but not upon hearing a novel label. Eighteen-month-olds followed the interlocutor's gaze more often and attended longer to the labeled object during learning compared with 12-month-olds. In the test phase, both age groups showed word recognition, looking longer at the target object after hearing its label than at familiar or unlabeled distractors, although differences with the visually similar distractor were nonsignificant. When hearing the non-learned word, infants looked longer at the similar distractor. Infants demonstrated word-object learning based on the interlocutor's gaze, with gaze -following abilities strengthening with age.

本研究考察了婴儿如何利用对话者的目光进行单词学习，使用了一种新颖的眼球追踪范式。最后的样本包括25名年龄在12到18个月之间讲希伯来语的婴儿。婴儿完成了三个实验阶段：(a)一个由两部分组成的验证阶段：(1)听到标签（例如，“球在哪里？”）后，在两个项目（球，瓶子）中识别出熟悉的物体（球）。(2)对话者凝视着一个不熟悉的物体（响浪声）并与之交谈，而不给它贴上标签（例如，“看，它在这里”）；(b)学习阶段，展示两个视觉上相似的不熟悉的动物玩偶，对话者给其中一个娃娃贴上标签（例如，“看，这是bícket”）；(c)一个测试阶段，四个物体（球、拨浪鼓和两个动物玩偶）一起展示，测试婴儿在听到学习过的标签（例如，“bícket在哪里？”）时是否会看目标物体。但当你听到一个新颖的标签时，你就不会这么做了。与12个月大的婴儿相比，18个月大的婴儿在学习过程中更经常地跟随对话者的目光，并且更长时间地关注被标记的物体。在测试阶段，两个年龄组的人都表现出单词识别能力，在听到目标物体的标签后，他们看目标物体的时间比看熟悉的或未标记的分心物的时间要长，尽管与视觉上相似的分心物的差异并不显著。当听到非学过的单词时，婴儿看同样的干扰物的时间更长。婴儿表现出基于对话者注视的词物学习，注视跟随能力随着年龄的增长而增强。

{"title":"Word Learning Through Eye-Gaze Cues at Ages 12 and 18 Months.","authors":"Osnat Segal, Zipora Yegudayev","doi":"10.1177/00238309251395278","DOIUrl":"https://doi.org/10.1177/00238309251395278","url":null,"abstract":"This study examined how infants exploit an interlocutor's eye gaze for word learning, using a novel eye-tracking paradigm. The final sample included 25 Hebrew-speaking infants aged 12 and 18 months. Infants completed three experimental phases: (a) a 2-part validation phase: (1) recognition of a familiar object (ball) among two items (ball, bottle) upon hearing its label (e.g., \"Where is the ball?\"), and (2) exposure to an interlocutor gazing at and talking to an unfamiliar object (rattle) without labeling it (e.g., \"Look, it is here\"); (b) a learning phase, in which two unfamiliar animal dolls of similar visual salience were presented, and the interlocutor labeled one doll (e.g., \"Look, here is bícket\"); and (c) a test phase, in which the four objects (ball, rattle, and the two animal dolls) were shown together, and infants were tested to see if they look at the target object upon hearing the learned label (e.g., \"Where is bícket?\") but not upon hearing a novel label. Eighteen-month-olds followed the interlocutor's gaze more often and attended longer to the labeled object during learning compared with 12-month-olds. In the test phase, both age groups showed word recognition, looking longer at the target object after hearing its label than at familiar or unlabeled distractors, although differences with the visually similar distractor were nonsignificant. When hearing the non-learned word, infants looked longer at the similar distractor. Infants demonstrated word-object learning based on the interlocutor's gaze, with gaze -following abilities strengthening with age.","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251395278"},"PeriodicalIF":1.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145893475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How Perceptual Learning Extends Across Vowels. 感知学习是如何跨元音扩展的。

IF 1.1 2区文学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Language and Speech

Pub Date : 2026-01-02 DOI: 10.1177/00238309251395663

Chelsea Sanker

Speakers' perception of phonemes can be shifted based on hearing tokens of them with altered acoustic characteristics, and those shifts are extended to phonemes not heard during exposure. The patterns of extension from one vowel to others can help clarify the phonological representation of vowels and the processes that underlie extension of acoustic shifts. Three perceptual learning tasks tested how exposure to shifted F1 or F2 in a single vowel quality in American English influences other vowels with a range of characteristics, and how differences between dialects interact with those patterns of extension. In Experiment 1, shifted F1 in /ɪ/ exposure items produced perceptual shifts in the boundary between several high and mid vowels, as well as the /ε-æ/ boundary. In Experiment 2, shifted F2 in /u/ exposure items produced perceptual shifts in the boundary between front and back vowels. In Experiment 3, shifted F2 in /ε/ or /ei/ produced different patterns; shifted /ei/ only impacted the /ou-ei/ boundary, while shifted /ε/ impacted /ʌ-ε/ and /ʊ-ɪ/. The results can be explained by shifts in perception extending to vowels that share phonological features which are linked to the manipulated acoustic characteristic. However, the results are also largely consistent with extension based on acoustic similarity. There was little evidence for the listener's dialect affecting patterns of extension.

说话者对音素的感知可以根据他们听到的具有改变的声学特征的符号而转移，这些转移扩展到在暴露期间听不到的音素。从一个元音到另一个元音的延伸模式可以帮助阐明元音的音系表征和声学转移延伸的过程。三个感知学习任务测试了美国英语中单个元音质量中移位的F1或F2如何影响具有一系列特征的其他元音，以及方言之间的差异如何与这些扩展模式相互作用。在实验1中，在/ / /暴露项中移位的F1会在几个高元音和中元音之间的边界以及/ε-æ/边界产生感知移位。在实验2中，/u/暴露项中F2移位会产生前后元音边界的感知移位。在实验3中，/ε/或/ei/中F2移位产生不同的模式；移位的/ei/只影响了/ou-ei/的边界，而移位的/ε/影响了/ u- ε/和/ u- æ /。这一结果可以用感知的变化来解释，这些变化延伸到元音，这些元音具有与被操纵的声学特性相关的语音特征。然而，结果也与基于声学相似性的扩展基本一致。几乎没有证据表明听者的方言会影响延伸模式。

{"title":"How Perceptual Learning Extends Across Vowels.","authors":"Chelsea Sanker","doi":"10.1177/00238309251395663","DOIUrl":"https://doi.org/10.1177/00238309251395663","url":null,"abstract":"Speakers' perception of phonemes can be shifted based on hearing tokens of them with altered acoustic characteristics, and those shifts are extended to phonemes not heard during exposure. The patterns of extension from one vowel to others can help clarify the phonological representation of vowels and the processes that underlie extension of acoustic shifts. Three perceptual learning tasks tested how exposure to shifted F1 or F2 in a single vowel quality in American English influences other vowels with a range of characteristics, and how differences between dialects interact with those patterns of extension. In Experiment 1, shifted F1 in /ɪ/ exposure items produced perceptual shifts in the boundary between several high and mid vowels, as well as the /ε-æ/ boundary. In Experiment 2, shifted F2 in /u/ exposure items produced perceptual shifts in the boundary between front and back vowels. In Experiment 3, shifted F2 in /ε/ or /ei/ produced different patterns; shifted /ei/ only impacted the /ou-ei/ boundary, while shifted /ε/ impacted /ʌ-ε/ and /ʊ-ɪ/. The results can be explained by shifts in perception extending to vowels that share phonological features which are linked to the manipulated acoustic characteristic. However, the results are also largely consistent with extension based on acoustic similarity. There was little evidence for the listener's dialect affecting patterns of extension.","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251395663"},"PeriodicalIF":1.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145890583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Multi-CEFR-Level Learner Corpus Study to Quantify Fluency and Accuracy in Speech. 多cefr水平学习者语料库研究量化语言流畅性和准确性。

IF 1.1 2区文学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Language and Speech

Pub Date : 2025-12-26 DOI: 10.1177/00238309251393170

Lan-Fen Huang, Tomáš Gráf

This study strengthens the validation of learner speech assessment in the Common European Framework of Reference (CEFR) by analyzing the quantitative variables related to fluency and accuracy across four CEFR levels (A2, B1, B2, and C1). Drawing on a learner corpus approach, we examine 500,000 tokens from the Louvain International Database of Spoken English Interlanguage (LINDSEI) and its extensions, supplemented by post hoc rater evaluations. Three task types-a semi-monologic topic discussion, a dialogic interaction, and a monologic picture description-are used to elicit variation in speech production. The analysis focuses on speech rates, the frequency of filled and unfilled pauses, and error rates to unveil developmental trends in learner speech. The results reveal strong correlations between these fluency and accuracy metrics and CEFR levels, with speech rate emerging as the most reliable indicator of proficiency. The frequency of unfilled pauses decreases as proficiency increases, while filled pauses, although less critical to fluency assessment, offer insights into speech planning mechanisms. Error rates similarly decline with higher proficiency, reflecting greater accuracy in speech production. Exemplary instances for each CEFR level are presented, offering practical metrics for teaching, assessment, and rater training. While the study's limitations include an overrepresentation of Mandarin Chinese learners and the exclusion of pronunciation errors, these gaps highlight avenues for future research. This study provides empirical, task-sensitive evidence to enrich CEFR can-do descriptors, enhance rater training, and refine speaking assessments, contributing to more effective language teaching, learning, and assessment practices.

本研究通过分析欧洲共同参考框架（CEFR）四个水平（A2、B1、B2和C1）与流利性和准确性相关的定量变量，加强了学习者语言评估在CEFR中的有效性。利用学习者语料库方法，我们检查了来自鲁汶国际英语口语中介语言数据库（LINDSEI）及其扩展的500,000个令牌，并辅以事后评估。三种任务类型——半单一主题讨论、对话互动和单一图片描述——用于引发语音产生的变化。分析的重点是语速，填充和未填充停顿的频率，错误率，以揭示学习者语言的发展趋势。结果显示，这些流利度和准确性指标与CEFR水平之间存在很强的相关性，其中语速是衡量熟练程度的最可靠指标。随着熟练程度的提高，未填充停顿的频率会降低，而填充停顿虽然对流利度评估不那么重要，但却能让我们深入了解语音规划机制。错误率也随着熟练程度的提高而下降，这反映了语音生成的准确性。为每个CEFR级别提供了示例实例，为教学、评估和评分员培训提供了实用的度量。虽然这项研究的局限性包括对普通话学习者的过度代表和排除发音错误，但这些差距突出了未来研究的途径。本研究为丰富CEFR can-do描述符、加强语言培训、完善口语评估提供了实证、任务敏感的证据，有助于更有效的语言教学、学习和评估实践。

{"title":"A Multi-CEFR-Level Learner Corpus Study to Quantify Fluency and Accuracy in Speech.","authors":"Lan-Fen Huang, Tomáš Gráf","doi":"10.1177/00238309251393170","DOIUrl":"https://doi.org/10.1177/00238309251393170","url":null,"abstract":"This study strengthens the validation of learner speech assessment in the Common European Framework of Reference (CEFR) by analyzing the quantitative variables related to fluency and accuracy across four CEFR levels (A2, B1, B2, and C1). Drawing on a learner corpus approach, we examine 500,000 tokens from the Louvain International Database of Spoken English Interlanguage (LINDSEI) and its extensions, supplemented by post hoc rater evaluations. Three task types-a semi-monologic topic discussion, a dialogic interaction, and a monologic picture description-are used to elicit variation in speech production. The analysis focuses on speech rates, the frequency of filled and unfilled pauses, and error rates to unveil developmental trends in learner speech. The results reveal strong correlations between these fluency and accuracy metrics and CEFR levels, with speech rate emerging as the most reliable indicator of proficiency. The frequency of unfilled pauses decreases as proficiency increases, while filled pauses, although less critical to fluency assessment, offer insights into speech planning mechanisms. Error rates similarly decline with higher proficiency, reflecting greater accuracy in speech production. Exemplary instances for each CEFR level are presented, offering practical metrics for teaching, assessment, and rater training. While the study's limitations include an overrepresentation of Mandarin Chinese learners and the exclusion of pronunciation errors, these gaps highlight avenues for future research. This study provides empirical, task-sensitive evidence to enrich CEFR can-do descriptors, enhance rater training, and refine speaking assessments, contributing to more effective language teaching, learning, and assessment practices.","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251393170"},"PeriodicalIF":1.1,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145835380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

What Determines Personality Impressions of Synthetic and Natural Voices? The Effects of Voice Quality and Intonation. 是什么决定了合成声音和自然声音的个性印象？语音质量和语调的影响。

IF 1.1 2区文学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Language and Speech

Pub Date : 2025-12-26 DOI: 10.1177/00238309251389567

Minjeong Kim, Jaehan Park, Minhong Jeong, Jieun Song

The present study investigated how acoustic and phonetic characteristics of synthetic and natural voices affect personality impressions of the voices. To this end, we conducted a personality rating experiment in which 30 native Korean speakers judged the perceived personality of natural Korean utterances and their synthetic counterparts (voice clones) using the Big-Five personality model. Various acoustic parameters, including measures of voice quality, F0, and articulation rate, were then extracted from the speech, and Intonational Phrase boundary tones were annotated. The ratings of the Big-Five personality traits were reduced to two dimensions (P1: agreeableness, conscientiousness, and emotional stability; P2: extraversion and openness) using a principal component analysis. The results suggest that the acoustic differences between state-of-the-art synthetic speech and its original counterpart can produce varying effects on personality perception. For example, speech produced with a narrower F0 range received lower scores on P1 and P2, but for male speakers, this effect was only observed in synthetic voices, likely due to the less-natural intonational patterns used. The intonation analysis further demonstrates that across speech type, using context-appropriate tones or those conveying positive attitudes improves the overall impression of the voice (both P1 and P2). The results also suggest that a less-modal voice enhances the personality scores overall, but specific voice qualities (i.e., breathiness and creakiness) and voice pitch seem to affect P1 and P2 differently. The present study demonstrates a range of acoustic and phonetic characteristics that should be considered when designing personas for AI voices or developing more likable synthetic voices.

本研究探讨了合成声音和自然声音的声学和语音特征如何影响声音的个性印象。为此，我们进行了一项人格评定实验，让30名母语为韩语的人用大五人格模型来判断自然的韩语话语和合成的韩语话语（语音克隆）的感知人格。然后从语音中提取各种声学参数，包括语音质量、F0和发音率，并对语调短语边界音调进行注释。采用主成分分析法，将大五人格特征的评分降至两个维度（P1：宜人性、尽责性和情绪稳定性；P2：外向性和开放性）。结果表明，最先进的合成语音与原始语音之间的声学差异会对人格感知产生不同的影响。例如，F0范围较窄的语音在P1和P2上得分较低，但对于男性说话者来说，这种影响只在合成声音中观察到，可能是由于使用的语调模式不太自然。语调分析进一步表明，在不同的语音类型中，使用适合上下文的语调或传达积极态度的语调可以改善声音的整体印象（P1和P2）。结果还表明，少模态的声音总体上提高了个性得分，但特定的声音质量（即呼吸和嘎吱声）和音高似乎对P1和P2的影响不同。目前的研究表明，在为人工智能语音设计人物角色或开发更讨人喜欢的合成语音时，应该考虑一系列声学和语音特征。

{"title":"What Determines Personality Impressions of Synthetic and Natural Voices? The Effects of Voice Quality and Intonation.","authors":"Minjeong Kim, Jaehan Park, Minhong Jeong, Jieun Song","doi":"10.1177/00238309251389567","DOIUrl":"https://doi.org/10.1177/00238309251389567","url":null,"abstract":"The present study investigated how acoustic and phonetic characteristics of synthetic and natural voices affect personality impressions of the voices. To this end, we conducted a personality rating experiment in which 30 native Korean speakers judged the perceived personality of natural Korean utterances and their synthetic counterparts (voice clones) using the Big-Five personality model. Various acoustic parameters, including measures of voice quality, F0, and articulation rate, were then extracted from the speech, and Intonational Phrase boundary tones were annotated. The ratings of the Big-Five personality traits were reduced to two dimensions (P1: agreeableness, conscientiousness, and emotional stability; P2: extraversion and openness) using a principal component analysis. The results suggest that the acoustic differences between state-of-the-art synthetic speech and its original counterpart can produce varying effects on personality perception. For example, speech produced with a narrower F0 range received lower scores on P1 and P2, but for male speakers, this effect was only observed in synthetic voices, likely due to the less-natural intonational patterns used. The intonation analysis further demonstrates that across speech type, using context-appropriate tones or those conveying positive attitudes improves the overall impression of the voice (both P1 and P2). The results also suggest that a less-modal voice enhances the personality scores overall, but specific voice qualities (i.e., breathiness and creakiness) and voice pitch seem to affect P1 and P2 differently. The present study demonstrates a range of acoustic and phonetic characteristics that should be considered when designing personas for AI voices or developing more likable synthetic voices.","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251389567"},"PeriodicalIF":1.1,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145835357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unveiling Denasalization as an Ongoing Sound Change: The Role of Prosody and Gender in Seoul Korean. 揭示去中性化是一种持续的声音变化：首尔韩语中韵律和性别的作用。

IF 1.1 2区文学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Language and Speech

Pub Date : 2025-12-17 DOI: 10.1177/00238309251390431

Jiyoung Jang, Jungah Lee, Jiyoung Lee, Sahyang Kim, Taehong Cho

This study examines variation in coarticulatory vowel nasalization in Seoul Korean as a function of prosodic boundaries and gender, exploring its role in an emerging denasalization sound change. Coarticulatory vowel nasality, measured by A1-P0, was analyzed in the word-initial vowels of /ma.mi/ across three prosodic boundary conditions (IP-initial, AP-initial, and Wd-initial) in 35 speakers in their 20s. Results show that phrase-initial vowels exhibit reduced nasality as part of domain-initial articulatory strengthening, suggesting that denasalization of word-initial nasal consonants extends to the following vowel, reducing its coarticulatory nasalization and thus signaling the progression of a position-driven sound change. Significant gender differences were found: male speakers consistently adhere to this change throughout the vowel, exhibiting greater reductions in coarticulatory vowel nasalization in phrase-initial contexts. In contrast, female speakers retain higher nasality levels in both phrase-initial and phrase-medial positions by regulating the coarticulatory process. These gender-related differences may reflect socially grounded perceptions of nasality and/or female speakers' tendency to preserve phonological features, influencing speech production choices. These findings highlight the interplay between prosodically driven phonetic variation and gender: speakers actively control the degree of vowel nasalization, and this phonetic variation, in turn, is further shaped by gender, potentially evolving into a systematic sound change.

本研究考察了首尔朝鲜语元音协同发音的鼻音化随韵律边界和性别的变化，探讨了其在新兴的去鼻音化声音变化中的作用。用A1-P0测量/ma的单词开头元音，分析辅音元音的鼻音性。35名20多岁的说话者在三个韵律边界条件（ip首、ap首和wd首）上的发音差异。结果表明，短语初始元音作为域初始发音强化的一部分表现出鼻音减弱，这表明单词初始鼻辅音的去鼻音化延伸到下一个元音，减少了它的协同发音，从而标志着位置驱动的声音变化的进展。研究发现了显著的性别差异：男性说话者始终坚持元音的这种变化，在短语开头的语境中，元音的协同发音明显减少。相比之下，女性说话者通过调节协同发音过程，在短语起始和短语中间位置都保持了较高的鼻音水平。这些与性别相关的差异可能反映了社会对鼻音的认知和/或女性说话者倾向于保留语音特征，从而影响语音产生的选择。这些发现强调了韵律驱动的语音变化与性别之间的相互作用：说话者主动控制元音鼻音化的程度，而这种语音变化反过来又受到性别的进一步影响，有可能演变成一种系统的语音变化。

{"title":"Unveiling Denasalization as an Ongoing Sound Change: The Role of Prosody and Gender in Seoul Korean.","authors":"Jiyoung Jang, Jungah Lee, Jiyoung Lee, Sahyang Kim, Taehong Cho","doi":"10.1177/00238309251390431","DOIUrl":"https://doi.org/10.1177/00238309251390431","url":null,"abstract":"This study examines variation in coarticulatory vowel nasalization in Seoul Korean as a function of prosodic boundaries and gender, exploring its role in an emerging denasalization sound change. Coarticulatory vowel nasality, measured by A1-P0, was analyzed in the word-initial vowels of /ma.mi/ across three prosodic boundary conditions (IP-initial, AP-initial, and Wd-initial) in 35 speakers in their 20s. Results show that phrase-initial vowels exhibit reduced nasality as part of domain-initial articulatory strengthening, suggesting that denasalization of word-initial nasal consonants extends to the following vowel, reducing its coarticulatory nasalization and thus signaling the progression of a position-driven sound change. Significant gender differences were found: male speakers consistently adhere to this change throughout the vowel, exhibiting greater reductions in coarticulatory vowel nasalization in phrase-initial contexts. In contrast, female speakers retain higher nasality levels in both phrase-initial and phrase-medial positions by regulating the coarticulatory process. These gender-related differences may reflect socially grounded perceptions of nasality and/or female speakers' tendency to preserve phonological features, influencing speech production choices. These findings highlight the interplay between prosodically driven phonetic variation and gender: speakers actively control the degree of vowel nasalization, and this phonetic variation, in turn, is further shaped by gender, potentially evolving into a systematic sound change.","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251390431"},"PeriodicalIF":1.1,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145769981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Listening Effort Across Non-Native and Regional Accents: A Pupillometry Study. 非母语和地方口音的听力努力：一项瞳孔测量研究。

IF 1.1 2区文学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Language and Speech

Pub Date : 2025-12-17 DOI: 10.1177/00238309251389573

Marc Barnard, Scott Kunkel, Rémi Lamarque, Adam J Chong

Previous work has shown that L2-accented speech incurs a processing cost even when accurately understood. It remains unknown, however, whether an online processing cost is found when listeners process speech produced in L1 accents that are not their own. In this study, we examine this question by using comparative pupil dilation as a measure of cognitive load. Participants from the South of England heard sentences produced in four different accents: Southern British English (the listeners' own familiar accent), American English (a standard L1 accent widely used in media), Glaswegian English (a less-familiar regional L1 accent), and Mandarin Chinese-accented English (an L2 English accent). Results show that Chinese-accented speech elicited significantly larger pupil dilation responses compared with Southern British English. Speech from less-familiar L1 accents elicited pupil dilation responses of different shapes and trajectories, suggesting differences in processing of these accents. Furthermore, participants showed larger mean pupil dilation when they heard relatively less-familiar L1 American-accented speech than when hearing Glaswegian English. Interestingly, this effect was found despite participants self-reporting that they were less familiar with the Glaswegian accent and found it more effortful to comprehend compared with American English. These results suggest that accurately perceived and highly intelligible L1 accents such as American English also incur a cognitive cost in processing, but to a smaller extent compared with L2-accented speech. We discuss the implications of our findings for the relationship between exposure, subjective effortfulness measures, and pupil dilation responses.

先前的研究表明，l2口音的语音即使被准确理解也会产生处理成本。然而，当听者处理非母语口音的语音时，是否会产生在线处理成本，目前尚不清楚。在这项研究中，我们通过使用比较瞳孔扩张作为认知负荷的测量来检验这个问题。来自英格兰南部的参与者听到了四种不同口音的句子：英国南部英语（听众自己熟悉的口音），美国英语（媒体中广泛使用的标准L1口音），格拉斯哥英语（不太熟悉的地区L1口音）和普通话中国口音英语（第二语言英语口音）。结果表明，与英国南部英语相比，中国口音语音引起的瞳孔扩张反应明显更大。来自不太熟悉的母语口音的语音引起不同形状和轨迹的瞳孔扩张反应，表明这些口音的处理存在差异。此外，当参与者听到相对不太熟悉的L1美国口音讲话时，他们的平均瞳孔扩张幅度比听到格拉斯哥英语时要大。有趣的是，尽管参与者自我报告说他们对格拉斯哥口音不太熟悉，并且发现与美式英语相比，格拉斯哥口音更难以理解，但还是发现了这种影响。这些结果表明，准确感知和高度可理解的L1口音（如美式英语）在处理过程中也会产生认知成本，但与l2口音相比，其程度较小。我们讨论了我们的发现对暴露、主观努力测量和瞳孔扩张反应之间关系的影响。

{"title":"Listening Effort Across Non-Native and Regional Accents: A Pupillometry Study.","authors":"Marc Barnard, Scott Kunkel, Rémi Lamarque, Adam J Chong","doi":"10.1177/00238309251389573","DOIUrl":"https://doi.org/10.1177/00238309251389573","url":null,"abstract":"Previous work has shown that L2-accented speech incurs a processing cost even when accurately understood. It remains unknown, however, whether an online processing cost is found when listeners process speech produced in L1 accents that are not their own. In this study, we examine this question by using comparative pupil dilation as a measure of cognitive load. Participants from the South of England heard sentences produced in four different accents: Southern British English (the listeners' own familiar accent), American English (a standard L1 accent widely used in media), Glaswegian English (a less-familiar regional L1 accent), and Mandarin Chinese-accented English (an L2 English accent). Results show that Chinese-accented speech elicited significantly larger pupil dilation responses compared with Southern British English. Speech from less-familiar L1 accents elicited pupil dilation responses of different shapes and trajectories, suggesting differences in processing of these accents. Furthermore, participants showed larger mean pupil dilation when they heard relatively less-familiar L1 American-accented speech than when hearing Glaswegian English. Interestingly, this effect was found despite participants self-reporting that they were less familiar with the Glaswegian accent and found it more effortful to comprehend compared with American English. These results suggest that accurately perceived and highly intelligible L1 accents such as American English also incur a cognitive cost in processing, but to a smaller extent compared with L2-accented speech. We discuss the implications of our findings for the relationship between exposure, subjective effortfulness measures, and pupil dilation responses.","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251389573"},"PeriodicalIF":1.1,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145769993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Corrigendum to "Sources of Intelligibility of Distant Languages: An Empirical Study". “遥远语言可理解性的来源：一项实证研究”的勘误表。

IF 1.1 2区文学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Language and Speech

Pub Date : 2025-12-16 DOI: 10.1177/00238309251410999

引用次数: 0

Apparent Talker Variability and Speaking Style Similarity Can Enhance Comprehension of Novel L2-Accented Talkers. 说话人的明显变异性和说话风格的相似性可以增强对l2口音新说话人的理解。

IF 1.1 2区文学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Language and Speech

Pub Date : 2025-12-15 DOI: 10.1177/00238309251390505

Nicholas B Aoki, Georgia Zellou

Certain studies report facilitatory effects of multiple-talker exposure on cross-talker generalization of L2-accented speech (often defined as greater comprehension of novel talkers). However, a confound exists in prior work: do multiple-talker exposure benefits stem from the greater number of talkers (numerosity) or greater phonological variability (heterogeneity)? This study examined how apparent talker variability and speaking style affect L2-accent adaptation, while keeping phonological variation as constant as possible across exposure conditions. L1-English participants transcribed sentences in noise for a single Mandarin-accented English talker in an exposure phase and a novel Mandarin-accented English speaker in a test phase (a control condition received no exposure). Although all exposure stimuli came from one speaker, half of the listeners who received exposure were led to believe that multiple talkers were present by shifting the F0 and formants of a subset of sentences. We find: (a) when the test talker produces casual speech, all critical conditions with exposure enhance generalization (i.e., greater comprehension of the test talker relative to control); (b) when the test talker produces hard-of-hearing-directed speech, there is no difference in transcription accuracy between the control and critical conditions; and (c) when the test talker produces casual speech, generalization is greatest when listeners are exposed to multiple apparent talkers, but only given speaking style similarity between exposure and test (i.e., when the exposure phase also presents casual speech). This work lends credence to numerosity accounts-given a minimal change in phonological variability, the illusion of multiple-talker exposure can facilitate cross-talker generalization of L2-accented speech.

某些研究报告了多重说话者暴露对相声者l2口音语音的泛化（通常定义为对新说话者的更好理解）的促进作用。然而，在先前的研究中存在一个困惑：多说话者暴露的好处是来自于更多的说话者（数量）还是更大的语音变异性（异质性）？本研究考察了说话者的明显变化和说话风格如何影响l2口音适应，同时在不同的暴露条件下尽可能保持语音变化不变。l1 -英语的参与者在噪音中为一个有普通话口音的英语说话者转录句子，在一个暴露阶段，为一个有普通话口音的英语说话者转录句子，在一个测试阶段（控制条件没有暴露）。尽管所有的暴露刺激都来自一个说话者，但通过改变句子子集的F0和共振峰，接受暴露的一半听众被引导相信有多个说话者在场。我们发现：(a)当测试说话者随意说话时，所有暴露的关键条件都增强了泛化（即，相对于对照组，对测试说话者有更好的理解）；(b)当测试说话者产生听力困难的定向语音时，对照和临界条件下的转录准确性没有差异；(c)当测试说话者产生随意言语时，当听者暴露于多个明显的说话者，但只有在暴露和测试之间的说话风格相似时（即，当暴露阶段也呈现随意言语时），泛化效果最大。这项工作为数字解释提供了可信度——考虑到语音可变性的最小变化，多说话者暴露的错觉可以促进对口说话者对l2口音的概括。

{"title":"Apparent Talker Variability and Speaking Style Similarity Can Enhance Comprehension of Novel L2-Accented Talkers.","authors":"Nicholas B Aoki, Georgia Zellou","doi":"10.1177/00238309251390505","DOIUrl":"https://doi.org/10.1177/00238309251390505","url":null,"abstract":"Certain studies report facilitatory effects of multiple-talker exposure on cross-talker generalization of L2-accented speech (often defined as greater comprehension of novel talkers). However, a confound exists in prior work: do multiple-talker exposure benefits stem from the greater number of talkers (numerosity) or greater phonological variability (heterogeneity)? This study examined how apparent talker variability and speaking style affect L2-accent adaptation, while keeping phonological variation as constant as possible across exposure conditions. L1-English participants transcribed sentences in noise for a single Mandarin-accented English talker in an exposure phase and a novel Mandarin-accented English speaker in a test phase (a control condition received no exposure). Although all exposure stimuli came from one speaker, half of the listeners who received exposure were led to believe that multiple talkers were present by shifting the F0 and formants of a subset of sentences. We find: (a) when the test talker produces casual speech, all critical conditions with exposure enhance generalization (i.e., greater comprehension of the test talker relative to control); (b) when the test talker produces hard-of-hearing-directed speech, there is no difference in transcription accuracy between the control and critical conditions; and (c) when the test talker produces casual speech, generalization is greatest when listeners are exposed to multiple apparent talkers, but only given speaking style similarity between exposure and test (i.e., when the exposure phase also presents casual speech). This work lends credence to numerosity accounts-given a minimal change in phonological variability, the illusion of multiple-talker exposure can facilitate cross-talker generalization of L2-accented speech.","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251390505"},"PeriodicalIF":1.1,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145764569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comparing Online and Face-to-Face Administration of the Polish Sentence Repetition Task in Monolingual and Multilingual Children: Higher Scores in Face-to-Face Testing. 单语和多语儿童波兰语句子重复任务在线管理和面对面管理的比较：面对面测试得分更高。

IF 1.1 2区文学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Language and Speech

Pub Date : 2025-12-14 DOI: 10.1177/00238309251394372

Natalia Banasik-Jemielniak, Magdalena Kochańska, Maria Obarska, Maria Zajączkowska, Joanna Świderska, Ewa Haman

This study compared online and face-to-face (f2f) testing using the short Polish version of the LITMUS Sentence Repetition Task (SRep) with multilingual and monolingual Polish-speaking children. The shift to remote testing during the COVID-19 pandemic prompted questions about whether online methods yield results comparable with in-person testing for assessing multilingual children's grammatical abilities. Reliable online testing could enhance access to underrepresented populations, enabling families from diverse backgrounds to participate from home. We tested 92 multilingual children (speaking Polish and English or German) and 55 monolingual Polish-speaking children aged 4;6-7;6. Each child completed the SRep task twice (online and f2f) in a counterbalanced order. Results showed better performance on f2f tasks for both groups. Multilingual children improved on their second attempt, regardless of format, while monolinguals consistently scored higher in the f2f condition. These findings indicate differences in performance across testing modalities and the need to adapt and norm the SRep task for both online and f2f administration separately.

这项研究比较了在线测试和面对面测试（f2f），使用短波兰语版本的LITMUS句子重复任务（SRep）对多语和单语波兰语儿童进行测试。在COVID-19大流行期间转向远程测试引发了人们的质疑，即在线方法是否能产生与现场测试相当的结果，以评估多语言儿童的语法能力。可靠的在线测试可以增加对代表性不足的人群的访问，使来自不同背景的家庭能够在家参与。我们测试了92名多语儿童（讲波兰语、英语或德语）和55名4岁、6-7岁和6岁的单语波兰语儿童。每个孩子以平衡的顺序完成两次SRep任务（在线和f2f）。结果显示，两组在f2f任务上的表现都更好。多语言儿童在第二次测试中有所提高，无论测试形式如何，而单语言儿童在第二次测试中得分始终较高。这些发现表明，不同测试方式的表现存在差异，需要分别适应和规范在线和在线管理的SRep任务。

{"title":"Comparing Online and Face-to-Face Administration of the Polish Sentence Repetition Task in Monolingual and Multilingual Children: Higher Scores in Face-to-Face Testing.","authors":"Natalia Banasik-Jemielniak, Magdalena Kochańska, Maria Obarska, Maria Zajączkowska, Joanna Świderska, Ewa Haman","doi":"10.1177/00238309251394372","DOIUrl":"https://doi.org/10.1177/00238309251394372","url":null,"abstract":"This study compared online and face-to-face (f2f) testing using the short Polish version of the LITMUS Sentence Repetition Task (SRep) with multilingual and monolingual Polish-speaking children. The shift to remote testing during the COVID-19 pandemic prompted questions about whether online methods yield results comparable with in-person testing for assessing multilingual children's grammatical abilities. Reliable online testing could enhance access to underrepresented populations, enabling families from diverse backgrounds to participate from home. We tested 92 multilingual children (speaking Polish and English or German) and 55 monolingual Polish-speaking children aged 4;6-7;6. Each child completed the SRep task twice (online and f2f) in a counterbalanced order. Results showed better performance on f2f tasks for both groups. Multilingual children improved on their second attempt, regardless of format, while monolinguals consistently scored higher in the f2f condition. These findings indicate differences in performance across testing modalities and the need to adapt and norm the SRep task for both online and f2f administration separately.","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251394372"},"PeriodicalIF":1.1,"publicationDate":"2025-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145758397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Acoustic and Perceptual Differences of Aegyo Speaking Style Across Gender in Seoul Korean. 首尔朝鲜语Aegyo说话风格的声学和感知差异

IF 1.1 2区文学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY

Language and Speech

Pub Date : 2025-12-10 DOI: 10.1177/00238309251383547

Ji-Eun Kim, Volker Dellwo

Aegyo is a culturally salient speaking style in Korea, often described as a baby talk-like register used by young adults to convey affection or cuteness. Yet, its acoustic profile and perception across genders remain understudied. This study investigates the acoustic and perceptual characteristics of aegyo through a production study and a perception study. In the production study, 12 native Seoul Korean speakers (six females, six males) produced sentences in both aegyo and non-aegyo styles. Acoustic analyses revealed that aegyo is characterized by significantly longer vowel durations, slower speech rate, and higher mean and maximum F0, along with greater variability in F0 and vowel duration: F0 range was significant only for male speakers. In addition, "hyper-score" analyses showed that male speakers exhibited more increases in mean and maximum F0 compared with female speakers. In the perception study, 49 Korean listeners (25 females, 24 males) judged whether the stimuli were produced in aegyo. Results showed a significant interaction between sensitivity and bias: listeners were less accurate but more prone to label the stimuli as aegyo when the speaker was male, whereas they were more accurate and more conservative when the speaker was female. These findings suggest that listeners interpret the same speaking style differently depending on speaker gender. Overall, our results support the Speaker Design model by evidencing that speakers systematically shift their vocal behavior to construct social identity, while also showing that listener interpretation of such shifts may vary by speaker gender.

Aegyo在韩国是一种文化上突出的说话方式，通常被描述为年轻人用来表达爱意或可爱的婴儿语。然而，它的声学特征和性别感知仍未得到充分研究。本研究通过生产研究和知觉研究来探讨海yo的声学和知觉特征。在制作研究中，12名母语为首尔语的人（6名女性，6名男性）分别用aegyo和non-aegyo两种风格制作了句子。声学分析表明，aegyo具有明显较长的元音持续时间、较慢的语速、较高的平均和最大F0，以及F0和元音持续时间的较大差异。F0范围仅在男性说话者中显着。此外，“超得分”分析表明，男性说话者在平均和最大F0上比女性说话者表现出更多的增长。在感知研究中，49名韩国听众（25名女性，24名男性）判断刺激是否在aegyo产生。结果显示敏感度和偏见之间存在显著的相互作用：当说话者是男性时，听众不太准确，但更倾向于将刺激标记为aegyo，而当说话者是女性时，他们更准确，更保守。这些发现表明，听者对同一种说话方式的理解会因说话者的性别而异。总体而言，我们的研究结果支持说话人设计模型，证明说话人系统地改变其发声行为以构建社会身份，同时也表明听者对这种转变的解释可能因说话人的性别而异。

{"title":"Acoustic and Perceptual Differences of Aegyo Speaking Style Across Gender in Seoul Korean.","authors":"Ji-Eun Kim, Volker Dellwo","doi":"10.1177/00238309251383547","DOIUrl":"https://doi.org/10.1177/00238309251383547","url":null,"abstract":"Aegyo is a culturally salient speaking style in Korea, often described as a baby talk-like register used by young adults to convey affection or cuteness. Yet, its acoustic profile and perception across genders remain understudied. This study investigates the acoustic and perceptual characteristics of aegyo through a production study and a perception study. In the production study, 12 native Seoul Korean speakers (six females, six males) produced sentences in both aegyo and non-aegyo styles. Acoustic analyses revealed that aegyo is characterized by significantly longer vowel durations, slower speech rate, and higher mean and maximum F0, along with greater variability in F0 and vowel duration: F0 range was significant only for male speakers. In addition, \"hyper-score\" analyses showed that male speakers exhibited more increases in mean and maximum F0 compared with female speakers. In the perception study, 49 Korean listeners (25 females, 24 males) judged whether the stimuli were produced in aegyo. Results showed a significant interaction between sensitivity and bias: listeners were less accurate but more prone to label the stimuli as aegyo when the speaker was male, whereas they were more accurate and more conservative when the speaker was female. These findings suggest that listeners interpret the same speaking style differently depending on speaker gender. Overall, our results support the Speaker Design model by evidencing that speakers systematically shift their vocal behavior to construct social identity, while also showing that listener interpretation of such shifts may vary by speaker gender.","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251383547"},"PeriodicalIF":1.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0