Pub Date : 2026-01-02DOI: 10.1177/00238309251395663
Chelsea Sanker
Speakers' perception of phonemes can be shifted based on hearing tokens of them with altered acoustic characteristics, and those shifts are extended to phonemes not heard during exposure. The patterns of extension from one vowel to others can help clarify the phonological representation of vowels and the processes that underlie extension of acoustic shifts. Three perceptual learning tasks tested how exposure to shifted F1 or F2 in a single vowel quality in American English influences other vowels with a range of characteristics, and how differences between dialects interact with those patterns of extension. In Experiment 1, shifted F1 in /ɪ/ exposure items produced perceptual shifts in the boundary between several high and mid vowels, as well as the /ε-æ/ boundary. In Experiment 2, shifted F2 in /u/ exposure items produced perceptual shifts in the boundary between front and back vowels. In Experiment 3, shifted F2 in /ε/ or /ei/ produced different patterns; shifted /ei/ only impacted the /ou-ei/ boundary, while shifted /ε/ impacted /ʌ-ε/ and /ʊ-ɪ/. The results can be explained by shifts in perception extending to vowels that share phonological features which are linked to the manipulated acoustic characteristic. However, the results are also largely consistent with extension based on acoustic similarity. There was little evidence for the listener's dialect affecting patterns of extension.
{"title":"How Perceptual Learning Extends Across Vowels.","authors":"Chelsea Sanker","doi":"10.1177/00238309251395663","DOIUrl":"https://doi.org/10.1177/00238309251395663","url":null,"abstract":"<p><p>Speakers' perception of phonemes can be shifted based on hearing tokens of them with altered acoustic characteristics, and those shifts are extended to phonemes not heard during exposure. The patterns of extension from one vowel to others can help clarify the phonological representation of vowels and the processes that underlie extension of acoustic shifts. Three perceptual learning tasks tested how exposure to shifted F1 or F2 in a single vowel quality in American English influences other vowels with a range of characteristics, and how differences between dialects interact with those patterns of extension. In Experiment 1, shifted F1 in /ɪ/ exposure items produced perceptual shifts in the boundary between several high and mid vowels, as well as the /ε-æ/ boundary. In Experiment 2, shifted F2 in /u/ exposure items produced perceptual shifts in the boundary between front and back vowels. In Experiment 3, shifted F2 in /ε/ or /ei/ produced different patterns; shifted /ei/ only impacted the /ou-ei/ boundary, while shifted /ε/ impacted /ʌ-ε/ and /ʊ-ɪ/. The results can be explained by shifts in perception extending to vowels that share phonological features which are linked to the manipulated acoustic characteristic. However, the results are also largely consistent with extension based on acoustic similarity. There was little evidence for the listener's dialect affecting patterns of extension.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251395663"},"PeriodicalIF":1.1,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145890583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1177/00238309251393170
Lan-Fen Huang, Tomáš Gráf
This study strengthens the validation of learner speech assessment in the Common European Framework of Reference (CEFR) by analyzing the quantitative variables related to fluency and accuracy across four CEFR levels (A2, B1, B2, and C1). Drawing on a learner corpus approach, we examine 500,000 tokens from the Louvain International Database of Spoken English Interlanguage (LINDSEI) and its extensions, supplemented by post hoc rater evaluations. Three task types-a semi-monologic topic discussion, a dialogic interaction, and a monologic picture description-are used to elicit variation in speech production. The analysis focuses on speech rates, the frequency of filled and unfilled pauses, and error rates to unveil developmental trends in learner speech. The results reveal strong correlations between these fluency and accuracy metrics and CEFR levels, with speech rate emerging as the most reliable indicator of proficiency. The frequency of unfilled pauses decreases as proficiency increases, while filled pauses, although less critical to fluency assessment, offer insights into speech planning mechanisms. Error rates similarly decline with higher proficiency, reflecting greater accuracy in speech production. Exemplary instances for each CEFR level are presented, offering practical metrics for teaching, assessment, and rater training. While the study's limitations include an overrepresentation of Mandarin Chinese learners and the exclusion of pronunciation errors, these gaps highlight avenues for future research. This study provides empirical, task-sensitive evidence to enrich CEFR can-do descriptors, enhance rater training, and refine speaking assessments, contributing to more effective language teaching, learning, and assessment practices.
{"title":"A Multi-CEFR-Level Learner Corpus Study to Quantify Fluency and Accuracy in Speech.","authors":"Lan-Fen Huang, Tomáš Gráf","doi":"10.1177/00238309251393170","DOIUrl":"https://doi.org/10.1177/00238309251393170","url":null,"abstract":"<p><p>This study strengthens the validation of learner speech assessment in the Common European Framework of Reference (CEFR) by analyzing the quantitative variables related to fluency and accuracy across four CEFR levels (A2, B1, B2, and C1). Drawing on a learner corpus approach, we examine 500,000 tokens from the Louvain International Database of Spoken English Interlanguage (LINDSEI) and its extensions, supplemented by post hoc rater evaluations. Three task types-a semi-monologic topic discussion, a dialogic interaction, and a monologic picture description-are used to elicit variation in speech production. The analysis focuses on speech rates, the frequency of filled and unfilled pauses, and error rates to unveil developmental trends in learner speech. The results reveal strong correlations between these fluency and accuracy metrics and CEFR levels, with speech rate emerging as the most reliable indicator of proficiency. The frequency of unfilled pauses decreases as proficiency increases, while filled pauses, although less critical to fluency assessment, offer insights into speech planning mechanisms. Error rates similarly decline with higher proficiency, reflecting greater accuracy in speech production. Exemplary instances for each CEFR level are presented, offering practical metrics for teaching, assessment, and rater training. While the study's limitations include an overrepresentation of Mandarin Chinese learners and the exclusion of pronunciation errors, these gaps highlight avenues for future research. This study provides empirical, task-sensitive evidence to enrich CEFR can-do descriptors, enhance rater training, and refine speaking assessments, contributing to more effective language teaching, learning, and assessment practices.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251393170"},"PeriodicalIF":1.1,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145835380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1177/00238309251389567
Minjeong Kim, Jaehan Park, Minhong Jeong, Jieun Song
The present study investigated how acoustic and phonetic characteristics of synthetic and natural voices affect personality impressions of the voices. To this end, we conducted a personality rating experiment in which 30 native Korean speakers judged the perceived personality of natural Korean utterances and their synthetic counterparts (voice clones) using the Big-Five personality model. Various acoustic parameters, including measures of voice quality, F0, and articulation rate, were then extracted from the speech, and Intonational Phrase boundary tones were annotated. The ratings of the Big-Five personality traits were reduced to two dimensions (P1: agreeableness, conscientiousness, and emotional stability; P2: extraversion and openness) using a principal component analysis. The results suggest that the acoustic differences between state-of-the-art synthetic speech and its original counterpart can produce varying effects on personality perception. For example, speech produced with a narrower F0 range received lower scores on P1 and P2, but for male speakers, this effect was only observed in synthetic voices, likely due to the less-natural intonational patterns used. The intonation analysis further demonstrates that across speech type, using context-appropriate tones or those conveying positive attitudes improves the overall impression of the voice (both P1 and P2). The results also suggest that a less-modal voice enhances the personality scores overall, but specific voice qualities (i.e., breathiness and creakiness) and voice pitch seem to affect P1 and P2 differently. The present study demonstrates a range of acoustic and phonetic characteristics that should be considered when designing personas for AI voices or developing more likable synthetic voices.
{"title":"What Determines Personality Impressions of Synthetic and Natural Voices? The Effects of Voice Quality and Intonation.","authors":"Minjeong Kim, Jaehan Park, Minhong Jeong, Jieun Song","doi":"10.1177/00238309251389567","DOIUrl":"https://doi.org/10.1177/00238309251389567","url":null,"abstract":"<p><p>The present study investigated how acoustic and phonetic characteristics of synthetic and natural voices affect personality impressions of the voices. To this end, we conducted a personality rating experiment in which 30 native Korean speakers judged the perceived personality of natural Korean utterances and their synthetic counterparts (voice clones) using the Big-Five personality model. Various acoustic parameters, including measures of voice quality, F0, and articulation rate, were then extracted from the speech, and Intonational Phrase boundary tones were annotated. The ratings of the Big-Five personality traits were reduced to two dimensions (P1: agreeableness, conscientiousness, and emotional stability; P2: extraversion and openness) using a principal component analysis. The results suggest that the acoustic differences between state-of-the-art synthetic speech and its original counterpart can produce varying effects on personality perception. For example, speech produced with a narrower F0 range received lower scores on P1 and P2, but for male speakers, this effect was only observed in synthetic voices, likely due to the less-natural intonational patterns used. The intonation analysis further demonstrates that across speech type, using context-appropriate tones or those conveying positive attitudes improves the overall impression of the voice (both P1 and P2). The results also suggest that a less-modal voice enhances the personality scores overall, but specific voice qualities (i.e., breathiness and creakiness) and voice pitch seem to affect P1 and P2 differently. The present study demonstrates a range of acoustic and phonetic characteristics that should be considered when designing personas for AI voices or developing more likable synthetic voices.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251389567"},"PeriodicalIF":1.1,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145835357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-17DOI: 10.1177/00238309251390431
Jiyoung Jang, Jungah Lee, Jiyoung Lee, Sahyang Kim, Taehong Cho
This study examines variation in coarticulatory vowel nasalization in Seoul Korean as a function of prosodic boundaries and gender, exploring its role in an emerging denasalization sound change. Coarticulatory vowel nasality, measured by A1-P0, was analyzed in the word-initial vowels of /ma.mi/ across three prosodic boundary conditions (IP-initial, AP-initial, and Wd-initial) in 35 speakers in their 20s. Results show that phrase-initial vowels exhibit reduced nasality as part of domain-initial articulatory strengthening, suggesting that denasalization of word-initial nasal consonants extends to the following vowel, reducing its coarticulatory nasalization and thus signaling the progression of a position-driven sound change. Significant gender differences were found: male speakers consistently adhere to this change throughout the vowel, exhibiting greater reductions in coarticulatory vowel nasalization in phrase-initial contexts. In contrast, female speakers retain higher nasality levels in both phrase-initial and phrase-medial positions by regulating the coarticulatory process. These gender-related differences may reflect socially grounded perceptions of nasality and/or female speakers' tendency to preserve phonological features, influencing speech production choices. These findings highlight the interplay between prosodically driven phonetic variation and gender: speakers actively control the degree of vowel nasalization, and this phonetic variation, in turn, is further shaped by gender, potentially evolving into a systematic sound change.
{"title":"Unveiling Denasalization as an Ongoing Sound Change: The Role of Prosody and Gender in Seoul Korean.","authors":"Jiyoung Jang, Jungah Lee, Jiyoung Lee, Sahyang Kim, Taehong Cho","doi":"10.1177/00238309251390431","DOIUrl":"https://doi.org/10.1177/00238309251390431","url":null,"abstract":"<p><p>This study examines variation in coarticulatory vowel nasalization in Seoul Korean as a function of prosodic boundaries and gender, exploring its role in an emerging denasalization sound change. Coarticulatory vowel nasality, measured by A1-P0, was analyzed in the word-initial vowels of /ma.mi/ across three prosodic boundary conditions (IP-initial, AP-initial, and Wd-initial) in 35 speakers in their 20s. Results show that phrase-initial vowels exhibit reduced nasality as part of domain-initial articulatory strengthening, suggesting that denasalization of word-initial nasal consonants extends to the following vowel, reducing its coarticulatory nasalization and thus signaling the progression of a position-driven sound change. Significant gender differences were found: male speakers consistently adhere to this change throughout the vowel, exhibiting greater reductions in coarticulatory vowel nasalization in phrase-initial contexts. In contrast, female speakers retain higher nasality levels in both phrase-initial and phrase-medial positions by regulating the coarticulatory process. These gender-related differences may reflect socially grounded perceptions of nasality and/or female speakers' tendency to preserve phonological features, influencing speech production choices. These findings highlight the interplay between prosodically driven phonetic variation and gender: speakers actively control the degree of vowel nasalization, and this phonetic variation, in turn, is further shaped by gender, potentially evolving into a systematic sound change.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251390431"},"PeriodicalIF":1.1,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145769981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-17DOI: 10.1177/00238309251389573
Marc Barnard, Scott Kunkel, Rémi Lamarque, Adam J Chong
Previous work has shown that L2-accented speech incurs a processing cost even when accurately understood. It remains unknown, however, whether an online processing cost is found when listeners process speech produced in L1 accents that are not their own. In this study, we examine this question by using comparative pupil dilation as a measure of cognitive load. Participants from the South of England heard sentences produced in four different accents: Southern British English (the listeners' own familiar accent), American English (a standard L1 accent widely used in media), Glaswegian English (a less-familiar regional L1 accent), and Mandarin Chinese-accented English (an L2 English accent). Results show that Chinese-accented speech elicited significantly larger pupil dilation responses compared with Southern British English. Speech from less-familiar L1 accents elicited pupil dilation responses of different shapes and trajectories, suggesting differences in processing of these accents. Furthermore, participants showed larger mean pupil dilation when they heard relatively less-familiar L1 American-accented speech than when hearing Glaswegian English. Interestingly, this effect was found despite participants self-reporting that they were less familiar with the Glaswegian accent and found it more effortful to comprehend compared with American English. These results suggest that accurately perceived and highly intelligible L1 accents such as American English also incur a cognitive cost in processing, but to a smaller extent compared with L2-accented speech. We discuss the implications of our findings for the relationship between exposure, subjective effortfulness measures, and pupil dilation responses.
{"title":"Listening Effort Across Non-Native and Regional Accents: A Pupillometry Study.","authors":"Marc Barnard, Scott Kunkel, Rémi Lamarque, Adam J Chong","doi":"10.1177/00238309251389573","DOIUrl":"https://doi.org/10.1177/00238309251389573","url":null,"abstract":"<p><p>Previous work has shown that L2-accented speech incurs a processing cost even when accurately understood. It remains unknown, however, whether an online processing cost is found when listeners process speech produced in L1 accents that are not their own. In this study, we examine this question by using comparative pupil dilation as a measure of cognitive load. Participants from the South of England heard sentences produced in four different accents: Southern British English (the listeners' own familiar accent), American English (a standard L1 accent widely used in media), Glaswegian English (a less-familiar regional L1 accent), and Mandarin Chinese-accented English (an L2 English accent). Results show that Chinese-accented speech elicited significantly larger pupil dilation responses compared with Southern British English. Speech from less-familiar L1 accents elicited pupil dilation responses of different shapes and trajectories, suggesting differences in processing of these accents. Furthermore, participants showed larger mean pupil dilation when they heard relatively less-familiar L1 American-accented speech than when hearing Glaswegian English. Interestingly, this effect was found despite participants self-reporting that they were less familiar with the Glaswegian accent and found it more effortful to comprehend compared with American English. These results suggest that accurately perceived and highly intelligible L1 accents such as American English also incur a cognitive cost in processing, but to a smaller extent compared with L2-accented speech. We discuss the implications of our findings for the relationship between exposure, subjective effortfulness measures, and pupil dilation responses.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251389573"},"PeriodicalIF":1.1,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145769993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-16DOI: 10.1177/00238309251410999
{"title":"Corrigendum to \"Sources of Intelligibility of Distant Languages: An Empirical Study\".","authors":"","doi":"10.1177/00238309251410999","DOIUrl":"https://doi.org/10.1177/00238309251410999","url":null,"abstract":"","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251410999"},"PeriodicalIF":1.1,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145764562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-15DOI: 10.1177/00238309251390505
Nicholas B Aoki, Georgia Zellou
Certain studies report facilitatory effects of multiple-talker exposure on cross-talker generalization of L2-accented speech (often defined as greater comprehension of novel talkers). However, a confound exists in prior work: do multiple-talker exposure benefits stem from the greater number of talkers (numerosity) or greater phonological variability (heterogeneity)? This study examined how apparent talker variability and speaking style affect L2-accent adaptation, while keeping phonological variation as constant as possible across exposure conditions. L1-English participants transcribed sentences in noise for a single Mandarin-accented English talker in an exposure phase and a novel Mandarin-accented English speaker in a test phase (a control condition received no exposure). Although all exposure stimuli came from one speaker, half of the listeners who received exposure were led to believe that multiple talkers were present by shifting the F0 and formants of a subset of sentences. We find: (a) when the test talker produces casual speech, all critical conditions with exposure enhance generalization (i.e., greater comprehension of the test talker relative to control); (b) when the test talker produces hard-of-hearing-directed speech, there is no difference in transcription accuracy between the control and critical conditions; and (c) when the test talker produces casual speech, generalization is greatest when listeners are exposed to multiple apparent talkers, but only given speaking style similarity between exposure and test (i.e., when the exposure phase also presents casual speech). This work lends credence to numerosity accounts-given a minimal change in phonological variability, the illusion of multiple-talker exposure can facilitate cross-talker generalization of L2-accented speech.
{"title":"Apparent Talker Variability and Speaking Style Similarity Can Enhance Comprehension of Novel L2-Accented Talkers.","authors":"Nicholas B Aoki, Georgia Zellou","doi":"10.1177/00238309251390505","DOIUrl":"https://doi.org/10.1177/00238309251390505","url":null,"abstract":"<p><p>Certain studies report facilitatory effects of multiple-talker exposure on cross-talker generalization of L2-accented speech (often defined as greater comprehension of novel talkers). However, a confound exists in prior work: do multiple-talker exposure benefits stem from the greater number of talkers (numerosity) or greater phonological variability (heterogeneity)? This study examined how apparent talker variability and speaking style affect L2-accent adaptation, while keeping phonological variation as constant as possible across exposure conditions. L1-English participants transcribed sentences in noise for a single Mandarin-accented English talker in an exposure phase and a novel Mandarin-accented English speaker in a test phase (a control condition received no exposure). Although all exposure stimuli came from one speaker, half of the listeners who received exposure were led to believe that multiple talkers were present by shifting the F0 and formants of a subset of sentences. We find: (a) when the test talker produces casual speech, all critical conditions with exposure enhance generalization (i.e., greater comprehension of the test talker relative to control); (b) when the test talker produces hard-of-hearing-directed speech, there is no difference in transcription accuracy between the control and critical conditions; and (c) when the test talker produces casual speech, generalization is greatest when listeners are exposed to multiple apparent talkers, but only given speaking style similarity between exposure and test (i.e., when the exposure phase also presents casual speech). This work lends credence to numerosity accounts-given a minimal change in phonological variability, the illusion of multiple-talker exposure can facilitate cross-talker generalization of L2-accented speech.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251390505"},"PeriodicalIF":1.1,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145764569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-14DOI: 10.1177/00238309251394372
Natalia Banasik-Jemielniak, Magdalena Kochańska, Maria Obarska, Maria Zajączkowska, Joanna Świderska, Ewa Haman
This study compared online and face-to-face (f2f) testing using the short Polish version of the LITMUS Sentence Repetition Task (SRep) with multilingual and monolingual Polish-speaking children. The shift to remote testing during the COVID-19 pandemic prompted questions about whether online methods yield results comparable with in-person testing for assessing multilingual children's grammatical abilities. Reliable online testing could enhance access to underrepresented populations, enabling families from diverse backgrounds to participate from home. We tested 92 multilingual children (speaking Polish and English or German) and 55 monolingual Polish-speaking children aged 4;6-7;6. Each child completed the SRep task twice (online and f2f) in a counterbalanced order. Results showed better performance on f2f tasks for both groups. Multilingual children improved on their second attempt, regardless of format, while monolinguals consistently scored higher in the f2f condition. These findings indicate differences in performance across testing modalities and the need to adapt and norm the SRep task for both online and f2f administration separately.
{"title":"Comparing Online and Face-to-Face Administration of the Polish Sentence Repetition Task in Monolingual and Multilingual Children: Higher Scores in Face-to-Face Testing.","authors":"Natalia Banasik-Jemielniak, Magdalena Kochańska, Maria Obarska, Maria Zajączkowska, Joanna Świderska, Ewa Haman","doi":"10.1177/00238309251394372","DOIUrl":"https://doi.org/10.1177/00238309251394372","url":null,"abstract":"<p><p>This study compared online and face-to-face (f2f) testing using the short Polish version of the LITMUS Sentence Repetition Task (SRep) with multilingual and monolingual Polish-speaking children. The shift to remote testing during the COVID-19 pandemic prompted questions about whether online methods yield results comparable with in-person testing for assessing multilingual children's grammatical abilities. Reliable online testing could enhance access to underrepresented populations, enabling families from diverse backgrounds to participate from home. We tested 92 multilingual children (speaking Polish and English or German) and 55 monolingual Polish-speaking children aged 4;6-7;6. Each child completed the SRep task twice (online and f2f) in a counterbalanced order. Results showed better performance on f2f tasks for both groups. Multilingual children improved on their second attempt, regardless of format, while monolinguals consistently scored higher in the f2f condition. These findings indicate differences in performance across testing modalities and the need to adapt and norm the SRep task for both online and f2f administration separately.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251394372"},"PeriodicalIF":1.1,"publicationDate":"2025-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145758397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-10DOI: 10.1177/00238309251383547
Ji-Eun Kim, Volker Dellwo
Aegyo is a culturally salient speaking style in Korea, often described as a baby talk-like register used by young adults to convey affection or cuteness. Yet, its acoustic profile and perception across genders remain understudied. This study investigates the acoustic and perceptual characteristics of aegyo through a production study and a perception study. In the production study, 12 native Seoul Korean speakers (six females, six males) produced sentences in both aegyo and non-aegyo styles. Acoustic analyses revealed that aegyo is characterized by significantly longer vowel durations, slower speech rate, and higher mean and maximum F0, along with greater variability in F0 and vowel duration: F0 range was significant only for male speakers. In addition, "hyper-score" analyses showed that male speakers exhibited more increases in mean and maximum F0 compared with female speakers. In the perception study, 49 Korean listeners (25 females, 24 males) judged whether the stimuli were produced in aegyo. Results showed a significant interaction between sensitivity and bias: listeners were less accurate but more prone to label the stimuli as aegyo when the speaker was male, whereas they were more accurate and more conservative when the speaker was female. These findings suggest that listeners interpret the same speaking style differently depending on speaker gender. Overall, our results support the Speaker Design model by evidencing that speakers systematically shift their vocal behavior to construct social identity, while also showing that listener interpretation of such shifts may vary by speaker gender.
{"title":"Acoustic and Perceptual Differences of <i>Aegyo</i> Speaking Style Across Gender in Seoul Korean.","authors":"Ji-Eun Kim, Volker Dellwo","doi":"10.1177/00238309251383547","DOIUrl":"https://doi.org/10.1177/00238309251383547","url":null,"abstract":"<p><p><i>Aegyo</i> is a culturally salient speaking style in Korea, often described as a baby talk-like register used by young adults to convey affection or cuteness. Yet, its acoustic profile and perception across genders remain understudied. This study investigates the acoustic and perceptual characteristics of <i>aegyo</i> through a production study and a perception study. In the production study, 12 native Seoul Korean speakers (six females, six males) produced sentences in both <i>aegyo</i> and non-<i>aegyo</i> styles. Acoustic analyses revealed that <i>aegyo</i> is characterized by significantly longer vowel durations, slower speech rate, and higher mean and maximum F0, along with greater variability in F0 and vowel duration: F0 range was significant only for male speakers. In addition, \"hyper-score\" analyses showed that male speakers exhibited more increases in mean and maximum F0 compared with female speakers. In the perception study, 49 Korean listeners (25 females, 24 males) judged whether the stimuli were produced in <i>aegyo</i>. Results showed a significant interaction between sensitivity and bias: listeners were less accurate but more prone to label the stimuli as <i>aegyo</i> when the speaker was male, whereas they were more accurate and more conservative when the speaker was female. These findings suggest that listeners interpret the same speaking style differently depending on speaker gender. Overall, our results support the Speaker Design model by evidencing that speakers systematically shift their vocal behavior to construct social identity, while also showing that listener interpretation of such shifts may vary by speaker gender.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251383547"},"PeriodicalIF":1.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-08DOI: 10.1177/00238309251388973
Jeffrey J Holliday, Eun Jong Kong
Short tongue pronunciation (STP) refers to an enregistered set of pronunciation variants in Korean that is popularly thought to result from underdeveloped speech articulation, having a physically shorter tongue, or intentionally imitating such a pronunciation. Because these explanations can be tied to children's speech, there is a potential role of gender in mediating the social evaluation of STP. To investigate this, we carried out a between-subjects (n = 474) survey on beliefs and attitudes toward STP produced by adult men and women. The results confirmed that STP is a familiar concept to Korean speakers, primarily referring to the stopping, affrication, fronting, and tensification of obstruents, resembling certain child-like speech patterns. While STP was generally perceived negatively, the gender of both the listener and the imagined talker influenced its evaluation. Stopping and affrication were more frequently associated with female talkers, and male listeners perceived female STP, but not male STP, as cute in certain sociopragmatic contexts. In contrast, fronting, which was more frequently associated with male talkers, was regarded as an innate speech deficit and consistently evaluated negatively, regardless of talker gender. These findings highlight the complex interplay of sociophonetic perception and gendered expectations in Korean.
{"title":"The Role of Gender in the Social Evaluation of Korean Short Tongue Pronunciation.","authors":"Jeffrey J Holliday, Eun Jong Kong","doi":"10.1177/00238309251388973","DOIUrl":"https://doi.org/10.1177/00238309251388973","url":null,"abstract":"<p><p>Short tongue pronunciation (STP) refers to an enregistered set of pronunciation variants in Korean that is popularly thought to result from underdeveloped speech articulation, having a physically shorter tongue, or intentionally imitating such a pronunciation. Because these explanations can be tied to children's speech, there is a potential role of gender in mediating the social evaluation of STP. To investigate this, we carried out a between-subjects (<i>n</i> = 474) survey on beliefs and attitudes toward STP produced by adult men and women. The results confirmed that STP is a familiar concept to Korean speakers, primarily referring to the stopping, affrication, fronting, and tensification of obstruents, resembling certain child-like speech patterns. While STP was generally perceived negatively, the gender of both the listener and the imagined talker influenced its evaluation. Stopping and affrication were more frequently associated with female talkers, and male listeners perceived female STP, but not male STP, as cute in certain sociopragmatic contexts. In contrast, fronting, which was more frequently associated with male talkers, was regarded as an innate speech deficit and consistently evaluated negatively, regardless of talker gender. These findings highlight the complex interplay of sociophonetic perception and gendered expectations in Korean.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251388973"},"PeriodicalIF":1.1,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145702909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}