Pub Date : 2025-12-01Epub Date: 2024-03-10DOI: 10.1177/00238309241230895
Shuwen Chen, D H Whalen, Peggy Pik Ki Mok
Rhotic sounds are some of the most challenging sounds for L2 learners to acquire. This study investigates the production of English rhotic sounds by Mandarin-English bilinguals with two English proficiency levels. The production of the English /ɹ/ by 17 Mandarin-English bilinguals was examined with ultrasound imaging and compared with the production of native English speakers. The ultrasound data show that bilinguals can produce native-like bunched and retroflex gestures, but the distributional pattern of tongue shapes in various contexts differs from that of native speakers. Acoustically, the English /ɹ/ produced by bilinguals had a higher F3 and F3-F2, as well as some frication noise in prevocalic /ɹ/, features similar to the Mandarin /ɹ/. Mandarin-English bilinguals did produce language-specific phonetic realizations for the English and Mandarin /ɹ/s. There was a positive correlation between language proficiency and English-specific characteristics of /ɹ/ by Mandarin-English bilinguals in both articulation and acoustics. Phonetic similarities facilitated rather than hindered L2 speech learning in production: Mandarin-English bilinguals showed better performance in producing the English /ɹ/ allophones that were more similar to the Mandarin /ɹ/ (syllabic and postvocalic /ɹ/s) than producing the English /ɹ/ allophone that was less similar to the Mandarin /ɹ/ (prevocalic /ɹ/). This study contributes to our understanding of the mechanism of speech production in late bilinguals.
{"title":"Production of the English /ɹ/ by Mandarin-English Bilingual Speakers.","authors":"Shuwen Chen, D H Whalen, Peggy Pik Ki Mok","doi":"10.1177/00238309241230895","DOIUrl":"10.1177/00238309241230895","url":null,"abstract":"<p><p>Rhotic sounds are some of the most challenging sounds for L2 learners to acquire. This study investigates the production of English rhotic sounds by Mandarin-English bilinguals with two English proficiency levels. The production of the English /ɹ/ by 17 Mandarin-English bilinguals was examined with ultrasound imaging and compared with the production of native English speakers. The ultrasound data show that bilinguals can produce native-like bunched and retroflex gestures, but the distributional pattern of tongue shapes in various contexts differs from that of native speakers. Acoustically, the English /ɹ/ produced by bilinguals had a higher F3 and F3-F2, as well as some frication noise in prevocalic /ɹ/, features similar to the Mandarin /ɹ/. Mandarin-English bilinguals did produce language-specific phonetic realizations for the English and Mandarin /ɹ/s. There was a positive correlation between language proficiency and English-specific characteristics of /ɹ/ by Mandarin-English bilinguals in both articulation and acoustics. Phonetic similarities facilitated rather than hindered L2 speech learning in production: Mandarin-English bilinguals showed better performance in producing the English /ɹ/ allophones that were more similar to the Mandarin /ɹ/ (syllabic and postvocalic /ɹ/s) than producing the English /ɹ/ allophone that was less similar to the Mandarin /ɹ/ (prevocalic /ɹ/). This study contributes to our understanding of the mechanism of speech production in late bilinguals.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"794-831"},"PeriodicalIF":1.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140095028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2023-12-29DOI: 10.1177/00238309231215355
Ander Beristain
Segment-to-segment timing overlap between Vowel-Nasal gestures in /VN/ sequences varies cross-linguistically. However, how bilinguals may adjust those timing gestures is still unanswered. Regarding timing strategies in a second language (L2), research finds that native (L1) strategies can be partially transferred to the L2, and that higher L2 proficiency promotes a more successful phonetic performance. My goal is to answer whether bilingual speakers can adjust their L1 coarticulatory settings in their L2 and to observe whether their L2 accentedness plays a role in ultimate attainment. Ten native speakers of Spanish (L1Sp) who were highly proficient L2 English speakers participated in Spanish and English read-aloud tasks. A control group of 16 L1 English speakers undertook the English experiment. Aerodynamic data were collected using pressure transducers. Each participant produced tokens with nasalized vowels in CVN# words and oral vowels in CV(CV) words. Four linguistically trained judges (two per target language) evaluated a set of pseudo-randomized sentences produced by the participants containing words with nasalized vowels and rated the speech on a 1 (heavily accented) to 9 (native-like) Likert-type scale. Measurements for onset and degree of overall nasality were obtained. Results indicate the L1Sp group can accommodate gestural timing strategies cross-linguistically as they exhibit an earlier nasality onset and increment nasality proportion in L2 English in a native-like manner. In addition, a positive correlation between greater vowel nasality degree and native-like accentedness in the L2 was found, suggesting L2 timing settings might be specified in higher spoken proficiency levels.
{"title":"Gestural Timing Patterns of Nasality in Highly Proficient Spanish Learners of English: Aerodynamic Evidence.","authors":"Ander Beristain","doi":"10.1177/00238309231215355","DOIUrl":"10.1177/00238309231215355","url":null,"abstract":"<p><p>Segment-to-segment timing overlap between Vowel-Nasal gestures in /VN/ sequences varies cross-linguistically. However, how bilinguals may adjust those timing gestures is still unanswered. Regarding timing strategies in a second language (L2), research finds that native (L1) strategies can be partially transferred to the L2, and that higher L2 proficiency promotes a more successful phonetic performance. My goal is to answer whether bilingual speakers can adjust their L1 coarticulatory settings in their L2 and to observe whether their L2 accentedness plays a role in ultimate attainment. Ten native speakers of Spanish (L1Sp) who were highly proficient L2 English speakers participated in Spanish and English read-aloud tasks. A control group of 16 L1 English speakers undertook the English experiment. Aerodynamic data were collected using pressure transducers. Each participant produced tokens with nasalized vowels in CVN# words and oral vowels in CV(CV) words. Four linguistically trained judges (two per target language) evaluated a set of pseudo-randomized sentences produced by the participants containing words with nasalized vowels and rated the speech on a 1 (heavily accented) to 9 (native-like) Likert-type scale. Measurements for onset and degree of overall nasality were obtained. Results indicate the L1Sp group can accommodate gestural timing strategies cross-linguistically as they exhibit an earlier nasality onset and increment nasality proportion in L2 English in a native-like manner. In addition, a positive correlation between greater vowel nasality degree and native-like accentedness in the L2 was found, suggesting L2 timing settings might be specified in higher spoken proficiency levels.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"766-793"},"PeriodicalIF":1.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139059054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-30DOI: 10.1177/00238309251370980
Christian DiCanio, Jared Sharp
Itunyoso Triqui (Otomanguean: Mexico) has a typologically uncommon contrast between singleton and geminate consonants which occurs only in word-initial position of monosyllabic words. In this paper, we examine how functional factors contribute to the observed phonetic variation in production for this marked contrast. Geminate and singleton onsets are not equally distributed in the language-singleton onsets greatly outnumber geminate onsets. Moreover, the distribution of geminate and singleton onsets varies by manner of articulation and consonant onset. Functional factors also vary across the contrast space. In our first study, we focus on durational data from a smaller, 2-hr corpus and test the degree to which functional factors (Shannon entropy, functional status, lexical competitor size, and segment frequency) influence the production of the contrast. With the exception of entropy, we find that several of these factors play a role in predicting the robustness/hyperarticulation of the contrast realization. Content words with onset singleton obstruents are more likely to be lengthened than content words with onset singleton sonorants. Segments with a larger token frequency from a larger, 90K word corpus are more likely to be hyperarticulated. In our second study, we examine how the observed durational factors lead to differential patterns of consonant undershoot by examining patterns of lenition. Combined, our findings demonstrate how functional factors influence language variation and may lead toward particular diachronic trajectories in the evolution of rare sound contrasts like these in human language.
{"title":"Just How Contrastive Is Word-Initial Consonant Length? Exploring the Itunyoso Triqui Spontaneous Speech Corpus.","authors":"Christian DiCanio, Jared Sharp","doi":"10.1177/00238309251370980","DOIUrl":"https://doi.org/10.1177/00238309251370980","url":null,"abstract":"<p><p>Itunyoso Triqui (Otomanguean: Mexico) has a typologically uncommon contrast between singleton and geminate consonants which occurs <i>only</i> in word-initial position of monosyllabic words. In this paper, we examine how functional factors contribute to the observed phonetic variation in production for this marked contrast. Geminate and singleton onsets are not equally distributed in the language-singleton onsets greatly outnumber geminate onsets. Moreover, the distribution of geminate and singleton onsets varies by manner of articulation and consonant onset. Functional factors also vary across the contrast space. In our first study, we focus on durational data from a smaller, 2-hr corpus and test the degree to which functional factors (Shannon entropy, functional status, lexical competitor size, and segment frequency) influence the production of the contrast. With the exception of entropy, we find that several of these factors play a role in predicting the robustness/hyperarticulation of the contrast realization. Content words with onset singleton obstruents are more likely to be lengthened than content words with onset singleton sonorants. Segments with a larger token frequency from a larger, 90K word corpus are more likely to be hyperarticulated. In our second study, we examine how the observed durational factors lead to differential patterns of consonant undershoot by examining patterns of lenition. Combined, our findings demonstrate how functional factors influence language variation and may lead toward particular diachronic trajectories in the evolution of rare sound contrasts like these in human language.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251370980"},"PeriodicalIF":1.1,"publicationDate":"2025-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145642451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-27DOI: 10.1177/00238309251378475
Sara Finley
Previous research exploring the learnability of vowel harmony (a phonetically natural pattern) and vowel disharmony (a phonetically unnatural pattern) has shown mixed evidence for a naturalness bias. This study aims to clarify these mixed results by introducing a more sensitive and indirect measure of learning-a modified phoneme monitoring task. Participants listened to CV-me/CV-mo words and pressed a button to indicate the final vowel (either [e] or [o]). In the first set of trials, participants responded to words that either always obeyed harmony (HarmonyFirst) or always obeyed disharmony (DisharmonyFirst). In the second set of trials, the rule switched. Results from two studies support a learning bias for vowel harmony; participants generally showed greater decreases in response times for harmonic blocks, and greater increases in response time when the rule switched from vowel harmony to disharmony.
{"title":"Biases for Vowel Harmony Over Disharmony in Phoneme Monitoring.","authors":"Sara Finley","doi":"10.1177/00238309251378475","DOIUrl":"https://doi.org/10.1177/00238309251378475","url":null,"abstract":"<p><p>Previous research exploring the learnability of vowel harmony (a phonetically natural pattern) and vowel disharmony (a phonetically unnatural pattern) has shown mixed evidence for a naturalness bias. This study aims to clarify these mixed results by introducing a more sensitive and indirect measure of learning-a modified phoneme monitoring task. Participants listened to CV-me/CV-mo words and pressed a button to indicate the final vowel (either [e] or [o]). In the first set of trials, participants responded to words that either always obeyed harmony (HarmonyFirst) or always obeyed disharmony (DisharmonyFirst). In the second set of trials, the rule switched. Results from two studies support a learning bias for vowel harmony; participants generally showed greater decreases in response times for harmonic blocks, and greater increases in response time when the rule switched from vowel harmony to disharmony.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251378475"},"PeriodicalIF":1.1,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145642431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.1177/00238309251380518
Darinka Verdonik, Peter Rupnik, Nikola Ljubešić
This study investigates how speakers adapt their use of disfluencies in public versus private speech settings. Existing studies suggest systematic differences in disfluency rates, depending on who we are communicating with, how interactive the communication is, how difficult the topic is, whether the interaction is broadcast or not, and whether the speech is pre-scripted or not. We aim to improve this understanding through analysis in the Slovenian language, using data from the Training Corpus of Spoken Slovenian ROG-Artur. We investigate whether quantitative differences in the use of disfluency exist between private and public speech, and aim to explain these differences by investigating the relationship between disfluency functions and the physical, social, cognitive, and other factors influencing communication behavior. Our results revealed significant differences in disfluency patterns: disfluencies, in general, are more frequent in private speech, whereas filled pauses, unrepaired pronunciations and blocks, are more common in public speech. We group disfluency functions into two general categories. In contextual analysis, we interpret that speakers reduce disfluencies in public speech due to its high relevance, formal expectations, partial pre-scripting, time constraints and advanced speaker skills, while the higher frequency of filled pauses, unrepaired pronunciations and blocks in public speech reflect the impact of longer dialog turns, time constraints and emotional stress. The findings of this study should be interpreted with caution, given the interpretative nature of qualitative analysis and the potential confounding effect of the involvement of different speakers in the public and private speech samples.
{"title":"Disfluencies in Public and Private Speech.","authors":"Darinka Verdonik, Peter Rupnik, Nikola Ljubešić","doi":"10.1177/00238309251380518","DOIUrl":"10.1177/00238309251380518","url":null,"abstract":"<p><p>This study investigates how speakers adapt their use of disfluencies in public versus private speech settings. Existing studies suggest systematic differences in disfluency rates, depending on who we are communicating with, how interactive the communication is, how difficult the topic is, whether the interaction is broadcast or not, and whether the speech is pre-scripted or not. We aim to improve this understanding through analysis in the Slovenian language, using data from the Training Corpus of Spoken Slovenian ROG-Artur. We investigate whether quantitative differences in the use of disfluency exist between private and public speech, and aim to explain these differences by investigating the relationship between disfluency functions and the physical, social, cognitive, and other factors influencing communication behavior. Our results revealed significant differences in disfluency patterns: disfluencies, in general, are more frequent in private speech, whereas filled pauses, unrepaired pronunciations and blocks, are more common in public speech. We group disfluency functions into two general categories. In contextual analysis, we interpret that speakers reduce disfluencies in public speech due to its high relevance, formal expectations, partial pre-scripting, time constraints and advanced speaker skills, while the higher frequency of filled pauses, unrepaired pronunciations and blocks in public speech reflect the impact of longer dialog turns, time constraints and emotional stress. The findings of this study should be interpreted with caution, given the interpretative nature of qualitative analysis and the potential confounding effect of the involvement of different speakers in the public and private speech samples.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251380518"},"PeriodicalIF":1.1,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145490873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-02DOI: 10.1177/00238309251369482
Na Hu, Aoju Chen, Hugo Quené, Ted J M Sanders
Numerous studies have established that prosody plays an important role in expressing meanings and functions. However, it remains unknown whether prosody is employed to convey the distinction between subjective causality (CLAIM-ARGUMENT) and objective causality (CONSEQUENCE-CAUSE). This study aimed to address this issue in English, where both types of causality are typically expressed using the same connective. Two production experiments were conducted, focusing on causality in backward order (Q "because" P) and in forward order (P "so" Q), respectively. The results show that subjective causality exhibited a larger F0 range, less integrated prosody, and a distinctive F0 contour shape compared with objective causality. These findings highlight the role of prosody in expressing subjective and objective causality in the absence of explicit lexical markers in English.
{"title":"The Role of Prosody in Expressing Subjective and Objective Causality in English.","authors":"Na Hu, Aoju Chen, Hugo Quené, Ted J M Sanders","doi":"10.1177/00238309251369482","DOIUrl":"https://doi.org/10.1177/00238309251369482","url":null,"abstract":"<p><p>Numerous studies have established that prosody plays an important role in expressing meanings and functions. However, it remains unknown whether prosody is employed to convey the distinction between subjective causality (CLAIM-ARGUMENT) and objective causality (CONSEQUENCE-CAUSE). This study aimed to address this issue in English, where both types of causality are typically expressed using the same connective. Two production experiments were conducted, focusing on causality in backward order (<i>Q</i> \"because\" <i>P</i>) and in forward order (<i>P</i> \"so\" <i>Q</i>), respectively. The results show that subjective causality exhibited a larger F0 range, less integrated prosody, and a distinctive F0 contour shape compared with objective causality. These findings highlight the role of prosody in expressing subjective and objective causality in the absence of explicit lexical markers in English.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251369482"},"PeriodicalIF":1.1,"publicationDate":"2025-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145433005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study investigates young school-aged children's knowledge (at 4-7 years) of accurate English word-initial onset clusters. By this age, we expect children to be mostly accurate in producing #CC clusters (rather than repairing them with deletion or epenthesis). We ask how well can they recognize and reject cluster repair errors, in both real and nonce word tasks. The results suggest that these learners' cluster judgment skills lag behind their cluster production abilities, and that asymmetries in error types do not overall align between the two domains. Perceptual errors are made most often when comparing clusters with epenthesis repairs, not deletion, and the cluster's sonority profile does not directly influence error rates. After comparing these findings with similar results from adult L2 English speakers as well, we discuss the ways in which issues like recoverability, salience, and contiguity can account for our findings. We also suggest that more work on phonological knowledge and judgments in older children will provide a broader understanding of sound pattern acquisition across development.
{"title":"Learning Accurate Onset Clusters: Perception Lags Behind Production.","authors":"Claire Moore-Cantwell, Anne-Michelle Tessier, Ashley Farris-Trimble","doi":"10.1177/00238309251362881","DOIUrl":"https://doi.org/10.1177/00238309251362881","url":null,"abstract":"<p><p>This study investigates young school-aged children's knowledge (at 4-7 years) of accurate English word-initial onset clusters. By this age, we expect children to be mostly accurate in producing #CC clusters (rather than repairing them with deletion or epenthesis). We ask how well can they recognize and reject cluster repair errors, in both real and nonce word tasks. The results suggest that these learners' cluster judgment skills lag behind their cluster production abilities, and that asymmetries in error types do not overall align between the two domains. Perceptual errors are made most often when comparing clusters with epenthesis repairs, not deletion, and the cluster's sonority profile does not directly influence error rates. After comparing these findings with similar results from adult L2 English speakers as well, we discuss the ways in which issues like recoverability, salience, and contiguity can account for our findings. We also suggest that more work on phonological knowledge and judgments in older children will provide a broader understanding of sound pattern acquisition across development.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251362881"},"PeriodicalIF":1.1,"publicationDate":"2025-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145318725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-09DOI: 10.1177/00238309251374295
Joan Birulés, Mireia Marimon, Alexandre Duroyal, Anne Vilain, Gérard Bailly, Mathilde Fort
To segment words in unfamiliar speech, listeners are known to exploit both native prosodic cues and statistical cues available in the speech signal. However, how and when these cues are combined remains a matter of debate. Here, we studied how transitional probabilities (TPs) and prosodic phrasal boundaries are combined by French speakers to segment words. Since French does not have lexical stress, prosodic phrasal boundaries unambiguously signal word boundaries, providing a unique possibility to test whether prosodic cues can overcome statistical ones, and constrain further statistically based segmentation. We tested French adults in an artificial speech segmentation task, manipulating the consistency between prosodic and TP cues, signaling either the same or different word boundaries. Results showed that participants favored prosodic phrasal boundaries over TPs, regardless of exposure time to the speech stream (Experiment 1: 3.5 minutes; Experiment 2: 7 min), supporting a prosodically driven statistical segmentation of the speech stream.
{"title":"French Speakers Prefer Prosody Over Statistics to Segment Speech.","authors":"Joan Birulés, Mireia Marimon, Alexandre Duroyal, Anne Vilain, Gérard Bailly, Mathilde Fort","doi":"10.1177/00238309251374295","DOIUrl":"https://doi.org/10.1177/00238309251374295","url":null,"abstract":"<p><p>To segment words in unfamiliar speech, listeners are known to exploit both native prosodic cues and statistical cues available in the speech signal. However, how and when these cues are combined remains a matter of debate. Here, we studied how transitional probabilities (TPs) and prosodic phrasal boundaries are combined by French speakers to segment words. Since French does not have lexical stress, prosodic phrasal boundaries unambiguously signal word boundaries, providing a unique possibility to test whether prosodic cues can overcome statistical ones, and constrain further statistically based segmentation. We tested French adults in an artificial speech segmentation task, manipulating the consistency between prosodic and TP cues, signaling either the same or different word boundaries. Results showed that participants favored prosodic phrasal boundaries over TPs, regardless of exposure time to the speech stream (Experiment 1: 3.5 minutes; Experiment 2: 7 min), supporting a prosodically driven statistical segmentation of the speech stream.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251374295"},"PeriodicalIF":1.1,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145259807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-16DOI: 10.1177/00238309251361010
Robert Squizzero
Separate traditions of research have examined the impact of linguistic factors and social factors on the intelligibility, comprehensibility, and accentedness of second language (L2) speech, but studies that simultaneously investigate social and linguistic factors are rarely conducted on L2 languages other than English and outside of Western social and cultural environments. This study explores the effects of utterance-level prosody and speaker ethnicity on perception of L2 Mandarin Chinese speech. First language (L1) Mandarin listeners (n = 292) were asked to select the correct transcriptions of each of six sentences spoken by two male L2 Mandarin speakers who differed in their prosodic accuracy. While listening to each set of sentences, a picture of an Asian face or a White face was displayed on the listener's screen. Results indicate that participants were significantly more likely to select the correct transcription of each sentence both when they heard the speaker with high prosodic accuracy and when they believed that the speaker was ethnically Chinese. Listeners also rated speakers' comprehensibility, accentedness, and perceived personal characteristics; listeners rated a speaker with higher prosodic accuracy or believed to be ethnically Chinese as more comprehensible, less accented, and higher on perceived personal characteristics. This study demonstrates that a link between linguistic and social factors exists in processing L2 speech, even outside of the social, cultural, and linguistic environments typically used as a setting for investigation of L2 speech perception, and it explores implications for L2 Mandarin pronunciation teaching.
{"title":"The Effects of Perceived Ethnicity and Prosodic Accuracy on Intelligibility, Comprehensibility, and Accentedness in L2 Mandarin Chinese.","authors":"Robert Squizzero","doi":"10.1177/00238309251361010","DOIUrl":"https://doi.org/10.1177/00238309251361010","url":null,"abstract":"<p><p>Separate traditions of research have examined the impact of linguistic factors and social factors on the intelligibility, comprehensibility, and accentedness of second language (L2) speech, but studies that simultaneously investigate social and linguistic factors are rarely conducted on L2 languages other than English and outside of Western social and cultural environments. This study explores the effects of utterance-level prosody and speaker ethnicity on perception of L2 Mandarin Chinese speech. First language (L1) Mandarin listeners (<i>n</i> = 292) were asked to select the correct transcriptions of each of six sentences spoken by two male L2 Mandarin speakers who differed in their prosodic accuracy. While listening to each set of sentences, a picture of an Asian face or a White face was displayed on the listener's screen. Results indicate that participants were significantly more likely to select the correct transcription of each sentence both when they heard the speaker with high prosodic accuracy and when they believed that the speaker was ethnically Chinese. Listeners also rated speakers' comprehensibility, accentedness, and perceived personal characteristics; listeners rated a speaker with higher prosodic accuracy or believed to be ethnically Chinese as more comprehensible, less accented, and higher on perceived personal characteristics. This study demonstrates that a link between linguistic and social factors exists in processing L2 speech, even outside of the social, cultural, and linguistic environments typically used as a setting for investigation of L2 speech perception, and it explores implications for L2 Mandarin pronunciation teaching.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251361010"},"PeriodicalIF":1.1,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145070813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-11DOI: 10.1177/00238309251368294
Hyunjung Joo, Mariapaola D'Imperio
In this study, we tested whether the perception of pitch contours within a lexical pitch accent can be better understood through tonal targets in the Autosegmental-Metrical (AM) model or as an entire tonal configuration identification. Specifically, a categorization experiment was conducted to see how South Kyungsang Korean (SKK) listeners perceive their high (H) and rising (LH) lexical pitch accents. Auditory stimuli were manipulated depending on H peak alignment (earlier vs. later), rise shape (domed or "convex" vs. scooped or "concave"), or segmental duration (shorter vs. longer). Results showed that F0 rise shape and segmental duration influenced SKK listeners' categorization, while no effect of peak alignment was observed. Specifically, they responded to more scooped shapes as an LH, while more domed shapes were mainly assigned to H responses. Moreover, shorter duration induced a H categorization, while longer duration was related to an LH. Results suggest that SKK listeners use both F0 shape and segmental duration as important cues for tonal contrast, though F0 shape shows stronger categorical effect than duration. Thus, F0 shape information is important to determine phonological representation of lexical pitch accents, as opposed to strict tonal alignment defined in Autosegmental-Metrical theory.
{"title":"The Perception of Lexical Pitch Accent in South Kyungsang Korean: The Relevance of Accent Shape.","authors":"Hyunjung Joo, Mariapaola D'Imperio","doi":"10.1177/00238309251368294","DOIUrl":"https://doi.org/10.1177/00238309251368294","url":null,"abstract":"<p><p>In this study, we tested whether the perception of pitch contours within a lexical pitch accent can be better understood through tonal targets in the Autosegmental-Metrical (AM) model or as an entire tonal configuration identification. Specifically, a categorization experiment was conducted to see how South Kyungsang Korean (SKK) listeners perceive their high (H) and rising (LH) lexical pitch accents. Auditory stimuli were manipulated depending on H peak alignment (earlier vs. later), rise shape (domed or \"convex\" vs. scooped or \"concave\"), or segmental duration (shorter vs. longer). Results showed that F0 rise shape and segmental duration influenced SKK listeners' categorization, while no effect of peak alignment was observed. Specifically, they responded to more scooped shapes as an LH, while more domed shapes were mainly assigned to H responses. Moreover, shorter duration induced a H categorization, while longer duration was related to an LH. Results suggest that SKK listeners use both F0 shape and segmental duration as important cues for tonal contrast, though F0 shape shows stronger categorical effect than duration. Thus, F0 shape information is important to determine phonological representation of lexical pitch accents, as opposed to strict tonal alignment defined in Autosegmental-Metrical theory.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309251368294"},"PeriodicalIF":1.1,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}