Pub Date : 2025-03-01Epub Date: 2024-06-14DOI: 10.1177/00238309241258162
Ronny Bujok, Antje S Meyer, Hans Rutger Bosker
Human communication is inherently multimodal. Auditory speech, but also visual cues can be used to understand another talker. Most studies of audiovisual speech perception have focused on the perception of speech segments (i.e., speech sounds). However, less is known about the influence of visual information on the perception of suprasegmental aspects of speech like lexical stress. In two experiments, we investigated the influence of different visual cues (e.g., facial articulatory cues and beat gestures) on the audiovisual perception of lexical stress. We presented auditory lexical stress continua of disyllabic Dutch stress pairs together with videos of a speaker producing stress on the first or second syllable (e.g., articulating VOORnaam or voorNAAM). Moreover, we combined and fully crossed the face of the speaker producing lexical stress on either syllable with a gesturing body producing a beat gesture on either the first or second syllable. Results showed that people successfully used visual articulatory cues to stress in muted videos. However, in audiovisual conditions, we were not able to find an effect of visual articulatory cues. In contrast, we found that the temporal alignment of beat gestures with speech robustly influenced participants' perception of lexical stress. These results highlight the importance of considering suprasegmental aspects of language in multimodal contexts.
{"title":"Audiovisual Perception of Lexical Stress: Beat Gestures and Articulatory Cues.","authors":"Ronny Bujok, Antje S Meyer, Hans Rutger Bosker","doi":"10.1177/00238309241258162","DOIUrl":"10.1177/00238309241258162","url":null,"abstract":"<p><p>Human communication is inherently multimodal. Auditory speech, but also visual cues can be used to understand another talker. Most studies of audiovisual speech perception have focused on the perception of speech segments (i.e., speech sounds). However, less is known about the influence of visual information on the perception of suprasegmental aspects of speech like lexical stress. In two experiments, we investigated the influence of different visual cues (e.g., facial articulatory cues and beat gestures) on the audiovisual perception of lexical stress. We presented auditory lexical stress continua of disyllabic Dutch stress pairs together with videos of a speaker producing stress on the first or second syllable (e.g., articulating <i>VOORnaam</i> or <i>voorNAAM</i>). Moreover, we combined and fully crossed the face of the speaker producing lexical stress on either syllable with a gesturing body producing a beat gesture on either the first or second syllable. Results showed that people successfully used visual articulatory cues to stress in muted videos. However, in audiovisual conditions, we were not able to find an effect of visual articulatory cues. In contrast, we found that the temporal alignment of beat gestures with speech robustly influenced participants' perception of lexical stress. These results highlight the importance of considering suprasegmental aspects of language in multimodal contexts.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"181-203"},"PeriodicalIF":1.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11831865/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141321984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01Epub Date: 2024-07-27DOI: 10.1177/00238309241261702
Ricky K W Chan, Bruce Xiao Wang
Fundamental frequency (F0) has been widely studied and used in the context of speaker discrimination and forensic voice comparison casework, but most previous studies focused on long-term F0 statistics. Lexical tone, the linguistically structured and dynamic aspects of F0, has received much less research attention. A main methodological issue lies on how tonal F0 should be parameterized for the best speaker discrimination performance. This paper compares the speaker discriminatory performance of three approaches with lexical tone modeling: discrete cosine transform (DCT), polynomial curve fitting, and quantitative target approximation (qTA). Results show that using parameters based on DCT and polynomials led to similarly promising performance, whereas those based on qTA generally yielded relatively poor performance. Implications modeling surface tonal F0 and the underlying articulatory processes for speaker discrimination are discussed.
{"title":"Modeling Lexical Tones for Speaker Discrimination.","authors":"Ricky K W Chan, Bruce Xiao Wang","doi":"10.1177/00238309241261702","DOIUrl":"10.1177/00238309241261702","url":null,"abstract":"<p><p>Fundamental frequency (F0) has been widely studied and used in the context of speaker discrimination and forensic voice comparison casework, but most previous studies focused on long-term F0 statistics. Lexical tone, the linguistically structured and dynamic aspects of F0, has received much less research attention. A main methodological issue lies on how tonal F0 should be parameterized for the best speaker discrimination performance. This paper compares the speaker discriminatory performance of three approaches with lexical tone modeling: discrete cosine transform (DCT), polynomial curve fitting, and quantitative target approximation (qTA). Results show that using parameters based on DCT and polynomials led to similarly promising performance, whereas those based on qTA generally yielded relatively poor performance. Implications modeling surface tonal F0 and the underlying articulatory processes for speaker discrimination are discussed.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"229-243"},"PeriodicalIF":1.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141768005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01Epub Date: 2024-06-10DOI: 10.1177/00238309241254350
Stephanie Kaucke, Marcel Schlechtweg
Previous research has shown that it is difficult for English speakers to distinguish the front rounded vowels /y/ and /ø/ from the back rounded vowels /u/ and /o/. In this study, we examine the effect of noise on this perceptual difficulty. In an Oddity Discrimination Task, English speakers without any knowledge of German were asked to discriminate between German-sounding pseudowords varying in the vowel both in quiet and in white noise at two signal-to-noise ratios (8 and 0 dB). In test trials, vowels of the same height were contrasted with each other, whereas a contrast with /a/ served as a control trial. Results revealed that a contrast with /a/ remained stable in every listening condition for both high and mid vowels. When contrasting vowels of the same height, however, there was a perceptual shift along the F2 dimension as the noise level increased. Although the /ø/-/o/ and particularly /y/-/u/ contrasts were the most difficult in quiet, accuracy on /i/-/y/ and /e/-/ø/ trials decreased immensely when the speech signal was masked. The German control group showed the same pattern, albeit less severe than the non-native group, suggesting that even in low-level tasks with pseudowords, there is a native advantage in speech perception in noise.
{"title":"English Speakers' Perception of Non-native Vowel Contrasts in Adverse Listening Conditions: A Discrimination Study on the German Front Rounded Vowels /y/ and /ø/.","authors":"Stephanie Kaucke, Marcel Schlechtweg","doi":"10.1177/00238309241254350","DOIUrl":"10.1177/00238309241254350","url":null,"abstract":"<p><p>Previous research has shown that it is difficult for English speakers to distinguish the front rounded vowels /y/ and /ø/ from the back rounded vowels /u/ and /o/. In this study, we examine the effect of noise on this perceptual difficulty. In an Oddity Discrimination Task, English speakers without any knowledge of German were asked to discriminate between German-sounding pseudowords varying in the vowel both in quiet and in white noise at two signal-to-noise ratios (8 and 0 dB). In test trials, vowels of the same height were contrasted with each other, whereas a contrast with /a/ served as a control trial. Results revealed that a contrast with /a/ remained stable in every listening condition for both high and mid vowels. When contrasting vowels of the same height, however, there was a perceptual shift along the F2 dimension as the noise level increased. Although the /ø/-/o/ and particularly /y/-/u/ contrasts were the most difficult in quiet, accuracy on /i/-/y/ and /e/-/ø/ trials decreased immensely when the speech signal was masked. The German control group showed the same pattern, albeit less severe than the non-native group, suggesting that even in low-level tasks with pseudowords, there is a native advantage in speech perception in noise.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"162-180"},"PeriodicalIF":1.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11831862/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141297207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01Epub Date: 2024-03-04DOI: 10.1177/00238309241230625
Tsung-Ying Chen
The starting-small effect is a cognitive advantage in language acquisition when learners begin by generalizing on regularities from structurally simple and shorter tokens in a skewed input distribution. Our study explored this effect as a potential explanation for the biased learning of opaque and transparent vowel harmony. In opaque vowel harmony, feature agreement occurs strictly between adjacent vowels, and an intervening "neutral vowel" blocks long-distance vowel harmony. Thus, opaque vowel harmony could be acquired even if learners start with structurally simpler and more frequent disyllabic tokens. Alternatively, transparent vowel harmony can only be observed in longer tokens demonstrating long-distance agreement by skipping a neutral vowel. Opaque vowel harmony is predicted to be learned more efficiently due to its compatibility with local dependency acquired via starting-small learning. In two artificial grammar learning experiments, learners were exposed to both vowel harmony patterns embedded in an equal number of disyllabic and trisyllabic tokens or a skewed distribution with twice as many disyllabic tokens. In Exp I, learners' test performance suggests the consistently biased learning of local and opaque vowel harmony with starting-small learning. Furthermore, in Exp II, the acquired vowel harmony patterns varied significantly by working memory capacity with a balanced but not skewed input distribution, presumably because of the ease of cognitive demand with starting-small learning.
起点小效应是语言习得中的一种认知优势,即学习者从结构简单且较短的词块开始,在倾斜的输入分布中归纳出规律性的东西。我们的研究将这种效应作为不透明和透明元音和谐学习偏差的潜在解释。在不透明元音和谐中,特征一致严格发生在相邻元音之间,中间的 "中性元音 "会阻碍长距离元音和谐。因此,即使学习者从结构更简单、频率更高的双音节标记开始学习,也能掌握不透明元音和谐。或者说,只有通过跳过一个中性元音来显示长距离元音和谐的较长标记中,才能观察到透明元音和谐。由于不透明元音和谐与通过起始小学习获得的局部依赖性相兼容,因此不透明元音和谐的学习效率预计会更高。在两个人工语法学习实验中,学习者同时接触了嵌入相同数量的双音节和三音节标记的元音和谐模式,或嵌入两倍双音节标记的倾斜分布的元音和谐模式。在实验一中,学习者的测试成绩表明,在起始学习量较小的情况下,对局部和不透明元音和谐的学习一直存在偏差。此外,在实验 II 中,在输入分布均衡而非倾斜的情况下,所获得的元音和谐模式因工作记忆容量的不同而有显著差异,这可能是因为起始量小的学习方式更容易满足认知要求。
{"title":"The \"Starting-Small\" Effect in Phonology: Evidence From Biased Learning of Opaque and Transparent Vowel Harmony.","authors":"Tsung-Ying Chen","doi":"10.1177/00238309241230625","DOIUrl":"10.1177/00238309241230625","url":null,"abstract":"<p><p>The starting-small effect is a cognitive advantage in language acquisition when learners begin by generalizing on regularities from structurally simple and shorter tokens in a skewed input distribution. Our study explored this effect as a potential explanation for the biased learning of opaque and transparent vowel harmony. In opaque vowel harmony, feature agreement occurs strictly between adjacent vowels, and an intervening \"neutral vowel\" blocks long-distance vowel harmony. Thus, opaque vowel harmony could be acquired even if learners start with structurally simpler and more frequent disyllabic tokens. Alternatively, transparent vowel harmony can only be observed in longer tokens demonstrating long-distance agreement by skipping a neutral vowel. Opaque vowel harmony is predicted to be learned more efficiently due to its compatibility with local dependency acquired via starting-small learning. In two artificial grammar learning experiments, learners were exposed to both vowel harmony patterns embedded in an equal number of disyllabic and trisyllabic tokens or a skewed distribution with twice as many disyllabic tokens. In Exp I, learners' test performance suggests the consistently biased learning of local and opaque vowel harmony with starting-small learning. Furthermore, in Exp II, the acquired vowel harmony patterns varied significantly by working memory capacity with a balanced but not skewed input distribution, presumably because of the ease of cognitive demand with starting-small learning.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"3-35"},"PeriodicalIF":1.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140023262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-29DOI: 10.1177/00238309241312983
Niveen Omar, Bracha Nir, Karen Banai
This study investigated the role of systematicity in word learning, focusing on Semitic morpho-phonology where words exhibit multiple levels of systematicity. Building upon previous research on phonological templates, we explored how systematicity based on such templates, whether they encode meanings or not, influenced word learning in preschool-age Hebrew-speaking children. We examined form-meaning systematicity, where words share phonological templates and carry similar categorical meanings of manner-of-motion (e.g., finupál and bizudáx carry the meaning of skipping), and form-only systematicity, where words are phonologically similar but do not share a meaning (e.g., finupál and bizudáx belong to different categories of manner-of-motion). We aimed to discern how these systematicity types impact the learning of the meaning of the word as a whole, that is, the encoding of visual form combined with manner-of-motion. Using novel Semitic-like stimuli, our experiments demonstrated that different types of systematicity involve different effects on word learning. Experiment 1 showed that form-meaning systematicity hindered the learning of the manner-of-motion. In contrast, Experiment 2 revealed that form systematicity facilitated learning these features. The findings suggest a complex interplay of top-down and bottom-up processes in word learning, expanding our understanding of systematicity in word learning.
{"title":"Effects of Systematicity on Word Learning in Preschool Children: The Case of Semitic Morpho-Phonology.","authors":"Niveen Omar, Bracha Nir, Karen Banai","doi":"10.1177/00238309241312983","DOIUrl":"https://doi.org/10.1177/00238309241312983","url":null,"abstract":"<p><p>This study investigated the role of systematicity in word learning, focusing on Semitic morpho-phonology where words exhibit multiple levels of systematicity. Building upon previous research on phonological templates, we explored how systematicity based on such templates, whether they encode meanings or not, influenced word learning in preschool-age Hebrew-speaking children. We examined form-meaning systematicity, where words share phonological templates and carry similar categorical meanings of manner-of-motion (e.g., <i>finupál</i> and <i>bizudáx</i> carry the meaning of skipping), and form-only systematicity, where words are phonologically similar but do not share a meaning (e.g., <i>finupál</i> and <i>bizudáx</i> belong to different categories of manner-of-motion). We aimed to discern how these systematicity types impact the learning of the meaning of the word as a whole, that is, the encoding of visual form combined with manner-of-motion. Using novel Semitic-like stimuli, our experiments demonstrated that different types of systematicity involve different effects on word learning. Experiment 1 showed that form-meaning systematicity hindered the learning of the manner-of-motion. In contrast, Experiment 2 revealed that form systematicity facilitated learning these features. The findings suggest a complex interplay of top-down and bottom-up processes in word learning, expanding our understanding of systematicity in word learning.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309241312983"},"PeriodicalIF":1.1,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143061395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-23DOI: 10.1177/00238309241311230
Ghada Khattab, Tamar Keren-Portnoy
Semitic languages such as Hebrew and Arabic are known for having a non-concatenative morphology: words are typically built of a combination of a consonantal root, typically tri-consonantal (e.g., k-t-b "related to writing" in Modern Standard Arabic (MSA)), with a prosodic template. Research on Hebrew language development suggests early sensitivity to frequently occurring templates. For the Arabic dialects, little is known about whether implicit sensitivity to non-concatenative morphology develops at a young age through exposure to speech, and how templatic the spoken language is in comparison to MSA. We focus on Lebanese Arabic. We hypothesized that prolonged contact with French and English may have "diluted" the salience of roots and patterns in the input. We used three different corpora of adult-directed-speech (ADS), child-directed-speech (CDS), and child speech. We analyzed the root and pattern structures in the 50 most frequent Lebanese Arabic word types in each corpus. We found fewer words with templatic patterns than expected among the most frequent words in ADS (35/50), even fewer in CDS (23/50) and still fewer in the children's target words (15/50). In addition, only a minority contains three root consonants in their surface forms: 22 in ADS, 15 in CDS, and only 7 in words targeted by the children. We conclude that Semitic structure is less evident in either input to children or words targeted by children aged 1-3 than has been assumed. We discuss implications for the development of sensitivity to templatic structure among Lebanese-acquiring children.
{"title":"How Templatic Is Arabic Input to Children? The Role of Child-Directed-Speech in the Acquisition of Semitic Morpho-Phonology.","authors":"Ghada Khattab, Tamar Keren-Portnoy","doi":"10.1177/00238309241311230","DOIUrl":"https://doi.org/10.1177/00238309241311230","url":null,"abstract":"<p><p>Semitic languages such as Hebrew and Arabic are known for having a non-concatenative morphology: words are typically built of a combination of a consonantal root, typically tri-consonantal (e.g., k-t-b \"related to writing\" in Modern Standard Arabic (MSA)), with a prosodic template. Research on Hebrew language development suggests early sensitivity to frequently occurring templates. For the Arabic dialects, little is known about whether implicit sensitivity to non-concatenative morphology develops at a young age through exposure to speech, and how templatic the spoken language is in comparison to MSA. We focus on Lebanese Arabic. We hypothesized that prolonged contact with French and English may have \"diluted\" the salience of roots and patterns in the input. We used three different corpora of adult-directed-speech (ADS), child-directed-speech (CDS), and child speech. We analyzed the root and pattern structures in the 50 most frequent Lebanese Arabic word types in each corpus. We found fewer words with templatic patterns than expected among the most frequent words in ADS (35/50), even fewer in CDS (23/50) and still fewer in the children's target words (15/50). In addition, only a minority contains three root consonants in their surface forms: 22 in ADS, 15 in CDS, and only 7 in words targeted by the children. We conclude that Semitic structure is less evident in either input to children or words targeted by children aged 1-3 than has been assumed. We discuss implications for the development of sensitivity to templatic structure among Lebanese-acquiring children.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309241311230"},"PeriodicalIF":1.1,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143025716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-04DOI: 10.1177/00238309241306748
Hoyoung Yi, Delaney DiCristofaro, Woonyoung Song
Adapting one's speaking style is particularly crucial as children start interacting with diverse conversational partners in various communication contexts. The study investigated the capacity of preschool children aged 3-5 years (n = 28) to modify their speaking styles in response to background noise, referred to as noise-adapted speech, and when talking to an interlocutor who pretended to have hearing loss, referred to as clear speech. We examined how two modified speaking styles differed across the age range. Prosody features of conversational, noise-adapted, and clear speech were analyzed, including F0 mean (Hz), F0 range (Hz), energy in 1-3 kHz range (dB), speaking rate (syllables per second), and the number of pauses. Preschoolers adjusted their prosody features in response to auditory feedback interruptions (i.e., noise-adapted speech), while developmental changes were observed across the age range for clear speech. To examine the functional effect of the modified hyper-speech produced by the preschoolers, speech intelligibility was also examined in adult listeners (n = 30). The study found that speech intelligibility was higher in noise-adapted speech than in conversational speech across the preschool age range. A noticeable increase in speech intelligibility for clear speech was observed with the increasing age of preschool talkers, aligning with the age-related enhancements in acoustic prosody for clear speech. The findings indicate that children progressively develop their ability to modify speech in challenging environments, initiating and refining adaptations to better accommodate their listeners.
{"title":"Prosodic Modifications to Challenging Communicative Environments in Preschoolers.","authors":"Hoyoung Yi, Delaney DiCristofaro, Woonyoung Song","doi":"10.1177/00238309241306748","DOIUrl":"10.1177/00238309241306748","url":null,"abstract":"<p><p>Adapting one's speaking style is particularly crucial as children start interacting with diverse conversational partners in various communication contexts. The study investigated the capacity of preschool children aged 3-5 years (<i>n</i> = 28) to modify their speaking styles in response to background noise, referred to as noise-adapted speech, and when talking to an interlocutor who pretended to have hearing loss, referred to as clear speech. We examined how two modified speaking styles differed across the age range. Prosody features of conversational, noise-adapted, and clear speech were analyzed, including F0 mean (Hz), F0 range (Hz), energy in 1-3 kHz range (dB), speaking rate (syllables per second), and the number of pauses. Preschoolers adjusted their prosody features in response to auditory feedback interruptions (i.e., noise-adapted speech), while developmental changes were observed across the age range for clear speech. To examine the functional effect of the modified hyper-speech produced by the preschoolers, speech intelligibility was also examined in adult listeners (<i>n</i> = 30). The study found that speech intelligibility was higher in noise-adapted speech than in conversational speech across the preschool age range. A noticeable increase in speech intelligibility for clear speech was observed with the increasing age of preschool talkers, aligning with the age-related enhancements in acoustic prosody for clear speech. The findings indicate that children progressively develop their ability to modify speech in challenging environments, initiating and refining adaptations to better accommodate their listeners.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309241306748"},"PeriodicalIF":1.1,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142928328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Listeners adjust their perception of sound categories when confronted with variations in speech. Previous research on speech recalibration has primarily focused on segmental variation, demonstrating that recalibration tends to be specific to individual speakers and situations and often persists over time. In this study, we present findings on the perceptual learning of lexical tone in Standard Chinese, a suprasegmental feature signaled primarily through pitch variations to distinguish morpheme/word meanings. Native speakers of Standard Chinese showed a recalibration of tone category boundaries immediately following exposure to ambiguous tonal pitch contours. However, this recalibration effect significantly weakened after 12 hours. Furthermore, participants trained at night did not exhibit delayed stabilization, a phenomenon commonly observed during sleep-induced consolidation. Our results replicate previous findings and provide new evidence suggesting that while our perceptual system can flexibly adapt to real-time sensory inputs, subsequent consolidation processes, such as those occurring during sleep, may exhibit selectivity and, under certain conditions, may be ineffective.
{"title":"Flexibility and Stability in Lexical Tone Recalibration: Evidence from Tone Perceptual Learning.","authors":"Yingyi Luo, Holger Mitterer, Xiaolin Zhou, Yiya Chen","doi":"10.1177/00238309241291536","DOIUrl":"https://doi.org/10.1177/00238309241291536","url":null,"abstract":"<p><p>Listeners adjust their perception of sound categories when confronted with variations in speech. Previous research on speech recalibration has primarily focused on segmental variation, demonstrating that recalibration tends to be specific to individual speakers and situations and often persists over time. In this study, we present findings on the perceptual learning of lexical tone in Standard Chinese, a suprasegmental feature signaled primarily through pitch variations to distinguish morpheme/word meanings. Native speakers of Standard Chinese showed a recalibration of tone category boundaries immediately following exposure to ambiguous tonal pitch contours. However, this recalibration effect significantly weakened after 12 hours. Furthermore, participants trained at night did not exhibit delayed stabilization, a phenomenon commonly observed during sleep-induced consolidation. Our results replicate previous findings and provide new evidence suggesting that while our perceptual system can flexibly adapt to real-time sensory inputs, subsequent consolidation processes, such as those occurring during sleep, may exhibit selectivity and, under certain conditions, may be ineffective.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309241291536"},"PeriodicalIF":1.1,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142856182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-03DOI: 10.1177/00238309241297703
Mitsuhiko Ota
Young children often produce non-target-like word forms in which non-adjacent consonants share a major place of articulation (e.g., [gɔgi] "doggy"). Termed child consonant harmony (CCH), this phenomenon has garnered considerable attention in the literature, primarily due to the apparent absence of analogous patterns in mature phonological systems. This study takes a close look at a potential account of CCH that is compatible with findings from adult word learning, serial recall, and phonological typology. According to this account, CCH is a response to memory pressure involved in remembering and retrieving multiple consonantal contrasts within a word. If this is the main motivation behind CCH, we would expect the resulting child forms to be biased toward full assimilation (i.e., consonant repetition) as it allows maximal reduction of phonolexical memory load. To test this prediction, children's productions of target words containing consonants that differ in both major place and manner were analyzed using two data sources: a single session sample from 40 children aged 1-2 years learning English, French, Finnish, Japanese, or Mandarin; and longitudinal samples from seven English-learning children between 1 and 3 years of age. Prevalence of consonant repetitions was robustly evidenced in early child forms, especially in those produced for target words with the structure CVCV(C). The results suggest that early word production is shaped by constraints on phonolexical memory.
{"title":"Child Consonant Harmony Revisited: The Role of Lexical Memory Constraints and Segment Repetition.","authors":"Mitsuhiko Ota","doi":"10.1177/00238309241297703","DOIUrl":"https://doi.org/10.1177/00238309241297703","url":null,"abstract":"<p><p>Young children often produce non-target-like word forms in which non-adjacent consonants share a major place of articulation (e.g., [gɔgi] \"doggy\"). Termed child consonant harmony (CCH), this phenomenon has garnered considerable attention in the literature, primarily due to the apparent absence of analogous patterns in mature phonological systems. This study takes a close look at a potential account of CCH that is compatible with findings from adult word learning, serial recall, and phonological typology. According to this account, CCH is a response to memory pressure involved in remembering and retrieving multiple consonantal contrasts within a word. If this is the main motivation behind CCH, we would expect the resulting child forms to be biased toward full assimilation (i.e., consonant repetition) as it allows maximal reduction of phonolexical memory load. To test this prediction, children's productions of target words containing consonants that differ in both major place and manner were analyzed using two data sources: a single session sample from 40 children aged 1-2 years learning English, French, Finnish, Japanese, or Mandarin; and longitudinal samples from seven English-learning children between 1 and 3 years of age. Prevalence of consonant repetitions was robustly evidenced in early child forms, especially in those produced for target words with the structure CVCV(C). The results suggest that early word production is shaped by constraints on phonolexical memory.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"238309241297703"},"PeriodicalIF":1.1,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142774382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-01Epub Date: 2023-11-10DOI: 10.1177/00238309231205012
Hannah L Goh, Fei Ting Woon, Scott R Moisik, Suzy J Styles
The standard Beijing variety of Mandarin has a clear alveolar-retroflex contrast for phonemes featuring voiceless sibilant frication (i.e., /s/, /ʂ/, /ʈs/, /ʈʂ/, /ʈsʰ/, /ʈʂʰ/). However, some studies show that varieties in the 'outer circle', such in Taiwan, have a reduced contrast for these speech sounds via a process known as 'deretroflexion'. The variety of Mandarin spoken in Singapore is also considered as 'outer circle', as it exhibits influences from Min Nan varieties. We investigated how bilinguals of Singapore Mandarin and English perceive and produce speech tokens in minimal pairs differing only in the alveolar/retroflex place of articulation. In all, 50 participants took part in two tasks. In Task 1, participants performed a lexical identification task for minimal pairs differing only the alveolar/retroflex place of articulation, as spoken by native speakers of two varieties: Beijing Mandarin and Singapore Mandarin. No difference in comprehension of the words was observed between the two varieties indicating that both varieties contain sufficient acoustic information for discrimination. In Task 2, participants read aloud from the list of minimal pairs while their voices were recorded. Acoustic analysis revealed that the phonemes do indeed differ acoustically in terms of center of gravity of the frication and in an alternative measure: long-term averaged spectra. The magnitude of this difference appears to be smaller than previously reported differences for the Beijing variety. These findings show that although some deretroflexion is evident in the speech of bilinguals of the Singaporean variety of Mandarin, it does not translate to ambiguity in the speech signal.
{"title":"Contrastive Alveolar/Retroflex Phonemes in Singapore Mandarin Bilinguals: Comprehension Rates for Articulations in Different Accents, and Acoustic Analysis of Productions.","authors":"Hannah L Goh, Fei Ting Woon, Scott R Moisik, Suzy J Styles","doi":"10.1177/00238309231205012","DOIUrl":"10.1177/00238309231205012","url":null,"abstract":"<p><p>The standard Beijing variety of Mandarin has a clear alveolar-retroflex contrast for phonemes featuring voiceless sibilant frication (i.e., /s/, /ʂ/, /ʈs/, /ʈʂ/, /ʈsʰ/, /ʈʂʰ/). However, some studies show that varieties in the 'outer circle', such in Taiwan, have a reduced contrast for these speech sounds via a process known as 'deretroflexion'. The variety of Mandarin spoken in Singapore is also considered as 'outer circle', as it exhibits influences from Min Nan varieties. We investigated how bilinguals of Singapore Mandarin and English perceive and produce speech tokens in minimal pairs differing only in the alveolar/retroflex place of articulation. In all, 50 participants took part in two tasks. In Task 1, participants performed a lexical identification task for minimal pairs differing only the alveolar/retroflex place of articulation, as spoken by native speakers of two varieties: Beijing Mandarin and Singapore Mandarin. No difference in comprehension of the words was observed between the two varieties indicating that both varieties contain sufficient acoustic information for discrimination. In Task 2, participants read aloud from the list of minimal pairs while their voices were recorded. Acoustic analysis revealed that the phonemes do indeed differ acoustically in terms of center of gravity of the frication and in an alternative measure: long-term averaged spectra. The magnitude of this difference appears to be smaller than previously reported differences for the Beijing variety. These findings show that although some deretroflexion is evident in the speech of bilinguals of the Singaporean variety of Mandarin, it does not translate to ambiguity in the speech signal.</p>","PeriodicalId":51255,"journal":{"name":"Language and Speech","volume":" ","pages":"924-944"},"PeriodicalIF":1.1,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72016107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}