Pub Date : 2025-03-26DOI: 10.1016/j.wocn.2025.101410
Seung-Eun Kim , Sam Tilsen
In this study, we introduce an F0 modeling framework – which we refer to as the Gesture-Field-Register (GFR) framework – in which F0 production involves joint control of relatively generic intentions and how those intentions are mapped to physical F0 values. Building on Articulatory Phonology (AP) and Task Dynamics (TD), the GFR framework considers F0 gestures to be the fundamental units of F0 control. It further holds (i) that the dynamic target F0 state of a speaker is determined by the blending of F0 gestural targets in a planning field and (ii) that the gestural targets and dynamic targets are represented in normalized values which are converted to F0 in Hz via dynamic control of F0 register. We show how this framework accounts for a variety of empirical F0 patterns, and we present a case study that uses parameter optimization to analyze empirical F0 contours into a time series of gestural activation and register states. In doing so, we demonstrate that the framework allows for gestural targets to be invariant within an utterance, despite the fact that the surface contours are highly variable. Model code and examples for generating and fitting F0 contours are publicly available in Github and OSF repositories. Overall, the GFR framework provides a novel way of conceptualizing and modeling F0 control under AP/TD and further expands the AP/TD by incorporating the mechanisms of a planning field and dynamic register control.
{"title":"The Gesture-Field-Register (GFR) framework for modeling F0 control","authors":"Seung-Eun Kim , Sam Tilsen","doi":"10.1016/j.wocn.2025.101410","DOIUrl":"10.1016/j.wocn.2025.101410","url":null,"abstract":"<div><div>In this study, we introduce an F0 modeling framework – which we refer to as the Gesture-Field-Register (GFR) framework – in which F0 production involves joint control of relatively generic intentions and how those intentions are mapped to physical F0 values. Building on Articulatory Phonology (AP) and Task Dynamics (TD), the GFR framework considers F0 gestures to be the fundamental units of F0 control. It further holds (i) that the dynamic target F0 state of a speaker is determined by the blending of F0 gestural targets in a planning field and (ii) that the gestural targets and dynamic targets are represented in normalized values which are converted to F0 in Hz via dynamic control of F0 register. We show how this framework accounts for a variety of empirical F0 patterns, and we present a case study that uses parameter optimization to analyze empirical F0 contours into a time series of gestural activation and register states. In doing so, we demonstrate that the framework allows for gestural targets to be invariant within an utterance, despite the fact that the surface contours are highly variable. Model code and examples for generating and fitting F0 contours are publicly available in Github and OSF repositories. Overall, the GFR framework provides a novel way of conceptualizing and modeling F0 control under AP/TD and further expands the AP/TD by incorporating the mechanisms of a planning field and dynamic register control.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"110 ","pages":"Article 101410"},"PeriodicalIF":1.9,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143697848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-20DOI: 10.1016/j.wocn.2025.101402
Rachel Soo, Molly Babel
Sound change can present synchronic variation with categorical pronunciation variants. This is the case in Cantonese, where syllable-initial /n/ is merging with /l/, occasionally creating homophones (e.g., lou5 腦 “brain”/ 老“old”) and giving rise to [n]- and [l]-initial pronunciation variants that are allophones. This pronunciation variation offers insight into how variation is processed in spoken word recognition because [n] and [l] in Cantonese are not associated with an orthographic standard. Across four experiments, we examine the perception, recognition, and encoding of Cantonese [n] and [l], and use Bayesian analyses where gradient interpretations are more straightforward. We observe perceptual evidence that these allophones are distinguishable (Exp 2). In recognition (Exp 1) and encoding (Exp 3) paradigms, we find that the [n] and [l] allophones are processed neither equivalently nor distinctly when the targets bear the more common [l]-initial allophone. When the targets bear the [n]-initial allophone (Exp 4), we observe high error rates, and somewhat contradictory results. Altogether, the results suggest that [n] and [l] are allophonic variants independently mapped to a phoneme, with connection strengths varying as a function of the frequency, such that the more common [l]-initial pronunciation demonstrates an overall recognition advantage.
{"title":"Processing pronunciation variation with independently mappable allophones","authors":"Rachel Soo, Molly Babel","doi":"10.1016/j.wocn.2025.101402","DOIUrl":"10.1016/j.wocn.2025.101402","url":null,"abstract":"<div><div>Sound change can present synchronic variation with categorical pronunciation variants. This is the case in Cantonese, where syllable-initial /n/ is merging with /l/, occasionally creating homophones (e.g., <em>lou5</em> 腦 “brain”/ 老“old”) and giving rise to [n]- and [l]-initial pronunciation variants that are allophones. This pronunciation variation offers insight into how variation is processed in spoken word recognition because [n] and [l] in Cantonese are not associated with an orthographic standard. Across four experiments, we examine the perception, recognition, and encoding of Cantonese [n] and [l], and use Bayesian analyses where gradient interpretations are more straightforward. We observe perceptual evidence that these allophones are distinguishable (Exp 2). In recognition (Exp 1) and encoding (Exp 3) paradigms, we find that the [n] and [l] allophones are processed neither equivalently nor distinctly when the targets bear the more common [l]-initial allophone. When the targets bear the [n]-initial allophone (Exp 4), we observe high error rates, and somewhat contradictory results. Altogether, the results suggest that [n] and [l] are allophonic variants independently mapped to a phoneme, with connection strengths varying as a function of the frequency, such that the more common [l]-initial pronunciation demonstrates an overall recognition advantage.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"110 ","pages":"Article 101402"},"PeriodicalIF":1.9,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143684746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.wocn.2024.101388
Rosamund Oxbury , Matthew Hunt , Kathleen M. McCarthy
This study investigated Multicultural London English (MLE) diphthongs as produced by children and adolescents in the London borough of Ealing, UK. We conducted an acoustic analysis of the diphthongs face, price and goat in the speech of 24 young people aged 16–24 years and, 14 children aged 5–7 years. The results revealed different production patterns between the children and adolescents for some but not all the diphthong variables. We found that the children’s and adolescents’ diphthongs were similar in the quality of the onset, and similar to the MLE system described in East London, in the London borough of Hackney. However, the children had not acquired monophthongization of the diphthongs, with adolescents producing significantly more monophthongal tokens of price,goat and, to a lesser extent, face. These findings have implications both for the study of multiethnolects and MLE, and for research on children’s acquisition of sociophonetic variation.
{"title":"The acquisition of Multicultural London English: Child and adolescent diphthong variation in West London","authors":"Rosamund Oxbury , Matthew Hunt , Kathleen M. McCarthy","doi":"10.1016/j.wocn.2024.101388","DOIUrl":"10.1016/j.wocn.2024.101388","url":null,"abstract":"<div><div>This study investigated Multicultural London English (MLE) diphthongs as produced by children and adolescents in the London borough of Ealing, UK. We conducted an acoustic analysis of the diphthongs <span>face</span>, <span>price</span> and <span>goat</span> in the speech of 24 young people aged 16–24 years and, 14 children aged 5–7 years. The results revealed different production patterns between the children and adolescents for some but not all the diphthong variables. We found that the children’s and adolescents’ diphthongs were similar in the quality of the onset, and similar to the MLE system described in East London, in the London borough of Hackney. However, the children had not acquired monophthongization of the diphthongs, with adolescents producing significantly more monophthongal tokens of <span>price,</span> <span>goat</span> and, to a lesser extent, <span>face</span>. These findings have implications both for the study of multiethnolects and MLE, and for research on children’s acquisition of sociophonetic variation.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"109 ","pages":"Article 101388"},"PeriodicalIF":1.9,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143520857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01DOI: 10.1016/j.wocn.2025.101401
Yevgeniy Vasilyevich Melguy , Keith Johnson
Speech produced with an unfamiliar accent may pose a challenge for listeners, resulting in delayed processing and/or decreased intelligibility. Such costs may be due to a mismatch between listeners’ experience with how a given sound category is phonetically realized, and how it is implemented by an unfamiliar speaker. Phonetic mismatches can increase processing time, but listeners could avoid them by adjusting their expectations for a given speaker or speech variety. This study investigates how changes in phonetic category structure may facilitate (or inhibit) processing of novel words produced with either the same or a phonetically similar accent, asking whether such adaptation is driven by a targeted shift or expansion of phonetic category boundaries. An artificial accent was created by morphing voiceless fricatives /θ/ and /s/ to create phonetically ambiguous [θ/s], which was presented in disambiguating /θ/ word frames (e.g., hypo[θ/s]etical). To examine the effect of phonetic learning on word processing, listeners were divided into three groups and asked to complete an exposure task where they heard either (1) accented critical /θ/ words, (2) natural (unaccented) /θ/ words, or (3) no /θ/ words. All listeners then completed a cross-modal priming task where, across two experiments, they were tested on their processing of words produced with the same artificial accent or three related accents differing in their phonetic match to the training accent. Overall, results show that while there was no effect of prior exposure on processing of novel words produced with the exposure accent, listeners with prior accent exposure showed a distinct pattern of facilitation and inhibition when processing words produced with the novel accents, compared to listeners with no prior accent exposure. Interestingly, listeners with prior exposure to unaccented /θ/ words tended to pattern with the accented /θ/ exposure group, rather than with controls. The role of acoustic/perceptual similarity and prior experience are discussed, along with implications of these results for a category expansion mechanism of phonetic learning.
All data, stimuli, and code for this study are freely available on OSF via https://osf.io/xw5k3/.
{"title":"What are you sinking about? Experience with unfamiliar accent produces both inhibition and facilitation during lexical processing","authors":"Yevgeniy Vasilyevich Melguy , Keith Johnson","doi":"10.1016/j.wocn.2025.101401","DOIUrl":"10.1016/j.wocn.2025.101401","url":null,"abstract":"<div><div>Speech produced with an unfamiliar accent may pose a challenge for listeners, resulting in delayed processing and/or decreased intelligibility. Such costs may be due to a mismatch between listeners’ experience with how a given sound category is phonetically realized, and how it is implemented by an unfamiliar speaker. Phonetic mismatches can increase processing time, but listeners could avoid them by adjusting their expectations for a given speaker or speech variety. This study investigates how changes in phonetic category structure may facilitate (or inhibit) processing of novel words produced with either the same or a phonetically similar accent, asking whether such adaptation is driven by a <em>targeted shift</em> or <em>expansion</em> of phonetic category boundaries. An artificial accent was created by morphing voiceless fricatives /θ/ and /s/ to create phonetically ambiguous [θ/s], which was presented in disambiguating /θ/ word frames (e.g., <em>hypo[</em>θ<em>/s]etical</em>). To examine the effect of phonetic learning on word processing, listeners were divided into three groups and asked to complete an exposure task where they heard either (1) accented critical /θ/ words, (2) natural (unaccented) /θ/ words, or (3) no /θ/ words. All listeners then completed a cross-modal priming task where, across two experiments, they were tested on their processing of words produced with the same artificial accent or three related accents differing in their phonetic match to the training accent. Overall, results show that while there was no effect of prior exposure on processing of novel words produced with the exposure accent, listeners with prior accent exposure showed a distinct pattern of facilitation and inhibition when processing words produced with the novel accents, compared to listeners with no prior accent exposure. Interestingly, listeners with prior exposure to unaccented /θ/ words tended to pattern with the accented /θ/ exposure group, rather than with controls. The role of acoustic/perceptual similarity and prior experience are discussed, along with implications of these results for a <em>category expansion</em> mechanism of phonetic learning.</div><div>All data, stimuli, and code for this study are freely available on OSF via <span><span>https://osf.io/xw5k3/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"109 ","pages":"Article 101401"},"PeriodicalIF":1.9,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-23DOI: 10.1016/j.wocn.2025.101391
Lei Wang , Marco van de Ven , Carlos Gussenhoven
Kaifeng Mandarin has four tones (LH, HL, H, L), among which the citation pronunciation of L is an f0 fall (the ‘Falling variant’) for some speakers and a falling-rising f0 contour (the ‘Dipping variant’) for others. Seeking to comprehend the rationale behind this idiosyncratic variation, we decided to investigate the distinctiveness of each variant of L with each of the three other tones, LH, HL and H. Accordingly, we constructed six ten-step f0 continua, using two naturally spoken syllables [ma] spoken by a male and a female speaker as source files. In a two-alternative forced choice task, the Falling and Dipping variants turned out to be equally distinctive. Specifically, the results revealed distinct categorizations between the Dipping variant and HL as well as between the Falling variant and LH. However, when the Dipping variant needed to be distinguished from LH and the Falling variant from HL, recognition accuracy dropped significantly, favoring the complex tone. The two L-variants were equally discriminable from H. This overall functional similarity of the two variants goes some way towards understanding their coexistence within the same speech community. Because communicative intentions played no role in the experiment, it remains to be seen if the distribution across speakers will remain stable in production experiments that vary communicative duress, as created by the need to discriminate between the L-tone and each of the two complex tones.
{"title":"Dipping and Falling as competing strategies for maintaining the distinctiveness of the low tone in the four-tone system of Kaifeng Mandarin","authors":"Lei Wang , Marco van de Ven , Carlos Gussenhoven","doi":"10.1016/j.wocn.2025.101391","DOIUrl":"10.1016/j.wocn.2025.101391","url":null,"abstract":"<div><div>Kaifeng Mandarin has four tones (LH, HL, H, L), among which the citation pronunciation of L is an f0 fall (the ‘Falling variant’) for some speakers and a falling-rising f0 contour (the ‘Dipping variant’) for others. Seeking to comprehend the rationale behind this idiosyncratic variation, we decided to investigate the distinctiveness of each variant of L with each of the three other tones, LH, HL and H. Accordingly, we constructed six ten-step f0 continua, using two naturally spoken syllables [ma] spoken by a male and a female speaker as source files. In a two-alternative forced choice task, the Falling and Dipping variants turned out to be equally distinctive. Specifically, the results revealed distinct categorizations between the Dipping variant and HL as well as between the Falling variant and LH. However, when the Dipping variant needed to be distinguished from LH and the Falling variant from HL, recognition accuracy dropped significantly, favoring the complex tone. The two L-variants were equally discriminable from H. This overall functional similarity of the two variants goes some way towards understanding their coexistence within the same speech community. Because communicative intentions played no role in the experiment, it remains to be seen if the distribution across speakers will remain stable in production experiments that vary communicative duress, as created by the need to discriminate between the L-tone and each of the two complex tones.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"109 ","pages":"Article 101391"},"PeriodicalIF":1.9,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143137776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-23DOI: 10.1016/j.wocn.2025.101392
Shihao Du, Stephan R. Kuberski, Adamantios I. Gafos
We offer an intrinsic timing account of durations widely used to characterize inter-segmental coarticulation or coproduction patterns cross-linguistically. In this account, measured durations are the result of dynamical properties of the coarticulated segments. Our account is developed on the basis of timing data, registered using Electromagnetic articulography (EMA), from stop-lateral clusters in three languages. In C1C2 stop-lateral consonant clusters from these languages, we show that the extent of the consonants’ coproduction (‘overlap’) is controlled by a synergy between the dynamical parameters of C1 opening and C2 closing stiffness, the two movements most relevant in the C1-to-C2 transition. The specific form of the overlap-stiffness relation is one where extent of coproduction is a linear function of the (reciprocal of the) mean of the two stiffness parameters. This result establishes a link between lag measures widely used to characterize inter-segmental coarticulation and the dynamical properties of the gestures of the segments whose co-production is at issue.
{"title":"Towards a dynamical account of inter-segmental coordination","authors":"Shihao Du, Stephan R. Kuberski, Adamantios I. Gafos","doi":"10.1016/j.wocn.2025.101392","DOIUrl":"10.1016/j.wocn.2025.101392","url":null,"abstract":"<div><div>We offer an intrinsic timing account of durations widely used to characterize inter-segmental coarticulation or coproduction patterns cross-linguistically. In this account, measured durations are the result of dynamical properties of the coarticulated segments. Our account is developed on the basis of timing data, registered using Electromagnetic articulography (EMA), from stop-lateral clusters in three languages. In C1C2 stop-lateral consonant clusters from these languages, we show that the extent of the consonants’ coproduction (‘overlap’) is controlled by a synergy between the dynamical parameters of C1 opening and C2 closing stiffness, the two movements most relevant in the C1-to-C2 transition. The specific form of the overlap-stiffness relation is one where extent of coproduction is a linear function of the (reciprocal of the) mean of the two stiffness parameters. This result establishes a link between lag measures widely used to characterize inter-segmental coarticulation and the dynamical properties of the gestures of the segments whose co-production is at issue.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"109 ","pages":"Article 101392"},"PeriodicalIF":1.9,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143137777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.wocn.2024.101371
Bowei Shao , Anne Hermes , Philipp Buech , Maria Giavazzi
Lexically prominent positions are phonologically privileged: they are often phonetically strengthened and they are loci of contrast preservation. Cross-linguistically, stress-conditioned alternations target stress-adjacent consonants independently of syllabic boundaries. We argue that the phonetic bases of these processes can be found in the articulatory modulations induced by stress. They are anchored in the stressed vowel but have spill-over effects on adjacent consonants. In this study, we investigate the articulation of velar consonants in a palatalizing context. By comparing two conditions, with or without stress modulations, we aim to investigate potential articulatory underpinnings of a stress-conditioned phonological process, i.e., velar palatalization in Italian plural nouns and adjectives, which is largely blocked in post-tonic position. Using articulatory data (EMA), we show that lexical stress induces temporal and spatial modulations on post-tonic velar consonants. Temporal modulations surface with a delayed target achievement of the consonants’ constriction gestures. Spatial modulations surface with a further back place of articulation in post-tonic velars. Both effects are due to the strengthening of the stressed vowel. We discuss the implications of our findings within the -gesture proposal of Articulatory Phonology for the distribution of palatalization in Italian.
{"title":"Articulatory consequences of lexical stress on post-tonic velar plosives in Italian","authors":"Bowei Shao , Anne Hermes , Philipp Buech , Maria Giavazzi","doi":"10.1016/j.wocn.2024.101371","DOIUrl":"10.1016/j.wocn.2024.101371","url":null,"abstract":"<div><div>Lexically prominent positions are phonologically privileged: they are often phonetically strengthened and they are <em>loci</em> of contrast preservation. Cross-linguistically, stress-conditioned alternations target stress-adjacent consonants independently of syllabic boundaries. We argue that the phonetic bases of these processes can be found in the articulatory modulations induced by stress. They are anchored in the stressed vowel but have spill-over effects on adjacent consonants. In this study, we investigate the articulation of velar consonants in a palatalizing context. By comparing two conditions, with or without stress modulations, we aim to investigate potential articulatory underpinnings of a stress-conditioned phonological process, i.e., velar palatalization in Italian plural nouns and adjectives, which is largely blocked in post-tonic position. Using articulatory data (EMA), we show that lexical stress induces temporal and spatial modulations on post-tonic velar consonants. Temporal modulations surface with a delayed target achievement of the consonants’ constriction gestures. Spatial modulations surface with a further back place of articulation in post-tonic velars. Both effects are due to the strengthening of the stressed vowel. We discuss the implications of our findings within the <span><math><mrow><mi>μ</mi></mrow></math></span>-gesture proposal of Articulatory Phonology for the distribution of palatalization in Italian.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"108 ","pages":"Article 101371"},"PeriodicalIF":1.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143144891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.wocn.2024.101387
Jahnavi Narkar
Most work using voice onset time (VOT) to characterize voicing contrasts has focused on languages with two or three-way distinctions (Lisker and Abramson, 1964, Cho et al., 2019b). While this work acknowledges that VOT is not adequate to describe more complex voicing contrasts, there are few proposals addressing how such complex laryngeal contrasts can be characterized. In this post-facto addition to the special collection Marking 50 Years of Research on Voice Onset Time and the Voicing Contrast in the World’s Languages (Cho, Docherty, & Whalen, 2019a), I argue that VOT should be reconceptualized as a two-dimensional plane rather than a one-dimensional continuum. This simple reformulation of VOT, under which negative and positive VOT make up the complete VOT space, yields a more complete description of the voicing contrasts that exist in the world’s languages. Crucial evidence comes from Bengali which, like many other Indic languages, has a four-way contrast, utilizing both voicing and aspiration. If VOT is conceptualized as two-dimensional, the Bengali-type pattern is naturally predicted to exist. I argue that two-dimensional VOT best characterizes the acoustic properties of voicing contrasts and that the modest modification to our understanding of the VOT space proposed here can better explain the typology of stop laryngeal contrasts.
{"title":"Reconceptualizing VOT: Further contributions to marking 50 years of research on voice onset time","authors":"Jahnavi Narkar","doi":"10.1016/j.wocn.2024.101387","DOIUrl":"10.1016/j.wocn.2024.101387","url":null,"abstract":"<div><div>Most work using voice onset time (VOT) to characterize voicing contrasts has focused on languages with two or three-way distinctions (<span><span>Lisker and Abramson, 1964</span></span>, <span><span>Cho et al., 2019b</span></span>). While this work acknowledges that VOT is not adequate to describe more complex voicing contrasts, there are few proposals addressing how such complex laryngeal contrasts can be characterized. In this post-facto addition to the special collection <em>Marking 50 Years of Research on Voice Onset Time and the Voicing Contrast in the World’s Languages</em> (<span><span>Cho, Docherty, & Whalen, 2019a</span></span>), I argue that VOT should be reconceptualized as a two-dimensional plane rather than a one-dimensional continuum. This simple reformulation of VOT, under which negative and positive VOT make up the complete VOT space, yields a more complete description of the voicing contrasts that exist in the world’s languages. Crucial evidence comes from Bengali which, like many other Indic languages, has a four-way contrast, utilizing both voicing and aspiration. If VOT is conceptualized as two-dimensional, the Bengali-type pattern is naturally predicted to exist. I argue that two-dimensional VOT best characterizes the acoustic properties of voicing contrasts and that the modest modification to our understanding of the VOT space proposed here can better explain the typology of stop laryngeal contrasts.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"108 ","pages":"Article 101387"},"PeriodicalIF":1.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143144893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.wocn.2024.101370
James M. Stratton
The present study examines the effects of production training on speech perception in English-speaking L2 learners of German. Forty-five third-semester German language students at a North American university were divided into three learning conditions: explicit, implicit, or control. Learners in the explicit condition received six twenty-minute training sessions on German articulatory phonetics and phonology, targeting both consonants and vowels. Changes in L2 perception were measured by an AX discrimination task and a binary forced choice identification task. Results indicate that learners in the explicit group significantly outperformed learners in the implicit and control groups, improving auditory discrimination of novel contrasts by an average of 19 percent and perceptual categorization by 14 percent. The findings provide support for motor theory models of speech perception and show that improvements in L2 perception can be a positive concomitant of exclusively production-based training, highlighting that production and perception are inextricably linked.
{"title":"The effects of production training on speech perception in L2 learners of German","authors":"James M. Stratton","doi":"10.1016/j.wocn.2024.101370","DOIUrl":"10.1016/j.wocn.2024.101370","url":null,"abstract":"<div><div>The present study examines the effects of production training on speech perception in English-speaking L2 learners of German. Forty-five third-semester German language students at a North American university were divided into three learning conditions: explicit, implicit, or control. Learners in the explicit condition received six twenty-minute training sessions on German articulatory phonetics and phonology, targeting both consonants and vowels. Changes in L2 perception were measured by an AX discrimination task and a binary forced choice identification task. Results indicate that learners in the explicit group significantly outperformed learners in the implicit and control groups, improving auditory discrimination of novel contrasts by an average of 19 percent and perceptual categorization by 14 percent. The findings provide support for motor theory models of speech perception and show that improvements in L2 perception can be a positive concomitant of exclusively production-based training, highlighting that production and perception are inextricably linked.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"108 ","pages":"Article 101370"},"PeriodicalIF":1.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143144895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.wocn.2024.101377
Simona Sbranna, Aviad Albert, Martine Grice
Previous studies report that Italian learners of German transfer their L1 prosody to their L2 when marking information status prosodically within noun phrases (NPs). However, these studies were based on a categorical analysis of accentuation based on the presence or absence of pitch accents, which might not provide the full picture of interlanguages, in which category boundaries are flexible and dynamically evolving. We elicited two-word NPs in two different information status conditions – given-new (GN) and new-given (NG) – in L1 German, L1 Italian, and L2 German. We performed a periodic-energy-informed analysis to explore speakers’ continuous modulation of F0 and prosodic strength and additionally discuss the results for the interlanguage in categorical terms. Learners prosodically mark information status by modulating the F0 contour on the first word similarly to their L1. However, learners reduce the prosodic strength of the second word in the noun phrase across the board, i.e. irrespective of information status. This pattern resembles German deaccentuation, and indicates that the learners are using a salient pattern but are not associating it with the appropriate pragmatic function. The current study revealed patterns for L1 Italian learners of L2 German which did not emerge in previous categorical analyses of the intonation of Italian learners of German.
{"title":"Investigating interlanguages beyond categorical analyses: Prosodic marking of information status in Italian learners of German","authors":"Simona Sbranna, Aviad Albert, Martine Grice","doi":"10.1016/j.wocn.2024.101377","DOIUrl":"10.1016/j.wocn.2024.101377","url":null,"abstract":"<div><div>Previous studies report that Italian learners of German transfer their L1 prosody to their L2 when marking information status prosodically within noun phrases (NPs). However, these studies were based on a categorical analysis of accentuation based on the presence or absence of pitch accents, which might not provide the full picture of interlanguages, in which category boundaries are flexible and dynamically evolving. We elicited two-word NPs in two different information status conditions – given-new (GN) and new-given (NG) – in L1 German, L1 Italian, and L2 German. We performed a periodic-energy-informed analysis to explore speakers’ continuous modulation of F0 and prosodic strength and additionally discuss the results for the interlanguage in categorical terms. Learners prosodically mark information status by modulating the F0 contour on the first word similarly to their L1. However, learners reduce the prosodic strength of the second word in the noun phrase across the board, i.e. irrespective of information status. This pattern resembles German deaccentuation, and indicates that the learners are using a salient pattern but are not associating it with the appropriate pragmatic function. The current study revealed patterns for L1 Italian learners of L2 German which did not emerge in previous categorical analyses of the intonation of Italian learners of German.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"108 ","pages":"Article 101377"},"PeriodicalIF":1.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143144886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}