Pub Date : 2024-09-01Epub Date: 2024-09-05DOI: 10.1016/j.wocn.2024.101354
Rasmus Puggaard-Rode
This paper provides evidence for the assumption that the precise phonetic implementation of laryngeal contrast in obstruents can have an influence on higher order linguistic structure. Traditional varieties of Jutland Danish – which are all broadly ‘aspirating’ varieties – are used as a case study. The paper shows that the precise implementation of the aspirated–unaspirated contrast in stops varied systematically in these varieties, and that this covaries with the morphophonological process of stop gradation. Stop gradation is a lenition process which is historically found in the entire Danish-speaking area, but with quite varying outcomes, which were mapped extensively by dialectologists more than a century ago. Using a large legacy corpus of sociolinguistic interviews from the 1970s, this study shows that more sonorous outcomes of stop gradation covary with higher rates of continuous closure voicing in /b d g/ and shorter aspiration in /p t k/, and vice versa for less sonorous outcomes of stop gradation.
本文为以下假设提供了证据,即喉音对比在塞音中的精确发音会对高阶语言结构产生影响。本文以日德兰丹麦语的传统变体(它们都是广义上的 "吸气 "变体)为例进行研究。论文表明,在这些变体中,停顿中吸气与不吸气对比的精确实施有系统地变化,这与停顿分级的形态学过程有关。停顿分级是整个丹麦语区历史上都存在的一种宽化过程,但其结果却千差万别,方言学家早在一个多世纪前就对其进行了广泛的研究。本研究利用 20 世纪 70 年代遗留下来的大型社会语言学访谈语料库,表明停顿分级的音调较高的结果与 /b d g/ 中较高的连续闭合发声率和 /p t k/ 中较短的吸气率共存,反之则与停顿分级的音调较低的结果共存。
{"title":"Variation in fine phonetic detail can modulate the outcome of sound change: The case of stop gradation and laryngeal contrast implementation in Jutland Danish","authors":"Rasmus Puggaard-Rode","doi":"10.1016/j.wocn.2024.101354","DOIUrl":"10.1016/j.wocn.2024.101354","url":null,"abstract":"<div><p>This paper provides evidence for the assumption that the precise phonetic implementation of laryngeal contrast in obstruents can have an influence on higher order linguistic structure. Traditional varieties of Jutland Danish – which are all broadly ‘aspirating’ varieties – are used as a case study. The paper shows that the precise implementation of the aspirated–unaspirated contrast in stops varied systematically in these varieties, and that this covaries with the morphophonological process of stop gradation. Stop gradation is a lenition process which is historically found in the entire Danish-speaking area, but with quite varying outcomes, which were mapped extensively by dialectologists more than a century ago. Using a large legacy corpus of sociolinguistic interviews from the 1970s, this study shows that more sonorous outcomes of stop gradation covary with higher rates of continuous closure voicing in /b d g/ and shorter aspiration in /p t k/, and <em>vice versa</em> for less sonorous outcomes of stop gradation.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"106 ","pages":"Article 101354"},"PeriodicalIF":1.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0095447024000603/pdfft?md5=859e34aeb56cd3078cc452afcc961edc&pid=1-s2.0-S0095447024000603-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142148765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01Epub Date: 2024-06-20DOI: 10.1016/j.wocn.2024.101338
Anqi Xu , Daniel R. van Niekerk , Branislav Gerazov , Paul Konstantin Krug , Peter Birkholz , Santitham Prom-on , Lorna F. Halliday , Yi Xu
It has long been a mystery how children learn to speak without formal instructions. Previous research has used computational modelling to help solve the mystery by simulating vocal learning with direct imitation or caregiver feedback, but has encountered difficulty in overcoming the speaker normalisation problem, namely, discrepancies between children’s vocalisations and that of adults due to age-related anatomical differences. Here we show that vocal learning can be successfully simulated via recognition-guided vocal exploration without explicit speaker normalisation. We trained an articulatory synthesiser with three-dimensional vocal tract models of an adult and two child configurations of different ages to learn monosyllabic English words consisting of CVC syllables, based on coarticulatory dynamics and two kinds of auditory feedback: (i) acoustic features to simulate universal phonetic perception (or direct imitation), and (ii) a deep-learning-based speech recogniser to simulate native-language phonological perception. Native listeners were invited to evaluate the learned synthetic speech with natural speech as baseline reference. Results show that the English words trained with the speech recogniser were more intelligible than those trained with acoustic features, sometimes close to natural speech. The successful simulation of vocal learning in this study suggests that a combination of coarticulatory dynamics and native-language phonological perception may be critical also for real-life vocal production learning.
{"title":"Artificial vocal learning guided by speech recognition: What it may tell us about how children learn to speak","authors":"Anqi Xu , Daniel R. van Niekerk , Branislav Gerazov , Paul Konstantin Krug , Peter Birkholz , Santitham Prom-on , Lorna F. Halliday , Yi Xu","doi":"10.1016/j.wocn.2024.101338","DOIUrl":"https://doi.org/10.1016/j.wocn.2024.101338","url":null,"abstract":"<div><p>It has long been a mystery how children learn to speak without formal instructions. Previous research has used computational modelling to help solve the mystery by simulating vocal learning with direct imitation or caregiver feedback, but has encountered difficulty in overcoming the speaker normalisation problem, namely, discrepancies between children’s vocalisations and that of adults due to age-related anatomical differences. Here we show that vocal learning can be successfully simulated via recognition-guided vocal exploration without explicit speaker normalisation. We trained an articulatory synthesiser with three-dimensional vocal tract models of an adult and two child configurations of different ages to learn monosyllabic English words consisting of CVC syllables, based on coarticulatory dynamics and two kinds of auditory feedback: (i) acoustic features to simulate universal phonetic perception (or direct imitation), and (ii) a deep-learning-based speech recogniser to simulate native-language phonological perception. Native listeners were invited to evaluate the learned synthetic speech with natural speech as baseline reference. Results show that the English words trained with the speech recogniser were more intelligible than those trained with acoustic features, sometimes close to natural speech. The successful simulation of vocal learning in this study suggests that a combination of coarticulatory dynamics and native-language phonological perception may be critical also for real-life vocal production learning.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"105 ","pages":"Article 101338"},"PeriodicalIF":1.9,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0095447024000445/pdfft?md5=941cb45273d2db483f6143ef8085a741&pid=1-s2.0-S0095447024000445-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141428706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01Epub Date: 2024-06-10DOI: 10.1016/j.wocn.2024.101339
Wei-Rong Chen , Michael C. Stern , D.H. Whalen , Donald Derrick , Christopher Carignan , Catherine T. Best , Mark Tiede
Ultrasound imaging of the tongue is biased by the probe movements relative to the speaker’s head. Two common remedies are restricting or algorithmically compensating for such movements, each with its own challenges. We describe these challenges in details and evaluate an open-source, adjustable probe stabilizer for ultrasound (ALPHUS), specifically designed to address these challenges by restricting uncorrectable probe movements while allowing for correctable ones (e.g., jaw opening) to facilitate naturalness. The stabilizer is highly modular and adaptable to different users (e.g., adults and children) and different research/clinical needs (e.g., imaging in both midsagittal and coronal orientations). The results of three experiments show that probe movement over uncorrectable degrees of freedom was negligible, while movement over correctable degrees of freedom that could be compensated through post-processing alignment was relatively large, indicating unconstrained articulation over parameters relevant for natural speech. Results also showed that probe movements as small as 5 mm or 2 degrees can neutralize phonemic contrasts in ultrasound tongue positions. This demonstrates that while stabilized but uncorrected ultrasound imaging can provide reliable tongue shape information (e.g., curvature or complexity), accurate tongue position (e.g., height or backness) with respect to vocal tract hard structure needs correction for probe displacement relative to the head.
{"title":"Assessing ultrasound probe stabilization for quantifying speech production contrasts using the Adjustable Laboratory Probe Holder for UltraSound (ALPHUS)","authors":"Wei-Rong Chen , Michael C. Stern , D.H. Whalen , Donald Derrick , Christopher Carignan , Catherine T. Best , Mark Tiede","doi":"10.1016/j.wocn.2024.101339","DOIUrl":"https://doi.org/10.1016/j.wocn.2024.101339","url":null,"abstract":"<div><p>Ultrasound imaging of the tongue is biased by the probe movements relative to the speaker’s head. Two common remedies are restricting or algorithmically compensating for such movements, each with its own challenges. We describe these challenges in details and evaluate an open-source, adjustable probe stabilizer for ultrasound (ALPHUS), specifically designed to address these challenges by restricting uncorrectable probe movements while allowing for correctable ones (e.g., jaw opening) to facilitate naturalness. The stabilizer is highly modular and adaptable to different users (e.g., adults and children) and different research/clinical needs (e.g., imaging in both midsagittal and coronal orientations). The results of three experiments show that probe movement over uncorrectable degrees of freedom was negligible, while movement over correctable degrees of freedom that could be compensated through post-processing alignment was relatively large, indicating unconstrained articulation over parameters relevant for natural speech. Results also showed that probe movements as small as 5 mm or 2 degrees can neutralize phonemic contrasts in ultrasound tongue positions. This demonstrates that while stabilized but uncorrected ultrasound imaging can provide reliable tongue shape information (e.g., curvature or complexity), accurate tongue position (e.g., height or backness) with respect to vocal tract hard structure needs correction for probe displacement relative to the head.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"105 ","pages":"Article 101339"},"PeriodicalIF":1.9,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141302459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01Epub Date: 2024-05-15DOI: 10.1016/j.wocn.2024.101330
Chunyu Ge, Peggy Mok
Suzhou Wu Chinese has undergone a transphonologization of a voicing contrast in initial consonants to a tone contrast. In consequence, the tone system has split into two registers, in which the high register tones are higher in pitch and modal voiced, whilst the low register tones are lower in pitch and breathy voiced. Our previous studies have found that breathy voice in the low register tones is disappearing in younger speakers’ production. This finding motivated us to investigate the effect of breathy voice on tone identification across age groups. Participants from three age groups completed a tone identification experiment. Stimuli were constructed based on natural tokens produced by a middle-aged female speaker and an older female speaker. The manipulation of phonation was accomplished by using the base syllables of both high and low register tones, for both unchecked (T1 vs. T2) and checked (T7 vs. T8) tone pairs. The results showed that breathy voice is still used by younger listeners in their perception and its effect on their tone identification is similar to that for older and middle-aged listeners. Moreover, the effect of breathy voice is modulated by social indexical factors (i.e., talker voice). The implications of the results for the origin of the loss of breathy voice in Suzhou Wu and the mechanism of sound change are discussed.
{"title":"The effect of breathy voice on tone identification by listeners of different ages in Suzhou Wu Chinese","authors":"Chunyu Ge, Peggy Mok","doi":"10.1016/j.wocn.2024.101330","DOIUrl":"https://doi.org/10.1016/j.wocn.2024.101330","url":null,"abstract":"<div><p>Suzhou Wu Chinese has undergone a transphonologization of a voicing contrast in initial consonants to a tone contrast. In consequence, the tone system has split into two registers, in which the high register tones are higher in pitch and modal voiced, whilst the low register tones are lower in pitch and breathy voiced. Our previous studies have found that breathy voice in the low register tones is disappearing in younger speakers’ production. This finding motivated us to investigate the effect of breathy voice on tone identification across age groups. Participants from three age groups completed a tone identification experiment. Stimuli were constructed based on natural tokens produced by a middle-aged female speaker and an older female speaker. The manipulation of phonation was accomplished by using the base syllables of both high and low register tones, for both unchecked (T1 vs. T2) and checked (T7 vs. T8) tone pairs. The results showed that breathy voice is still used by younger listeners in their perception and its effect on their tone identification is similar to that for older and middle-aged listeners. Moreover, the effect of breathy voice is modulated by social indexical factors (i.e., talker voice). The implications of the results for the origin of the loss of breathy voice in Suzhou Wu and the mechanism of sound change are discussed.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"105 ","pages":"Article 101330"},"PeriodicalIF":1.9,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140950688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01Epub Date: 2024-06-29DOI: 10.1016/j.wocn.2024.101340
Jill C. Thorson, Rachel Steindel Burdin
This study explores downstepping in Mainstream US English using three experiments. Experiment 1 investigated if downstep was associated with accessible referents. Pairs of scenarios were constructed: one with new information and one with accessible. Two versions of the target utterances were recorded (one with high star, and one with downstepping) and presented in the accessible and new contexts. The high star contour was preferred overall, but less so in accessible contexts. A statistical model showed an effect of the phonetic implementation of the contour. Experiment 2 examined the phonetic realizations of the utterances in Experiment 1 using a categorical perception discrimination task. Participants showed linear perception within the downstep contours but a categorical difference between the high star and downstep contours. Experiment 3 explored the interpretations attached to downstepping. Listeners showed a categorical difference between high star and downstep contours for interpretation, hearing downstep as indicating something had happened before, and more resigned, disappointed, and less clear than high star contours. There was also variation within the downstep contours based on phonetic implementation of the contour. We show that downstep contours have distinct meanings from high star contours, and that these meanings may be mediated by their phonetic implementation.
{"title":"Phonetic implementation and the interpretation of downstepping in Mainstream US English","authors":"Jill C. Thorson, Rachel Steindel Burdin","doi":"10.1016/j.wocn.2024.101340","DOIUrl":"https://doi.org/10.1016/j.wocn.2024.101340","url":null,"abstract":"<div><p>This study explores downstepping in Mainstream US English using three experiments. Experiment 1 investigated if downstep was associated with accessible referents. Pairs of scenarios were constructed: one with <em>new</em> information and one with <em>accessible</em>. Two versions of the target utterances were recorded (one with high star, and one with downstepping) and presented in the <em>accessible</em> and <em>new</em> contexts. The high star contour was preferred overall, but less so in <em>accessible</em> contexts. A statistical model showed an effect of the phonetic implementation of the contour. Experiment 2 examined the phonetic realizations of the utterances in Experiment 1 using a categorical perception discrimination task. Participants showed linear perception within the downstep contours but a categorical difference between the high star and downstep contours. Experiment 3 explored the interpretations attached to downstepping. Listeners showed a categorical difference between high star and downstep contours for interpretation, hearing downstep as indicating something had happened before, and more resigned, disappointed, and less clear than high star contours. There was also variation within the downstep contours based on phonetic implementation of the contour. We show that downstep contours have distinct meanings from high star contours, and that these meanings may be mediated by their phonetic implementation.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"105 ","pages":"Article 101340"},"PeriodicalIF":1.9,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141484396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01Epub Date: 2024-05-13DOI: 10.1016/j.wocn.2024.101329
Conceição Cunha , Phil Hoole , Dirk Voit , Jens Frahm , Jonathan Harrington
The diachronic change by which coarticulatory nasalization increases in VN (vowel-nasal) sequences has been modelled as an earlier alignment of the velum combined with oral gesture weakening of N. The model was tested by comparing American (USE) and Standard Southern British English (BRE) based on the assumption that this diachronic change is more advanced in USE. Real-time MRI data was collected from 16 USE and 27 BRE adult speakers producing monosyllables with coda /Vn, Vnd, Vnz/. For USE, nasalization was greater in V, less in N, and there was greater tongue tip lenition than for BRE. The dialects showed a similar stability of the velum gesture and a trade-off between vowel nasalization and tongue tip lenition. Velum alignment was not earlier in USE. Instead, a closer approximation of the time of the tongue tip peak velocity towards the tongue tip maximum for USE caused a shift in the acoustic boundary within VN towards N, giving the illusion that the velum gesture has an earlier alignment in USE. It is suggested that coda reduction which targets the tongue tip more than the velum is a principal physiological mechanism responsible for the onset of diachronic vowel nasalization.
该模型通过比较美式英语(USE)和标准南方英式英语(BRE)进行了测试,其假设是这种对时变化在美式英语中更为显著。研究人员从 16 位美式英语(USE)和 27 位英式英语(BRE)成年说话者那里收集了实时磁共振成像数据,这些说话者发出的单音节带有尾音 /Vn、Vnd、Vnz/。在 USE 中,V 的鼻化程度较高,N 的鼻化程度较低,而且与 BRE 相比,舌尖变长的程度更高。这些方言显示出类似的 velum 手势稳定性,以及元音鼻化和舌尖变长之间的权衡。在 USE 中,元音对齐的时间并不早。相反,在 USE 中,舌尖峰值速度更接近于舌尖最大值的时间导致 VN 中的声学边界向 N 方向移动,从而产生了在 USE 中茸音对齐更早的错觉。这表明,以舌尖而不是以舌尖为目标的尾音减弱是导致元音鼻化的主要生理机制。
{"title":"The physiological basis of the phonologization of vowel nasalization: A real-time MRI analysis of American and Southern British English","authors":"Conceição Cunha , Phil Hoole , Dirk Voit , Jens Frahm , Jonathan Harrington","doi":"10.1016/j.wocn.2024.101329","DOIUrl":"https://doi.org/10.1016/j.wocn.2024.101329","url":null,"abstract":"<div><p>The diachronic change by which coarticulatory nasalization increases in VN (vowel-nasal) sequences has been modelled as an earlier alignment of the velum combined with oral gesture weakening of N. The model was tested by comparing American (USE) and Standard Southern British English (BRE) based on the assumption that this diachronic change is more advanced in USE. Real-time MRI data was collected from 16 USE and 27 BRE adult speakers producing monosyllables with coda /Vn, Vnd, Vnz/. For USE, nasalization was greater in V, less in N, and there was greater tongue tip lenition than for BRE. The dialects showed a similar stability of the velum gesture and a trade-off between vowel nasalization and tongue tip lenition. Velum alignment was not earlier in USE. Instead, a closer approximation of the time of the tongue tip peak velocity towards the tongue tip maximum for USE caused a shift in the acoustic boundary within VN towards N, giving the illusion that the velum gesture has an earlier alignment in USE. It is suggested that coda reduction which targets the tongue tip more than the velum is a principal physiological mechanism responsible for the onset of diachronic vowel nasalization.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"105 ","pages":"Article 101329"},"PeriodicalIF":1.9,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0095447024000354/pdfft?md5=a796ba209e07d6d7a77d5ad1e757f23d&pid=1-s2.0-S0095447024000354-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140918808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01Epub Date: 2024-04-06DOI: 10.1016/j.wocn.2024.101309
Khalil Iskarous , Jennifer Cole , Jeremy Steffman
The pitch accent system of Mainstream American English (MAE) is one of the most well-studied phenomena within the Autosegmental-Metrical (AM) approach to intonation. In this work we present an explicit model grounded in dynamical theory that predicts both qualitative phonological and quantitative phonetic generalizations about the MAE system. While the traditional AM account separates a phonological model of the structure of the accents from the F0 algorithm that interprets the phonological specification, we propose a unified dynamical model that encompasses both. The proposed model is introduced incrementally, one dynamical term at a time, to arrive at the minimal model needed to account for observed empirical generalizations, avoiding unnecessary complexity. The quantitative and qualitative properties of the MAE system that inform the dynamical model are based on an analysis of a large database of productions of the four most well-studied pitch accents of American English: three rising accents (H*, L+H*, L*+H) and a low-falling accent (L*). The dynamic model highlights the importance of velocity-based measures of F0, not typically invoked in intonational research, as key to understanding F0 differences among pitch accent categories. Although the focus of this work is on the MAE pitch accent system, suggestions are made for how the unified phonetic-phonological dynamical framework presented can be further developed to account for other pitch-based phenomena in a variety of languages.
主流美式英语(MAE)的音高重音系统是自分量元(AM)语调方法中研究得最多的现象之一。在这项研究中,我们提出了一个以动态理论为基础的明确模型,该模型预测了美式英语系统的语音定性和语音定量概括。传统的 AM 方法将重音结构的语音模型与解释语音规范的 F0 算法分开,而我们提出的统一动态模型则将两者都包含在内。提出的模型是逐步引入的,每次引入一个动态术语,以达到解释观察到的经验概括所需的最小模型,避免不必要的复杂性。为动态模型提供信息的 MAE 系统的定量和定性特性是基于对美国英语中四种研究最深入的音高口音的大型数据库的分析:三种上升口音(H*、L+H*、L*+H)和一种低沉口音(L*)。该动态模型强调了基于速度的 F0 测量的重要性,这种测量方法在音调研究中并不常用,但却是理解不同音高重音类别之间 F0 差异的关键。虽然这项研究的重点是 MAE 音高重音系统,但也提出了如何进一步发展所提出的统一语音-声学动态框架,以解释各种语言中的其他音高现象的建议。
{"title":"A minimal dynamical model of Intonation: Tone contrast, alignment, and scaling of American English pitch accents as emergent properties","authors":"Khalil Iskarous , Jennifer Cole , Jeremy Steffman","doi":"10.1016/j.wocn.2024.101309","DOIUrl":"https://doi.org/10.1016/j.wocn.2024.101309","url":null,"abstract":"<div><p>The pitch accent system of Mainstream American English (MAE) is one of the most well-studied phenomena within the Autosegmental-Metrical (AM) approach to intonation. In this work we present an explicit model grounded in dynamical theory that predicts both qualitative phonological and quantitative phonetic generalizations about the MAE system. While the traditional AM account separates a phonological model of the structure of the accents from the F0 algorithm that interprets the phonological specification, we propose a unified dynamical model that encompasses both. The proposed model is introduced incrementally, one dynamical term at a time, to arrive at the minimal model needed to account for observed empirical generalizations, avoiding unnecessary complexity. The quantitative and qualitative properties of the MAE system that inform the dynamical model are based on an analysis of a large database of productions of the four most well-studied pitch accents of American English: three rising accents (H*, L+H*, L*+H) and a low-falling accent (L*). The dynamic model highlights the importance of velocity-based measures of F0, not typically invoked in intonational research, as key to understanding F0 differences among pitch accent categories. Although the focus of this work is on the MAE pitch accent system, suggestions are made for how the unified phonetic-phonological dynamical framework presented can be further developed to account for other pitch-based phenomena in a variety of languages.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"104 ","pages":"Article 101309"},"PeriodicalIF":1.9,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140533585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01Epub Date: 2024-05-02DOI: 10.1016/j.wocn.2024.101323
Don Daniels , Zoë Haupt , Melissa M. Baese-Berk
We provide a phonetic examination of intrusive vowels in Sgi Bara [jil]. These vowels are inserted in predictable places, and their quality (either [i], [ɨ], or [u]) is also predictable, so they are not considered phonemic. We demonstrate that they differ from phonemic vowels in their duration, being shorter; and in their articulation, being more peripheral; but not in their intensity. We then demonstrate how this phonetic understanding of the difference between intrusive and phonemic vowels can be used to answer phonological questions about Sgi Bara. We offer two case studies: phonologically ambiguous sequences of high vowels, and frequent two-word combinations that may be univerbating. The results confirm the existence of a distinction between intrusive and phonemic vowels.
我们对 Sgi Bara [jil] 中的插入元音进行了语音检测。这些元音插入的位置可以预测,其音质([i]、[ɨ]或[u])也可以预测,因此不被视为音位元音。我们证明,它们与音位元音的区别在于持续时间和发音上,前者更短,后者更边缘,但在强度上没有区别。然后,我们演示了如何利用这种对侵入元音和音位元音之间区别的语音理解来回答有关 Sgi Bara 的语音问题。我们提供了两个案例研究:语音上模棱两可的高元音序列和可能是单浊音的频繁双字组合。研究结果证实了侵入元音和音位元音之间存在区别。
{"title":"The phonetics of vowel intrusion in Sgi Bara","authors":"Don Daniels , Zoë Haupt , Melissa M. Baese-Berk","doi":"10.1016/j.wocn.2024.101323","DOIUrl":"https://doi.org/10.1016/j.wocn.2024.101323","url":null,"abstract":"<div><p>We provide a phonetic examination of intrusive vowels in Sgi Bara [jil]. These vowels are inserted in predictable places, and their quality (either [i], [ɨ], or [u]) is also predictable, so they are not considered phonemic. We demonstrate that they differ from phonemic vowels in their duration, being shorter; and in their articulation, being more peripheral; but not in their intensity. We then demonstrate how this phonetic understanding of the difference between intrusive and phonemic vowels can be used to answer phonological questions about Sgi Bara. We offer two case studies: phonologically ambiguous sequences of high vowels, and frequent two-word combinations that may be univerbating. The results confirm the existence of a distinction between intrusive and phonemic vowels.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"104 ","pages":"Article 101323"},"PeriodicalIF":1.9,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0095447024000299/pdfft?md5=4ed2ce41979d22264153fa5638e56f22&pid=1-s2.0-S0095447024000299-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140823935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01Epub Date: 2024-03-26DOI: 10.1016/j.wocn.2024.101322
Seung-Eun Kim , Sam Tilsen
Previous studies have examined whether speakers initiate longer utterances with higher F0. Evidence for such effects is mixed and is mostly based on point estimates of F0 at the beginning of the utterance. Moreover, it is unknown whether utterance length can influence F0 control solely at utterance onset or also during the utterance. We conducted a sentence production task to investigate how control of pitch register – F0 ceiling, floor, and span – is influenced by utterance length. Specifically, we test whether speakers adjust register both in relation to an initially planned utterance length – proactive F0 control – and in response to changes in utterance length that occur after response onset – reactive F0 control. Target sentences in the experiment had one, two, or three subject noun phrases, which were cued with visual stimuli. An experimental manipulation was tested in which some visual stimuli were delayed until after participants initiated the utterance. Evidence for both proactive and reactive control of register was observed. Participants adopted a higher register ceiling and broader span in longer utterances. Furthermore, they decreased the amount of ceiling compression upon encountering delayed stimuli. The findings suggest the presence of a mechanism in which speakers continuously estimate the remaining length of the utterance and use that information to adjust pitch register.
{"title":"Planning for the future and reacting to the present: Proactive and reactive F0 adjustments in speech","authors":"Seung-Eun Kim , Sam Tilsen","doi":"10.1016/j.wocn.2024.101322","DOIUrl":"https://doi.org/10.1016/j.wocn.2024.101322","url":null,"abstract":"<div><p>Previous studies have examined whether speakers initiate longer utterances with higher F0. Evidence for such effects is mixed and is mostly based on point estimates of F0 at the beginning of the utterance. Moreover, it is unknown whether utterance length can influence F0 control solely at utterance onset or also during the utterance. We conducted a sentence production task to investigate how control of pitch register – F0 ceiling, floor, and span – is influenced by utterance length. Specifically, we test whether speakers adjust register both in relation to an initially planned utterance length – <em>proactive</em> F0 control – and in response to changes in utterance length that occur after response onset – <em>reactive</em> F0 control. Target sentences in the experiment had one, two, or three subject noun phrases, which were cued with visual stimuli. An experimental manipulation was tested in which some visual stimuli were delayed until after participants initiated the utterance. Evidence for both proactive and reactive control of register was observed. Participants adopted a higher register ceiling and broader span in longer utterances. Furthermore, they decreased the amount of ceiling compression upon encountering delayed stimuli. The findings suggest the presence of a mechanism in which speakers continuously estimate the remaining length of the utterance and use that information to adjust pitch register.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"104 ","pages":"Article 101322"},"PeriodicalIF":1.9,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140290735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01Epub Date: 2024-03-16DOI: 10.1016/j.wocn.2024.101311
Rose Stamp , Svetlana Dachkovsky , Hagit Hel-Or , David Cohn , Wendy Sandler
Phonetic reduction arises in the course of typical language production, when language users produce a less clearly articulated form of a word. An important factor that affects phonetic reduction is the predictability of the information conveyed: predictable information is reduced. This can be observed in everyday use of reference in spoken language. Specifically, first mention of a referent is more carefully articulated than subsequent mentions of the same referents, which are often phonetically reduced. Here we ask whether phonetic reduction for predictable information exists in a young sign language, and, in particular, how phonetic reduction is realized in visual languages that exploit various articulators of the body: the hands, the head, and the torso. The only natural languages that we can observe as they emerge in real time are young sign languages, and we focus on one of these in the current study: Israeli Sign Language (ISL). We use 3D motion-capture technology to measure phonetic reduction in signers of ISL by comparing the production of referring expressions synchronically, at different points during a narrative (e.g., Introduction, Reintroduction, Maintenance). Our findings show: (a) that phonetic reduction is present in a young sign language; and specifically (b) that the actions of different articulators involved in discourse are reduced, based on predictability. We consider the importance of these findings in understanding predictability in language more generally.
{"title":"A kinematic study of phonetic reduction in a young sign language","authors":"Rose Stamp , Svetlana Dachkovsky , Hagit Hel-Or , David Cohn , Wendy Sandler","doi":"10.1016/j.wocn.2024.101311","DOIUrl":"https://doi.org/10.1016/j.wocn.2024.101311","url":null,"abstract":"<div><p>Phonetic reduction arises in the course of typical language production, when language users produce a less clearly articulated form of a word. An important factor that affects phonetic reduction is the predictability of the information conveyed: predictable information is reduced. This can be observed in everyday use of reference in spoken language. Specifically, first mention of a referent is more carefully articulated than subsequent mentions of the same referents, which are often phonetically reduced. Here we ask whether phonetic reduction for predictable information exists in a young sign language, and, in particular, how phonetic reduction is realized in visual languages that exploit various articulators of the body: the hands, the head, and the torso. The only natural languages that we can observe as they emerge in real time are young sign languages, and we focus on one of these in the current study: Israeli Sign Language (ISL). We use 3D motion-capture technology to measure phonetic reduction in signers of ISL by comparing the production of referring expressions synchronically, at different points during a narrative (e.g., Introduction, Reintroduction, Maintenance). Our findings show: (a) that phonetic reduction is present in a young sign language; and specifically (b) that the actions of different articulators involved in discourse are reduced, based on predictability. We consider the importance of these findings in understanding predictability in language more generally.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"104 ","pages":"Article 101311"},"PeriodicalIF":1.9,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140138929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}