首页 > 最新文献

Speech Prosody 2022最新文献

英文 中文
The effects of syntactic and acoustic cues on the perception of prosodic boundaries 句法和听觉线索对韵律边界感知的影响
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-142
Jianjing Kuang, May Pik Yu Chan, Nari Rhee
This study investigates how the perception of prosodic boundaries is shaped by syntactic phrasing and acoustic cues for English and Mandarin listeners. Syntactically-parsed speech corpora were used as the stimuli for the perception experiment. The relative strength of the syntactic boundary of both the left and right sides of the constituents was extracted from the syntactic parsing annotations. A wide range of acoustic cues of both prosodic domain-final and domain-initial positions were examined. Linear-mixed-effects modeling of the likelihood of boundary perception suggests that, for both languages, prosodic boundary perception was influenced by both the strength of syntactic boundary and acoustic cues: boundary perception was heavily driven by the presence of pause; pause also modulated the contribution of other acoustic cues; and larger syntactic boundaries were generally more likely to be perceived as prosodic boundaries. However, there is also cross-linguistic variation: the effect of syntactic phrasing cues was generally stronger for English; acoustically, the effect of final lengthening and pitch reset was stronger in English, while pause was the dominant cue in Mandarin. We discuss the important implica-tions of these findings related to the nature of prosodic hierar-chy, and the nature of the prosody-syntax interface.
本研究探讨了英语和普通话听者对韵律边界的感知是如何由句法短语和声音线索形成的。感知实验以句法分析后的语料库为刺激。从句法解析标注中提取成分左右两侧语法边界的相对强度。研究了韵律域末位和域初位的广泛声学线索。边界感知可能性的线性混合效应模型表明,对于两种语言,韵律边界感知都受到句法边界和声学线索的强度的影响:边界感知在很大程度上受到停顿的存在的驱动;停顿也调节了其他声音线索的作用;较大的句法边界通常更容易被理解为韵律边界。然而,也存在跨语言差异:语法短语线索的影响通常对英语更强;在声学上,英语的最后延长和音调重置效果更强,而普通话的暂停是主要的线索。我们讨论了这些发现的重要意义,这些发现涉及韵律层次的性质,以及韵律语法界面的性质。
{"title":"The effects of syntactic and acoustic cues on the perception of prosodic boundaries","authors":"Jianjing Kuang, May Pik Yu Chan, Nari Rhee","doi":"10.21437/speechprosody.2022-142","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-142","url":null,"abstract":"This study investigates how the perception of prosodic boundaries is shaped by syntactic phrasing and acoustic cues for English and Mandarin listeners. Syntactically-parsed speech corpora were used as the stimuli for the perception experiment. The relative strength of the syntactic boundary of both the left and right sides of the constituents was extracted from the syntactic parsing annotations. A wide range of acoustic cues of both prosodic domain-final and domain-initial positions were examined. Linear-mixed-effects modeling of the likelihood of boundary perception suggests that, for both languages, prosodic boundary perception was influenced by both the strength of syntactic boundary and acoustic cues: boundary perception was heavily driven by the presence of pause; pause also modulated the contribution of other acoustic cues; and larger syntactic boundaries were generally more likely to be perceived as prosodic boundaries. However, there is also cross-linguistic variation: the effect of syntactic phrasing cues was generally stronger for English; acoustically, the effect of final lengthening and pitch reset was stronger in English, while pause was the dominant cue in Mandarin. We discuss the important implica-tions of these findings related to the nature of prosodic hierar-chy, and the nature of the prosody-syntax interface.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124522828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Human self-domestication and the evolution of prosody 人类的自我驯化与韵律的进化
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-138
A. Benítez‐Burraco, Wendy Elvira-García
Human self-domestication refers to a new evolutionary hypothesis. According to this view, humans have experienced changes that are similar to those observed in domesticated mammals and that have provided us with many of the behavioural and perhaps cognitive pre-requisites for supporting our social practices and advanced culture. At the core of this hypothesis is the claim that self-domestication is triggered by a reduction in reactive aggression. Since the findings of increased complexity in the communicative signals of domesticated animals compared to their wild conspecific, the human self-domestication hypothesis has been used to account for the sophistication of the grammars of human languages. Nonetheless, less research has been done in the domain of phonology. In this talk, we apply this evolutionary model to the evolution of human prosody, arguing for a progressive complexification of prosody that parallels (and is triggered by) the complexification of grammar, also in response to a reduction in reactive aggression levels. Two different types of evidence support our claim: the parallel complexification of prosody and grammar found in emerging sign languages and the parallel sophistication of prosody and grammar during language acquisition, which in turn parallels an increased control over the mechanisms involved in reactive aggression.
人类自我驯化是一种新的进化假说。根据这一观点,人类经历了与家养哺乳动物相似的变化,这些变化为我们提供了许多行为和认知上的先决条件,以支持我们的社会实践和先进文化。这一假说的核心是,自我驯化是由反应性攻击的减少引发的。由于驯化动物的交流信号比其野生同类更复杂,人类自我驯化假说被用来解释人类语言语法的复杂性。然而,在音系学领域的研究却很少。在这次演讲中,我们将这一进化模型应用于人类韵律的进化,认为韵律的渐进式复杂化与语法的复杂化平行(并由其触发),也是对反应性攻击水平降低的回应。两种不同类型的证据支持我们的观点:在新兴的手语中发现的韵律和语法的平行复杂性,以及在语言习得过程中韵律和语法的平行复杂性,这反过来又与对反应性攻击所涉及的机制的控制增加相一致。
{"title":"Human self-domestication and the evolution of prosody","authors":"A. Benítez‐Burraco, Wendy Elvira-García","doi":"10.21437/speechprosody.2022-138","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-138","url":null,"abstract":"Human self-domestication refers to a new evolutionary hypothesis. According to this view, humans have experienced changes that are similar to those observed in domesticated mammals and that have provided us with many of the behavioural and perhaps cognitive pre-requisites for supporting our social practices and advanced culture. At the core of this hypothesis is the claim that self-domestication is triggered by a reduction in reactive aggression. Since the findings of increased complexity in the communicative signals of domesticated animals compared to their wild conspecific, the human self-domestication hypothesis has been used to account for the sophistication of the grammars of human languages. Nonetheless, less research has been done in the domain of phonology. In this talk, we apply this evolutionary model to the evolution of human prosody, arguing for a progressive complexification of prosody that parallels (and is triggered by) the complexification of grammar, also in response to a reduction in reactive aggression levels. Two different types of evidence support our claim: the parallel complexification of prosody and grammar found in emerging sign languages and the parallel sophistication of prosody and grammar during language acquisition, which in turn parallels an increased control over the mechanisms involved in reactive aggression.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114415414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Beware of the individual: Evaluating prominence perception in spontaneous speech 谨防个体:评价自发演讲中的突出感知
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-55
Anna Bruggeman, Leonie Schade, M. Wlodarczak, P. Wagner
Much of the existing research on prominence perception has focused on read speech in American English and German. The present paper presents two experiments that build on and extend insights from these studies in two ways. Firstly, we elicit prominence judgments on spontaneous speech. Secondly, we investigate gradient rather than binary prominence judgments by introducing a finger tapping task. We then provide a within-participant comparison of gradient prominence results with binary prominence judgments to evaluate their correspondence. Our results show that participants exhibit different success rates in tapping the prominence pattern of spontaneous data, but generally tapping results correlate well with binary prominence judgments within individuals. Random forest analyses of the acoustic parameters involved show that pitch accentuation and duration play important roles in both binary judgments and prominence tapping patterns. We can also confirm earlier findings from read speech that differences exist between participants in the relative importance rankings of various signal and systematic properties.
现有的关于突出感知的研究大多集中在美式英语和德语的阅读语音上。本论文提出了两个实验,以两种方式建立和扩展这些研究的见解。首先,我们推导出自发言语的显著性判断。其次,我们通过引入手指敲击任务来研究梯度而不是二元显著性判断。然后,我们提供了一个参与者内比较梯度突出结果与二元突出判断,以评估他们的对应关系。我们的研究结果表明,参与者在挖掘自发数据的突出模式方面表现出不同的成功率,但总体而言,挖掘结果与个体内部的二元突出判断具有良好的相关性。随机森林分析表明,音高重音和持续时间在二元判断和突出敲击模式中都起着重要作用。我们还可以证实先前的研究结果,即参与者之间在各种信号和系统属性的相对重要性排名上存在差异。
{"title":"Beware of the individual: Evaluating prominence perception in spontaneous speech","authors":"Anna Bruggeman, Leonie Schade, M. Wlodarczak, P. Wagner","doi":"10.21437/speechprosody.2022-55","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-55","url":null,"abstract":"Much of the existing research on prominence perception has focused on read speech in American English and German. The present paper presents two experiments that build on and extend insights from these studies in two ways. Firstly, we elicit prominence judgments on spontaneous speech. Secondly, we investigate gradient rather than binary prominence judgments by introducing a finger tapping task. We then provide a within-participant comparison of gradient prominence results with binary prominence judgments to evaluate their correspondence. Our results show that participants exhibit different success rates in tapping the prominence pattern of spontaneous data, but generally tapping results correlate well with binary prominence judgments within individuals. Random forest analyses of the acoustic parameters involved show that pitch accentuation and duration play important roles in both binary judgments and prominence tapping patterns. We can also confirm earlier findings from read speech that differences exist between participants in the relative importance rankings of various signal and systematic properties.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115084224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Influence of Tone on the Alignment of Speech and Co-Speech Gesture 声调对语音和同语音手势对齐的影响
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-63
Kathryn Franich, Hermann Keupdjio
Evidence continues to accrue suggesting that co-speech gestures form an integrated part of the prosodic system of languages. Several studies have highlighted a tight link between the timing of gestures of the hands and head with syllables bearing prosodic prominence. Most work to date has examined this relationship in Indo-European languages, where gestures appear to be crucially timed with respect to pitch-accented syllables. Less work has examined the timing of co-speech gestures in tonal languages, where pitch plays quite a different role within the phonological system. Here, we examine the influence of tone on the timing of manual co-speech gestures in Medmba, a Grassfields Bantu language spoken in Cameroon. We investigate 1) whether certain tones are more likely than others to associate with manual gestures in the language; and 2) whether the fine timing of the speech-gesture relationship is influenced by the tone or relative fundamental frequency ( f 0 ) of the syllable it co-occurs with. Our findings indicated no preference for any one tone to occur with co-speech gestures. However, gesture apexes were found to align significantly later with respect to the accompanying syllable’s vowel for low-toned syllables as compared with syllables of other tones.
越来越多的证据表明,共语手势构成了语言韵律系统的一个组成部分。几项研究都强调了手和头的手势时间与韵律突出的音节之间的紧密联系。迄今为止,大多数研究都是在印欧语言中研究这种关系,在印欧语言中,手势似乎与音高重读音节有着至关重要的时间关系。在声调语言中,声调在音系系统中扮演着相当不同的角色,而研究声调语言中共同言语手势的时间的工作则较少。在这里,我们研究了语调对Medmba(喀麦隆的一种草原班图语)中手动共语手势时间的影响。我们研究了1)某些音调是否比其他音调更容易与语言中的手势联系起来;2)语音-手势关系的精细时序是否受到与之共现音节的音调或相对基频(f0)的影响。我们的研究结果表明,在共同说话的手势中,没有任何一种音调的偏好。然而,与其他音调的音节相比,低音调音节的手势顶点与伴随音节的元音对齐的时间要晚得多。
{"title":"The Influence of Tone on the Alignment of Speech and Co-Speech Gesture","authors":"Kathryn Franich, Hermann Keupdjio","doi":"10.21437/speechprosody.2022-63","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-63","url":null,"abstract":"Evidence continues to accrue suggesting that co-speech gestures form an integrated part of the prosodic system of languages. Several studies have highlighted a tight link between the timing of gestures of the hands and head with syllables bearing prosodic prominence. Most work to date has examined this relationship in Indo-European languages, where gestures appear to be crucially timed with respect to pitch-accented syllables. Less work has examined the timing of co-speech gestures in tonal languages, where pitch plays quite a different role within the phonological system. Here, we examine the influence of tone on the timing of manual co-speech gestures in Medmba, a Grassfields Bantu language spoken in Cameroon. We investigate 1) whether certain tones are more likely than others to associate with manual gestures in the language; and 2) whether the fine timing of the speech-gesture relationship is influenced by the tone or relative fundamental frequency ( f 0 ) of the syllable it co-occurs with. Our findings indicated no preference for any one tone to occur with co-speech gestures. However, gesture apexes were found to align significantly later with respect to the accompanying syllable’s vowel for low-toned syllables as compared with syllables of other tones.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"17 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116863930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Effects of Intonation on the Sentence-Final Particle nyei in Iu-Mien 语调对留话句末助词nyei的影响
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-32
E. Thurgood, Paul Olejarczuk
This study focuses on the interaction between tone and intonation on the prosodic realization of the sentence-final particle nyei³³ in Iu-Mien, a Hmong-Mien language spoken in parts of China and Southeast Asia. While intonation patterns of questions in colloquial Iu-Mien, in which sentence-final particle nyei³³ does not typically occur, have been described, intonation patterns with the sentence-final nyei³³ used in less colloquial settings have not been analyzed yet. Our study aims to fill this gap. Using data from five female speakers, we show that the mid-level tone 33 of nyei³³ is preserved when in the final position of statements, but surfaces as a rising or falling contour at the end of yes-no questions. In addition, we find coarticulatory effects of the preceding tone on the F0 contour of the particle.
本研究聚焦于中国及东南亚部分地区使用的苗族语苗族语苗族语中,声调与语调的交互作用对句末助词nyei³³的韵律实现。虽然已经描述了口语化u- mien中问句的语调模式,其中句末助词nyei³³通常不会出现,但在非口语化环境中使用句末助词nyei³³的语调模式尚未得到分析。我们的研究旨在填补这一空白。使用五位女性说话者的数据,我们发现nyei³³的中级音调33在语句的最后位置被保留,但在是-否问题的结尾以上升或下降的轮廓出现。此外,我们发现了前一个音调对粒子的F0轮廓的协同发音效应。
{"title":"The Effects of Intonation on the Sentence-Final Particle nyei in Iu-Mien","authors":"E. Thurgood, Paul Olejarczuk","doi":"10.21437/speechprosody.2022-32","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-32","url":null,"abstract":"This study focuses on the interaction between tone and intonation on the prosodic realization of the sentence-final particle nyei³³ in Iu-Mien, a Hmong-Mien language spoken in parts of China and Southeast Asia. While intonation patterns of questions in colloquial Iu-Mien, in which sentence-final particle nyei³³ does not typically occur, have been described, intonation patterns with the sentence-final nyei³³ used in less colloquial settings have not been analyzed yet. Our study aims to fill this gap. Using data from five female speakers, we show that the mid-level tone 33 of nyei³³ is preserved when in the final position of statements, but surfaces as a rising or falling contour at the end of yes-no questions. In addition, we find coarticulatory effects of the preceding tone on the F0 contour of the particle.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116889624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Expressing information status through prosody in the spontaneous speech of American English-speaking children 美国英语儿童自发言语中的韵律表达信息状态
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-21
Jill C. Thorson, Jill M. Trumbell, Kimberly D. Nesbitt
Prosody is used to express information structure and status differences in American English. For this study, our motivation was to analyze these abilities during an ecologically valid interaction where we traded control for more natural spontaneous speech. We ask how children package information when playing with their parents during exhibit exploration in a children’s museum. Specifically, we employed a MAE_ToBI analysis to look at the production of new and given information status differences during these interactions. Parent-child dyads were recorded while playing in a museum exhibit at a children’s museum. Preliminary analyses were conducted on one 4-year-old, one 5-year-old, and one 6-year-old speaker. As predicted, we found a particular set of pitch accents to be commonly found as well as considerable variation in nuclear configuration patterns due to pragmatic effects. While pitch accent types largely stayed the same over the three ages analyzed to date, the H+!H* pitch accent was only found in the speech of the 4-year-old speaker. These data continue to add to the knowledge of how pitch accent selection relates to both information status and the pragmatics of the discourse.
在美式英语中,韵律是用来表达信息结构和地位差异的。在这项研究中,我们的动机是在一种生态有效的互动中分析这些能力,在这种互动中,我们用更自然的自发语言来交换控制。在儿童博物馆里,我们问孩子们在和父母一起探索展览时是如何包装信息的。具体来说,我们使用了MAE_ToBI分析来查看在这些交互过程中产生的新的和给定的信息状态差异。亲子二人组在儿童博物馆的展览中玩耍时被记录下来。对1名4岁、1名5岁和1名6岁的说话者进行了初步分析。正如预测的那样,我们发现一组特殊的音调重音是常见的,而且由于语用效应,核构型模式也有相当大的变化。虽然在迄今为止分析的三个年龄段中,音高口音类型基本保持不变,但H+!只在这位4岁的说话者的讲话中发现了H*音高的口音。这些数据继续增加了关于音高重音选择如何与信息状态和语篇语用学相关的知识。
{"title":"Expressing information status through prosody in the spontaneous speech of American English-speaking children","authors":"Jill C. Thorson, Jill M. Trumbell, Kimberly D. Nesbitt","doi":"10.21437/speechprosody.2022-21","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-21","url":null,"abstract":"Prosody is used to express information structure and status differences in American English. For this study, our motivation was to analyze these abilities during an ecologically valid interaction where we traded control for more natural spontaneous speech. We ask how children package information when playing with their parents during exhibit exploration in a children’s museum. Specifically, we employed a MAE_ToBI analysis to look at the production of new and given information status differences during these interactions. Parent-child dyads were recorded while playing in a museum exhibit at a children’s museum. Preliminary analyses were conducted on one 4-year-old, one 5-year-old, and one 6-year-old speaker. As predicted, we found a particular set of pitch accents to be commonly found as well as considerable variation in nuclear configuration patterns due to pragmatic effects. While pitch accent types largely stayed the same over the three ages analyzed to date, the H+!H* pitch accent was only found in the speech of the 4-year-old speaker. These data continue to add to the knowledge of how pitch accent selection relates to both information status and the pragmatics of the discourse.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124824133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Short-Term Periodicity of Prosodic Phrasing: Corpus-based Evidence 韵律短语的短期周期性:基于语料库的证据
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-141
S. Stehwien, Lars Meyer
Speech is perceived as a sequence of meaningful units of various lengths, from phones to phrases. Prosody is one of the means by which these are segmented: Prosodic boundaries sub-divide utterances into prosodic phrases. In this corpus study, we study prosodic boundaries from a neurolinguistic perspective. To be perceived correctly, prosodic phrases must obey neurobiological constraints. In particular, electrophysiological processing has been argued to operate periodically, with one electrophysiological processing cycle being devoted to the processing of exactly one prosodic phrase. We thus hypothesized that prosodic phrases as such should show periodicity. We assess the DIRNDL corpus of German radio news, which has been annotated for intonational and intermediate phrases. We find that sequences of 2–5 intermediate phrases are periodic at 0.8–1.6 Hertz within their superordinate intonation phrase. Across utterances, the duration of intermediate phrases alternates with the duration of superordinate intonation phrases, indicating a dependence of prosodic time scales. While the determinants of periodicity are unknown, the results are compatible with an asso-ciation between periodic electrophysiological processing mechanisms and the rhythm of prosody. This contributes to closing the gap between the the neurobiology of language and linguistic description.
从电话到短语,语言被认为是一系列不同长度的有意义的单位。韵律是其中一种分割方法:韵律边界将话语细分为韵律短语。在这个语料库研究中,我们从神经语言学的角度研究韵律边界。为了被正确理解,韵律短语必须服从神经生物学的约束。特别是,电生理处理被认为是周期性的,一个电生理处理周期被专门用于处理一个韵律短语。因此,我们假设韵律短语本身应该表现出周期性。我们评估了德国广播新闻的DIRNDL语料库,该语料库已被注释为语调和中间短语。我们发现2-5个中间词组的序列在其上级语调词组中以0.8-1.6赫兹的频率周期性变化。在整个话语中,中间短语的持续时间与上级语调短语的持续时间交替,表明韵律时间尺度的依赖性。虽然周期性的决定因素是未知的,但结果与周期性电生理处理机制和韵律节奏之间的关联是相容的。这有助于缩小语言的神经生物学和语言描述之间的差距。
{"title":"Short-Term Periodicity of Prosodic Phrasing: Corpus-based Evidence","authors":"S. Stehwien, Lars Meyer","doi":"10.21437/speechprosody.2022-141","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-141","url":null,"abstract":"Speech is perceived as a sequence of meaningful units of various lengths, from phones to phrases. Prosody is one of the means by which these are segmented: Prosodic boundaries sub-divide utterances into prosodic phrases. In this corpus study, we study prosodic boundaries from a neurolinguistic perspective. To be perceived correctly, prosodic phrases must obey neurobiological constraints. In particular, electrophysiological processing has been argued to operate periodically, with one electrophysiological processing cycle being devoted to the processing of exactly one prosodic phrase. We thus hypothesized that prosodic phrases as such should show periodicity. We assess the DIRNDL corpus of German radio news, which has been annotated for intonational and intermediate phrases. We find that sequences of 2–5 intermediate phrases are periodic at 0.8–1.6 Hertz within their superordinate intonation phrase. Across utterances, the duration of intermediate phrases alternates with the duration of superordinate intonation phrases, indicating a dependence of prosodic time scales. While the determinants of periodicity are unknown, the results are compatible with an asso-ciation between periodic electrophysiological processing mechanisms and the rhythm of prosody. This contributes to closing the gap between the the neurobiology of language and linguistic description.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124835251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
High Rising Terminals in Dublin: forms, functions and gender 都柏林的高层航站楼:形式、功能和性别
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-37
Julia Bongiorno, Sophie Herment
High Rising Terminals, Uptalk, or Upspeak, are stylistic rises that can be found at the end of declarative statements. They have been studied in numerous varieties of English and in other languages too. It has been shown that these rises can take on different phonetic and phonological forms and convey various pragmatic functions depending on the varieties in which they are found. The present study provides a description of these forms and functions in Dublin (Republic of Ireland). Based on a corpus of 5 speakers from the PAC-Dublin corpus that was recorded in the Irish capital in 2018, the study shows that HRTs are mainly realized with late rises and nuclear rises and that they are different from interrogative and continuative rises, notably because they are steeper than the latter. A sociolinguistic analysis of our corpus also shows that the gender of the speakers has an influence on the occurrence of the phenomenon, which does not seem to be the case for age range. This article thus provides a multidimensional analysis of stylistic rising tones in statements in Dublin.
高升终端,上升式,或上升式,是文体上的上升,可以在陈述句的结尾找到。它们被用多种英语和其他语言研究过。研究表明,这些升调可以采取不同的语音和语音形式,并根据其出现的种类传达不同的语用功能。本研究提供了在都柏林(爱尔兰共和国)这些形式和功能的描述。基于2018年在爱尔兰首都记录的PAC-Dublin语料库中的5名发言者的语料库,该研究表明,hrt主要通过晚升和核升来实现,并且它们与疑问句和连续式升不同,特别是因为它们比后者更陡峭。对语料库的社会语言学分析也表明,说话者的性别对这种现象的发生有影响,而年龄范围似乎没有这种影响。因此,本文提供了一个多维分析的文体上升语气在都柏林语句。
{"title":"High Rising Terminals in Dublin: forms, functions and gender","authors":"Julia Bongiorno, Sophie Herment","doi":"10.21437/speechprosody.2022-37","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-37","url":null,"abstract":"High Rising Terminals, Uptalk, or Upspeak, are stylistic rises that can be found at the end of declarative statements. They have been studied in numerous varieties of English and in other languages too. It has been shown that these rises can take on different phonetic and phonological forms and convey various pragmatic functions depending on the varieties in which they are found. The present study provides a description of these forms and functions in Dublin (Republic of Ireland). Based on a corpus of 5 speakers from the PAC-Dublin corpus that was recorded in the Irish capital in 2018, the study shows that HRTs are mainly realized with late rises and nuclear rises and that they are different from interrogative and continuative rises, notably because they are steeper than the latter. A sociolinguistic analysis of our corpus also shows that the gender of the speakers has an influence on the occurrence of the phenomenon, which does not seem to be the case for age range. This article thus provides a multidimensional analysis of stylistic rising tones in statements in Dublin.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129701054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Prosodic characteristics of canonical and non-canonical questions in Estonian 爱沙尼亚语规范疑问句与非规范疑问句的韵律特征
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-28
Heete Sahkai, Eva Liina Asu, P. Lippus
This paper presents a comparison of the prosodic characteristics of canonical questions with two types of non-canonical interrogative utterances in Estonian. The data consisted of string-identical interrogative sentences with the question-word kuidas (‘how’) elicited in three readings: information-seeking question (ISQ), rhetorical question (RQ) and surprise question (SQ). A three-way distinction between the three utterance types emerged. First, there was a binary distinction between canonical and non-canonical questions in mean pitch, utterance duration and voice quality: non-canonical questions were characterised by lower mean pitch, longer duration and a larger proportion of non-modal (creaky) voice quality. Second, there was a three-way distinction in pitch range: ISQs had the narrowest and SQs the widest pitch range while RQs were in-between the two. Third, SQs were further distinguished from ISQs and RQs by a different placement of focal accent and the accentuation of pronouns. There were, however, no differences in intonational pitch accent types and boundary tones between the three utterance types.Theresults imply that the lower mean pitch signals the indirect illocutionary force of the non-canonical questions while the longer duration, non-modal voice quality and larger pitch range indicate their affective nature. SQs are additionally associated with a specific information structure.
本文比较了爱沙尼亚语中典型疑问句和两种非典型疑问句的韵律特征。数据由字符串相同的疑问句和问题词kuidas(“如何”)组成,这些疑问句在三种阅读中引出:信息寻求问题(ISQ)、反问句(RQ)和惊喜问题(SQ)。三种话语类型之间的三向区别出现了。首先,规范问题和非规范问题在平均音高、话语持续时间和语音质量上存在二元区分:非规范问题的特征是平均音高较低、持续时间较长、非模态(沙哑)语音质量比例较大。第二,在音高范围上有三方面的区别:isq的音高范围最窄,SQs的音高范围最宽,而rq的音高范围介于两者之间。第三,通过不同的重音位置和代词的重音,进一步将sqq与isq和rq区分开来。然而,三种话语类型在语调、音高、重音类型和边界音调方面没有差异。结果表明,较低的平均音高表明非规范问题的间接言外力量,而较长的持续时间、非模态音质和较大的音高范围表明它们的情感性质。SQs还与特定的信息结构相关联。
{"title":"Prosodic characteristics of canonical and non-canonical questions in Estonian","authors":"Heete Sahkai, Eva Liina Asu, P. Lippus","doi":"10.21437/speechprosody.2022-28","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-28","url":null,"abstract":"This paper presents a comparison of the prosodic characteristics of canonical questions with two types of non-canonical interrogative utterances in Estonian. The data consisted of string-identical interrogative sentences with the question-word kuidas (‘how’) elicited in three readings: information-seeking question (ISQ), rhetorical question (RQ) and surprise question (SQ). A three-way distinction between the three utterance types emerged. First, there was a binary distinction between canonical and non-canonical questions in mean pitch, utterance duration and voice quality: non-canonical questions were characterised by lower mean pitch, longer duration and a larger proportion of non-modal (creaky) voice quality. Second, there was a three-way distinction in pitch range: ISQs had the narrowest and SQs the widest pitch range while RQs were in-between the two. Third, SQs were further distinguished from ISQs and RQs by a different placement of focal accent and the accentuation of pronouns. There were, however, no differences in intonational pitch accent types and boundary tones between the three utterance types.Theresults imply that the lower mean pitch signals the indirect illocutionary force of the non-canonical questions while the longer duration, non-modal voice quality and larger pitch range indicate their affective nature. SQs are additionally associated with a specific information structure.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127082948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Syllable duration as a proxy to latent prosodic features 音节时长作为潜在韵律特征的代表
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-45
Christina Tånnander, D. House, Jens Edlund
Recent advances in deep-learning have pushed text-to-speech synthesis (TTS) very close to human speech. In deep-learning, latent features refer to features that are hidden from us; notwithstanding, we may meaningfully observe their effects. Analogously, latent prosodic features refer to the exact features that constitute e.g. prominence that are unknown to us, although we know (some of) the functions of prominence and (some of) its acoustic correlates. Deep-learned speech models capture prosody well but leave us with little control and few insights. Previously, we explored average syllable duration on word level - a simple and accessible metric - as a proxy for prominence: in Swedish TTS, where verb particles and numerals tend to receive too little prominence, these were nudged towards lengthening while allowing the TTS models to otherwise operate freely. Listener panels overwhelmingly preferred the nudged versions to the unmodified TTS. In this paper, we analyze utterances from the modified TTS. The analysis shows that duration-nudging of relevant words changes the following features in an observable manner: duration is predictably lengthened, word-initial glottalization occurs, and the general intonation pattern changes. This supports the view of latent prosodic features that can be reflected in deep-learned models and accessed by proxy.
深度学习的最新进展使文本到语音合成(TTS)非常接近人类语音。在深度学习中,潜在特征是指我们看不到的特征;尽管如此,我们还是可以有意义地观察到它们的影响。类似地,潜在韵律特征指的是构成例如突出音的确切特征,尽管我们知道突出音的(一些)功能和(一些)与之相关的声学特征,但我们不知道这些特征。深度学习的语音模型很好地捕捉了韵律,但我们几乎没有控制力和洞察力。在此之前,我们探索了单词水平上的平均音节持续时间——一个简单易懂的度量标准——作为突出度的代表:在瑞典语TTS中,动词颗粒和数字往往得到的突出度太少,这些被推动到延长,同时允许TTS模型自由运行。与未修改的TTS相比,绝大多数听众更喜欢修改后的版本。本文对改进后的TTS语音进行分析。分析表明,相关词的持续时间变化显著地改变了以下特征:持续时间可预测地延长了,词首音化发生了,总体语调模式发生了变化。这支持了潜在韵律特征的观点,这些特征可以反映在深度学习模型中,并通过代理访问。
{"title":"Syllable duration as a proxy to latent prosodic features","authors":"Christina Tånnander, D. House, Jens Edlund","doi":"10.21437/speechprosody.2022-45","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-45","url":null,"abstract":"Recent advances in deep-learning have pushed text-to-speech synthesis (TTS) very close to human speech. In deep-learning, latent features refer to features that are hidden from us; notwithstanding, we may meaningfully observe their effects. Analogously, latent prosodic features refer to the exact features that constitute e.g. prominence that are unknown to us, although we know (some of) the functions of prominence and (some of) its acoustic correlates. Deep-learned speech models capture prosody well but leave us with little control and few insights. Previously, we explored average syllable duration on word level - a simple and accessible metric - as a proxy for prominence: in Swedish TTS, where verb particles and numerals tend to receive too little prominence, these were nudged towards lengthening while allowing the TTS models to otherwise operate freely. Listener panels overwhelmingly preferred the nudged versions to the unmodified TTS. In this paper, we analyze utterances from the modified TTS. The analysis shows that duration-nudging of relevant words changes the following features in an observable manner: duration is predictably lengthened, word-initial glottalization occurs, and the general intonation pattern changes. This supports the view of latent prosodic features that can be reflected in deep-learned models and accessed by proxy.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127275174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Speech Prosody 2022
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1