首页 > 最新文献

Speech Prosody 2022最新文献

英文 中文
A Comparison of Rhythm Metrics for L2 Speech 二语语音节奏度量的比较
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-68
Kakeru Yazawa, M. Kondo
A wide range of rhythm metrics (global metrics: %V, Δ, Varco, and segVarco; pairwise metrics: rPVI, nPVI, CCI, and D nCCI) was applied to L1 Japanese speakers’ L2 English speech data. Less proficient Japanese speakers of English are expected to show less durational variability for both vocalic and consonantal intervals (because of insufficient stress realization and transfer of CV syllable structure), although this pattern may be obscured by their slower speech rate (which increases interval durations in general). To test if the metrics can capture the L2 rhythmic characteristics, each metric was applied to read speech samples of “The North Wind and the Sun” by 183 Japanese speakers in the J-AESOP corpus. Only %V, VarcoV, and segVarcoV/C were successful; other metrics yielded inconsistent or implausible results likely due to insufficient rate normalization. The overall results indicate that global metrics can effectively quantify L2 rhythm if speech rate is normalized by the mean duration of segments (which is a good predictor of tempo) rather than the mean interval duration (which is popular but susceptible to syllable complexity).
广泛的节奏指标(全局指标:%V, Δ, Varco和segVarco;两两指标:rPVI、nPVI、CCI和D nCCI)应用于L1日语使用者的L2英语语音数据。不太精通英语的日本人在元音和辅音音程上表现出较小的持续时间变化(因为重音实现和CV音节结构的转移不足),尽管这种模式可能被他们较慢的说话速度所掩盖(这通常会增加音程持续时间)。为了测试这些度量是否能够捕捉L2节奏特征,我们将每个度量应用于183名日本人在J-AESOP语料库中朗读的《北风与太阳》的语音样本。只有%V、VarcoV和segVarcoV/C成功;其他指标产生了不一致或不可信的结果,可能是由于利率正常化不够。总体结果表明,如果语音速率是通过片段的平均持续时间(这是一个很好的节奏预测器)而不是平均间隔持续时间(这很流行,但容易受到音节复杂性的影响)来标准化的,那么全局指标可以有效地量化第二语言节奏。
{"title":"A Comparison of Rhythm Metrics for L2 Speech","authors":"Kakeru Yazawa, M. Kondo","doi":"10.21437/speechprosody.2022-68","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-68","url":null,"abstract":"A wide range of rhythm metrics (global metrics: %V, Δ, Varco, and segVarco; pairwise metrics: rPVI, nPVI, CCI, and D nCCI) was applied to L1 Japanese speakers’ L2 English speech data. Less proficient Japanese speakers of English are expected to show less durational variability for both vocalic and consonantal intervals (because of insufficient stress realization and transfer of CV syllable structure), although this pattern may be obscured by their slower speech rate (which increases interval durations in general). To test if the metrics can capture the L2 rhythmic characteristics, each metric was applied to read speech samples of “The North Wind and the Sun” by 183 Japanese speakers in the J-AESOP corpus. Only %V, VarcoV, and segVarcoV/C were successful; other metrics yielded inconsistent or implausible results likely due to insufficient rate normalization. The overall results indicate that global metrics can effectively quantify L2 rhythm if speech rate is normalized by the mean duration of segments (which is a good predictor of tempo) rather than the mean interval duration (which is popular but susceptible to syllable complexity).","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125986412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Syllable rate and speech rhythm in multiethnolectal Zurich German: a comparison of speaking styles 多民族苏黎世德语的音节率和语音节奏:说话风格的比较
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-69
Marie-Anne Morand, M. Bruno, Sandra Schwab, Stephan Schmid
Multiethnolectal ways of speaking have been emerging for 30 years in culturally and linguistically diverse neighborhoods of European cities, including Zurich (Switzerland). Among the prosodic features of Germanic multiethnolects, a so-called ‘staccato’ rhythm has been mentioned in several studies. For instance, a comparison between two groups of adolescents (12 speakers each) showed that speakers of multiethnolectal Zurich German displayed slower syllable rates and less vowel duration variability than speakers of a rather traditional dialect. This study compares syllable rate and speech rhythm metrics ( nPVI-V, nPVI-C ) in spontaneous and read speech of 48 Zurich German adolescents. In a regression analysis, rhythmic measures were compared with the perception of how multiethnolectal the speakers sounded ( rating score ). The results showed that syllable rate and nPVI-V were related to rating score independently of speaking style (read, spontaneous speech): Speakers who were perceived as more multiethnolectal had a slower syllable rate and less vowel duration variability. Such findings were not observed for nPVI-C. These results suggest that syllable rate and speech rhythm (at least, vowel duration variability) are stable phonetic features of multiethnolectal Zurich German, since the relationship between these features and the perception of multiethnolectal speech was observed in both read and spontaneous speech.
30年来,在包括瑞士苏黎世在内的欧洲城市的文化和语言多样化的社区中,多民族的说话方式已经出现。在日耳曼多民族的韵律特征中,一种所谓的“断奏”节奏在一些研究中被提到。例如,对两组青少年(每组12人)的比较表明,说多民族苏黎世德语的人比说传统方言的人表现出更慢的音节速率和更少的元音持续时间变化。本研究比较了48名苏黎世德语青少年的自然语音和诵读语音的音节率和语音节奏指标(nPVI-V, nPVI-C)。在回归分析中,节奏测量与说话者听起来多民族的感觉(评分)进行了比较。结果表明,音节率和nPVI-V与评分分数的关系独立于说话风格(阅读,自发演讲):被认为是多民族的说话者有更慢的音节率和更少的元音持续时间变化。在nPVI-C中未观察到这些结果。这些结果表明,音节速率和语音节奏(至少,元音持续时间的变化)是多民族苏黎世德语的稳定语音特征,因为这些特征与多民族语音感知之间的关系在阅读和自发语音中都被观察到。
{"title":"Syllable rate and speech rhythm in multiethnolectal Zurich German: a comparison of speaking styles","authors":"Marie-Anne Morand, M. Bruno, Sandra Schwab, Stephan Schmid","doi":"10.21437/speechprosody.2022-69","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-69","url":null,"abstract":"Multiethnolectal ways of speaking have been emerging for 30 years in culturally and linguistically diverse neighborhoods of European cities, including Zurich (Switzerland). Among the prosodic features of Germanic multiethnolects, a so-called ‘staccato’ rhythm has been mentioned in several studies. For instance, a comparison between two groups of adolescents (12 speakers each) showed that speakers of multiethnolectal Zurich German displayed slower syllable rates and less vowel duration variability than speakers of a rather traditional dialect. This study compares syllable rate and speech rhythm metrics ( nPVI-V, nPVI-C ) in spontaneous and read speech of 48 Zurich German adolescents. In a regression analysis, rhythmic measures were compared with the perception of how multiethnolectal the speakers sounded ( rating score ). The results showed that syllable rate and nPVI-V were related to rating score independently of speaking style (read, spontaneous speech): Speakers who were perceived as more multiethnolectal had a slower syllable rate and less vowel duration variability. Such findings were not observed for nPVI-C. These results suggest that syllable rate and speech rhythm (at least, vowel duration variability) are stable phonetic features of multiethnolectal Zurich German, since the relationship between these features and the perception of multiethnolectal speech was observed in both read and spontaneous speech.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125939024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Top-Down and Bottom-up Processing of Familiar and Unfamiliar Mandarin Dialect Tone Systems 熟悉与陌生普通话方言声调系统的自顶向下与自底向上加工
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-171
Liang Zhao, Shayne Sloggett, Eleanor Chodroff
Speech processing involves active integration of bottom-up and top-down information types. In the present study, we investigated the relative weighting of top-down expectedness and bottom-up lexical tone in the perception of familiar and unfamiliar lexical tone systems. Standard Mandarin and Chengdu Mandarin are mutually intelligible language varieties with comparable segmental and highly distinct tonal realizations. In a spoken semantic-plausibility judgment task, we manipulated whether a word was high-surprisal or low-surprisal given the preceding context and dialect-specific tone. All participants were native Standard Mandarin speakers with minimal Chengdu Mandarin experience. Lower judgment accuracy was observed when the stimulus was Chengdu Mandarin, and suggested that expectedness (i.e., top-down) information overrides tonal (i.e., bottom-up) information in sentence plausibility judgments. However, judgment response times to sentence surprisal were uniform across stimuli from both dialects, suggesting that speakers are aware of the surprisal conveyed by a non-standard tone, even if not used in their final decision. These findings reveal listener sensitivity to both top-down expectedness and bottom-up tone regardless of the initial tone reliability. For unfamiliar tone systems, top-down influence overrides bottom-up processing to access utterance meaning, but bottom-up processing is indeed present and may reflect rapid learning of the unfamiliar tone system.
语音处理涉及自底向上和自顶向下信息类型的主动整合。在本研究中,我们研究了自上而下的期望和自下而上的词汇语调在熟悉和不熟悉的词汇语调系统感知中的相对权重。标准普通话和成都普通话是相互可理解的语言变体,具有可比较的分段和高度不同的音调实现。在口语语义合理性判断任务中,我们根据前面的上下文和方言特定的语气来操纵一个词是高惊奇还是低惊奇。所有参与者的母语都是标准普通话,只有很少的成都普通话经验。当刺激为成都普通话时,判断正确率较低,这表明在句子可信性判断中,期望性(即自上而下)信息优于调性(即自下而上)信息。然而,在两种方言的刺激下,对句子“惊讶”的判断反应时间是一致的,这表明说话者意识到非标准语气传达的“惊讶”,即使他们没有在最终决定中使用。这些发现揭示了听者对自上而下的期望和自下而上的语调的敏感性,而不管最初的语调可靠性如何。对于不熟悉的声调系统,自上而下的影响压倒自下而上的加工以获取话语意义,但自下而上的加工确实存在,并且可能反映了对不熟悉的声调系统的快速学习。
{"title":"Top-Down and Bottom-up Processing of Familiar and Unfamiliar Mandarin Dialect Tone Systems","authors":"Liang Zhao, Shayne Sloggett, Eleanor Chodroff","doi":"10.21437/speechprosody.2022-171","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-171","url":null,"abstract":"Speech processing involves active integration of bottom-up and top-down information types. In the present study, we investigated the relative weighting of top-down expectedness and bottom-up lexical tone in the perception of familiar and unfamiliar lexical tone systems. Standard Mandarin and Chengdu Mandarin are mutually intelligible language varieties with comparable segmental and highly distinct tonal realizations. In a spoken semantic-plausibility judgment task, we manipulated whether a word was high-surprisal or low-surprisal given the preceding context and dialect-specific tone. All participants were native Standard Mandarin speakers with minimal Chengdu Mandarin experience. Lower judgment accuracy was observed when the stimulus was Chengdu Mandarin, and suggested that expectedness (i.e., top-down) information overrides tonal (i.e., bottom-up) information in sentence plausibility judgments. However, judgment response times to sentence surprisal were uniform across stimuli from both dialects, suggesting that speakers are aware of the surprisal conveyed by a non-standard tone, even if not used in their final decision. These findings reveal listener sensitivity to both top-down expectedness and bottom-up tone regardless of the initial tone reliability. For unfamiliar tone systems, top-down influence overrides bottom-up processing to access utterance meaning, but bottom-up processing is indeed present and may reflect rapid learning of the unfamiliar tone system.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126205707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Prosodic Correlates of Discourse Structure and Emotion in Discourse Markers that Preface Announcements of News 新闻公告前言语标记语中语篇结构与情感的韵律关联
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-123
Emilie Marty, R. Bertrand, Caterina Petrone, J. German
Discourse markers serve important structuring functions such as concluding a contribution or resuming a topic. We address whether, along with their role in structuring discourse, discourse markers carry prosodic cues to the emotional valence of upcoming news, perhaps to prepare the listener’s emotional reaction. Specifically, we explored the realization of French voilà donc (yeah so) when occurring between an announcement of news and its preface: Je vous appelle au sujet de votre chat qui était malade [preface], voilà donc [discourse marker] il est désormais guéri [announcement] (“I’m calling about your sick cat, yeah so he’s now cured”). We recorded 15 speakers reading voicemail messages announcing negative, positive or neutral (e.g., factual) news. We found that the intonation patterns produced with voilà donc correspond to its discursive functions, in line with existing findings, though the choice of pattern did not depend on the emotional valence of the news. Valence was, however, associated with phonetic variation, in that high f0 targets were higher for positive and neutral valence and pitch range was larger for positive valence. This finding suggests that phonetic variation projects the emotional valence of upcoming news even though discourse function primarily determines the choice of intonation pattern.
话语标记具有重要的结构功能,如结束一篇文章或继续一个话题。除了话语标记在构建话语中的作用外,我们还探讨了话语标记是否为即将到来的新闻的情感价值提供韵律线索,也许是为了让听者的情感反应做好准备。具体来说,我们探索了法语voil donc(是的,所以)发生在新闻宣布和它的序言之间时的实现:Je vous appelle au sujet de votre chat qui samtait malade[序言],voil donc[话语标记]il est dsamsormais gusamri[公告](“我打电话是为了你的病猫,是的,所以它现在好了”)。我们录下了15位说话者朗读语音邮件的录音,语音邮件中有负面的、积极的或中性的(如事实)新闻。我们发现,虽然语调模式的选择并不取决于新闻的情绪效价,但语音语调模式与其话语功能相对应,这与已有的研究结果一致。效价与语音变化有关,正价和中性价的高0目标较高,正价的音高范围较大。这一发现表明,语音变化反映了即将到来的新闻的情绪效价,尽管话语功能主要决定了语调模式的选择。
{"title":"Prosodic Correlates of Discourse Structure and Emotion in Discourse Markers that Preface Announcements of News","authors":"Emilie Marty, R. Bertrand, Caterina Petrone, J. German","doi":"10.21437/speechprosody.2022-123","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-123","url":null,"abstract":"Discourse markers serve important structuring functions such as concluding a contribution or resuming a topic. We address whether, along with their role in structuring discourse, discourse markers carry prosodic cues to the emotional valence of upcoming news, perhaps to prepare the listener’s emotional reaction. Specifically, we explored the realization of French voilà donc (yeah so) when occurring between an announcement of news and its preface: Je vous appelle au sujet de votre chat qui était malade [preface], voilà donc [discourse marker] il est désormais guéri [announcement] (“I’m calling about your sick cat, yeah so he’s now cured”). We recorded 15 speakers reading voicemail messages announcing negative, positive or neutral (e.g., factual) news. We found that the intonation patterns produced with voilà donc correspond to its discursive functions, in line with existing findings, though the choice of pattern did not depend on the emotional valence of the news. Valence was, however, associated with phonetic variation, in that high f0 targets were higher for positive and neutral valence and pitch range was larger for positive valence. This finding suggests that phonetic variation projects the emotional valence of upcoming news even though discourse function primarily determines the choice of intonation pattern.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121038325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The effect of intonational rises on serial recall in German 语调上升对德语连续记忆的影响
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-154
Christine T. Röhr, Michelina Savino, M. Grice
This paper uses a serial recall task to investigate the role of rising intonation in the allocation of attentional resources in German. It has been shown for Italian that rising intonation at prosodic boundaries enhances recall of digits in auditorily presented lists. Since resources are usually allocated to prominent items, and since pitch accents are primary encoders of prominence in both languages, we investigate whether an accentual rise leads to better recall than a boundary rise. In a serial recall task on nine-digit sequences in German we compare the effect on working memory of sequences grouped by marking the last item of the two non-final triplets with (i) a high/rising accent followed by an equally high boundary, (ii) a low accent followed by a boundary rise, or (iii) a low/falling accent-boundary sequence, as compared to (iv) ungrouped sequences as controls. Results reveal that items with a rise are recalled more accurately than items without a rise, with no evidence for superior recall of items with accent rises over those with boundary rises. However, boundary rises appear to facilitate recall over a larger domain than accentual rises.
本文采用连续回忆实验研究了升语调在德语注意资源分配中的作用。有研究表明,意大利语在韵律边界处的升调可以增强对听觉呈现列表中数字的记忆。由于资源通常分配给突出的项目,并且由于音高重音是两种语言中突出的主要编码器,因此我们研究了重音上升是否比边界上升能更好地唤起记忆。在一个九位数德语序列的连续回忆任务中,我们比较了通过标记两个非最后三联体的最后一个项目来分组的序列对工作记忆的影响,(i)高/上升重音后面跟着同样高的边界,(ii)低重音后面跟着边界上升,或(iii)低/下降重音边界序列,与(iv)未分组的序列作为对照。结果显示,有音调上升的项目比没有音调上升的项目回忆起来更准确,没有证据表明有口音上升的项目比有边界上升的项目回忆起来更好。然而,边界上升似乎比重音上升更容易在更大的范围内回忆。
{"title":"The effect of intonational rises on serial recall in German","authors":"Christine T. Röhr, Michelina Savino, M. Grice","doi":"10.21437/speechprosody.2022-154","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-154","url":null,"abstract":"This paper uses a serial recall task to investigate the role of rising intonation in the allocation of attentional resources in German. It has been shown for Italian that rising intonation at prosodic boundaries enhances recall of digits in auditorily presented lists. Since resources are usually allocated to prominent items, and since pitch accents are primary encoders of prominence in both languages, we investigate whether an accentual rise leads to better recall than a boundary rise. In a serial recall task on nine-digit sequences in German we compare the effect on working memory of sequences grouped by marking the last item of the two non-final triplets with (i) a high/rising accent followed by an equally high boundary, (ii) a low accent followed by a boundary rise, or (iii) a low/falling accent-boundary sequence, as compared to (iv) ungrouped sequences as controls. Results reveal that items with a rise are recalled more accurately than items without a rise, with no evidence for superior recall of items with accent rises over those with boundary rises. However, boundary rises appear to facilitate recall over a larger domain than accentual rises.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126692316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shape matters: Machine classification and listeners’ perceptual discrimination of American English intonational tunes 形状关系:机器分类与听者对美式英语语调语调的知觉辨别
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-61
J. Cole, Jeremy Steffman, Sam Tilsen
In Autosegmental-Metrical models of intonational phonology, pitch accents, phrase accents and boundary tones may combine freely to create a predicted set of phonologically distinct phrase-final “nuclear” tunes. In this study we ask if an 8-way distinction in nuclear tune shape in American English, predicted from combinations of 2 (monotonal) pitch accents, 2 phrase accents and 2 boundary tones, is manifest in speech production and in speech perception. F0 trajectories from an imitative speech production experiment were analyzed using (i) neural net classification, and (ii) human listeners’ perceptual discrimination of the model utterances. Pairwise classification accuracy of the imitative productions is highest for tune pairs that differ in holistic shape (high-rising vs. rise-fall), and poorest for tunes with the same shape that differ in (higher vs. lower) final f0. Perception results show a similar pattern, with poor pairwise discrimination for tunes that differ primarily, but by a small degree, in final f0. Together the results suggest a hierarchy of distinctiveness among nuclear tunes, with a robust distinction based on holistic tune shape, which only partly aligns with distinctions in tonal specification, and a weak/poorly differentiated distinction between tunes with the same holistic shape but small differences in final f0.
在语调音韵学的自分音格律模型中,音高重音、短语重音和边界音可以自由地组合在一起,形成一组在音韵学上截然不同的短语末“核”曲调。在这项研究中,我们提出了一个问题,即从两个(单调的)音高重音、两个短语重音和两个边界音的组合中预测的美式英语核调形状的8向区别是否在语音产生和语音感知中表现出来。使用(i)神经网络分类和(ii)人类听者对模型话语的感知辨别来分析来自模仿语音产生实验的F0轨迹。模仿作品的两两分类精度对于整体形状不同的曲调对(高升与低升)是最高的,而对于相同形状的曲调(高与低)最终f0不同的曲调是最差的。感知结果也显示了类似的模式,对于主要不同的曲调,配对辨别能力很差,但程度很小,在最后的60中。总之,结果表明,核曲调之间存在着层次结构的独特性,其中基于整体曲调形状的强大区分,仅部分与音调规格的区分一致,而具有相同整体形状但最终音高差异很小的曲调之间存在弱/差区分。
{"title":"Shape matters: Machine classification and listeners’ perceptual discrimination of American English intonational tunes","authors":"J. Cole, Jeremy Steffman, Sam Tilsen","doi":"10.21437/speechprosody.2022-61","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-61","url":null,"abstract":"In Autosegmental-Metrical models of intonational phonology, pitch accents, phrase accents and boundary tones may combine freely to create a predicted set of phonologically distinct phrase-final “nuclear” tunes. In this study we ask if an 8-way distinction in nuclear tune shape in American English, predicted from combinations of 2 (monotonal) pitch accents, 2 phrase accents and 2 boundary tones, is manifest in speech production and in speech perception. F0 trajectories from an imitative speech production experiment were analyzed using (i) neural net classification, and (ii) human listeners’ perceptual discrimination of the model utterances. Pairwise classification accuracy of the imitative productions is highest for tune pairs that differ in holistic shape (high-rising vs. rise-fall), and poorest for tunes with the same shape that differ in (higher vs. lower) final f0. Perception results show a similar pattern, with poor pairwise discrimination for tunes that differ primarily, but by a small degree, in final f0. Together the results suggest a hierarchy of distinctiveness among nuclear tunes, with a robust distinction based on holistic tune shape, which only partly aligns with distinctions in tonal specification, and a weak/poorly differentiated distinction between tunes with the same holistic shape but small differences in final f0.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"21 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129701349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Social variability of peak alignment in Russian rise-fall tunes 俄罗斯升降曲调中峰值对齐的社会变异性
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-175
Tatiana V. Kachkovskaia, Svetlana Zimina, Alena Portnova, D. Kocharov
In Russian, rise-fall tunes (H*L) are very typical in yes-no questions and non-utterance-final clauses. In standard descriptions of Russian intonation, the melodic maximum in this tune is located late in the stressed vowel. However, studies of modern Russian intonation, especially within the younger age group, report on cases of ”displaced” melodic peaks—shifted signifi-cantly to the right, so that the F0 maximum occurs on the post-stressed syllable. In this paper we analyse the frequency of such misplaced peaks in Russian dialogue speech, with respect to the factors of gender, age and social distance between the interlocutors. The research is based on the SibLing speech corpus: 90 dialogues with varying relationship between the interlocutors.
在俄语中,升降调(H*L)在是非问句和非话语结束语从句中非常典型。在俄语语调的标准描述中,这首曲子的旋律最大值位于重读元音的后期。然而,对现代俄语语调的研究,特别是在较年轻的年龄组中,报告了“移位”的旋律峰值的情况-显着向右移动,因此F0最大值出现在后重读音节上。本文结合对话者之间的性别、年龄和社会距离等因素,分析了俄语对话话语中这种错位高峰的出现频率。本研究基于兄弟姐妹语料库:90个对话者之间不同关系的对话。
{"title":"Social variability of peak alignment in Russian rise-fall tunes","authors":"Tatiana V. Kachkovskaia, Svetlana Zimina, Alena Portnova, D. Kocharov","doi":"10.21437/speechprosody.2022-175","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-175","url":null,"abstract":"In Russian, rise-fall tunes (H*L) are very typical in yes-no questions and non-utterance-final clauses. In standard descriptions of Russian intonation, the melodic maximum in this tune is located late in the stressed vowel. However, studies of modern Russian intonation, especially within the younger age group, report on cases of ”displaced” melodic peaks—shifted signifi-cantly to the right, so that the F0 maximum occurs on the post-stressed syllable. In this paper we analyse the frequency of such misplaced peaks in Russian dialogue speech, with respect to the factors of gender, age and social distance between the interlocutors. The research is based on the SibLing speech corpus: 90 dialogues with varying relationship between the interlocutors.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127827689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The coordination of boundary tones with constriction gestures in Seoul Korean, an edge-prominence language 边缘语言——首尔朝鲜语中边界音与收缩手势的协调
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-30
Jiyoung Jang, A. Katsika
Boundary tones mark major phrase boundaries and are expected to be coordinated with speech gestures adjacent to boundaries. Research on Greek has indeed shown that the onset of the boundary tone (BT) gestures co-occurs with the gestural target of the phrase-final vowel. Interestingly, this coordination is modulated by lexical stress even in the absence of phrasal pitch accent. The present electromagnetic articulography study examines the coordination between BT and constriction gestures in Seoul Korean, a language with no lexical prosody and an edge-prominence system, and further investigates whether focus-related prominence affects this coordination. To this end, the distance of the prominent linguistic unit to the boundary is manipulated in a variety of ways. Results indicate that the onset of BT gestures in Korean is most proximate to the peak velocity of the phrase-final vowel gesture, but suggest that a c-center account is also viable. Prominence fine-tunes this coordination: BT gestures are initiated earlier in Intonational Phrases (IPs) with non-final focus as opposed to IPs with final focus. Importantly, this pattern is detected in short IP-final Accentual Phrases (APs), but not in relatively long IP-final APs. Based on these results, implications on the relationships between lexical and phrasal levels are discussed.
边界音标志着主要的短语边界,并与边界附近的语音手势相协调。对希腊语的研究确实表明,边界音手势的开始与短语末元音的手势目标同时发生。有趣的是,即使在没有短语音高重音的情况下,这种协调也会受到词汇重音的调节。本研究考察了首尔朝鲜语(一种没有词汇韵律和边缘-突出系统的语言)中BT手势和收缩手势之间的协调,并进一步研究了焦点相关的突出是否影响这种协调。为此,突出的语言单位到边界的距离被以各种方式操纵。结果表明,韩语中BT手势的开始最接近短语-最后元音手势的峰值速度,但也表明c中心的说法是可行的。突出对这种协调进行微调:BT手势在语调短语(语调短语)中更早开始,具有非最终焦点,而不是具有最终焦点的语调短语。重要的是,这种模式在较短的IP-final Accentual Phrases (ap)中检测到,而在相对较长的IP-final ap中检测不到。在此基础上,讨论了词汇水平和短语水平之间关系的含义。
{"title":"The coordination of boundary tones with constriction gestures in Seoul Korean, an edge-prominence language","authors":"Jiyoung Jang, A. Katsika","doi":"10.21437/speechprosody.2022-30","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-30","url":null,"abstract":"Boundary tones mark major phrase boundaries and are expected to be coordinated with speech gestures adjacent to boundaries. Research on Greek has indeed shown that the onset of the boundary tone (BT) gestures co-occurs with the gestural target of the phrase-final vowel. Interestingly, this coordination is modulated by lexical stress even in the absence of phrasal pitch accent. The present electromagnetic articulography study examines the coordination between BT and constriction gestures in Seoul Korean, a language with no lexical prosody and an edge-prominence system, and further investigates whether focus-related prominence affects this coordination. To this end, the distance of the prominent linguistic unit to the boundary is manipulated in a variety of ways. Results indicate that the onset of BT gestures in Korean is most proximate to the peak velocity of the phrase-final vowel gesture, but suggest that a c-center account is also viable. Prominence fine-tunes this coordination: BT gestures are initiated earlier in Intonational Phrases (IPs) with non-final focus as opposed to IPs with final focus. Importantly, this pattern is detected in short IP-final Accentual Phrases (APs), but not in relatively long IP-final APs. Based on these results, implications on the relationships between lexical and phrasal levels are discussed.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131700483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conversational Correlates of Prosodic Entrainment in Youth with and without Autism Spectrum Disorder 有和没有自闭症谱系障碍的青少年韵律干扰的会话相关性
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-9
Heike Lehnert-LeHouillier, Steven Snadoval
Research on prosodic entrainment has shown correlations between the degree of prosodic entrainment and several dimensions of conversational success. Individuals with autism spectrum disorder (ASD) often encounter difficulties with a variety of skills necessary for conversational success, especially with the social dimensions of conversational behavior. The goal of the current study was to investigate whether children and teens with an autism diagnosis show similar correlations between prosodic entrainment in mean fundamental frequency ( f0 ) on the one hand and conversational effectiveness, duration of conversations, and conversational turn-taking behavior on the other hand when compared to their neurotypical peers.We found significant interaction effects by group between mean f0 entrainment and all three conversational measures. However, we found no significant differences in group means in the three investigated conversational measures of conversational effectiveness, the number of conversational turns, and duration of conversations for speakers in each group. These results suggest that even though speakers with ASD may show surface conversational behaviors similar to their neurotypical peers, the prosodic manifestation of conversational speech clearly marks conversation partners with ASD as different from their age and gender matched peers.
韵律伴随的研究表明,韵律伴随的程度与会话成功的几个维度之间存在相关性。患有自闭症谱系障碍(ASD)的个体经常在掌握成功对话所需的各种技能方面遇到困难,特别是在会话行为的社会维度方面。本研究的目的是调查自闭症儿童和青少年与正常的同龄人相比,在平均基本频率(f0)的韵律伴随与会话有效性、会话持续时间和会话转换行为之间是否表现出相似的相关性。我们发现,在平均得分与所有三种会话测量之间,组间存在显著的交互效应。然而,我们发现,在三种被调查的会话有效性、会话回合数和会话持续时间的会话测量中,每组说话者的群体手段没有显著差异。这些结果表明,尽管ASD说话者可能表现出与神经正常的同龄人相似的表面会话行为,但会话语言的韵律表现清楚地标志着ASD对话伙伴与他们的年龄和性别匹配的同龄人不同。
{"title":"Conversational Correlates of Prosodic Entrainment in Youth with and without Autism Spectrum Disorder","authors":"Heike Lehnert-LeHouillier, Steven Snadoval","doi":"10.21437/speechprosody.2022-9","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-9","url":null,"abstract":"Research on prosodic entrainment has shown correlations between the degree of prosodic entrainment and several dimensions of conversational success. Individuals with autism spectrum disorder (ASD) often encounter difficulties with a variety of skills necessary for conversational success, especially with the social dimensions of conversational behavior. The goal of the current study was to investigate whether children and teens with an autism diagnosis show similar correlations between prosodic entrainment in mean fundamental frequency ( f0 ) on the one hand and conversational effectiveness, duration of conversations, and conversational turn-taking behavior on the other hand when compared to their neurotypical peers.We found significant interaction effects by group between mean f0 entrainment and all three conversational measures. However, we found no significant differences in group means in the three investigated conversational measures of conversational effectiveness, the number of conversational turns, and duration of conversations for speakers in each group. These results suggest that even though speakers with ASD may show surface conversational behaviors similar to their neurotypical peers, the prosodic manifestation of conversational speech clearly marks conversation partners with ASD as different from their age and gender matched peers.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113966107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The influence of L1 prosody on Bulgarian-accented German and English 母语韵律对保加利亚口音德语和英语的影响
Pub Date : 2022-05-23 DOI: 10.21437/speechprosody.2022-155
B. Andreeva, S. Dimitrova
The present study investigates L2 prosodic realizations in the readings of two groups of Bulgarian informants: (a) with L2 German, and (b) with L2 English. Each of the two groups consisted of ten female learners, who read the fable “The North Wind and the Sun” in their L1 and in the respective L2. We also recorded two groups of female native speakers of the target languages as controls. The following durational parameters were obtained: mean accented syllable duration, accented vs. unaccented syllable duration ratio, and speaking rate. With respect to F0 parameters, mean, median, minimum, maximum, span in semitones, and standard deviations per IP were measured. Additionally, we calculated the number of accented and unaccented syllables, IPs and pauses in each reading. Statistical analyses show that the two groups differ in their use of F0. Both groups use higher standard deviation and level in their L2, whereas the ‘German group’ use higher pitch span as well. The number of accented syllables, IPs and pauses is also higher in L2. Regarding duration, both groups use slower articulation rate. The ratio between accented and unaccented syllables is lower in L2 for the ‘English group’. We also provide original data on speaking rate in Bulgarian from an information theoretical perspective.
本研究调查了两组保加利亚语信息者在阅读中的第二语言韵律实现:(a)第二语言德语,(b)第二语言英语。两组每组由10名女性学习者组成,她们分别用母语和各自的第二语言阅读寓言《北风与太阳》。我们还记录了两组以目标语言为母语的女性作为对照。获得了以下持续时间参数:平均重音音节持续时间、重音与非重音音节持续时间比和语速。对于F0参数,测量每个IP的平均值、中位数、最小值、最大值、半音跨度和标准差。此外,我们还计算了每次阅读中重音和非重音音节、ip和停顿的数量。统计分析表明,两组在使用F0方面存在差异。两组人在他们的第二语言中都使用更高的标准偏差和水平,而“德国组”也使用更高的音高跨度。在第二语言中,重音音节、顿音和停顿的数量也更高。在持续时间方面,两组均使用较慢的发音速度。在第二语言中,“英语组”的重音音节和非重音音节的比例较低。我们还从信息论的角度提供了保加利亚语语速的原始数据。
{"title":"The influence of L1 prosody on Bulgarian-accented German and English","authors":"B. Andreeva, S. Dimitrova","doi":"10.21437/speechprosody.2022-155","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-155","url":null,"abstract":"The present study investigates L2 prosodic realizations in the readings of two groups of Bulgarian informants: (a) with L2 German, and (b) with L2 English. Each of the two groups consisted of ten female learners, who read the fable “The North Wind and the Sun” in their L1 and in the respective L2. We also recorded two groups of female native speakers of the target languages as controls. The following durational parameters were obtained: mean accented syllable duration, accented vs. unaccented syllable duration ratio, and speaking rate. With respect to F0 parameters, mean, median, minimum, maximum, span in semitones, and standard deviations per IP were measured. Additionally, we calculated the number of accented and unaccented syllables, IPs and pauses in each reading. Statistical analyses show that the two groups differ in their use of F0. Both groups use higher standard deviation and level in their L2, whereas the ‘German group’ use higher pitch span as well. The number of accented syllables, IPs and pauses is also higher in L2. Regarding duration, both groups use slower articulation rate. The ratio between accented and unaccented syllables is lower in L2 for the ‘English group’. We also provide original data on speaking rate in Bulgarian from an information theoretical perspective.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"261 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122543067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Speech Prosody 2022
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1