Shape matters: Machine classification and listeners’ perceptual discrimination of American English intonational tunes

Speech Prosody 2022 Pub Date : 2022-05-23 DOI:10.21437/speechprosody.2022-61

J. Cole, Jeremy Steffman, Sam Tilsen

{"title":"Shape matters: Machine classification and listeners’ perceptual discrimination of American English intonational tunes","authors":"J. Cole, Jeremy Steffman, Sam Tilsen","doi":"10.21437/speechprosody.2022-61","DOIUrl":null,"url":null,"abstract":"In Autosegmental-Metrical models of intonational phonology, pitch accents, phrase accents and boundary tones may combine freely to create a predicted set of phonologically distinct phrase-final “nuclear” tunes. In this study we ask if an 8-way distinction in nuclear tune shape in American English, predicted from combinations of 2 (monotonal) pitch accents, 2 phrase accents and 2 boundary tones, is manifest in speech production and in speech perception. F0 trajectories from an imitative speech production experiment were analyzed using (i) neural net classification, and (ii) human listeners’ perceptual discrimination of the model utterances. Pairwise classification accuracy of the imitative productions is highest for tune pairs that differ in holistic shape (high-rising vs. rise-fall), and poorest for tunes with the same shape that differ in (higher vs. lower) final f0. Perception results show a similar pattern, with poor pairwise discrimination for tunes that differ primarily, but by a small degree, in final f0. Together the results suggest a hierarchy of distinctiveness among nuclear tunes, with a robust distinction based on holistic tune shape, which only partly aligns with distinctions in tonal specification, and a weak/poorly differentiated distinction between tunes with the same holistic shape but small differences in final f0.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"21 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Speech Prosody 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/speechprosody.2022-61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

In Autosegmental-Metrical models of intonational phonology, pitch accents, phrase accents and boundary tones may combine freely to create a predicted set of phonologically distinct phrase-final “nuclear” tunes. In this study we ask if an 8-way distinction in nuclear tune shape in American English, predicted from combinations of 2 (monotonal) pitch accents, 2 phrase accents and 2 boundary tones, is manifest in speech production and in speech perception. F0 trajectories from an imitative speech production experiment were analyzed using (i) neural net classification, and (ii) human listeners’ perceptual discrimination of the model utterances. Pairwise classification accuracy of the imitative productions is highest for tune pairs that differ in holistic shape (high-rising vs. rise-fall), and poorest for tunes with the same shape that differ in (higher vs. lower) final f0. Perception results show a similar pattern, with poor pairwise discrimination for tunes that differ primarily, but by a small degree, in final f0. Together the results suggest a hierarchy of distinctiveness among nuclear tunes, with a robust distinction based on holistic tune shape, which only partly aligns with distinctions in tonal specification, and a weak/poorly differentiated distinction between tunes with the same holistic shape but small differences in final f0.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

形状关系:机器分类与听者对美式英语语调语调的知觉辨别

在语调音韵学的自分音格律模型中，音高重音、短语重音和边界音可以自由地组合在一起，形成一组在音韵学上截然不同的短语末“核”曲调。在这项研究中，我们提出了一个问题，即从两个(单调的)音高重音、两个短语重音和两个边界音的组合中预测的美式英语核调形状的8向区别是否在语音产生和语音感知中表现出来。使用(i)神经网络分类和(ii)人类听者对模型话语的感知辨别来分析来自模仿语音产生实验的F0轨迹。模仿作品的两两分类精度对于整体形状不同的曲调对(高升与低升)是最高的，而对于相同形状的曲调(高与低)最终f0不同的曲调是最差的。感知结果也显示了类似的模式，对于主要不同的曲调，配对辨别能力很差，但程度很小，在最后的60中。总之，结果表明，核曲调之间存在着层次结构的独特性，其中基于整体曲调形状的强大区分，仅部分与音调规格的区分一致，而具有相同整体形状但最终音高差异很小的曲调之间存在弱/差区分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Speech Prosody 2022

自引率

0.00%

发文量