语音生成中的说话人特异性：源和滤波器的贡献

IF 2.4 1区文学 0 LANGUAGE & LINGUISTICS Journal of Phonetics Pub Date : 2023-03-01 Epub Date: 2023-02-03 DOI:10.1016/j.wocn.2023.101224

Vincent Hughes , Amanda Cardoso , Paul Foulkes , Peter French , Amelia Gully , Philip Harrison

{"title":"语音生成中的说话人特异性：源和滤波器的贡献","authors":"Vincent Hughes , Amanda Cardoso , Paul Foulkes , Peter French , Amelia Gully , Philip Harrison","doi":"10.1016/j.wocn.2023.101224","DOIUrl":null,"url":null,"abstract":"<div><p>This study examines the extent to which speaker-specific information is encoded in different features of vocal output and the relationships between those features. A range of acoustic features, grouped as source (laryngeal voice quality measures and fundamental frequency) and filter features (formants and Mel-frequency cepstral coefficients; MFCCs), were extracted from the vocalic portion of the hesitation marker <em>um</em> for 90 male speakers of Standard Southern British English. Little overall correlation between the sets of features was observed, suggesting no strong interdependence between source and filter in our data. Although filter features were consistently better at discriminating between same- and different-speaker pairs compared with source features, combining source and filter has the potential of producing the lowest error rates and the strongest speaker discrimination scores. Taken together, results show that source and filter provide complementary speaker-specific information. However, the extent of the improvements in speaker discrimination performance when combining source and filter varied across speakers. We explore potential explanations for this finding and discuss the implications for source-filter theory, and for applied fields such as speaker recognition and forensic speech science.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"97 ","pages":"Article 101224"},"PeriodicalIF":2.4000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Speaker-specificity in speech production: The contribution of source and filter\",\"authors\":\"Vincent Hughes , Amanda Cardoso , Paul Foulkes , Peter French , Amelia Gully , Philip Harrison\",\"doi\":\"10.1016/j.wocn.2023.101224\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This study examines the extent to which speaker-specific information is encoded in different features of vocal output and the relationships between those features. A range of acoustic features, grouped as source (laryngeal voice quality measures and fundamental frequency) and filter features (formants and Mel-frequency cepstral coefficients; MFCCs), were extracted from the vocalic portion of the hesitation marker <em>um</em> for 90 male speakers of Standard Southern British English. Little overall correlation between the sets of features was observed, suggesting no strong interdependence between source and filter in our data. Although filter features were consistently better at discriminating between same- and different-speaker pairs compared with source features, combining source and filter has the potential of producing the lowest error rates and the strongest speaker discrimination scores. Taken together, results show that source and filter provide complementary speaker-specific information. However, the extent of the improvements in speaker discrimination performance when combining source and filter varied across speakers. We explore potential explanations for this finding and discuss the implications for source-filter theory, and for applied fields such as speaker recognition and forensic speech science.</p></div>\",\"PeriodicalId\":51397,\"journal\":{\"name\":\"Journal of Phonetics\",\"volume\":\"97 \",\"pages\":\"Article 101224\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Phonetics\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S009544702300013X\",\"RegionNum\":1,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/2/3 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Phonetics","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S009544702300013X","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/2/3 0:00:00","PubModel":"Epub","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 0

摘要

这项研究考察了说话者特定信息在多大程度上被编码在声音输出的不同特征中，以及这些特征之间的关系。一系列声学特征，分为源（喉音质量测量和基频）和滤波器特征（共振峰和梅尔频率倒谱系数；MFCC），从90名标准南方英国英语男性说话者的犹豫标记um的发声部分提取。观察到特征集之间的总体相关性很小，这表明我们的数据中的源和过滤器之间没有很强的相互依赖性。尽管与源特征相比，滤波器特征在区分相同和不同的说话者对方面始终更好，但将源和滤波器相结合有可能产生最低的错误率和最强的说话者区分分数。总之，结果表明，源和滤波器提供了互补的说话者特定信息。然而，当组合源和滤波器时，扬声器辨别性能的改善程度因扬声器而异。我们探索了这一发现的潜在解释，并讨论了对源滤波器理论以及说话人识别和取证语音科学等应用领域的启示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Speaker-specificity in speech production: The contribution of source and filter

This study examines the extent to which speaker-specific information is encoded in different features of vocal output and the relationships between those features. A range of acoustic features, grouped as source (laryngeal voice quality measures and fundamental frequency) and filter features (formants and Mel-frequency cepstral coefficients; MFCCs), were extracted from the vocalic portion of the hesitation marker um for 90 male speakers of Standard Southern British English. Little overall correlation between the sets of features was observed, suggesting no strong interdependence between source and filter in our data. Although filter features were consistently better at discriminating between same- and different-speaker pairs compared with source features, combining source and filter has the potential of producing the lowest error rates and the strongest speaker discrimination scores. Taken together, results show that source and filter provide complementary speaker-specific information. However, the extent of the improvements in speaker discrimination performance when combining source and filter varied across speakers. We explore potential explanations for this finding and discuss the implications for source-filter theory, and for applied fields such as speaker recognition and forensic speech science.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Phonetics Multiple-

CiteScore

3.50

自引率

26.30%

发文量

期刊介绍： The Journal of Phonetics publishes papers of an experimental or theoretical nature that deal with phonetic aspects of language and linguistic communication processes. Papers dealing with technological and/or pathological topics, or papers of an interdisciplinary nature are also suitable, provided that linguistic-phonetic principles underlie the work reported. Regular articles, review articles, and letters to the editor are published. Themed issues are also published, devoted entirely to a specific subject of interest within the field of phonetics.