Vincent Hughes , Amanda Cardoso , Paul Foulkes , Peter French , Amelia Gully , Philip Harrison
{"title":"语音生成中的说话人特异性:源和滤波器的贡献","authors":"Vincent Hughes , Amanda Cardoso , Paul Foulkes , Peter French , Amelia Gully , Philip Harrison","doi":"10.1016/j.wocn.2023.101224","DOIUrl":null,"url":null,"abstract":"<div><p>This study examines the extent to which speaker-specific information is encoded in different features of vocal output and the relationships between those features. A range of acoustic features, grouped as source (laryngeal voice quality measures and fundamental frequency) and filter features (formants and Mel-frequency cepstral coefficients; MFCCs), were extracted from the vocalic portion of the hesitation marker <em>um</em> for 90 male speakers of Standard Southern British English. Little overall correlation between the sets of features was observed, suggesting no strong interdependence between source and filter in our data. Although filter features were consistently better at discriminating between same- and different-speaker pairs compared with source features, combining source and filter has the potential of producing the lowest error rates and the strongest speaker discrimination scores. Taken together, results show that source and filter provide complementary speaker-specific information. However, the extent of the improvements in speaker discrimination performance when combining source and filter varied across speakers. We explore potential explanations for this finding and discuss the implications for source-filter theory, and for applied fields such as speaker recognition and forensic speech science.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Speaker-specificity in speech production: The contribution of source and filter\",\"authors\":\"Vincent Hughes , Amanda Cardoso , Paul Foulkes , Peter French , Amelia Gully , Philip Harrison\",\"doi\":\"10.1016/j.wocn.2023.101224\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This study examines the extent to which speaker-specific information is encoded in different features of vocal output and the relationships between those features. A range of acoustic features, grouped as source (laryngeal voice quality measures and fundamental frequency) and filter features (formants and Mel-frequency cepstral coefficients; MFCCs), were extracted from the vocalic portion of the hesitation marker <em>um</em> for 90 male speakers of Standard Southern British English. Little overall correlation between the sets of features was observed, suggesting no strong interdependence between source and filter in our data. Although filter features were consistently better at discriminating between same- and different-speaker pairs compared with source features, combining source and filter has the potential of producing the lowest error rates and the strongest speaker discrimination scores. Taken together, results show that source and filter provide complementary speaker-specific information. However, the extent of the improvements in speaker discrimination performance when combining source and filter varied across speakers. We explore potential explanations for this finding and discuss the implications for source-filter theory, and for applied fields such as speaker recognition and forensic speech science.</p></div>\",\"PeriodicalId\":51397,\"journal\":{\"name\":\"Journal of Phonetics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Phonetics\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S009544702300013X\",\"RegionNum\":1,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Phonetics","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S009544702300013X","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
Speaker-specificity in speech production: The contribution of source and filter
This study examines the extent to which speaker-specific information is encoded in different features of vocal output and the relationships between those features. A range of acoustic features, grouped as source (laryngeal voice quality measures and fundamental frequency) and filter features (formants and Mel-frequency cepstral coefficients; MFCCs), were extracted from the vocalic portion of the hesitation marker um for 90 male speakers of Standard Southern British English. Little overall correlation between the sets of features was observed, suggesting no strong interdependence between source and filter in our data. Although filter features were consistently better at discriminating between same- and different-speaker pairs compared with source features, combining source and filter has the potential of producing the lowest error rates and the strongest speaker discrimination scores. Taken together, results show that source and filter provide complementary speaker-specific information. However, the extent of the improvements in speaker discrimination performance when combining source and filter varied across speakers. We explore potential explanations for this finding and discuss the implications for source-filter theory, and for applied fields such as speaker recognition and forensic speech science.
期刊介绍:
The Journal of Phonetics publishes papers of an experimental or theoretical nature that deal with phonetic aspects of language and linguistic communication processes. Papers dealing with technological and/or pathological topics, or papers of an interdisciplinary nature are also suitable, provided that linguistic-phonetic principles underlie the work reported. Regular articles, review articles, and letters to the editor are published. Themed issues are also published, devoted entirely to a specific subject of interest within the field of phonetics.