一个消声、高保真、多向语音语料库。

IF 2.2 2区 医学 Q1 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY Journal of Speech Language and Hearing Research Pub Date : 2025-01-02 Epub Date: 2024-12-02 DOI:10.1044/2024_JSLHR-24-00296
Margaret K Miller, Vahid Delaram, Allison Trine, Rohit M Ananthanarayana, Emily Buss, Brian B Monson, G Christopher Stecker
{"title":"一个消声、高保真、多向语音语料库。","authors":"Margaret K Miller, Vahid Delaram, Allison Trine, Rohit M Ananthanarayana, Emily Buss, Brian B Monson, G Christopher Stecker","doi":"10.1044/2024_JSLHR-24-00296","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>We currently lack speech testing materials faithful to broader aspects of real-world auditory scenes such as speech directivity and extended high frequency (EHF; > 8 kHz) content that have demonstrable effects on speech perception. Here, we describe the development of a multidirectional, high-fidelity speech corpus using multichannel anechoic recordings that can be used for future studies of speech perception in complex environments by diverse listeners.</p><p><strong>Design: </strong>Fifteen male and 15 female talkers (21.3-60.5 years) recorded Bamford-Kowal-Bench (BKB) Standard Sentence Test lists, digits 0-10, and a 2.5-min unscripted narrative. Recordings were made in an anechoic chamber with 17 free-field condenser microphones spanning 0°-180° azimuth angle around the talker using a 48 kHz sampling rate.</p><p><strong>Results: </strong>Recordings resulted in a large corpus containing four BKB lists, 10 digits, and narratives produced by 30 talkers, and an additional 17 BKB lists (21 total) produced by a subset of six talkers.</p><p><strong>Conclusions: </strong>The goal of this study was to create an anechoic, high-fidelity, multidirectional speech corpus using standard speech materials. More naturalistic narratives, useful for the creation of babble noise and speech maskers, were also recorded. A large group of 30 talkers permits testers to select speech materials based on talker characteristics relevant to a specific task. The resulting speech corpus allows for more diverse and precise speech recognition testing, including testing effects of speech directivity and EHF content. Recordings are publicly available.</p>","PeriodicalId":51254,"journal":{"name":"Journal of Speech Language and Hearing Research","volume":" ","pages":"411-418"},"PeriodicalIF":2.2000,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Anechoic, High-Fidelity, Multidirectional Speech Corpus.\",\"authors\":\"Margaret K Miller, Vahid Delaram, Allison Trine, Rohit M Ananthanarayana, Emily Buss, Brian B Monson, G Christopher Stecker\",\"doi\":\"10.1044/2024_JSLHR-24-00296\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>We currently lack speech testing materials faithful to broader aspects of real-world auditory scenes such as speech directivity and extended high frequency (EHF; > 8 kHz) content that have demonstrable effects on speech perception. Here, we describe the development of a multidirectional, high-fidelity speech corpus using multichannel anechoic recordings that can be used for future studies of speech perception in complex environments by diverse listeners.</p><p><strong>Design: </strong>Fifteen male and 15 female talkers (21.3-60.5 years) recorded Bamford-Kowal-Bench (BKB) Standard Sentence Test lists, digits 0-10, and a 2.5-min unscripted narrative. Recordings were made in an anechoic chamber with 17 free-field condenser microphones spanning 0°-180° azimuth angle around the talker using a 48 kHz sampling rate.</p><p><strong>Results: </strong>Recordings resulted in a large corpus containing four BKB lists, 10 digits, and narratives produced by 30 talkers, and an additional 17 BKB lists (21 total) produced by a subset of six talkers.</p><p><strong>Conclusions: </strong>The goal of this study was to create an anechoic, high-fidelity, multidirectional speech corpus using standard speech materials. More naturalistic narratives, useful for the creation of babble noise and speech maskers, were also recorded. A large group of 30 talkers permits testers to select speech materials based on talker characteristics relevant to a specific task. The resulting speech corpus allows for more diverse and precise speech recognition testing, including testing effects of speech directivity and EHF content. Recordings are publicly available.</p>\",\"PeriodicalId\":51254,\"journal\":{\"name\":\"Journal of Speech Language and Hearing Research\",\"volume\":\" \",\"pages\":\"411-418\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Speech Language and Hearing Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1044/2024_JSLHR-24-00296\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/2 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Speech Language and Hearing Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1044/2024_JSLHR-24-00296","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/2 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

我们目前缺乏忠实于现实世界听觉场景更广泛方面的语音测试材料,如语音指向性和扩展高频(EHF;> 8 kHz)的内容对语音感知有明显的影响。在这里,我们描述了使用多通道消声录音的多向高保真语音语料库的开发,该语料库可用于未来不同听者在复杂环境下的语音感知研究。设计:15名男性和15名女性谈话者(21.3-60.5岁)记录Bamford-Kowal-Bench (BKB)标准句子测试列表,数字0-10,并进行2.5分钟的无脚本叙述。录音是在消声室中进行的,17个自由场电容麦克风在扬声器周围的方位角为0°-180°,采样率为48 kHz。结果:录音产生了一个大型语料库,包含4个BKB列表,10个数字,30个谈话者产生的叙述,以及6个谈话者的子集产生的另外17个BKB列表(总共21个)。结论:本研究的目的是使用标准语音材料创建一个无回声、高保真、多向的语音语料库。更自然的叙述,有助于创造牙牙学语的噪音和语音面具,也被记录下来。一个由30名说话者组成的大小组允许测试者根据与特定任务相关的说话者特征选择演讲材料。由此产生的语音语料库允许更多样化和精确的语音识别测试,包括语音指向性和EHF内容的测试效果。录音是公开的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Anechoic, High-Fidelity, Multidirectional Speech Corpus.

Introduction: We currently lack speech testing materials faithful to broader aspects of real-world auditory scenes such as speech directivity and extended high frequency (EHF; > 8 kHz) content that have demonstrable effects on speech perception. Here, we describe the development of a multidirectional, high-fidelity speech corpus using multichannel anechoic recordings that can be used for future studies of speech perception in complex environments by diverse listeners.

Design: Fifteen male and 15 female talkers (21.3-60.5 years) recorded Bamford-Kowal-Bench (BKB) Standard Sentence Test lists, digits 0-10, and a 2.5-min unscripted narrative. Recordings were made in an anechoic chamber with 17 free-field condenser microphones spanning 0°-180° azimuth angle around the talker using a 48 kHz sampling rate.

Results: Recordings resulted in a large corpus containing four BKB lists, 10 digits, and narratives produced by 30 talkers, and an additional 17 BKB lists (21 total) produced by a subset of six talkers.

Conclusions: The goal of this study was to create an anechoic, high-fidelity, multidirectional speech corpus using standard speech materials. More naturalistic narratives, useful for the creation of babble noise and speech maskers, were also recorded. A large group of 30 talkers permits testers to select speech materials based on talker characteristics relevant to a specific task. The resulting speech corpus allows for more diverse and precise speech recognition testing, including testing effects of speech directivity and EHF content. Recordings are publicly available.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Speech Language and Hearing Research
Journal of Speech Language and Hearing Research AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY-REHABILITATION
CiteScore
4.10
自引率
19.20%
发文量
538
审稿时长
4-8 weeks
期刊介绍: Mission: JSLHR publishes peer-reviewed research and other scholarly articles on the normal and disordered processes in speech, language, hearing, and related areas such as cognition, oral-motor function, and swallowing. The journal is an international outlet for both basic research on communication processes and clinical research pertaining to screening, diagnosis, and management of communication disorders as well as the etiologies and characteristics of these disorders. JSLHR seeks to advance evidence-based practice by disseminating the results of new studies as well as providing a forum for critical reviews and meta-analyses of previously published work. Scope: The broad field of communication sciences and disorders, including speech production and perception; anatomy and physiology of speech and voice; genetics, biomechanics, and other basic sciences pertaining to human communication; mastication and swallowing; speech disorders; voice disorders; development of speech, language, or hearing in children; normal language processes; language disorders; disorders of hearing and balance; psychoacoustics; and anatomy and physiology of hearing.
期刊最新文献
Linguistic Skills and Text Reading Comprehension in Prelingually Deaf Readers: A Systematic Review. Perception Versus Comprehension of Bound Morphemes in Children Who Are Deaf and Hard of Hearing: The Pivotal Role of Form-Meaning Mapping. Small-Dose Behavioral Treatment Effects: Learning Following 2 Hours of Computer-Based Conversational Script Training in Individuals With Poststroke Aphasia. Speech Perception in Noise and Cognitive Skills in Children With Varying Degrees of Music Training. Talker Differences in Perceived Emotion in Clear and Conversational Speech.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1