Multi-objective non-intrusive hearing-aid speech assessment model.

IF 2.1 2区 物理与天体物理 Q2 ACOUSTICS Journal of the Acoustical Society of America Pub Date : 2024-11-01 DOI:10.1121/10.0034362
Hsin-Tien Chiang, Szu-Wei Fu, Hsin-Min Wang, Yu Tsao, John H L Hansen
{"title":"Multi-objective non-intrusive hearing-aid speech assessment model.","authors":"Hsin-Tien Chiang, Szu-Wei Fu, Hsin-Min Wang, Yu Tsao, John H L Hansen","doi":"10.1121/10.0034362","DOIUrl":null,"url":null,"abstract":"<p><p>Because a reference signal is often unavailable in real-world scenarios, reference-free speech quality and intelligibility assessment models are important for many speech processing applications. Despite a great number of deep-learning models that have been applied to build non-intrusive speech assessment approaches and achieve promising performance, studies focusing on the hearing impaired (HI) subjects are limited. This paper presents HASA-Net+, a multi-objective non-intrusive hearing-aid speech assessment model, building upon our previous work, HASA-Net. HASA-Net+ improves HASA-Net in several ways: (1) inclusivity for both normal-hearing and HI listeners, (2) integration with pre-trained speech foundation models and fine-tuning techniques, (3) expansion of predictive capabilities to cover speech quality and intelligibility in diverse conditions, including noisy, denoised, reverberant, dereverberated, and vocoded speech, thereby evaluating its robustness, and (4) validation of the generalization capability using an out-of-domain dataset.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"156 5","pages":"3574-3587"},"PeriodicalIF":2.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Acoustical Society of America","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1121/10.0034362","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Because a reference signal is often unavailable in real-world scenarios, reference-free speech quality and intelligibility assessment models are important for many speech processing applications. Despite a great number of deep-learning models that have been applied to build non-intrusive speech assessment approaches and achieve promising performance, studies focusing on the hearing impaired (HI) subjects are limited. This paper presents HASA-Net+, a multi-objective non-intrusive hearing-aid speech assessment model, building upon our previous work, HASA-Net. HASA-Net+ improves HASA-Net in several ways: (1) inclusivity for both normal-hearing and HI listeners, (2) integration with pre-trained speech foundation models and fine-tuning techniques, (3) expansion of predictive capabilities to cover speech quality and intelligibility in diverse conditions, including noisy, denoised, reverberant, dereverberated, and vocoded speech, thereby evaluating its robustness, and (4) validation of the generalization capability using an out-of-domain dataset.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多目标非侵入式助听器语音评估模型。
由于在现实世界中往往无法获得参考信号,因此无参考语音质量和可懂度评估模型对于许多语音处理应用来说都非常重要。尽管大量深度学习模型已被用于构建非侵入式语音评估方法并取得了可喜的性能,但针对听障(HI)受试者的研究却十分有限。本文介绍了一种多目标非侵入式助听器语音评估模型 HASA-Net+,它建立在我们之前的工作 HASA-Net 基础之上。HASA-Net+ 在以下几个方面改进了 HASA-Net:(1) 兼顾正常听力和听力障碍听者;(2) 与预训练的语音基础模型和微调技术相结合;(3) 扩展预测能力,以涵盖不同条件下的语音质量和可懂度,包括噪声、去噪、混响、去混响和声码语音,从而评估其鲁棒性;(4) 使用域外数据集验证泛化能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.60
自引率
16.70%
发文量
1433
审稿时长
4.7 months
期刊介绍: Since 1929 The Journal of the Acoustical Society of America has been the leading source of theoretical and experimental research results in the broad interdisciplinary study of sound. Subject coverage includes: linear and nonlinear acoustics; aeroacoustics, underwater sound and acoustical oceanography; ultrasonics and quantum acoustics; architectural and structural acoustics and vibration; speech, music and noise; psychology and physiology of hearing; engineering acoustics, transduction; bioacoustics, animal bioacoustics.
期刊最新文献
Ducting of wave-breaking sound by the sea surface bubble layer. Soundscape perception indices (SPIs): Developing context-dependent single value scores of multidimensional soundscape perceptual qualitya). The influence of dialect loss on tone perception: Diminishing voice quality cues in preserved tone contrast. Transcranial ultrasound modeling using the spectral-element method. Noise assessment of multirotor configurations during landing proceduresa).
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1