Hsin-Tien Chiang, Szu-Wei Fu, Hsin-Min Wang, Yu Tsao, John H L Hansen
{"title":"Multi-objective non-intrusive hearing-aid speech assessment model.","authors":"Hsin-Tien Chiang, Szu-Wei Fu, Hsin-Min Wang, Yu Tsao, John H L Hansen","doi":"10.1121/10.0034362","DOIUrl":null,"url":null,"abstract":"<p><p>Because a reference signal is often unavailable in real-world scenarios, reference-free speech quality and intelligibility assessment models are important for many speech processing applications. Despite a great number of deep-learning models that have been applied to build non-intrusive speech assessment approaches and achieve promising performance, studies focusing on the hearing impaired (HI) subjects are limited. This paper presents HASA-Net+, a multi-objective non-intrusive hearing-aid speech assessment model, building upon our previous work, HASA-Net. HASA-Net+ improves HASA-Net in several ways: (1) inclusivity for both normal-hearing and HI listeners, (2) integration with pre-trained speech foundation models and fine-tuning techniques, (3) expansion of predictive capabilities to cover speech quality and intelligibility in diverse conditions, including noisy, denoised, reverberant, dereverberated, and vocoded speech, thereby evaluating its robustness, and (4) validation of the generalization capability using an out-of-domain dataset.</p>","PeriodicalId":17168,"journal":{"name":"Journal of the Acoustical Society of America","volume":"156 5","pages":"3574-3587"},"PeriodicalIF":2.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Acoustical Society of America","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1121/10.0034362","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Because a reference signal is often unavailable in real-world scenarios, reference-free speech quality and intelligibility assessment models are important for many speech processing applications. Despite a great number of deep-learning models that have been applied to build non-intrusive speech assessment approaches and achieve promising performance, studies focusing on the hearing impaired (HI) subjects are limited. This paper presents HASA-Net+, a multi-objective non-intrusive hearing-aid speech assessment model, building upon our previous work, HASA-Net. HASA-Net+ improves HASA-Net in several ways: (1) inclusivity for both normal-hearing and HI listeners, (2) integration with pre-trained speech foundation models and fine-tuning techniques, (3) expansion of predictive capabilities to cover speech quality and intelligibility in diverse conditions, including noisy, denoised, reverberant, dereverberated, and vocoded speech, thereby evaluating its robustness, and (4) validation of the generalization capability using an out-of-domain dataset.
期刊介绍:
Since 1929 The Journal of the Acoustical Society of America has been the leading source of theoretical and experimental research results in the broad interdisciplinary study of sound. Subject coverage includes: linear and nonlinear acoustics; aeroacoustics, underwater sound and acoustical oceanography; ultrasonics and quantum acoustics; architectural and structural acoustics and vibration; speech, music and noise; psychology and physiology of hearing; engineering acoustics, transduction; bioacoustics, animal bioacoustics.