Nils L. Westhausen, Hendrik Kayser, Theresa Jansen, Bernd T. Meyer
{"title":"Real-time multichannel deep speech enhancement in hearing aids: Comparing monaural and binaural processing in complex acoustic scenarios","authors":"Nils L. Westhausen, Hendrik Kayser, Theresa Jansen, Bernd T. Meyer","doi":"arxiv-2405.01967","DOIUrl":null,"url":null,"abstract":"Deep learning has the potential to enhance speech signals and increase their\nintelligibility for users of hearing aids. Deep models suited for real-world\napplication should feature a low computational complexity and low processing\ndelay of only a few milliseconds. In this paper, we explore deep speech\nenhancement that matches these requirements and contrast monaural and binaural\nprocessing algorithms in two complex acoustic scenes. Both algorithms are\nevaluated with objective metrics and in experiments with hearing-impaired\nlisteners performing a speech-in-noise test. Results are compared to two\ntraditional enhancement strategies, i.e., adaptive differential microphone\nprocessing and binaural beamforming. While in diffuse noise, all algorithms\nperform similarly, the binaural deep learning approach performs best in the\npresence of spatial interferers. Through a post-analysis, this can be\nattributed to improvements at low SNRs and to precise spatial filtering.","PeriodicalId":501178,"journal":{"name":"arXiv - CS - Sound","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Sound","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.01967","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning has the potential to enhance speech signals and increase their
intelligibility for users of hearing aids. Deep models suited for real-world
application should feature a low computational complexity and low processing
delay of only a few milliseconds. In this paper, we explore deep speech
enhancement that matches these requirements and contrast monaural and binaural
processing algorithms in two complex acoustic scenes. Both algorithms are
evaluated with objective metrics and in experiments with hearing-impaired
listeners performing a speech-in-noise test. Results are compared to two
traditional enhancement strategies, i.e., adaptive differential microphone
processing and binaural beamforming. While in diffuse noise, all algorithms
perform similarly, the binaural deep learning approach performs best in the
presence of spatial interferers. Through a post-analysis, this can be
attributed to improvements at low SNRs and to precise spatial filtering.