{"title":"Speech enhancement using spectral subtraction and cascaded-median based noise estimation for hearing impaired listeners","authors":"Santosh K. Waddi, P. C. Pandey, Nitya Tiwari","doi":"10.1109/NCC.2013.6487989","DOIUrl":null,"url":null,"abstract":"A spectral subtraction technique is presented for real-time speech enhancement in the aids used by hearing impaired listeners. For reducing computational complexity and memory requirement, it uses a cascaded-median based estimation of the noise spectrum without voice activity detection. The technique is implemented and tested for satisfactory real-time operation, with sampling frequency of 12 kHz, processing using window length of 30 ms with 50% overlap, and noise estimation by 3-frame 4-stage cascaded-median, on a 16-bit fixed-point DSP processor with on-chip FFT hardware. Enhancement of speech with different types of additive stationary and non-stationary noise resulted in SNR advantage of 4 – 13 dB.","PeriodicalId":202526,"journal":{"name":"2013 National Conference on Communications (NCC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2013.6487989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
A spectral subtraction technique is presented for real-time speech enhancement in the aids used by hearing impaired listeners. For reducing computational complexity and memory requirement, it uses a cascaded-median based estimation of the noise spectrum without voice activity detection. The technique is implemented and tested for satisfactory real-time operation, with sampling frequency of 12 kHz, processing using window length of 30 ms with 50% overlap, and noise estimation by 3-frame 4-stage cascaded-median, on a 16-bit fixed-point DSP processor with on-chip FFT hardware. Enhancement of speech with different types of additive stationary and non-stationary noise resulted in SNR advantage of 4 – 13 dB.