{"title":"频谱减法实时增强喉电语音","authors":"S. K. Basha, P. C. Pandey","doi":"10.1109/NCC.2012.6176807","DOIUrl":null,"url":null,"abstract":"An electrolarynx, a vibrator held against the neck tissue, is used by laryngectomy patients to provide excitation to the vocal tract as a substitute to that provided by the glottis. The quality and intelligibility of electrolaryngeal speech is generally poor because of the presence of background noise caused by leakage of acoustic energy from the vibrator and vibrator-tissue interface. This noise can be suppressed by pitch-synchronous application of spectral subtraction. The paper presents a real-time implementation of the spectral subtraction for enhancement of electrolaryngeal speech, using a 16-bit fixed-point DSP board. Electrolaryngeal speech is continuously acquired at 12 kHz using codec and DMA into the input buffers. It is processed using 256-point FFT, 3-frame 4-stage cascaded median-based dynamic estimation of noise, spectral subtraction, and IFFT, using two-pitch period window with 50 % overlap. The resynthesized speech is output using DMA and codec.","PeriodicalId":178278,"journal":{"name":"2012 National Conference on Communications (NCC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Real-time enhancement of electrolaryngeal speech by spectral subtraction\",\"authors\":\"S. K. Basha, P. C. Pandey\",\"doi\":\"10.1109/NCC.2012.6176807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An electrolarynx, a vibrator held against the neck tissue, is used by laryngectomy patients to provide excitation to the vocal tract as a substitute to that provided by the glottis. The quality and intelligibility of electrolaryngeal speech is generally poor because of the presence of background noise caused by leakage of acoustic energy from the vibrator and vibrator-tissue interface. This noise can be suppressed by pitch-synchronous application of spectral subtraction. The paper presents a real-time implementation of the spectral subtraction for enhancement of electrolaryngeal speech, using a 16-bit fixed-point DSP board. Electrolaryngeal speech is continuously acquired at 12 kHz using codec and DMA into the input buffers. It is processed using 256-point FFT, 3-frame 4-stage cascaded median-based dynamic estimation of noise, spectral subtraction, and IFFT, using two-pitch period window with 50 % overlap. The resynthesized speech is output using DMA and codec.\",\"PeriodicalId\":178278,\"journal\":{\"name\":\"2012 National Conference on Communications (NCC)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC.2012.6176807\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2012.6176807","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Real-time enhancement of electrolaryngeal speech by spectral subtraction
An electrolarynx, a vibrator held against the neck tissue, is used by laryngectomy patients to provide excitation to the vocal tract as a substitute to that provided by the glottis. The quality and intelligibility of electrolaryngeal speech is generally poor because of the presence of background noise caused by leakage of acoustic energy from the vibrator and vibrator-tissue interface. This noise can be suppressed by pitch-synchronous application of spectral subtraction. The paper presents a real-time implementation of the spectral subtraction for enhancement of electrolaryngeal speech, using a 16-bit fixed-point DSP board. Electrolaryngeal speech is continuously acquired at 12 kHz using codec and DMA into the input buffers. It is processed using 256-point FFT, 3-frame 4-stage cascaded median-based dynamic estimation of noise, spectral subtraction, and IFFT, using two-pitch period window with 50 % overlap. The resynthesized speech is output using DMA and codec.