{"title":"利用频率听觉掩蔽和基于GMM的语音转换增强喉电语音","authors":"P. Malathi, G. R. Sureshw, M. Moorthi","doi":"10.1109/AEEICB.2018.8480968","DOIUrl":null,"url":null,"abstract":"Laryngectomees lose their voice box after surgery and adapt various methods to restore their voice, one of them being Electrolaryngeal speech. The Electrolarynx suffers from producing natural speech by generating mechanical form of speech with suppressed unvoiced features, device and environment noise. This paper tends to remove the echo noise, device noise and environmental noise thereby enhancing the Electrolaryngeal speech to be more intelligible by spectral mapping using Gaussian Mixture Model (GMM) and auditory masking. The low frequency noise is masked by the pre-emphasised speech signal by determining the absolute threshold of masking. The spectral mapping technique using GMM based voice conversion in association with the source-filter model improves the voice quality and prosody. The objective and subjective evaluation measures, depict the significant enhancement of electrolaryngeal speech compared to previous enhancement methods which removed only low frequency noise and failed to include voice quality.","PeriodicalId":423671,"journal":{"name":"2018 Fourth International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Enhancement of electrolaryngeal speech using Frequency Auditory Masking and GMM based voice conversion\",\"authors\":\"P. Malathi, G. R. Sureshw, M. Moorthi\",\"doi\":\"10.1109/AEEICB.2018.8480968\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Laryngectomees lose their voice box after surgery and adapt various methods to restore their voice, one of them being Electrolaryngeal speech. The Electrolarynx suffers from producing natural speech by generating mechanical form of speech with suppressed unvoiced features, device and environment noise. This paper tends to remove the echo noise, device noise and environmental noise thereby enhancing the Electrolaryngeal speech to be more intelligible by spectral mapping using Gaussian Mixture Model (GMM) and auditory masking. The low frequency noise is masked by the pre-emphasised speech signal by determining the absolute threshold of masking. The spectral mapping technique using GMM based voice conversion in association with the source-filter model improves the voice quality and prosody. The objective and subjective evaluation measures, depict the significant enhancement of electrolaryngeal speech compared to previous enhancement methods which removed only low frequency noise and failed to include voice quality.\",\"PeriodicalId\":423671,\"journal\":{\"name\":\"2018 Fourth International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Fourth International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AEEICB.2018.8480968\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Fourth International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AEEICB.2018.8480968","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enhancement of electrolaryngeal speech using Frequency Auditory Masking and GMM based voice conversion
Laryngectomees lose their voice box after surgery and adapt various methods to restore their voice, one of them being Electrolaryngeal speech. The Electrolarynx suffers from producing natural speech by generating mechanical form of speech with suppressed unvoiced features, device and environment noise. This paper tends to remove the echo noise, device noise and environmental noise thereby enhancing the Electrolaryngeal speech to be more intelligible by spectral mapping using Gaussian Mixture Model (GMM) and auditory masking. The low frequency noise is masked by the pre-emphasised speech signal by determining the absolute threshold of masking. The spectral mapping technique using GMM based voice conversion in association with the source-filter model improves the voice quality and prosody. The objective and subjective evaluation measures, depict the significant enhancement of electrolaryngeal speech compared to previous enhancement methods which removed only low frequency noise and failed to include voice quality.