基于小波的语音降噪方法,重点研究语音的浊音区、浊音区和静音区

Anamika Baishya, Priyatam Kumar
{"title":"基于小波的语音降噪方法,重点研究语音的浊音区、浊音区和静音区","authors":"Anamika Baishya, Priyatam Kumar","doi":"10.1109/SPIN.2018.8474205","DOIUrl":null,"url":null,"abstract":"This paper presents an improved speech enhancement technique based on wavelet transform along with excitation-based classification of speech to eliminate noise from speech signals. The method initially classifies the speech into voiced, unvoiced and silence regions on the basis of a novel energy-based threshold and then wavelet transform is applied. To remove the noise, thresholding is applied to the detail coefficients by taking into consideration different characteristics of speech in the three different regions. For this, soft thresholding is used for the voiced regions, hard thresholding for the unvoiced regions and the wavelet coefficients of silence regions are made zero. Speech signals obtained from SPEAR database and corrupted with white noise are being used for evaluation of the proposed method. Experimental results show, in terms of SNR and PESQ score, de-noising of speech is achieved using the proposed method. With regards to SNR, the best improvement is 9.4 dB when compared to the SNR of the original (noisy) speech and 1.2 dB as compared to the improvement obtained using one of the recently reported methods.","PeriodicalId":184596,"journal":{"name":"2018 5th International Conference on Signal Processing and Integrated Networks (SPIN)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Speech De-noising using Wavelet based Methods with Focus on Classification of Speech into Voiced, Unvoiced and Silence Regions\",\"authors\":\"Anamika Baishya, Priyatam Kumar\",\"doi\":\"10.1109/SPIN.2018.8474205\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents an improved speech enhancement technique based on wavelet transform along with excitation-based classification of speech to eliminate noise from speech signals. The method initially classifies the speech into voiced, unvoiced and silence regions on the basis of a novel energy-based threshold and then wavelet transform is applied. To remove the noise, thresholding is applied to the detail coefficients by taking into consideration different characteristics of speech in the three different regions. For this, soft thresholding is used for the voiced regions, hard thresholding for the unvoiced regions and the wavelet coefficients of silence regions are made zero. Speech signals obtained from SPEAR database and corrupted with white noise are being used for evaluation of the proposed method. Experimental results show, in terms of SNR and PESQ score, de-noising of speech is achieved using the proposed method. With regards to SNR, the best improvement is 9.4 dB when compared to the SNR of the original (noisy) speech and 1.2 dB as compared to the improvement obtained using one of the recently reported methods.\",\"PeriodicalId\":184596,\"journal\":{\"name\":\"2018 5th International Conference on Signal Processing and Integrated Networks (SPIN)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 5th International Conference on Signal Processing and Integrated Networks (SPIN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPIN.2018.8474205\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 5th International Conference on Signal Processing and Integrated Networks (SPIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPIN.2018.8474205","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

本文提出了一种基于小波变换和基于激励的语音分类的语音增强技术,以消除语音信号中的噪声。该方法首先基于一种新的基于能量的阈值将语音划分为浊音、浊音和静音区域,然后应用小波变换。为了去除噪声,考虑到三个不同区域语音的不同特征,对细节系数进行阈值处理。为此,对浊音区域采用软阈值法,对浊音区域采用硬阈值法,并使静音区域的小波系数为零。利用SPEAR数据库中被白噪声破坏的语音信号对该方法进行了评价。实验结果表明,从信噪比和PESQ分数两方面来看,该方法均能实现语音去噪。在信噪比方面,与原始(有噪声)语音的信噪比相比,最佳的改进是9.4 dB,与使用最近报道的方法之一相比,最佳的改进是1.2 dB。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Speech De-noising using Wavelet based Methods with Focus on Classification of Speech into Voiced, Unvoiced and Silence Regions
This paper presents an improved speech enhancement technique based on wavelet transform along with excitation-based classification of speech to eliminate noise from speech signals. The method initially classifies the speech into voiced, unvoiced and silence regions on the basis of a novel energy-based threshold and then wavelet transform is applied. To remove the noise, thresholding is applied to the detail coefficients by taking into consideration different characteristics of speech in the three different regions. For this, soft thresholding is used for the voiced regions, hard thresholding for the unvoiced regions and the wavelet coefficients of silence regions are made zero. Speech signals obtained from SPEAR database and corrupted with white noise are being used for evaluation of the proposed method. Experimental results show, in terms of SNR and PESQ score, de-noising of speech is achieved using the proposed method. With regards to SNR, the best improvement is 9.4 dB when compared to the SNR of the original (noisy) speech and 1.2 dB as compared to the improvement obtained using one of the recently reported methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CPW-Fed UWB Flexible Antenna for GSM/WLAN/X-Band Applications Deep Convolution Neural Network Based Speech Recognition for Chhattisgarhi PLCC System Performance with Complex Channel-Gain and QPSK Signaling A Novel HEED Protocol for Wireless Sensor Networks Computer Based Automatic Segmentation of Pap smear Cells for Cervical Cancer Detection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1