{"title":"Shared Network for Speech Enhancement Based on Multi-Task Learning","authors":"Y. Xi, Bin Li, Zhan Zhang, Yuehai Wang","doi":"10.1109/ICCSE49874.2020.9201880","DOIUrl":null,"url":null,"abstract":"Speech enhancement (SE) plays an important role in the domain of speech recognition and speech evaluation. As for the previous time-frequency based SE methods, we find that the denoise network may cause damage to the structure of the speech spectrum and will lead to a discontinuity of the auditory perception. In contrast to the existing approaches that train networks directly, we propose a two-stage based method called ShareNet. We first train a convolutional neural network to perform noise reduction, and then we stack these two pretrained blocks while keeping the parameters shared. We set different input data to train each block in different stages so that the parameters can be adapted to perform both denoising and repairing tasks. The experimental results show that the proposed method is effective for speech enhancement tasks. We compare our method with conventional algorithms and convolutional neural networks (CNN) based speech enhancement techniques. The experiment results demonstrate that our method can get an improvement over several objective metrics.","PeriodicalId":350703,"journal":{"name":"2020 15th International Conference on Computer Science & Education (ICCSE)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 15th International Conference on Computer Science & Education (ICCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE49874.2020.9201880","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Speech enhancement (SE) plays an important role in the domain of speech recognition and speech evaluation. As for the previous time-frequency based SE methods, we find that the denoise network may cause damage to the structure of the speech spectrum and will lead to a discontinuity of the auditory perception. In contrast to the existing approaches that train networks directly, we propose a two-stage based method called ShareNet. We first train a convolutional neural network to perform noise reduction, and then we stack these two pretrained blocks while keeping the parameters shared. We set different input data to train each block in different stages so that the parameters can be adapted to perform both denoising and repairing tasks. The experimental results show that the proposed method is effective for speech enhancement tasks. We compare our method with conventional algorithms and convolutional neural networks (CNN) based speech enhancement techniques. The experiment results demonstrate that our method can get an improvement over several objective metrics.