{"title":"基于DNN分类模型的掩码选择语音增强技术","authors":"Bong-Ki Lee","doi":"10.23919/Eusipco47968.2020.9287410","DOIUrl":null,"url":null,"abstract":"This paper presents a speech enhancement algorithm using a DNN classification model combined with noise classification-based ensemble. Although various single-channel speech enhancement algorithms based on deep learning have been recently developed, since it is optimized for reducing the mean square error, it can not accurately estimate the actual target values in a regression task, resulting in muffled enhanced speech. Therefore, this paper proposes the DNN classification-based single-channel speech enhancement algorithm to overcome disadvantages of the existing DNN regression-based speech enhancement algorithms. To replace the DNN regression task into the classification task, gain mask templates are predefined using k-means clustering among the gain masks. The input feature vector extracted from the microphone input signal is fed into the DNN’s input and then an optimal gain mask is selected from the gain mask templates. Furthermore, we define the gain mask templates for each noise environment using the DNN-based noise classification to cover various noise environments and use an ensemble structure based on a probability of the noise classification stage.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"40 1","pages":"436-440"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DNN Classification Model-based Speech Enhancement Using Mask Selection Technique\",\"authors\":\"Bong-Ki Lee\",\"doi\":\"10.23919/Eusipco47968.2020.9287410\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a speech enhancement algorithm using a DNN classification model combined with noise classification-based ensemble. Although various single-channel speech enhancement algorithms based on deep learning have been recently developed, since it is optimized for reducing the mean square error, it can not accurately estimate the actual target values in a regression task, resulting in muffled enhanced speech. Therefore, this paper proposes the DNN classification-based single-channel speech enhancement algorithm to overcome disadvantages of the existing DNN regression-based speech enhancement algorithms. To replace the DNN regression task into the classification task, gain mask templates are predefined using k-means clustering among the gain masks. The input feature vector extracted from the microphone input signal is fed into the DNN’s input and then an optimal gain mask is selected from the gain mask templates. Furthermore, we define the gain mask templates for each noise environment using the DNN-based noise classification to cover various noise environments and use an ensemble structure based on a probability of the noise classification stage.\",\"PeriodicalId\":6705,\"journal\":{\"name\":\"2020 28th European Signal Processing Conference (EUSIPCO)\",\"volume\":\"40 1\",\"pages\":\"436-440\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 28th European Signal Processing Conference (EUSIPCO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/Eusipco47968.2020.9287410\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 28th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/Eusipco47968.2020.9287410","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DNN Classification Model-based Speech Enhancement Using Mask Selection Technique
This paper presents a speech enhancement algorithm using a DNN classification model combined with noise classification-based ensemble. Although various single-channel speech enhancement algorithms based on deep learning have been recently developed, since it is optimized for reducing the mean square error, it can not accurately estimate the actual target values in a regression task, resulting in muffled enhanced speech. Therefore, this paper proposes the DNN classification-based single-channel speech enhancement algorithm to overcome disadvantages of the existing DNN regression-based speech enhancement algorithms. To replace the DNN regression task into the classification task, gain mask templates are predefined using k-means clustering among the gain masks. The input feature vector extracted from the microphone input signal is fed into the DNN’s input and then an optimal gain mask is selected from the gain mask templates. Furthermore, we define the gain mask templates for each noise environment using the DNN-based noise classification to cover various noise environments and use an ensemble structure based on a probability of the noise classification stage.