{"title":"语音去噪的振幅一致性增强","authors":"Chunlei Liu, Longbiao Wang, J. Dang","doi":"10.1145/3404555.3404618","DOIUrl":null,"url":null,"abstract":"The mapping and masking methods based on deep learning are both essential methods for speech dereverberation at present, which typically enhance the amplitude of the reverberant speech while letting the reverberant phase unprocessed. The reverberant phase and enhanced amplitude are used to synthesize the target speech. However, because the overlapping frames interfere with each other during the superposition process (overlap-and-add), the final synthesized speech signal will deviate from the ideal value. In this paper, we propose an amplitude consistent enhancement method (ACE) to solve this problem. With ACE to train the deep neural networks (DNNs), we use the difference between amplitudes of the synthesized and clean speech as the loss function. Also, we propose a method of adding an adjustment layer to improve the regression accuracy of DNN. The speech dereverberation experiments show that the proposed method has improved the PESQ and SNR by 5% and 15% compared with the traditional signal approximation method.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"1991 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Amplitude Consistent Enhancement for Speech Dereverberation\",\"authors\":\"Chunlei Liu, Longbiao Wang, J. Dang\",\"doi\":\"10.1145/3404555.3404618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The mapping and masking methods based on deep learning are both essential methods for speech dereverberation at present, which typically enhance the amplitude of the reverberant speech while letting the reverberant phase unprocessed. The reverberant phase and enhanced amplitude are used to synthesize the target speech. However, because the overlapping frames interfere with each other during the superposition process (overlap-and-add), the final synthesized speech signal will deviate from the ideal value. In this paper, we propose an amplitude consistent enhancement method (ACE) to solve this problem. With ACE to train the deep neural networks (DNNs), we use the difference between amplitudes of the synthesized and clean speech as the loss function. Also, we propose a method of adding an adjustment layer to improve the regression accuracy of DNN. The speech dereverberation experiments show that the proposed method has improved the PESQ and SNR by 5% and 15% compared with the traditional signal approximation method.\",\"PeriodicalId\":220526,\"journal\":{\"name\":\"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence\",\"volume\":\"1991 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3404555.3404618\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3404555.3404618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Amplitude Consistent Enhancement for Speech Dereverberation
The mapping and masking methods based on deep learning are both essential methods for speech dereverberation at present, which typically enhance the amplitude of the reverberant speech while letting the reverberant phase unprocessed. The reverberant phase and enhanced amplitude are used to synthesize the target speech. However, because the overlapping frames interfere with each other during the superposition process (overlap-and-add), the final synthesized speech signal will deviate from the ideal value. In this paper, we propose an amplitude consistent enhancement method (ACE) to solve this problem. With ACE to train the deep neural networks (DNNs), we use the difference between amplitudes of the synthesized and clean speech as the loss function. Also, we propose a method of adding an adjustment layer to improve the regression accuracy of DNN. The speech dereverberation experiments show that the proposed method has improved the PESQ and SNR by 5% and 15% compared with the traditional signal approximation method.