{"title":"基于深度神经网络的声音降噪视听系统","authors":"Seyedeh Sogand Hashemi, M. Asadi, M. Aghabozorgi","doi":"10.1109/ICSPIS54653.2021.9729351","DOIUrl":null,"url":null,"abstract":"Audio noise has no unique definition, but in general, it includes background and environmental sounds such as objects movements, animal sounds, and etc. These sounds distract listeners and lead to loss of main content. Noise reduction is a process for removing such these unwanted sounds and extracts clear noise-free sound of an audio source. All proposed methods for this problem deal with some challenges such as residual noise, low speed performance, ambiguity in separation. In this paper an automated system is proposed to eliminate noise signal from noisy audio of an audio-visual data. This system utilizes audio and visual features of main sound source (musical instruments) to feed its two internal DNN based models: a) object detection and b) sound separation model. First, an object detection model which is designed by transfer learning method is used to identify sound source in video frames. Then based on detected source, a specific sound separation model is applied to noisy signal and extracts the noise-free audio signal. Audio and visual features play a complementary role in noise reduction process and its positive effect is obvious in obtained results. The experimental results indicate that under the noisy environment, especially in real-time applications, the proposed noise reduction scheme improves the quality of the extracted noise-free sound in comparison with other algorithms.","PeriodicalId":286966,"journal":{"name":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Audio-Visual System for Sound Noise Reduction Based on Deep Neural Networks\",\"authors\":\"Seyedeh Sogand Hashemi, M. Asadi, M. Aghabozorgi\",\"doi\":\"10.1109/ICSPIS54653.2021.9729351\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Audio noise has no unique definition, but in general, it includes background and environmental sounds such as objects movements, animal sounds, and etc. These sounds distract listeners and lead to loss of main content. Noise reduction is a process for removing such these unwanted sounds and extracts clear noise-free sound of an audio source. All proposed methods for this problem deal with some challenges such as residual noise, low speed performance, ambiguity in separation. In this paper an automated system is proposed to eliminate noise signal from noisy audio of an audio-visual data. This system utilizes audio and visual features of main sound source (musical instruments) to feed its two internal DNN based models: a) object detection and b) sound separation model. First, an object detection model which is designed by transfer learning method is used to identify sound source in video frames. Then based on detected source, a specific sound separation model is applied to noisy signal and extracts the noise-free audio signal. Audio and visual features play a complementary role in noise reduction process and its positive effect is obvious in obtained results. The experimental results indicate that under the noisy environment, especially in real-time applications, the proposed noise reduction scheme improves the quality of the extracted noise-free sound in comparison with other algorithms.\",\"PeriodicalId\":286966,\"journal\":{\"name\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSPIS54653.2021.9729351\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSPIS54653.2021.9729351","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Audio-Visual System for Sound Noise Reduction Based on Deep Neural Networks
Audio noise has no unique definition, but in general, it includes background and environmental sounds such as objects movements, animal sounds, and etc. These sounds distract listeners and lead to loss of main content. Noise reduction is a process for removing such these unwanted sounds and extracts clear noise-free sound of an audio source. All proposed methods for this problem deal with some challenges such as residual noise, low speed performance, ambiguity in separation. In this paper an automated system is proposed to eliminate noise signal from noisy audio of an audio-visual data. This system utilizes audio and visual features of main sound source (musical instruments) to feed its two internal DNN based models: a) object detection and b) sound separation model. First, an object detection model which is designed by transfer learning method is used to identify sound source in video frames. Then based on detected source, a specific sound separation model is applied to noisy signal and extracts the noise-free audio signal. Audio and visual features play a complementary role in noise reduction process and its positive effect is obvious in obtained results. The experimental results indicate that under the noisy environment, especially in real-time applications, the proposed noise reduction scheme improves the quality of the extracted noise-free sound in comparison with other algorithms.