基于深度神经网络的声音降噪视听系统

Seyedeh Sogand Hashemi, M. Asadi, M. Aghabozorgi
{"title":"基于深度神经网络的声音降噪视听系统","authors":"Seyedeh Sogand Hashemi, M. Asadi, M. Aghabozorgi","doi":"10.1109/ICSPIS54653.2021.9729351","DOIUrl":null,"url":null,"abstract":"Audio noise has no unique definition, but in general, it includes background and environmental sounds such as objects movements, animal sounds, and etc. These sounds distract listeners and lead to loss of main content. Noise reduction is a process for removing such these unwanted sounds and extracts clear noise-free sound of an audio source. All proposed methods for this problem deal with some challenges such as residual noise, low speed performance, ambiguity in separation. In this paper an automated system is proposed to eliminate noise signal from noisy audio of an audio-visual data. This system utilizes audio and visual features of main sound source (musical instruments) to feed its two internal DNN based models: a) object detection and b) sound separation model. First, an object detection model which is designed by transfer learning method is used to identify sound source in video frames. Then based on detected source, a specific sound separation model is applied to noisy signal and extracts the noise-free audio signal. Audio and visual features play a complementary role in noise reduction process and its positive effect is obvious in obtained results. The experimental results indicate that under the noisy environment, especially in real-time applications, the proposed noise reduction scheme improves the quality of the extracted noise-free sound in comparison with other algorithms.","PeriodicalId":286966,"journal":{"name":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Audio-Visual System for Sound Noise Reduction Based on Deep Neural Networks\",\"authors\":\"Seyedeh Sogand Hashemi, M. Asadi, M. Aghabozorgi\",\"doi\":\"10.1109/ICSPIS54653.2021.9729351\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Audio noise has no unique definition, but in general, it includes background and environmental sounds such as objects movements, animal sounds, and etc. These sounds distract listeners and lead to loss of main content. Noise reduction is a process for removing such these unwanted sounds and extracts clear noise-free sound of an audio source. All proposed methods for this problem deal with some challenges such as residual noise, low speed performance, ambiguity in separation. In this paper an automated system is proposed to eliminate noise signal from noisy audio of an audio-visual data. This system utilizes audio and visual features of main sound source (musical instruments) to feed its two internal DNN based models: a) object detection and b) sound separation model. First, an object detection model which is designed by transfer learning method is used to identify sound source in video frames. Then based on detected source, a specific sound separation model is applied to noisy signal and extracts the noise-free audio signal. Audio and visual features play a complementary role in noise reduction process and its positive effect is obvious in obtained results. The experimental results indicate that under the noisy environment, especially in real-time applications, the proposed noise reduction scheme improves the quality of the extracted noise-free sound in comparison with other algorithms.\",\"PeriodicalId\":286966,\"journal\":{\"name\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSPIS54653.2021.9729351\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSPIS54653.2021.9729351","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

音频噪音没有独特的定义,但一般来说,它包括背景和环境声音,如物体运动,动物声音等。这些声音会分散听者的注意力,导致丢失主要内容。降噪是去除这些不需要的声音并提取音频源的清晰无噪声声音的过程。针对这一问题提出的方法都面临着一些挑战,如残余噪声、低速度性能、分离模糊等。本文提出了一种消除视听数据中噪声信号的自动化系统。该系统利用主要声源(乐器)的音频和视觉特征来馈送其两个内部基于DNN的模型:a)目标检测和b)声音分离模型。首先,利用迁移学习方法设计的目标检测模型对视频帧中的声源进行识别。然后根据检测到的声源,对噪声信号应用特定的声音分离模型,提取出无噪声的音频信号。音频和视觉特征在降噪过程中起着互补的作用,所获得的结果表明其积极作用是明显的。实验结果表明,在噪声环境下,特别是在实时应用中,与其他算法相比,所提出的降噪方案提高了提取的无噪声声音的质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Audio-Visual System for Sound Noise Reduction Based on Deep Neural Networks
Audio noise has no unique definition, but in general, it includes background and environmental sounds such as objects movements, animal sounds, and etc. These sounds distract listeners and lead to loss of main content. Noise reduction is a process for removing such these unwanted sounds and extracts clear noise-free sound of an audio source. All proposed methods for this problem deal with some challenges such as residual noise, low speed performance, ambiguity in separation. In this paper an automated system is proposed to eliminate noise signal from noisy audio of an audio-visual data. This system utilizes audio and visual features of main sound source (musical instruments) to feed its two internal DNN based models: a) object detection and b) sound separation model. First, an object detection model which is designed by transfer learning method is used to identify sound source in video frames. Then based on detected source, a specific sound separation model is applied to noisy signal and extracts the noise-free audio signal. Audio and visual features play a complementary role in noise reduction process and its positive effect is obvious in obtained results. The experimental results indicate that under the noisy environment, especially in real-time applications, the proposed noise reduction scheme improves the quality of the extracted noise-free sound in comparison with other algorithms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Intelligent Fault Diagnosis of Rolling BearingBased on Deep Transfer Learning Using Time-Frequency Representation Wind Energy Potential Approximation with Various Metaheuristic Optimization Techniques Deployment Listening to Sounds of Silence for Audio replay attack detection Transcranial Magnetic Stimulation of Prefrontal Cortex Alters Functional Brain Network Architecture: Graph Theoretical Analysis Anomaly Detection and Resilience-Oriented Countermeasures against Cyberattacks in Smart Grids
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1