Category-Adapted Sound Event Enhancement with Weakly Labeled Data

Guangwei Li, Xuenan Xu, Heinrich Dinkel, Mengyue Wu, K. Yu
{"title":"Category-Adapted Sound Event Enhancement with Weakly Labeled Data","authors":"Guangwei Li, Xuenan Xu, Heinrich Dinkel, Mengyue Wu, K. Yu","doi":"10.1109/ICASSP43922.2022.9747722","DOIUrl":null,"url":null,"abstract":"Previous audio enhancement training usually requires clean signals with additive noises; hence commonly focuses on speech enhancement, where clean speech is easy to access. This paper goes beyond a broader sound event enhancement by using a weakly supervised approach via sound event detection (SED) to approximate the location and presence of a specific sound event. We propose a category-adapted system to enable enhancement on any selected sound category, where we first familiarize the model to all common sound classes and followed by a category-specific fine-tune procedure to enhance the targeted sound class. Evaluation is conducted on ten common sound classes, with a comparison to traditional and weakly supervised enhancement methods. Results indicate an average 2.86 dB SDR increase, with more significant improvement on speech (9.15 dB), music (5.01 dB), and typewriter (3.68 dB) under SNR of 0 dB. All enhancement metrics outperform previous weakly supervised methods and achieve comparable results to the state-of-the-art method that requires clean signals.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP43922.2022.9747722","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Previous audio enhancement training usually requires clean signals with additive noises; hence commonly focuses on speech enhancement, where clean speech is easy to access. This paper goes beyond a broader sound event enhancement by using a weakly supervised approach via sound event detection (SED) to approximate the location and presence of a specific sound event. We propose a category-adapted system to enable enhancement on any selected sound category, where we first familiarize the model to all common sound classes and followed by a category-specific fine-tune procedure to enhance the targeted sound class. Evaluation is conducted on ten common sound classes, with a comparison to traditional and weakly supervised enhancement methods. Results indicate an average 2.86 dB SDR increase, with more significant improvement on speech (9.15 dB), music (5.01 dB), and typewriter (3.68 dB) under SNR of 0 dB. All enhancement metrics outperform previous weakly supervised methods and achieve comparable results to the state-of-the-art method that requires clean signals.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
弱标记数据下的类别适应声音事件增强
以前的音频增强训练通常需要带有附加噪声的干净信号;因此,通常侧重于语音增强,其中干净的语音易于访问。本文超越了更广泛的声音事件增强,通过声音事件检测(SED)使用弱监督方法来近似特定声音事件的位置和存在。我们提出了一个类别适应系统来增强任何选定的声音类别,我们首先让模型熟悉所有常见的声音类别,然后通过特定类别的微调程序来增强目标声音类别。对十个常见的声音类进行了评价,并与传统的弱监督增强方法进行了比较。结果表明,在信噪比为0 dB时,SDR平均增加2.86 dB,其中语音(9.15 dB)、音乐(5.01 dB)和打字机(3.68 dB)的改善更为显著。所有增强指标都优于以前的弱监督方法,并获得与需要干净信号的最先进方法相当的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Spatio-Temporal Attention Graph Convolution Network for Functional Connectome Classification Improving Biomedical Named Entity Recognition with a Unified Multi-Task MRC Framework Combining Multiple Style Transfer Networks and Transfer Learning For LGE-CMR Segmentation Sensors to Sign Language: A Natural Approach to Equitable Communication Estimation of the Admittance Matrix in Power Systems Under Laplacian and Physical Constraints
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1