Hazardous Sound Detection Based on Audio Augmentation

Jincheng Zhang, Baojun Wang, W. Shi, Jucai Lin, Jun Yin
{"title":"Hazardous Sound Detection Based on Audio Augmentation","authors":"Jincheng Zhang, Baojun Wang, W. Shi, Jucai Lin, Jun Yin","doi":"10.1145/3459104.3459174","DOIUrl":null,"url":null,"abstract":"The aim of surveillance is to detect the occurrence of dangerous events. Recently, with the widely use of deep learning, video surveillance had get dramatically improvement. For audio event detection in surveillance, the deep learning means are applied in hazardous sound classification task. However, due to the low frequency of dangerous sounds occurred and the high cost of collection, there is no corresponding large-scale dataset. Large-scale dataset is essential to achieve an ideal result for deep learning methods. Therefore, how to obtain richer audio events has become an urgent problem. Nowadays, researchers have use a variety of data augmentation methods in computer vision, making performance improvement obviously. And these approaches are gradually being used in various sound pattern recognition or ASR (auto-speech recognition), but there is little research on the classification of hazardous sounds with less data set. In this paper, various data augmentation methods are adopted for hazardous sound classification. Our results show that data augmentation has bring big improvement on all four class dataset. The classification accuracy has increased by 0.5% on average. As the scale of data augmentation increases, the classification accuracy has increased to about 1.5%.","PeriodicalId":142284,"journal":{"name":"2021 International Symposium on Electrical, Electronics and Information Engineering","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Symposium on Electrical, Electronics and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3459104.3459174","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The aim of surveillance is to detect the occurrence of dangerous events. Recently, with the widely use of deep learning, video surveillance had get dramatically improvement. For audio event detection in surveillance, the deep learning means are applied in hazardous sound classification task. However, due to the low frequency of dangerous sounds occurred and the high cost of collection, there is no corresponding large-scale dataset. Large-scale dataset is essential to achieve an ideal result for deep learning methods. Therefore, how to obtain richer audio events has become an urgent problem. Nowadays, researchers have use a variety of data augmentation methods in computer vision, making performance improvement obviously. And these approaches are gradually being used in various sound pattern recognition or ASR (auto-speech recognition), but there is little research on the classification of hazardous sounds with less data set. In this paper, various data augmentation methods are adopted for hazardous sound classification. Our results show that data augmentation has bring big improvement on all four class dataset. The classification accuracy has increased by 0.5% on average. As the scale of data augmentation increases, the classification accuracy has increased to about 1.5%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于音频增强的危险声音检测
监视的目的是发现危险事件的发生。近年来,随着深度学习技术的广泛应用,视频监控得到了极大的改善。针对监控中的音频事件检测,将深度学习方法应用于危险声音分类任务。然而,由于危险声音发生的频率低,收集成本高,没有相应的大规模数据集。对于深度学习方法来说,要获得理想的结果,大规模数据集是必不可少的。因此,如何获取更丰富的音频事件已成为一个亟待解决的问题。目前,研究人员在计算机视觉中使用了各种各样的数据增强方法,使性能得到了明显的提高。这些方法已逐渐应用于各种声音模式识别或自动语音识别中,但在数据集较少的情况下,对有害声音的分类研究较少。本文采用了多种数据增强方法对危险声音进行分类。我们的结果表明,数据增强对所有四类数据集都带来了很大的改善。分类精度平均提高了0.5%。随着数据扩充规模的增加,分类准确率提高到1.5%左右。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Exploring the Integration of Blockchain Technology and IoT in a Smart University Application Architecture 3D Moving Rigid Body Localization in the Presence of Anchor Position Errors RANS/LES Simulation of Low-Frequency Flow Oscillations on a NACA0012 Airfoil Near Stall Tuning Language Representation Models for Classification of Turkish News Improving Consumer Experience for Medical Information Using Text Analytics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1