Multi-scale Fusion and Channel Weighted CNN for Acoustic Scene Classification

International Conference on Signal Processing and Machine Learning Pub Date : 2019-11-27 DOI:10.1145/3372806.3372809

Liping Yang, Xinxing Chen, Lianjie Tao, Xiaohua Gu

引用次数: 3

Abstract

Ensemble semantic features are useful for acoustic scene classification. In this paper, we proposed a multi-scale fusion and channel weighted CNN framework. The framework consists of two stages: the multi-scale feature fusion and channel weighting stages. The multi-scale feature fusion stage extracts hierarchy semantic feature maps using a CNN with simplified Xception architecture and then integrates multi-scale semantic features through a top-down pathway. The channel weighting stage squeezes feature maps into a channel descriptor and then transforms it into a set of channel weighting factors to reinforce the importance of each channel for acoustic scene classification. Experimental results on DCASE2018 acoustic scene classification subtask A and subtask B demonstrate the performances of the proposed framework.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于多尺度融合和信道加权CNN的声场景分类

集成语义特征对声学场景分类非常有用。本文提出了一种多尺度融合和信道加权的CNN框架。该框架包括两个阶段:多尺度特征融合阶段和通道加权阶段。多尺度特征融合阶段使用简化Xception架构的CNN提取层次语义特征映射，然后通过自顶向下的路径整合多尺度语义特征。通道加权阶段将特征映射压缩为通道描述符，然后将其转换为一组通道加权因子，以增强每个通道对声学场景分类的重要性。在DCASE2018声学场景分类子任务A和子任务B上的实验结果验证了该框架的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Conference on Signal Processing and Machine Learning

自引率

0.00%

发文量

期刊最新文献

Multi-source Radar Data Fusion via Support Vector Regression Data Link Modeling and Simulation Based on DEVS Implement AI Service into VR Training Automated Detection of Sewer Pipe Defects Based on Cost-Sensitive Convolutional Neural Network Multi-Scale Deep Convolutional Nets with Attention Model and Conditional Random Fields for Semantic Image Segmentation