Multi-LCNN: A Hybrid Neural Network Based on Integrated Time-Frequency Characteristics for Acoustic Scene Classification

2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI) Pub Date : 2018-11-01 DOI:10.1109/ICTAI.2018.00019

Jin Lei, Changjian Wang, Boqing Zhu, Q. Lv, Zhen Huang, Yuxing Peng

{"title":"Multi-LCNN: A Hybrid Neural Network Based on Integrated Time-Frequency Characteristics for Acoustic Scene Classification","authors":"Jin Lei, Changjian Wang, Boqing Zhu, Q. Lv, Zhen Huang, Yuxing Peng","doi":"10.1109/ICTAI.2018.00019","DOIUrl":null,"url":null,"abstract":"Acoustic scene classification (ASC) is an important task in audio signal processing and can be useful in many real-world applications. Recently, several deep neural network models have been proposed for ASC, such as LSTMs based on temporal analysis and CNNs based on frequency spectrum, as well as hybrid models of LSTM and CNN to further improve classification performance. However, existing hybrid models fail to properly preserve the temporal information when transferring data between different models. In this work, we first analyze the cause of such temporal information loss. We then propose Multi-LCNN, a new hybrid model with two important mechanisms: (1) a LCNN architecture to effectively preserve temporal information; and (2) a multi-channel feature fusion mechanism (MCFF) that combines enhanced temporal information and frequency spectrogram information to learn highly integrated and discriminative features for ASC. Evaluations on the TUT ASC 2016 dataset show that our model can achieve an improvement of 10.23% over the baseline method, and is currently the best-performing end-to-end model on this dataset.","PeriodicalId":254686,"journal":{"name":"2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2018.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Acoustic scene classification (ASC) is an important task in audio signal processing and can be useful in many real-world applications. Recently, several deep neural network models have been proposed for ASC, such as LSTMs based on temporal analysis and CNNs based on frequency spectrum, as well as hybrid models of LSTM and CNN to further improve classification performance. However, existing hybrid models fail to properly preserve the temporal information when transferring data between different models. In this work, we first analyze the cause of such temporal information loss. We then propose Multi-LCNN, a new hybrid model with two important mechanisms: (1) a LCNN architecture to effectively preserve temporal information; and (2) a multi-channel feature fusion mechanism (MCFF) that combines enhanced temporal information and frequency spectrogram information to learn highly integrated and discriminative features for ASC. Evaluations on the TUT ASC 2016 dataset show that our model can achieve an improvement of 10.23% over the baseline method, and is currently the best-performing end-to-end model on this dataset.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Multi-LCNN:一种基于时频综合特征的混合神经网络用于声场景分类

声场景分类(ASC)是音频信号处理中的一项重要任务，在许多实际应用中都很有用。近年来，针对ASC提出了几种深度神经网络模型，如基于时间分析的LSTM和基于频谱的CNN，以及LSTM和CNN的混合模型，以进一步提高分类性能。然而，现有的混合模型在不同模型之间传输数据时，不能很好地保留时间信息。在这项工作中，我们首先分析了这种时间信息丢失的原因。然后，我们提出了Multi-LCNN，这是一种新的混合模型，具有两个重要机制:(1)LCNN架构有效地保留时间信息;(2)多通道特征融合机制(MCFF)，该机制结合增强的时间信息和频谱信息，学习高度集成和判别的ASC特征。对TUT ASC 2016数据集的评估表明，我们的模型比基线方法提高了10.23%，是目前该数据集上表现最好的端到端模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)

自引率

0.00%

发文量

期刊最新文献

[Title page i] Enhanced Unsatisfiable Cores for QBF: Weakening Universal to Existential Quantifiers Effective Ant Colony Optimization Solution for the Brazilian Family Health Team Scheduling Problem Exploiting Global Semantic Similarity Biterms for Short-Text Topic Discovery Assigning and Scheduling Service Visits in a Mixed Urban/Rural Setting