Multi-LCNN: A Hybrid Neural Network Based on Integrated Time-Frequency Characteristics for Acoustic Scene Classification

Jin Lei, Changjian Wang, Boqing Zhu, Q. Lv, Zhen Huang, Yuxing Peng
{"title":"Multi-LCNN: A Hybrid Neural Network Based on Integrated Time-Frequency Characteristics for Acoustic Scene Classification","authors":"Jin Lei, Changjian Wang, Boqing Zhu, Q. Lv, Zhen Huang, Yuxing Peng","doi":"10.1109/ICTAI.2018.00019","DOIUrl":null,"url":null,"abstract":"Acoustic scene classification (ASC) is an important task in audio signal processing and can be useful in many real-world applications. Recently, several deep neural network models have been proposed for ASC, such as LSTMs based on temporal analysis and CNNs based on frequency spectrum, as well as hybrid models of LSTM and CNN to further improve classification performance. However, existing hybrid models fail to properly preserve the temporal information when transferring data between different models. In this work, we first analyze the cause of such temporal information loss. We then propose Multi-LCNN, a new hybrid model with two important mechanisms: (1) a LCNN architecture to effectively preserve temporal information; and (2) a multi-channel feature fusion mechanism (MCFF) that combines enhanced temporal information and frequency spectrogram information to learn highly integrated and discriminative features for ASC. Evaluations on the TUT ASC 2016 dataset show that our model can achieve an improvement of 10.23% over the baseline method, and is currently the best-performing end-to-end model on this dataset.","PeriodicalId":254686,"journal":{"name":"2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2018.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Acoustic scene classification (ASC) is an important task in audio signal processing and can be useful in many real-world applications. Recently, several deep neural network models have been proposed for ASC, such as LSTMs based on temporal analysis and CNNs based on frequency spectrum, as well as hybrid models of LSTM and CNN to further improve classification performance. However, existing hybrid models fail to properly preserve the temporal information when transferring data between different models. In this work, we first analyze the cause of such temporal information loss. We then propose Multi-LCNN, a new hybrid model with two important mechanisms: (1) a LCNN architecture to effectively preserve temporal information; and (2) a multi-channel feature fusion mechanism (MCFF) that combines enhanced temporal information and frequency spectrogram information to learn highly integrated and discriminative features for ASC. Evaluations on the TUT ASC 2016 dataset show that our model can achieve an improvement of 10.23% over the baseline method, and is currently the best-performing end-to-end model on this dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Multi-LCNN:一种基于时频综合特征的混合神经网络用于声场景分类
声场景分类(ASC)是音频信号处理中的一项重要任务,在许多实际应用中都很有用。近年来,针对ASC提出了几种深度神经网络模型,如基于时间分析的LSTM和基于频谱的CNN,以及LSTM和CNN的混合模型,以进一步提高分类性能。然而,现有的混合模型在不同模型之间传输数据时,不能很好地保留时间信息。在这项工作中,我们首先分析了这种时间信息丢失的原因。然后,我们提出了Multi-LCNN,这是一种新的混合模型,具有两个重要机制:(1)LCNN架构有效地保留时间信息;(2)多通道特征融合机制(MCFF),该机制结合增强的时间信息和频谱信息,学习高度集成和判别的ASC特征。对TUT ASC 2016数据集的评估表明,我们的模型比基线方法提高了10.23%,是目前该数据集上表现最好的端到端模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
[Title page i] Enhanced Unsatisfiable Cores for QBF: Weakening Universal to Existential Quantifiers Effective Ant Colony Optimization Solution for the Brazilian Family Health Team Scheduling Problem Exploiting Global Semantic Similarity Biterms for Short-Text Topic Discovery Assigning and Scheduling Service Visits in a Mixed Urban/Rural Setting
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1