{"title":"DCH-Net:用于环境声音分类的密集连接公路卷积神经网络","authors":"Xiaohu Zhang, Yuexian Zou","doi":"10.1109/ICDSP.2018.8631632","DOIUrl":null,"url":null,"abstract":"Environmental Sound Classification (ESC) plays a vital role in the field of machine auditory scene. Recently, the Highway Network CNN model has achieved the state-of-art results via solving the vanishing-gradient problem of much deeper CNN. However, carefully analyzing the Highway Network model shows that the Highway Network model lacks ability to maximize information flow between layers, which is essentially benefits the discriminative representation of acoustic events. Besides, the Highway Network model size is larger than 20MB for ESC task, which is still large for mobile applications. Regarding to these two issues, in this study, we propose a novel Densely Connected Highway Convolutional Network (DCH-Net) model for ESC task. Specifically, a densely highway module is developed which is able to ensure the maximum information flow between layers by connecting all layers directly with each other. Besides, to reduce the model size, a global average pooling layer is designed which replaces the traditional fully connection layers and the parameters of the model is greatly reduced. Experimental results show that our DCH-Net ESC model achieves accuracy of 69% and 90% on ESC50 and ESCIO dataset respectively, which is 2% and 10% higher than that of Highway Network based Highway networks ESC model. Meanwhile our model size is only 2MB.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DCH-Net: Densely Connected Highway Convolution Neural Network for Environmental Sound Classification\",\"authors\":\"Xiaohu Zhang, Yuexian Zou\",\"doi\":\"10.1109/ICDSP.2018.8631632\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Environmental Sound Classification (ESC) plays a vital role in the field of machine auditory scene. Recently, the Highway Network CNN model has achieved the state-of-art results via solving the vanishing-gradient problem of much deeper CNN. However, carefully analyzing the Highway Network model shows that the Highway Network model lacks ability to maximize information flow between layers, which is essentially benefits the discriminative representation of acoustic events. Besides, the Highway Network model size is larger than 20MB for ESC task, which is still large for mobile applications. Regarding to these two issues, in this study, we propose a novel Densely Connected Highway Convolutional Network (DCH-Net) model for ESC task. Specifically, a densely highway module is developed which is able to ensure the maximum information flow between layers by connecting all layers directly with each other. Besides, to reduce the model size, a global average pooling layer is designed which replaces the traditional fully connection layers and the parameters of the model is greatly reduced. Experimental results show that our DCH-Net ESC model achieves accuracy of 69% and 90% on ESC50 and ESCIO dataset respectively, which is 2% and 10% higher than that of Highway Network based Highway networks ESC model. Meanwhile our model size is only 2MB.\",\"PeriodicalId\":218806,\"journal\":{\"name\":\"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)\",\"volume\":\"100 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDSP.2018.8631632\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSP.2018.8631632","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Environmental Sound Classification (ESC) plays a vital role in the field of machine auditory scene. Recently, the Highway Network CNN model has achieved the state-of-art results via solving the vanishing-gradient problem of much deeper CNN. However, carefully analyzing the Highway Network model shows that the Highway Network model lacks ability to maximize information flow between layers, which is essentially benefits the discriminative representation of acoustic events. Besides, the Highway Network model size is larger than 20MB for ESC task, which is still large for mobile applications. Regarding to these two issues, in this study, we propose a novel Densely Connected Highway Convolutional Network (DCH-Net) model for ESC task. Specifically, a densely highway module is developed which is able to ensure the maximum information flow between layers by connecting all layers directly with each other. Besides, to reduce the model size, a global average pooling layer is designed which replaces the traditional fully connection layers and the parameters of the model is greatly reduced. Experimental results show that our DCH-Net ESC model achieves accuracy of 69% and 90% on ESC50 and ESCIO dataset respectively, which is 2% and 10% higher than that of Highway Network based Highway networks ESC model. Meanwhile our model size is only 2MB.