{"title":"基于改进AlexNet模型的场景分类","authors":"Lisha Xiao, Qin Yan, Shuyu Deng","doi":"10.1109/ISKE.2017.8258820","DOIUrl":null,"url":null,"abstract":"Scene classification is an important research branch of image comprehension, which gains information from images and interprets them using computer system by imitating the biological systems of human beings. AlexNet model is limited in image classification because of the large convolution kernel and stride in the first convolutional layer leading to over rapid decline of feature maps resolution and excessive compression of spatial information. This paper proposed an improved AlexNet model according to the design principle of convolutional neural networks (CNNs). The large convolution kernel is decomposed into a structure cascaded by two small convolution kernels with reduced stride. Another convolutional layer is added after the first one to enhance the integration process of the low-level features or the spatial information. The asymmetric convolution kernel is applied in the last three convolutional layers. The experiments on two datasets show that the classification accuracy of the improved AlexNet model is higher than those of AlexNet model and ZFNet model for 23 categories of scene classification.","PeriodicalId":208009,"journal":{"name":"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":"{\"title\":\"Scene classification with improved AlexNet model\",\"authors\":\"Lisha Xiao, Qin Yan, Shuyu Deng\",\"doi\":\"10.1109/ISKE.2017.8258820\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scene classification is an important research branch of image comprehension, which gains information from images and interprets them using computer system by imitating the biological systems of human beings. AlexNet model is limited in image classification because of the large convolution kernel and stride in the first convolutional layer leading to over rapid decline of feature maps resolution and excessive compression of spatial information. This paper proposed an improved AlexNet model according to the design principle of convolutional neural networks (CNNs). The large convolution kernel is decomposed into a structure cascaded by two small convolution kernels with reduced stride. Another convolutional layer is added after the first one to enhance the integration process of the low-level features or the spatial information. The asymmetric convolution kernel is applied in the last three convolutional layers. The experiments on two datasets show that the classification accuracy of the improved AlexNet model is higher than those of AlexNet model and ZFNet model for 23 categories of scene classification.\",\"PeriodicalId\":208009,\"journal\":{\"name\":\"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"45\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISKE.2017.8258820\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISKE.2017.8258820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Scene classification is an important research branch of image comprehension, which gains information from images and interprets them using computer system by imitating the biological systems of human beings. AlexNet model is limited in image classification because of the large convolution kernel and stride in the first convolutional layer leading to over rapid decline of feature maps resolution and excessive compression of spatial information. This paper proposed an improved AlexNet model according to the design principle of convolutional neural networks (CNNs). The large convolution kernel is decomposed into a structure cascaded by two small convolution kernels with reduced stride. Another convolutional layer is added after the first one to enhance the integration process of the low-level features or the spatial information. The asymmetric convolution kernel is applied in the last three convolutional layers. The experiments on two datasets show that the classification accuracy of the improved AlexNet model is higher than those of AlexNet model and ZFNet model for 23 categories of scene classification.