Coarse-to-fine trained multi-scale Convolutional Neural Networks for image classification

2015 International Joint Conference on Neural Networks (IJCNN) Pub Date : 2015-07-12 DOI:10.1109/IJCNN.2015.7280542

Haobin Dou, Xihong Wu

{"title":"Coarse-to-fine trained multi-scale Convolutional Neural Networks for image classification","authors":"Haobin Dou, Xihong Wu","doi":"10.1109/IJCNN.2015.7280542","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks (CNNs) have become forceful models in feature learning and image classification. They achieve translation invariance by spatial convolution and pooling mechanisms, while their ability in scale invariance is limited. To tackle the problem of scale variation in image classification, this work proposed a multi-scale CNN model with depth-decreasing multi-column structure. Input images were decomposed into multiple scales and at each scale image, a CNN column was instantiated with its depth decreasing from fine to coarse scale for model simplification. Scale-invariant features were learned by weights shared across all scales and pooled among adjacent scales. Particularly, a coarse-to-fine pre-training method imitating the human's development of spatial frequency perception was proposed to train this multi-scale CNN, which accelerated the training process and reduced the classification error. In addition, model averaging technique was used to combine models obtained during pre-training and further improve the performance. With these methods, our model achieved classification errors of 15.38% on CIFAR-10 dataset and 41.29% on CIFAR-100 dataset, i.e. 1.05% and 2.97% reduction compared with single-scale CNN model.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"7 1","pages":"1-7"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.2015.7280542","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

Convolutional Neural Networks (CNNs) have become forceful models in feature learning and image classification. They achieve translation invariance by spatial convolution and pooling mechanisms, while their ability in scale invariance is limited. To tackle the problem of scale variation in image classification, this work proposed a multi-scale CNN model with depth-decreasing multi-column structure. Input images were decomposed into multiple scales and at each scale image, a CNN column was instantiated with its depth decreasing from fine to coarse scale for model simplification. Scale-invariant features were learned by weights shared across all scales and pooled among adjacent scales. Particularly, a coarse-to-fine pre-training method imitating the human's development of spatial frequency perception was proposed to train this multi-scale CNN, which accelerated the training process and reduced the classification error. In addition, model averaging technique was used to combine models obtained during pre-training and further improve the performance. With these methods, our model achieved classification errors of 15.38% on CIFAR-10 dataset and 41.29% on CIFAR-100 dataset, i.e. 1.05% and 2.97% reduction compared with single-scale CNN model.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于图像分类的粗到精训练多尺度卷积神经网络

卷积神经网络(cnn)已经成为特征学习和图像分类的有力模型。它们通过空间卷积和池化机制实现平移不变性，但尺度不变性能力有限。为了解决图像分类中尺度变化的问题，本文提出了一种深度递减的多列结构的多尺度CNN模型。将输入图像分解为多个尺度，在每个尺度图像上实例化一个CNN列，其深度由细尺度递减到粗尺度，进行模型简化。尺度不变特征是通过在所有尺度上共享权重来学习的，并在相邻尺度之间进行池化。特别提出了一种模仿人类空间频率感知发展的由粗到精的预训练方法来训练这种多尺度CNN，加快了训练过程，降低了分类误差。此外，采用模型平均技术将预训练得到的模型进行组合，进一步提高性能。使用这些方法，我们的模型在CIFAR-10数据集上的分类误差为15.38%，在CIFAR-100数据集上的分类误差为41.29%，分别比单尺度CNN模型降低了1.05%和2.97%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2015 International Joint Conference on Neural Networks (IJCNN)

自引率

0.00%

发文量