How Does the Data set Affect CNN-based Image Classification Performance?

Chao Luo, Xiaojie Li, Lutao Wang, Jia He, Denggao Li, Jiliu Zhou
{"title":"How Does the Data set Affect CNN-based Image Classification Performance?","authors":"Chao Luo, Xiaojie Li, Lutao Wang, Jia He, Denggao Li, Jiliu Zhou","doi":"10.1109/ICSAI.2018.8599448","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (ConvNets or CNNs) have been proven very effective in areas such as image recognition and classification. Especially in the field of image classification, the CNN-based method has achieved excellent performance. Performance is an important indicator for evaluating whether a CNN-based classification method is excellent, so it is important to study which factors affect performance. As we all know, image classification performance is affected by the network structure itself and the size of the data set. In particular, data set size have a significant impact on performance. While for most people, a large number of data set are difficult to obtain. Thus, we consider a question of this approach: How does the size of the data set affect performance? In order to clarify this issue, there are 35 groups experiment performed with 5 times experiment in each group (175 experiments in total). For each k-classification experiment, we do 5 groups by increasing the size of the training set. Observe changes in accuracy to analyze the effect of data set size on difference. For the same CNN-based network, experimental results of average accuracy illustrate that the larger the training set, the higher the test accuracy. However, when the training data set are insufficient, better results can be obtained. Furthermore, in each group experiment, the more categories that are classified, the more obvious the performance change. Results of this paper not only can guide us to do experiments on image classification, but also have important guiding significance for other experiments based on deep learning.","PeriodicalId":375852,"journal":{"name":"2018 5th International Conference on Systems and Informatics (ICSAI)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 5th International Conference on Systems and Informatics (ICSAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI.2018.8599448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 34

Abstract

Convolutional neural networks (ConvNets or CNNs) have been proven very effective in areas such as image recognition and classification. Especially in the field of image classification, the CNN-based method has achieved excellent performance. Performance is an important indicator for evaluating whether a CNN-based classification method is excellent, so it is important to study which factors affect performance. As we all know, image classification performance is affected by the network structure itself and the size of the data set. In particular, data set size have a significant impact on performance. While for most people, a large number of data set are difficult to obtain. Thus, we consider a question of this approach: How does the size of the data set affect performance? In order to clarify this issue, there are 35 groups experiment performed with 5 times experiment in each group (175 experiments in total). For each k-classification experiment, we do 5 groups by increasing the size of the training set. Observe changes in accuracy to analyze the effect of data set size on difference. For the same CNN-based network, experimental results of average accuracy illustrate that the larger the training set, the higher the test accuracy. However, when the training data set are insufficient, better results can be obtained. Furthermore, in each group experiment, the more categories that are classified, the more obvious the performance change. Results of this paper not only can guide us to do experiments on image classification, but also have important guiding significance for other experiments based on deep learning.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
数据集如何影响基于cnn的图像分类性能?
卷积神经网络(ConvNets或cnn)已被证明在图像识别和分类等领域非常有效。特别是在图像分类领域,基于cnn的方法取得了优异的成绩。性能是评价一种基于cnn的分类方法是否优秀的重要指标,因此研究哪些因素会影响性能是很重要的。众所周知,图像分类性能受网络结构本身和数据集大小的影响。特别是,数据集的大小对性能有很大的影响。而对于大多数人来说,大量的数据集是很难获得的。因此,我们考虑这种方法的一个问题:数据集的大小如何影响性能?为了澄清这一问题,共进行了35组实验,每组5次实验,共175次实验。对于每个k分类实验,我们通过增加训练集的大小来做5组。观察准确率的变化,分析数据集大小对差异的影响。对于相同的基于cnn的网络,平均准确率的实验结果表明,训练集越大,测试准确率越高。然而,当训练数据集不足时,可以获得更好的结果。此外,在每组实验中,分类的类别越多,性能变化越明显。本文的研究结果不仅可以指导我们进行图像分类的实验,对其他基于深度学习的实验也具有重要的指导意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Research on Improvement of Text Processing and Clustering Algorithms in Public Opinion Early Warning System Mutation Relation Extraction and Genes Network Analysis in Colon Cancer Discovering Transportation Mode of Tourists Using Low-Sampling-Rate Trajectory of Cellular Data Sound Source Separation by Instantaneous Estimation-Based Spectral Subtraction Evaluation Of Electricity Market Operation Efficiency Based On Analytic Hierarchy Process-Grey Relational Analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1