Ho-Gyeong Kim, Jihyeon Roh, Hwaran Lee, Geon-min Kim, Soo-Young Lee
{"title":"大规模对象分类的主动学习:从探索到利用","authors":"Ho-Gyeong Kim, Jihyeon Roh, Hwaran Lee, Geon-min Kim, Soo-Young Lee","doi":"10.1145/2814940.2814989","DOIUrl":null,"url":null,"abstract":"Information and communication technologies supply data every day at incredibly increasing rate, however, almost all of the accumulated data are unlabeled and obtaining their labels is expensive and time-consuming. Among the raw data, selecting and labeling some samples expected to be more informative than others can enhance machines without high cost. This process is called selective sampling, essential part of active learning. So far, most researches have concentrated on classical uncertainty measures to acquire informative data, which is related to \"exploitation\" process of learning. However, when the initial labeled dataset is too small or biased, the early stage model can be unreliable and its decision boundary would be over-fitted to the initial data. Moreover, the obtained data by the exploitation strategy may exacerbate the model further. We introduced \"exploration\" strategy as well as \"exploitation\" strategy. In this paper, we employ Self-Organizing Maps (SOM), one of neural networks to estimate and explore data distribution. For exploitation, margin sampling is applied to the classifier, neural network with soft-max output layer. The effectiveness proposed methods are demonstrated on ILSVRC-2011 image classification task based on features extracted from well-trained Convolutional Neural Networks (CNN). Active learning with exploration strategy shows its potential by stabilizing the early stage model and reducing the classification error rate, and finally making it to be high-quality models.","PeriodicalId":427567,"journal":{"name":"Proceedings of the 3rd International Conference on Human-Agent Interaction","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Active Learning for Large-scale Object Classification: from Exploration to Exploitation\",\"authors\":\"Ho-Gyeong Kim, Jihyeon Roh, Hwaran Lee, Geon-min Kim, Soo-Young Lee\",\"doi\":\"10.1145/2814940.2814989\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Information and communication technologies supply data every day at incredibly increasing rate, however, almost all of the accumulated data are unlabeled and obtaining their labels is expensive and time-consuming. Among the raw data, selecting and labeling some samples expected to be more informative than others can enhance machines without high cost. This process is called selective sampling, essential part of active learning. So far, most researches have concentrated on classical uncertainty measures to acquire informative data, which is related to \\\"exploitation\\\" process of learning. However, when the initial labeled dataset is too small or biased, the early stage model can be unreliable and its decision boundary would be over-fitted to the initial data. Moreover, the obtained data by the exploitation strategy may exacerbate the model further. We introduced \\\"exploration\\\" strategy as well as \\\"exploitation\\\" strategy. In this paper, we employ Self-Organizing Maps (SOM), one of neural networks to estimate and explore data distribution. For exploitation, margin sampling is applied to the classifier, neural network with soft-max output layer. The effectiveness proposed methods are demonstrated on ILSVRC-2011 image classification task based on features extracted from well-trained Convolutional Neural Networks (CNN). Active learning with exploration strategy shows its potential by stabilizing the early stage model and reducing the classification error rate, and finally making it to be high-quality models.\",\"PeriodicalId\":427567,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Human-Agent Interaction\",\"volume\":\"68 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Human-Agent Interaction\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2814940.2814989\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Human-Agent Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2814940.2814989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Active Learning for Large-scale Object Classification: from Exploration to Exploitation
Information and communication technologies supply data every day at incredibly increasing rate, however, almost all of the accumulated data are unlabeled and obtaining their labels is expensive and time-consuming. Among the raw data, selecting and labeling some samples expected to be more informative than others can enhance machines without high cost. This process is called selective sampling, essential part of active learning. So far, most researches have concentrated on classical uncertainty measures to acquire informative data, which is related to "exploitation" process of learning. However, when the initial labeled dataset is too small or biased, the early stage model can be unreliable and its decision boundary would be over-fitted to the initial data. Moreover, the obtained data by the exploitation strategy may exacerbate the model further. We introduced "exploration" strategy as well as "exploitation" strategy. In this paper, we employ Self-Organizing Maps (SOM), one of neural networks to estimate and explore data distribution. For exploitation, margin sampling is applied to the classifier, neural network with soft-max output layer. The effectiveness proposed methods are demonstrated on ILSVRC-2011 image classification task based on features extracted from well-trained Convolutional Neural Networks (CNN). Active learning with exploration strategy shows its potential by stabilizing the early stage model and reducing the classification error rate, and finally making it to be high-quality models.