Jeonghyo Ha, Jung Eun, Pyunghwan Ahn, Dong-Hoon Shin, Junmo Kim
{"title":"Learning Convolutional Neural Network Using Data from Other Domains in case of Insufficient Data","authors":"Jeonghyo Ha, Jung Eun, Pyunghwan Ahn, Dong-Hoon Shin, Junmo Kim","doi":"10.1145/3209914.3209927","DOIUrl":null,"url":null,"abstract":"In this paper, we describe a training methodology of convolutional neural networks(CNNs) using data from a different domain when the number of training data in the test domain is small. Training a CNN for classification without enough data might lead to serious problems of overfitting and thus fail to generalize. In this case, if large data of the same object categories is available in another domain, this problem can be alleviated. We propose a method to train a CNN with small data in the test domain and large data in another. Since training a single network using data from different domains could lead to performance degradation, we consider this problem as cross-domain image similarity learning. In our experiment, we train a Siamese network to compute similarity between a pair of images from different domains, which are natural photos and 3D model projections. We design the network to output the probability that the input image pair belongs to the same category. Thus, the network can calculate similarity between the input pair and also classify a natural photo by comparing it with each images in the 3D model database. Since the network output represents similarity, we can greatly reduce testing time for classification compared to other methods (such as NN classification) in which distances between feature vectors must be calculated for every pair of images.","PeriodicalId":174382,"journal":{"name":"Proceedings of the 1st International Conference on Information Science and Systems","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st International Conference on Information Science and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3209914.3209927","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In this paper, we describe a training methodology of convolutional neural networks(CNNs) using data from a different domain when the number of training data in the test domain is small. Training a CNN for classification without enough data might lead to serious problems of overfitting and thus fail to generalize. In this case, if large data of the same object categories is available in another domain, this problem can be alleviated. We propose a method to train a CNN with small data in the test domain and large data in another. Since training a single network using data from different domains could lead to performance degradation, we consider this problem as cross-domain image similarity learning. In our experiment, we train a Siamese network to compute similarity between a pair of images from different domains, which are natural photos and 3D model projections. We design the network to output the probability that the input image pair belongs to the same category. Thus, the network can calculate similarity between the input pair and also classify a natural photo by comparing it with each images in the 3D model database. Since the network output represents similarity, we can greatly reduce testing time for classification compared to other methods (such as NN classification) in which distances between feature vectors must be calculated for every pair of images.