Harnessing GAN with Metric Learning for One-Shot Generation on a Fine-Grained Category

2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI) Pub Date : 2019-11-01 DOI:10.1109/ICTAI.2019.00130

Yusuke Ohtsubo, Tetsu Matsukawa, Einoshin Suzuki

{"title":"Harnessing GAN with Metric Learning for One-Shot Generation on a Fine-Grained Category","authors":"Yusuke Ohtsubo, Tetsu Matsukawa, Einoshin Suzuki","doi":"10.1109/ICTAI.2019.00130","DOIUrl":null,"url":null,"abstract":"We propose a GAN-based one-shot generation method on a fine-grained category, which represents a subclass of a category, typically with diverse examples. One-shot generation refers to a task of taking an image which belongs to a class not used in the training phase and then generating a set of new images belonging to the same class. Generative Adversarial Network (GAN), which represents a type of deep neural networks with competing generator and discriminator, has proven to be useful in generating realistic images. Especially DAGAN, which maps the input image to a low-dimensional space via an encoder and then back to the example space via a decoder, has been quite effective with datasets such as handwritten character datasets. However, when the class corresponds to a fine-grained category, DAGAN occasionally generates images which are regarded as belonging to other classes due to the rich variety of the examples in the class and the low dissimilarities of the examples among the classes. For example, it accidentally generates facial images of different persons when the class corresponds to a specific person. To circumvent this problem, we introduce a metric learning with a triplet loss to the bottleneck layer of DAGAN to penalize such a generation. We also extend the optimization algorithm of DAGAN to an alternating procedure for two types of loss functions. Our proposed method outperforms DAGAN in the GAN-test task for VGG-Face dataset and CompCars dataset by 5.6% and 4.8% in accuracy, respectively. We also conducted experiments for the data augmentation task and observed 4.5% higher accuracy for our proposed method over DAGAN for VGG-Face dataset.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2019.00130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We propose a GAN-based one-shot generation method on a fine-grained category, which represents a subclass of a category, typically with diverse examples. One-shot generation refers to a task of taking an image which belongs to a class not used in the training phase and then generating a set of new images belonging to the same class. Generative Adversarial Network (GAN), which represents a type of deep neural networks with competing generator and discriminator, has proven to be useful in generating realistic images. Especially DAGAN, which maps the input image to a low-dimensional space via an encoder and then back to the example space via a decoder, has been quite effective with datasets such as handwritten character datasets. However, when the class corresponds to a fine-grained category, DAGAN occasionally generates images which are regarded as belonging to other classes due to the rich variety of the examples in the class and the low dissimilarities of the examples among the classes. For example, it accidentally generates facial images of different persons when the class corresponds to a specific person. To circumvent this problem, we introduce a metric learning with a triplet loss to the bottleneck layer of DAGAN to penalize such a generation. We also extend the optimization algorithm of DAGAN to an alternating procedure for two types of loss functions. Our proposed method outperforms DAGAN in the GAN-test task for VGG-Face dataset and CompCars dataset by 5.6% and 4.8% in accuracy, respectively. We also conducted experiments for the data augmentation task and observed 4.5% higher accuracy for our proposed method over DAGAN for VGG-Face dataset.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于度量学习的GAN在细粒度分类上的一次性生成

我们提出了一种基于gan的细粒度类别的一次性生成方法，细粒度类别代表一个类别的子类，通常具有不同的示例。一次性生成(One-shot generation)是指取一张训练阶段未使用的类别的图像，然后生成一组属于同一类别的新图像。生成式对抗网络(GAN)是一种具有生成器和鉴别器竞争的深度神经网络，在生成逼真图像方面非常有用。特别是DAGAN，它通过编码器将输入图像映射到低维空间，然后通过解码器返回到示例空间，对于手写字符数据集等数据集非常有效。然而，当类对应于一个细粒度的类别时，DAGAN偶尔会产生被认为属于其他类的图像，因为类中的样本种类丰富，类之间的样本不相似度很低。例如，当类对应于特定的人时，它会意外地生成不同人的面部图像。为了避免这个问题，我们在DAGAN的瓶颈层引入了一个带有三重损失的度量学习来惩罚这样的生成。我们还将DAGAN的优化算法推广到两类损失函数的交替过程。在VGG-Face数据集和CompCars数据集的gan测试任务中，我们提出的方法的准确率分别比DAGAN高5.6%和4.8%。我们还对VGG-Face数据集的数据增强任务进行了实验，发现我们提出的方法在DAGAN上的准确率提高了4.5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)

自引率

0.00%

发文量

期刊最新文献

Monaural Music Source Separation using a ResNet Latent Separator Network Graph-Based Attention Networks for Aspect Level Sentiment Analysis A Multi-channel Neural Network for Imbalanced Emotion Recognition Scaling up Prediction of Psychosis by Natural Language Processing Improving Bandit-Based Recommendations with Spatial Context Reasoning: An Online Evaluation