Harnessing GAN with Metric Learning for One-Shot Generation on a Fine-Grained Category

Yusuke Ohtsubo, Tetsu Matsukawa, Einoshin Suzuki
{"title":"Harnessing GAN with Metric Learning for One-Shot Generation on a Fine-Grained Category","authors":"Yusuke Ohtsubo, Tetsu Matsukawa, Einoshin Suzuki","doi":"10.1109/ICTAI.2019.00130","DOIUrl":null,"url":null,"abstract":"We propose a GAN-based one-shot generation method on a fine-grained category, which represents a subclass of a category, typically with diverse examples. One-shot generation refers to a task of taking an image which belongs to a class not used in the training phase and then generating a set of new images belonging to the same class. Generative Adversarial Network (GAN), which represents a type of deep neural networks with competing generator and discriminator, has proven to be useful in generating realistic images. Especially DAGAN, which maps the input image to a low-dimensional space via an encoder and then back to the example space via a decoder, has been quite effective with datasets such as handwritten character datasets. However, when the class corresponds to a fine-grained category, DAGAN occasionally generates images which are regarded as belonging to other classes due to the rich variety of the examples in the class and the low dissimilarities of the examples among the classes. For example, it accidentally generates facial images of different persons when the class corresponds to a specific person. To circumvent this problem, we introduce a metric learning with a triplet loss to the bottleneck layer of DAGAN to penalize such a generation. We also extend the optimization algorithm of DAGAN to an alternating procedure for two types of loss functions. Our proposed method outperforms DAGAN in the GAN-test task for VGG-Face dataset and CompCars dataset by 5.6% and 4.8% in accuracy, respectively. We also conducted experiments for the data augmentation task and observed 4.5% higher accuracy for our proposed method over DAGAN for VGG-Face dataset.","PeriodicalId":346657,"journal":{"name":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2019.00130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We propose a GAN-based one-shot generation method on a fine-grained category, which represents a subclass of a category, typically with diverse examples. One-shot generation refers to a task of taking an image which belongs to a class not used in the training phase and then generating a set of new images belonging to the same class. Generative Adversarial Network (GAN), which represents a type of deep neural networks with competing generator and discriminator, has proven to be useful in generating realistic images. Especially DAGAN, which maps the input image to a low-dimensional space via an encoder and then back to the example space via a decoder, has been quite effective with datasets such as handwritten character datasets. However, when the class corresponds to a fine-grained category, DAGAN occasionally generates images which are regarded as belonging to other classes due to the rich variety of the examples in the class and the low dissimilarities of the examples among the classes. For example, it accidentally generates facial images of different persons when the class corresponds to a specific person. To circumvent this problem, we introduce a metric learning with a triplet loss to the bottleneck layer of DAGAN to penalize such a generation. We also extend the optimization algorithm of DAGAN to an alternating procedure for two types of loss functions. Our proposed method outperforms DAGAN in the GAN-test task for VGG-Face dataset and CompCars dataset by 5.6% and 4.8% in accuracy, respectively. We also conducted experiments for the data augmentation task and observed 4.5% higher accuracy for our proposed method over DAGAN for VGG-Face dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于度量学习的GAN在细粒度分类上的一次性生成
我们提出了一种基于gan的细粒度类别的一次性生成方法,细粒度类别代表一个类别的子类,通常具有不同的示例。一次性生成(One-shot generation)是指取一张训练阶段未使用的类别的图像,然后生成一组属于同一类别的新图像。生成式对抗网络(GAN)是一种具有生成器和鉴别器竞争的深度神经网络,在生成逼真图像方面非常有用。特别是DAGAN,它通过编码器将输入图像映射到低维空间,然后通过解码器返回到示例空间,对于手写字符数据集等数据集非常有效。然而,当类对应于一个细粒度的类别时,DAGAN偶尔会产生被认为属于其他类的图像,因为类中的样本种类丰富,类之间的样本不相似度很低。例如,当类对应于特定的人时,它会意外地生成不同人的面部图像。为了避免这个问题,我们在DAGAN的瓶颈层引入了一个带有三重损失的度量学习来惩罚这样的生成。我们还将DAGAN的优化算法推广到两类损失函数的交替过程。在VGG-Face数据集和CompCars数据集的gan测试任务中,我们提出的方法的准确率分别比DAGAN高5.6%和4.8%。我们还对VGG-Face数据集的数据增强任务进行了实验,发现我们提出的方法在DAGAN上的准确率提高了4.5%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Monaural Music Source Separation using a ResNet Latent Separator Network Graph-Based Attention Networks for Aspect Level Sentiment Analysis A Multi-channel Neural Network for Imbalanced Emotion Recognition Scaling up Prediction of Psychosis by Natural Language Processing Improving Bandit-Based Recommendations with Spatial Context Reasoning: An Online Evaluation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1