Conditional synthetic food image generation

IS&T International Symposium on Electronic Imaging Pub Date : 2023-01-16 DOI:10.2352/ei.2023.35.7.image-268

Wenjin Fu, Yue Han, Jiangpeng He, Sriram Baireddy, Mridul Gupta, Fengqing Zhu

{"title":"Conditional synthetic food image generation","authors":"Wenjin Fu, Yue Han, Jiangpeng He, Sriram Baireddy, Mridul Gupta, Fengqing Zhu","doi":"10.2352/ei.2023.35.7.image-268","DOIUrl":null,"url":null,"abstract":"Generative Adversarial Networks (GAN) have been widely investigated for image synthesis based on their powerful representation learning ability. In this work, we explore the StyleGAN and its application of synthetic food image generation. Despite the impressive performance of GAN for natural image generation, food images suffer from high intra-class diversity and inter-class similarity, resulting in overfitting and visual artifacts for synthetic images. Therefore, we aim to explore the capability and improve the performance of GAN methods for food image generation. Specifically, we first choose StyleGAN3 as the baseline method to generate synthetic food images and analyze the performance. Then, we identify two issues that can cause performance degradation on food images during the training phase: (1) inter-class feature entanglement during multi-food classes training and (2) loss of high-resolution detail during image downsampling. To address both issues, we propose to train one food category at a time to avoid feature entanglement and leverage image patches cropped from high-resolution datasets to retain fine details. We evaluate our method on the Food-101 dataset and show improved quality of generated synthetic food images compared with the baseline. Finally, we demonstrate the great potential of improving the performance of downstream tasks, such as food image classification by including high-quality synthetic training samples in the data augmentation.","PeriodicalId":73514,"journal":{"name":"IS&T International Symposium on Electronic Imaging","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IS&T International Symposium on Electronic Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2352/ei.2023.35.7.image-268","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Generative Adversarial Networks (GAN) have been widely investigated for image synthesis based on their powerful representation learning ability. In this work, we explore the StyleGAN and its application of synthetic food image generation. Despite the impressive performance of GAN for natural image generation, food images suffer from high intra-class diversity and inter-class similarity, resulting in overfitting and visual artifacts for synthetic images. Therefore, we aim to explore the capability and improve the performance of GAN methods for food image generation. Specifically, we first choose StyleGAN3 as the baseline method to generate synthetic food images and analyze the performance. Then, we identify two issues that can cause performance degradation on food images during the training phase: (1) inter-class feature entanglement during multi-food classes training and (2) loss of high-resolution detail during image downsampling. To address both issues, we propose to train one food category at a time to avoid feature entanglement and leverage image patches cropped from high-resolution datasets to retain fine details. We evaluate our method on the Food-101 dataset and show improved quality of generated synthetic food images compared with the baseline. Finally, we demonstrate the great potential of improving the performance of downstream tasks, such as food image classification by including high-quality synthetic training samples in the data augmentation.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

条件合成食品图像生成

生成对抗网络(GAN)由于其强大的表征学习能力在图像合成领域得到了广泛的研究。在这项工作中，我们探索了StyleGAN及其在合成食品图像生成中的应用。尽管GAN在自然图像生成方面的表现令人印象深刻，但食物图像存在高度的类内多样性和类间相似性，导致合成图像的过拟合和视觉伪像。因此，我们的目标是探索GAN方法在食物图像生成方面的能力并提高其性能。具体而言，我们首先选择StyleGAN3作为基线方法生成合成食品图像并分析其性能。然后，我们确定了在训练阶段可能导致食物图像性能下降的两个问题:(1)多食物类训练期间的类间特征纠缠;(2)图像下采样期间高分辨率细节的损失。为了解决这两个问题，我们建议一次训练一个食物类别，以避免特征纠缠，并利用从高分辨率数据集裁剪的图像补丁来保留细节。我们在food -101数据集上评估了我们的方法，并显示与基线相比，生成的合成食品图像的质量有所提高。最后，我们展示了通过在数据增强中加入高质量的合成训练样本来提高下游任务(如食品图像分类)性能的巨大潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IS&T International Symposium on Electronic Imaging

自引率

0.00%

发文量