Counterfactual image generation by disentangling data attributes with deep generative models

IF 0.5 Q4 STATISTICS & PROBABILITY Communications for Statistical Applications and Methods Pub Date : 2023-11-30 DOI:10.29220/csam.2023.30.6.589

Jieon Lim, Weonyoung Joo

{"title":"Counterfactual image generation by disentangling data attributes with deep generative models","authors":"Jieon Lim, Weonyoung Joo","doi":"10.29220/csam.2023.30.6.589","DOIUrl":null,"url":null,"abstract":"Deep generative models target to infer the underlying true data distribution, and it leads to a huge success in generating fake-but-realistic data. Regarding such a perspective, the data attributes can be a crucial factor in the data generation process since non-existent counterfactual samples can be generated by altering certain factors. For example, we can generate new portrait images by ﬂipping the gender attribute or altering the hair color attributes. This paper proposes counterfactual disentangled variational autoencoder generative adversarial networks (CDVAE-GAN), specialized for data attribute level counterfactual data generation. The structure of the proposed CDVAE-GAN consists of variational autoencoders and generative adversarial networks. Speciﬁcally, we adopt a Gaussian variational autoencoder to extract low-dimensional disentangled data features and auxiliary Bernoulli latent variables to model the data attributes separately. Also, we utilize a generative adversarial network to generate data with high ﬁdelity. By enjoying the beneﬁts of the variational autoencoder with the additional Bernoulli latent variables and the generative adversarial network, the proposed CDVAE-GAN can control the data attributes, and it enables producing counterfactual data. Our experimental result on the CelebA dataset qualitatively shows that the generated samples from CDVAE-GAN are realistic. Also, the quantitative results support that the proposed model can produce data that can deceive other machine learning classiﬁers with the altered data attributes.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":"55 6","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications for Statistical Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29220/csam.2023.30.6.589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

Abstract

Deep generative models target to infer the underlying true data distribution, and it leads to a huge success in generating fake-but-realistic data. Regarding such a perspective, the data attributes can be a crucial factor in the data generation process since non-existent counterfactual samples can be generated by altering certain factors. For example, we can generate new portrait images by ﬂipping the gender attribute or altering the hair color attributes. This paper proposes counterfactual disentangled variational autoencoder generative adversarial networks (CDVAE-GAN), specialized for data attribute level counterfactual data generation. The structure of the proposed CDVAE-GAN consists of variational autoencoders and generative adversarial networks. Speciﬁcally, we adopt a Gaussian variational autoencoder to extract low-dimensional disentangled data features and auxiliary Bernoulli latent variables to model the data attributes separately. Also, we utilize a generative adversarial network to generate data with high ﬁdelity. By enjoying the beneﬁts of the variational autoencoder with the additional Bernoulli latent variables and the generative adversarial network, the proposed CDVAE-GAN can control the data attributes, and it enables producing counterfactual data. Our experimental result on the CelebA dataset qualitatively shows that the generated samples from CDVAE-GAN are realistic. Also, the quantitative results support that the proposed model can produce data that can deceive other machine learning classiﬁers with the altered data attributes.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过深度生成模型分解数据属性生成反事实图像

深度生成模型以推断底层真实数据分布为目标，在生成虚假但真实的数据方面取得了巨大成功。从这个角度来看，数据属性可能是数据生成过程中的一个关键因素，因为通过改变某些因素可以生成不存在的反事实样本。例如，我们可以通过改变性别属性或发色属性来生成新的肖像图像。本文提出了专门用于数据属性级反事实数据生成的反事实分离变异自动编码生成对抗网络（CDVAE-GAN）。CDVAE-GAN 的结构由变异自动编码器和生成式对抗网络组成。具体来说，我们采用高斯变异自动编码器来提取低维分解数据特征，并采用辅助伯努利潜变量对数据属性分别建模。此外，我们还利用生成式对抗网络生成高可靠性数据。通过利用带有额外伯努利潜变量和生成式对抗网络的变分自动编码器的优点，所提出的 CDVAE-GAN 可以控制数据属性，并能生成反事实数据。我们在 CelebA 数据集上的实验结果表明，CDVAE-GAN 生成的样本是真实的。此外，定量结果也证明了所提出的模型可以生成欺骗其他机器学习分类器的数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Communications for Statistical Applications and Methods STATISTICS & PROBABILITY-

CiteScore

0.90

自引率

0.00%

发文量

期刊介绍： Communications for Statistical Applications and Methods (Commun. Stat. Appl. Methods, CSAM) is an official journal of the Korean Statistical Society and Korean International Statistical Society. It is an international and Open Access journal dedicated to publishing peer-reviewed, high quality and innovative statistical research. CSAM publishes articles on applied and methodological research in the areas of statistics and probability. It features rapid publication and broad coverage of statistical applications and methods. It welcomes papers on novel applications of statistical methodology in the areas including medicine (pharmaceutical, biotechnology, medical device), business, management, economics, ecology, education, computing, engineering, operational research, biology, sociology and earth science, but papers from other areas are also considered.