SD-GAN: Semantic Decomposition for Face Image Synthesis with Discrete Attribute

Kangneng Zhou, Xiaobin Zhu, Daiheng Gao, Kai Lee, Xinjie Li, Xu-Cheng Yin
{"title":"SD-GAN: Semantic Decomposition for Face Image Synthesis with Discrete Attribute","authors":"Kangneng Zhou, Xiaobin Zhu, Daiheng Gao, Kai Lee, Xinjie Li, Xu-Cheng Yin","doi":"10.1145/3503161.3547791","DOIUrl":null,"url":null,"abstract":"Manipulating latent code in generative adversarial networks (GANs) for facial image synthesis mainly focuses on continuous attribute synthesis (e.g., age, pose and emotion), while discrete attribute synthesis (like face mask and eyeglasses) receives less attention. Directly applying existing works to facial discrete attributes may cause inaccurate results. In this work, we propose an innovative framework to tackle challenging facial discrete attribute synthesis via semantic decomposing, dubbed SD-GAN. To be concrete, we explicitly decompose the discrete attribute representation into two components, i.e. the semantic prior basis and offset latent representation. The semantic prior basis shows an initializing direction for manipulating face representation in the latent space. The offset latent presentation obtained by 3D-aware semantic fusion network is proposed to adjust prior basis. In addition, the fusion network integrates 3D embedding for better identity preservation and discrete attribute synthesis. The combination of prior basis and offset latent representation enable our method to synthesize photo-realistic face images with discrete attributes. Notably, we construct a large and valuable dataset MEGN (Face Mask and Eyeglasses images crawled from Google and Naver) for completing the lack of discrete attributes in the existing dataset. Extensive qualitative and quantitative experiments demonstrate the state-of-the-art performance of our method. Our code is available at an anonymous website: https://github.com/MontaEllis/SD-GAN.","PeriodicalId":412792,"journal":{"name":"Proceedings of the 30th ACM International Conference on Multimedia","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503161.3547791","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Manipulating latent code in generative adversarial networks (GANs) for facial image synthesis mainly focuses on continuous attribute synthesis (e.g., age, pose and emotion), while discrete attribute synthesis (like face mask and eyeglasses) receives less attention. Directly applying existing works to facial discrete attributes may cause inaccurate results. In this work, we propose an innovative framework to tackle challenging facial discrete attribute synthesis via semantic decomposing, dubbed SD-GAN. To be concrete, we explicitly decompose the discrete attribute representation into two components, i.e. the semantic prior basis and offset latent representation. The semantic prior basis shows an initializing direction for manipulating face representation in the latent space. The offset latent presentation obtained by 3D-aware semantic fusion network is proposed to adjust prior basis. In addition, the fusion network integrates 3D embedding for better identity preservation and discrete attribute synthesis. The combination of prior basis and offset latent representation enable our method to synthesize photo-realistic face images with discrete attributes. Notably, we construct a large and valuable dataset MEGN (Face Mask and Eyeglasses images crawled from Google and Naver) for completing the lack of discrete attributes in the existing dataset. Extensive qualitative and quantitative experiments demonstrate the state-of-the-art performance of our method. Our code is available at an anonymous website: https://github.com/MontaEllis/SD-GAN.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SD-GAN:面向离散属性人脸图像合成的语义分解
在生成对抗网络(GANs)中,处理潜在代码用于人脸图像合成主要集中在连续属性合成(如年龄、姿势和情绪)上,而离散属性合成(如面具和眼镜)较少受到关注。直接将现有作品应用于面部离散属性可能会导致不准确的结果。在这项工作中,我们提出了一个创新的框架,通过语义分解来解决具有挑战性的面部离散属性合成,称为SD-GAN。具体来说,我们明确地将离散属性表示分解为语义先验基和偏移潜在表示两部分。语义先验基显示了在潜在空间中操纵人脸表征的初始化方向。利用三维感知语义融合网络得到的偏移潜表示来调整先验基础。此外,融合网络还集成了三维嵌入,以更好地保持身份和离散属性合成。将先验基与偏移潜表示相结合,使我们的方法能够合成具有离散属性的逼真人脸图像。值得注意的是,我们构建了一个大型且有价值的数据集MEGN(从Google和Naver抓取的Face Mask和Eyeglasses图像),以完成现有数据集中离散属性的缺失。广泛的定性和定量实验证明了我们的方法的最先进的性能。我们的代码可以在一个匿名网站上找到:https://github.com/MontaEllis/SD-GAN。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Adaptive Anti-Bottleneck Multi-Modal Graph Learning Network for Personalized Micro-video Recommendation Composite Photograph Harmonization with Complete Background Cues Domain-Specific Conditional Jigsaw Adaptation for Enhancing transferability and Discriminability Enabling Effective Low-Light Perception using Ubiquitous Low-Cost Visible-Light Cameras Restoration of Analog Videos Using Swin-UNet
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1