Towards High-Quality and Disentangled Face Editing in a 3D GAN

Kaiwen Jiang;Shu-Yu Chen;Feng-Lin Liu;Hongbo Fu;Lin Gao
{"title":"Towards High-Quality and Disentangled Face Editing in a 3D GAN","authors":"Kaiwen Jiang;Shu-Yu Chen;Feng-Lin Liu;Hongbo Fu;Lin Gao","doi":"10.1109/TPAMI.2024.3523422","DOIUrl":null,"url":null,"abstract":"Recent methods for synthesizing 3D-aware face images have achieved rapid development thanks to neural radiance fields, allowing for high quality and fast inference speed. However, existing solutions for editing facial geometry and appearance independently usually require retraining and are not optimized for the recent work of generation, thus tending to lag behind the generation process. To address these issues, we introduce NeRFFaceEditing, which enables editing and decoupling geometry and appearance in the pretrained tri-plane-based neural radiance field while retaining its high quality and fast inference speed. Our key idea for disentanglement is to use the statistics of the tri-plane to represent the high-level appearance of its corresponding facial volume. Moreover, we leverage a generated 3D-continuous semantic mask as an intermediary for geometry editing. We devise a geometry decoder (whose output is unchanged when the appearance changes) and an appearance decoder. The geometry decoder aligns the original facial volume with the semantic mask volume. We also enhance the disentanglement by explicitly regularizing rendered images with the same appearance but different geometry to be similar in terms of color distribution for each facial component separately. Our method allows users to edit via semantic masks with decoupled control of geometry and appearance. Both qualitative and quantitative evaluations show the superior geometry and appearance control abilities of our method compared to existing and alternative solutions.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"2533-2544"},"PeriodicalIF":18.6000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10829803/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recent methods for synthesizing 3D-aware face images have achieved rapid development thanks to neural radiance fields, allowing for high quality and fast inference speed. However, existing solutions for editing facial geometry and appearance independently usually require retraining and are not optimized for the recent work of generation, thus tending to lag behind the generation process. To address these issues, we introduce NeRFFaceEditing, which enables editing and decoupling geometry and appearance in the pretrained tri-plane-based neural radiance field while retaining its high quality and fast inference speed. Our key idea for disentanglement is to use the statistics of the tri-plane to represent the high-level appearance of its corresponding facial volume. Moreover, we leverage a generated 3D-continuous semantic mask as an intermediary for geometry editing. We devise a geometry decoder (whose output is unchanged when the appearance changes) and an appearance decoder. The geometry decoder aligns the original facial volume with the semantic mask volume. We also enhance the disentanglement by explicitly regularizing rendered images with the same appearance but different geometry to be similar in terms of color distribution for each facial component separately. Our method allows users to edit via semantic masks with decoupled control of geometry and appearance. Both qualitative and quantitative evaluations show the superior geometry and appearance control abilities of our method compared to existing and alternative solutions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向高质量和无纠缠的三维GAN人脸编辑
由于神经辐射场的存在,近年来合成3d感知人脸图像的方法得到了快速发展,从而实现了高质量和快速的推理速度。然而,现有的独立编辑面部几何和外观的解决方案通常需要重新训练,并且没有针对最近的生成工作进行优化,因此往往落后于生成过程。为了解决这些问题,我们引入了NeRFFaceEditing,它可以在预训练的基于三平面的神经辐射场中编辑和解耦几何和外观,同时保持其高质量和快速的推理速度。我们解开纠缠的关键思想是使用三平面的统计量来表示其相应面部体积的高级外观。此外,我们利用生成的3d连续语义掩码作为几何编辑的中介。我们设计了一个几何解码器(当外观改变时其输出不变)和一个外观解码器。几何解码器将原始面部体积与语义掩码体积对齐。我们还通过显式正则化具有相同外观但不同几何形状的渲染图像,使其在每个面部成分的颜色分布方面相似来增强解纠缠。我们的方法允许用户通过语义掩码进行编辑,并对几何和外观进行解耦控制。定性和定量评估表明,与现有和替代解决方案相比,我们的方法具有优越的几何形状和外观控制能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Unsupervised Gaze Representation Learning by Switching Features. H2OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers. MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection. Parse Trees Guided LLM Prompt Compression. Fast Multi-View Discrete Clustering Via Spectral Embedding Fusion.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1