VD-NeRF: Visibility-Aware Decoupled Neural Radiance Fields for View-Consistent Editing and High-Frequency Relighting

Tong Wu;Jia-Mu Sun;Yu-Kun Lai;Lin Gao
{"title":"VD-NeRF: Visibility-Aware Decoupled Neural Radiance Fields for View-Consistent Editing and High-Frequency Relighting","authors":"Tong Wu;Jia-Mu Sun;Yu-Kun Lai;Lin Gao","doi":"10.1109/TPAMI.2025.3531417","DOIUrl":null,"url":null,"abstract":"Neural Radiance Fields (NeRFs) have shown promising results in novel view synthesis. While achieving state-of-the-art rendering results, NeRF usually encodes all properties related to geometry and appearance of the scene together into several MLP (Multi-Layer Perceptron) networks, which hinders downstream manipulation of geometry, appearance and illumination. Recently researchers made attempts to edit geometry, appearance and lighting for NeRF. However, they fail to render view-consistent results after editing the appearance of the input scene. Moreover, many approaches use Spherical Gaussian (SG) or Spherical Harmonic (SH) functions, or low-resolution environment maps to model lighting. These methods, however, struggle with high-frequency environmental relighting. While some approaches utilize high-resolution environment maps, the strategy of jointly optimizing geometry, material, and lighting introduces additional ambiguity. To solve the above problems, we propose VD-NeRF, a visibility-aware approach to decoupling view-independent appearance and view-dependent appearance in the scene with a hybrid lighting representation. Specifically, we first train a signed distance function to reconstruct an explicit mesh for the input scene. Then a decoupled NeRF learns to attach view-independent appearance to the reconstructed mesh by defining learnable disentangled features representing geometry and view-independent appearance on its vertices. For lighting, we approximate it with an explicit learnable environment map and an implicit lighting network to support both low-frequency and high-frequency relighting. By modifying the view-independent appearance, rendered results are consistent across different viewpoints. Our method also supports high-frequency environmental relighting by replacing the explicit environment map with a novel one and fitting the implicit lighting network to the novel environment map. We further take visibility into consideration when rendering and decoupling the input 3D scene, which improves the quality of decomposition and relighting results and also enables more downstream applications such as scene composition where occlusions between scenes are common. Extensive experiments show that our method achieves better editing and relighting performance both quantitatively and qualitatively compared to previous methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 5","pages":"3344-3357"},"PeriodicalIF":18.6000,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10847302/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Neural Radiance Fields (NeRFs) have shown promising results in novel view synthesis. While achieving state-of-the-art rendering results, NeRF usually encodes all properties related to geometry and appearance of the scene together into several MLP (Multi-Layer Perceptron) networks, which hinders downstream manipulation of geometry, appearance and illumination. Recently researchers made attempts to edit geometry, appearance and lighting for NeRF. However, they fail to render view-consistent results after editing the appearance of the input scene. Moreover, many approaches use Spherical Gaussian (SG) or Spherical Harmonic (SH) functions, or low-resolution environment maps to model lighting. These methods, however, struggle with high-frequency environmental relighting. While some approaches utilize high-resolution environment maps, the strategy of jointly optimizing geometry, material, and lighting introduces additional ambiguity. To solve the above problems, we propose VD-NeRF, a visibility-aware approach to decoupling view-independent appearance and view-dependent appearance in the scene with a hybrid lighting representation. Specifically, we first train a signed distance function to reconstruct an explicit mesh for the input scene. Then a decoupled NeRF learns to attach view-independent appearance to the reconstructed mesh by defining learnable disentangled features representing geometry and view-independent appearance on its vertices. For lighting, we approximate it with an explicit learnable environment map and an implicit lighting network to support both low-frequency and high-frequency relighting. By modifying the view-independent appearance, rendered results are consistent across different viewpoints. Our method also supports high-frequency environmental relighting by replacing the explicit environment map with a novel one and fitting the implicit lighting network to the novel environment map. We further take visibility into consideration when rendering and decoupling the input 3D scene, which improves the quality of decomposition and relighting results and also enables more downstream applications such as scene composition where occlusions between scenes are common. Extensive experiments show that our method achieves better editing and relighting performance both quantitatively and qualitatively compared to previous methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
dvd - nerf:可见度感知解耦神经辐射场,用于视图一致编辑和高频重照明
神经辐射场(Neural Radiance Fields, NeRFs)在新型视点合成中显示出良好的效果。在获得最先进的渲染结果的同时,NeRF通常将与场景几何和外观相关的所有属性编码到几个MLP(多层感知器)网络中,这阻碍了对几何、外观和照明的下游操作。最近,研究人员尝试为NeRF编辑几何、外观和照明。然而,在编辑输入场景的外观后,它们无法呈现与视图一致的结果。此外,许多方法使用球面高斯(SG)或球面谐波(SH)函数,或低分辨率环境映射来建模照明。然而,这些方法与高频环境重照明作斗争。虽然一些方法利用高分辨率环境地图,但联合优化几何、材料和照明的策略引入了额外的模糊性。为了解决上述问题,我们提出了VD-NeRF,这是一种可见度感知方法,通过混合照明表示来解耦场景中与视图无关的外观和与视图相关的外观。具体来说,我们首先训练一个带符号距离函数来重建输入场景的显式网格。然后,解耦的NeRF通过定义可学习的解纠缠特征来表示几何形状和顶点上的视图独立外观,从而学习将与视图无关的外观附加到重构网格上。对于照明,我们用一个明确的可学习环境映射和一个隐式照明网络来近似它,以支持低频和高频重照明。通过修改与视图无关的外观,呈现的结果在不同视点之间是一致的。我们的方法还通过用新的环境地图替换显式环境地图并将隐式照明网络拟合到新的环境地图中来支持高频环境重照明。在渲染和解耦输入3D场景时,我们进一步考虑了可见性,这提高了分解和重光照结果的质量,并且还支持更多下游应用,例如场景之间遮挡常见的场景构图。大量的实验表明,与以往的方法相比,我们的方法在定量和定性上都取得了更好的编辑和重光照性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation. Continuous Review and Timely Correction: Enhancing the Resistance to Noisy Labels via Self-Not-True and Class-Wise Distillation. On the Transferability and Discriminability of Representation Learning in Unsupervised Domain Adaptation. Fast Multi-view Discrete Clustering via Spectral Embedding Fusion. GrowSP++: Growing Superpoints and Primitives for Unsupervised 3D Semantic Segmentation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1