VD-NeRF: Visibility-Aware Decoupled Neural Radiance Fields for View-Consistent Editing and High-Frequency Relighting

IF 18.6 IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-01-20 DOI:10.1109/TPAMI.2025.3531417

Tong Wu;Jia-Mu Sun;Yu-Kun Lai;Lin Gao

{"title":"VD-NeRF: Visibility-Aware Decoupled Neural Radiance Fields for View-Consistent Editing and High-Frequency Relighting","authors":"Tong Wu;Jia-Mu Sun;Yu-Kun Lai;Lin Gao","doi":"10.1109/TPAMI.2025.3531417","DOIUrl":null,"url":null,"abstract":"Neural Radiance Fields (NeRFs) have shown promising results in novel view synthesis. While achieving state-of-the-art rendering results, NeRF usually encodes all properties related to geometry and appearance of the scene together into several MLP (Multi-Layer Perceptron) networks, which hinders downstream manipulation of geometry, appearance and illumination. Recently researchers made attempts to edit geometry, appearance and lighting for NeRF. However, they fail to render view-consistent results after editing the appearance of the input scene. Moreover, many approaches use Spherical Gaussian (SG) or Spherical Harmonic (SH) functions, or low-resolution environment maps to model lighting. These methods, however, struggle with high-frequency environmental relighting. While some approaches utilize high-resolution environment maps, the strategy of jointly optimizing geometry, material, and lighting introduces additional ambiguity. To solve the above problems, we propose VD-NeRF, a visibility-aware approach to decoupling view-independent appearance and view-dependent appearance in the scene with a hybrid lighting representation. Specifically, we first train a signed distance function to reconstruct an explicit mesh for the input scene. Then a decoupled NeRF learns to attach view-independent appearance to the reconstructed mesh by defining learnable disentangled features representing geometry and view-independent appearance on its vertices. For lighting, we approximate it with an explicit learnable environment map and an implicit lighting network to support both low-frequency and high-frequency relighting. By modifying the view-independent appearance, rendered results are consistent across different viewpoints. Our method also supports high-frequency environmental relighting by replacing the explicit environment map with a novel one and fitting the implicit lighting network to the novel environment map. We further take visibility into consideration when rendering and decoupling the input 3D scene, which improves the quality of decomposition and relighting results and also enables more downstream applications such as scene composition where occlusions between scenes are common. Extensive experiments show that our method achieves better editing and relighting performance both quantitatively and qualitatively compared to previous methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 5","pages":"3344-3357"},"PeriodicalIF":18.6000,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10847302/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Neural Radiance Fields (NeRFs) have shown promising results in novel view synthesis. While achieving state-of-the-art rendering results, NeRF usually encodes all properties related to geometry and appearance of the scene together into several MLP (Multi-Layer Perceptron) networks, which hinders downstream manipulation of geometry, appearance and illumination. Recently researchers made attempts to edit geometry, appearance and lighting for NeRF. However, they fail to render view-consistent results after editing the appearance of the input scene. Moreover, many approaches use Spherical Gaussian (SG) or Spherical Harmonic (SH) functions, or low-resolution environment maps to model lighting. These methods, however, struggle with high-frequency environmental relighting. While some approaches utilize high-resolution environment maps, the strategy of jointly optimizing geometry, material, and lighting introduces additional ambiguity. To solve the above problems, we propose VD-NeRF, a visibility-aware approach to decoupling view-independent appearance and view-dependent appearance in the scene with a hybrid lighting representation. Specifically, we first train a signed distance function to reconstruct an explicit mesh for the input scene. Then a decoupled NeRF learns to attach view-independent appearance to the reconstructed mesh by defining learnable disentangled features representing geometry and view-independent appearance on its vertices. For lighting, we approximate it with an explicit learnable environment map and an implicit lighting network to support both low-frequency and high-frequency relighting. By modifying the view-independent appearance, rendered results are consistent across different viewpoints. Our method also supports high-frequency environmental relighting by replacing the explicit environment map with a novel one and fitting the implicit lighting network to the novel environment map. We further take visibility into consideration when rendering and decoupling the input 3D scene, which improves the quality of decomposition and relighting results and also enables more downstream applications such as scene composition where occlusions between scenes are common. Extensive experiments show that our method achieves better editing and relighting performance both quantitatively and qualitatively compared to previous methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

dvd - nerf：可见度感知解耦神经辐射场，用于视图一致编辑和高频重照明

神经辐射场（Neural Radiance Fields, NeRFs）在新型视点合成中显示出良好的效果。在获得最先进的渲染结果的同时，NeRF通常将与场景几何和外观相关的所有属性编码到几个MLP（多层感知器）网络中，这阻碍了对几何、外观和照明的下游操作。最近，研究人员尝试为NeRF编辑几何、外观和照明。然而，在编辑输入场景的外观后，它们无法呈现与视图一致的结果。此外，许多方法使用球面高斯（SG）或球面谐波（SH）函数，或低分辨率环境映射来建模照明。然而，这些方法与高频环境重照明作斗争。虽然一些方法利用高分辨率环境地图，但联合优化几何、材料和照明的策略引入了额外的模糊性。为了解决上述问题，我们提出了VD-NeRF，这是一种可见度感知方法，通过混合照明表示来解耦场景中与视图无关的外观和与视图相关的外观。具体来说，我们首先训练一个带符号距离函数来重建输入场景的显式网格。然后，解耦的NeRF通过定义可学习的解纠缠特征来表示几何形状和顶点上的视图独立外观，从而学习将与视图无关的外观附加到重构网格上。对于照明，我们用一个明确的可学习环境映射和一个隐式照明网络来近似它，以支持低频和高频重照明。通过修改与视图无关的外观，呈现的结果在不同视点之间是一致的。我们的方法还通过用新的环境地图替换显式环境地图并将隐式照明网络拟合到新的环境地图中来支持高频环境重照明。在渲染和解耦输入3D场景时，我们进一步考虑了可见性，这提高了分解和重光照结果的质量，并且还支持更多下游应用，例如场景之间遮挡常见的场景构图。大量的实验表明，与以往的方法相比，我们的方法在定量和定性上都取得了更好的编辑和重光照性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量