MM-NeRF: Multimodal-Guided 3D Multi-Style Transfer of Neural Radiance Field.

Zijiang Yang, Zhongwei Qiu, Chang Xu, Dongmei Fu
{"title":"MM-NeRF: Multimodal-Guided 3D Multi-Style Transfer of Neural Radiance Field.","authors":"Zijiang Yang, Zhongwei Qiu, Chang Xu, Dongmei Fu","doi":"10.1109/TVCG.2024.3476331","DOIUrl":null,"url":null,"abstract":"<p><p>3D style transfer aims to generate stylized views of 3D scenes with specified styles, which requires high-quality generating and keeping multi-view consistency. Existing methods still suffer the challenges of high-quality stylization with texture details and stylization with multimodal guidance. In this paper, we reveal that the common training method of stylization with NeRF, which generates stylized multi-view supervision by 2D style transfer models, causes the same object in supervision to show various states (color tone, details, etc.) in different views, leading NeRF to tend to smooth the texture details, further resulting in low-quality rendering for 3D multi-style transfer. To tackle these problems, we propose a novel Multimodal-guided 3D Multi-style transfer of NeRF, termed MM-NeRF. First, MM-NeRF projects multimodal guidance into a unified space to keep the multimodal styles consistency and extracts multimodal features to guide the 3D stylization. Second, a novel multi-head learning scheme is proposed to relieve the difficulty of learning multi-style transfer, and a multi-view style consistent loss is proposed to track the inconsistency of multi-view supervision data. Finally, a novel incremental learning mechanism is proposed to generalize MM-NeRF to any new style with small costs. Extensive experiments on several real-world datasets show that MM-NeRF achieves high-quality 3D multi-style stylization with multimodal guidance, and keeps multi-view consistency and style consistency between multimodal guidance.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TVCG.2024.3476331","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

3D style transfer aims to generate stylized views of 3D scenes with specified styles, which requires high-quality generating and keeping multi-view consistency. Existing methods still suffer the challenges of high-quality stylization with texture details and stylization with multimodal guidance. In this paper, we reveal that the common training method of stylization with NeRF, which generates stylized multi-view supervision by 2D style transfer models, causes the same object in supervision to show various states (color tone, details, etc.) in different views, leading NeRF to tend to smooth the texture details, further resulting in low-quality rendering for 3D multi-style transfer. To tackle these problems, we propose a novel Multimodal-guided 3D Multi-style transfer of NeRF, termed MM-NeRF. First, MM-NeRF projects multimodal guidance into a unified space to keep the multimodal styles consistency and extracts multimodal features to guide the 3D stylization. Second, a novel multi-head learning scheme is proposed to relieve the difficulty of learning multi-style transfer, and a multi-view style consistent loss is proposed to track the inconsistency of multi-view supervision data. Finally, a novel incremental learning mechanism is proposed to generalize MM-NeRF to any new style with small costs. Extensive experiments on several real-world datasets show that MM-NeRF achieves high-quality 3D multi-style stylization with multimodal guidance, and keeps multi-view consistency and style consistency between multimodal guidance.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MM-NeRF:多模态引导的神经辐射场三维多类型转移。
三维风格转移的目的是生成具有指定风格的三维场景的风格化视图,这要求高质量的生成和保持多视图的一致性。现有的方法在高质量的纹理细节风格化和多模态引导风格化方面仍面临挑战。本文揭示了常见的 NeRF 风格化训练方法,即通过二维风格转移模型生成风格化的多视角监督,会导致监督中的同一对象在不同视角下呈现不同的状态(色调、细节等),导致 NeRF 倾向于平滑纹理细节,进一步导致三维多风格转移的低质量渲染。针对这些问题,我们提出了一种新颖的多模态引导三维多风格传输 NeRF,称为 MM-NeRF。首先,MM-NeRF 将多模态引导投射到一个统一的空间,以保持多模态风格的一致性,并提取多模态特征来引导 3D 风格化。其次,提出了一种新颖的多头学习方案来缓解多风格转移学习的困难,并提出了一种多视角风格一致性损失来跟踪多视角监督数据的不一致性。最后,提出了一种新颖的增量学习机制,以较小的成本将 MM-NeRF 推广到任何新的样式。在多个真实世界数据集上的广泛实验表明,MM-NeRF 通过多模态引导实现了高质量的三维多风格化,并在多模态引导之间保持了多视角一致性和风格一致性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
"where Did My Apps Go?" Supporting Scalable and Transition-Aware Access to Everyday Applications in Head-Worn Augmented Reality. PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction. From Dashboard Zoo to Census: A Case Study With Tableau Public. Authoring Data-Driven Chart Animations. Super-NeRF: View-consistent Detail Generation for NeRF Super-resolution.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1