MVGaussian:利用多视图引导和表面致密化技术生成高保真文本到三维内容

Phu Pham, Aradhya N. Mathur, Ojaswa Sharma, Aniket Bera
{"title":"MVGaussian:利用多视图引导和表面致密化技术生成高保真文本到三维内容","authors":"Phu Pham, Aradhya N. Mathur, Ojaswa Sharma, Aniket Bera","doi":"arxiv-2409.06620","DOIUrl":null,"url":null,"abstract":"The field of text-to-3D content generation has made significant progress in\ngenerating realistic 3D objects, with existing methodologies like Score\nDistillation Sampling (SDS) offering promising guidance. However, these methods\noften encounter the \"Janus\" problem-multi-face ambiguities due to imprecise\nguidance. Additionally, while recent advancements in 3D gaussian splitting have\nshown its efficacy in representing 3D volumes, optimization of this\nrepresentation remains largely unexplored. This paper introduces a unified\nframework for text-to-3D content generation that addresses these critical gaps.\nOur approach utilizes multi-view guidance to iteratively form the structure of\nthe 3D model, progressively enhancing detail and accuracy. We also introduce a\nnovel densification algorithm that aligns gaussians close to the surface,\noptimizing the structural integrity and fidelity of the generated models.\nExtensive experiments validate our approach, demonstrating that it produces\nhigh-quality visual outputs with minimal time cost. Notably, our method\nachieves high-quality results within half an hour of training, offering a\nsubstantial efficiency gain over most existing methods, which require hours of\ntraining time to achieve comparable results.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification\",\"authors\":\"Phu Pham, Aradhya N. Mathur, Ojaswa Sharma, Aniket Bera\",\"doi\":\"arxiv-2409.06620\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The field of text-to-3D content generation has made significant progress in\\ngenerating realistic 3D objects, with existing methodologies like Score\\nDistillation Sampling (SDS) offering promising guidance. However, these methods\\noften encounter the \\\"Janus\\\" problem-multi-face ambiguities due to imprecise\\nguidance. Additionally, while recent advancements in 3D gaussian splitting have\\nshown its efficacy in representing 3D volumes, optimization of this\\nrepresentation remains largely unexplored. This paper introduces a unified\\nframework for text-to-3D content generation that addresses these critical gaps.\\nOur approach utilizes multi-view guidance to iteratively form the structure of\\nthe 3D model, progressively enhancing detail and accuracy. We also introduce a\\nnovel densification algorithm that aligns gaussians close to the surface,\\noptimizing the structural integrity and fidelity of the generated models.\\nExtensive experiments validate our approach, demonstrating that it produces\\nhigh-quality visual outputs with minimal time cost. Notably, our method\\nachieves high-quality results within half an hour of training, offering a\\nsubstantial efficiency gain over most existing methods, which require hours of\\ntraining time to achieve comparable results.\",\"PeriodicalId\":501174,\"journal\":{\"name\":\"arXiv - CS - Graphics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.06620\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06620","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

文本到三维内容生成领域在生成逼真的三维对象方面取得了重大进展,现有的方法(如分数蒸馏采样(SDS))提供了很好的指导。然而,这些方法经常会遇到 "Janus "问题--由于引导不精确而导致多面性模糊。此外,虽然三维高斯分割的最新进展显示了其在表示三维体积方面的功效,但这种表示方法的优化在很大程度上仍未得到探索。我们的方法利用多视角引导迭代形成三维模型结构,逐步增强细节和准确性。我们的方法利用多视角引导逐步形成三维模型的结构,逐步增强细节和准确性。我们还引入了一种高级致密化算法,可将高斯对齐到表面附近,优化生成模型的结构完整性和保真度。值得注意的是,我们的方法在半小时的训练时间内就能获得高质量的结果,与大多数现有方法相比,效率大幅提高,因为现有方法需要数小时的训练时间才能获得类似的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification
The field of text-to-3D content generation has made significant progress in generating realistic 3D objects, with existing methodologies like Score Distillation Sampling (SDS) offering promising guidance. However, these methods often encounter the "Janus" problem-multi-face ambiguities due to imprecise guidance. Additionally, while recent advancements in 3D gaussian splitting have shown its efficacy in representing 3D volumes, optimization of this representation remains largely unexplored. This paper introduces a unified framework for text-to-3D content generation that addresses these critical gaps. Our approach utilizes multi-view guidance to iteratively form the structure of the 3D model, progressively enhancing detail and accuracy. We also introduce a novel densification algorithm that aligns gaussians close to the surface, optimizing the structural integrity and fidelity of the generated models. Extensive experiments validate our approach, demonstrating that it produces high-quality visual outputs with minimal time cost. Notably, our method achieves high-quality results within half an hour of training, offering a substantial efficiency gain over most existing methods, which require hours of training time to achieve comparable results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Gaussian Garments: Reconstructing Simulation-Ready Clothing with Photorealistic Appearance from Multi-View Video Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis Instant Facial Gaussians Translator for Relightable and Interactable Facial Rendering StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos Multi-scale Cycle Tracking in Dynamic Planar Graphs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1