高斯对象利用高斯拼接技术从四个视角重建高质量三维物体

IF 7.8 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING ACM Transactions on Graphics Pub Date : 2024-11-19 DOI:10.1145/3687759
Chen Yang, Sikuang Li, Jiemin Fang, Ruofan Liang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian
{"title":"高斯对象利用高斯拼接技术从四个视角重建高质量三维物体","authors":"Chen Yang, Sikuang Li, Jiemin Fang, Ruofan Liang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian","doi":"10.1145/3687759","DOIUrl":null,"url":null,"abstract":"Reconstructing and rendering 3D objects from highly sparse views is of critical importance for promoting applications of 3D vision techniques and improving user experience. However, images from sparse views only contain very limited 3D information, leading to two significant challenges: 1) Difficulty in building multi-view consistency as images for matching are too few; 2) Partially omitted or highly compressed object information as view coverage is insufficient. To tackle these challenges, we propose GaussianObject, a framework to represent and render the 3D object with Gaussian splatting that achieves high rendering quality with only 4 input images. We first introduce techniques of visual hull and floater elimination, which explicitly inject structure priors into the initial optimization process to help build multi-view consistency, yielding a coarse 3D Gaussian representation. Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined. We design a self-generating strategy to obtain image pairs for training the repair model. We further design a COLMAP-free variant, where pre-given accurate camera poses are not required, which achieves competitive quality and facilitates wider applications. GaussianObject is evaluated on several challenging datasets, including MipNeRF360, OmniObject3D, OpenIllumination, and our-collected unposed images, achieving superior performance from only four views and significantly outperforming previous SOTA methods.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"33 1","pages":""},"PeriodicalIF":7.8000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian Splatting\",\"authors\":\"Chen Yang, Sikuang Li, Jiemin Fang, Ruofan Liang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian\",\"doi\":\"10.1145/3687759\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reconstructing and rendering 3D objects from highly sparse views is of critical importance for promoting applications of 3D vision techniques and improving user experience. However, images from sparse views only contain very limited 3D information, leading to two significant challenges: 1) Difficulty in building multi-view consistency as images for matching are too few; 2) Partially omitted or highly compressed object information as view coverage is insufficient. To tackle these challenges, we propose GaussianObject, a framework to represent and render the 3D object with Gaussian splatting that achieves high rendering quality with only 4 input images. We first introduce techniques of visual hull and floater elimination, which explicitly inject structure priors into the initial optimization process to help build multi-view consistency, yielding a coarse 3D Gaussian representation. Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined. We design a self-generating strategy to obtain image pairs for training the repair model. We further design a COLMAP-free variant, where pre-given accurate camera poses are not required, which achieves competitive quality and facilitates wider applications. GaussianObject is evaluated on several challenging datasets, including MipNeRF360, OmniObject3D, OpenIllumination, and our-collected unposed images, achieving superior performance from only four views and significantly outperforming previous SOTA methods.\",\"PeriodicalId\":50913,\"journal\":{\"name\":\"ACM Transactions on Graphics\",\"volume\":\"33 1\",\"pages\":\"\"},\"PeriodicalIF\":7.8000,\"publicationDate\":\"2024-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Graphics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3687759\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Graphics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3687759","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

从高度稀疏的视图中重建和渲染三维物体对于促进三维视觉技术的应用和改善用户体验至关重要。然而,来自稀疏视图的图像只包含非常有限的三维信息,这导致了两个重大挑战:1) 由于用于匹配的图像太少,难以建立多视图一致性;2) 由于视图覆盖范围不足,部分遗漏或高度压缩了物体信息。为了应对这些挑战,我们提出了高斯对象(GaussianObject),这是一种用高斯拼接来表示和渲染三维物体的框架,只需 4 幅输入图像就能达到很高的渲染质量。我们首先引入了视觉船体和漂浮物消除技术,将结构先验明确注入初始优化过程,以帮助建立多视图一致性,从而获得粗略的三维高斯表示。然后,我们构建一个基于扩散模型的高斯修复模型,以补充遗漏的对象信息,并在此基础上进一步完善高斯。我们设计了一种自生成策略,以获得用于训练修复模型的图像对。我们还设计了一种无 COLMAP 的变体,在这种变体中,不需要预先给定精确的相机姿势,从而获得了具有竞争力的质量,并促进了更广泛的应用。我们在几个具有挑战性的数据集上对 GaussianObject 进行了评估,其中包括 MipNeRF360、OmniObject3D、OpenIllumination 和我们收集的未摆放图像,结果显示,仅从四个视角就能获得卓越的性能,明显优于以前的 SOTA 方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian Splatting
Reconstructing and rendering 3D objects from highly sparse views is of critical importance for promoting applications of 3D vision techniques and improving user experience. However, images from sparse views only contain very limited 3D information, leading to two significant challenges: 1) Difficulty in building multi-view consistency as images for matching are too few; 2) Partially omitted or highly compressed object information as view coverage is insufficient. To tackle these challenges, we propose GaussianObject, a framework to represent and render the 3D object with Gaussian splatting that achieves high rendering quality with only 4 input images. We first introduce techniques of visual hull and floater elimination, which explicitly inject structure priors into the initial optimization process to help build multi-view consistency, yielding a coarse 3D Gaussian representation. Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined. We design a self-generating strategy to obtain image pairs for training the repair model. We further design a COLMAP-free variant, where pre-given accurate camera poses are not required, which achieves competitive quality and facilitates wider applications. GaussianObject is evaluated on several challenging datasets, including MipNeRF360, OmniObject3D, OpenIllumination, and our-collected unposed images, achieving superior performance from only four views and significantly outperforming previous SOTA methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACM Transactions on Graphics
ACM Transactions on Graphics 工程技术-计算机:软件工程
CiteScore
14.30
自引率
25.80%
发文量
193
审稿时长
12 months
期刊介绍: ACM Transactions on Graphics (TOG) is a peer-reviewed scientific journal that aims to disseminate the latest findings of note in the field of computer graphics. It has been published since 1982 by the Association for Computing Machinery. Starting in 2003, all papers accepted for presentation at the annual SIGGRAPH conference are printed in a special summer issue of the journal.
期刊最新文献
Direct Manipulation of Procedural Implicit Surfaces 3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting Quark: Real-time, High-resolution, and General Neural View Synthesis Differentiable Owen Scrambling ELMO: Enhanced Real-time LiDAR Motion Capture through Upsampling
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1