GS-Net:可通用的即插即用 3D 高斯拼接模块

Yichen Zhang, Zihan Wang, Jiali Han, Peilin Li, Jiaxun Zhang, Jianqiang Wang, Lei He, Keqiang Li
{"title":"GS-Net:可通用的即插即用 3D 高斯拼接模块","authors":"Yichen Zhang, Zihan Wang, Jiali Han, Peilin Li, Jiaxun Zhang, Jianqiang Wang, Lei He, Keqiang Li","doi":"arxiv-2409.11307","DOIUrl":null,"url":null,"abstract":"3D Gaussian Splatting (3DGS) integrates the strengths of primitive-based\nrepresentations and volumetric rendering techniques, enabling real-time,\nhigh-quality rendering. However, 3DGS models typically overfit to single-scene\ntraining and are highly sensitive to the initialization of Gaussian ellipsoids,\nheuristically derived from Structure from Motion (SfM) point clouds, which\nlimits both generalization and practicality. To address these limitations, we\npropose GS-Net, a generalizable, plug-and-play 3DGS module that densifies\nGaussian ellipsoids from sparse SfM point clouds, enhancing geometric structure\nrepresentation. To the best of our knowledge, GS-Net is the first plug-and-play\n3DGS module with cross-scene generalization capabilities. Additionally, we\nintroduce the CARLA-NVS dataset, which incorporates additional camera\nviewpoints to thoroughly evaluate reconstruction and rendering quality.\nExtensive experiments demonstrate that applying GS-Net to 3DGS yields a PSNR\nimprovement of 2.08 dB for conventional viewpoints and 1.86 dB for novel\nviewpoints, confirming the method's effectiveness and robustness.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GS-Net: Generalizable Plug-and-Play 3D Gaussian Splatting Module\",\"authors\":\"Yichen Zhang, Zihan Wang, Jiali Han, Peilin Li, Jiaxun Zhang, Jianqiang Wang, Lei He, Keqiang Li\",\"doi\":\"arxiv-2409.11307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"3D Gaussian Splatting (3DGS) integrates the strengths of primitive-based\\nrepresentations and volumetric rendering techniques, enabling real-time,\\nhigh-quality rendering. However, 3DGS models typically overfit to single-scene\\ntraining and are highly sensitive to the initialization of Gaussian ellipsoids,\\nheuristically derived from Structure from Motion (SfM) point clouds, which\\nlimits both generalization and practicality. To address these limitations, we\\npropose GS-Net, a generalizable, plug-and-play 3DGS module that densifies\\nGaussian ellipsoids from sparse SfM point clouds, enhancing geometric structure\\nrepresentation. To the best of our knowledge, GS-Net is the first plug-and-play\\n3DGS module with cross-scene generalization capabilities. Additionally, we\\nintroduce the CARLA-NVS dataset, which incorporates additional camera\\nviewpoints to thoroughly evaluate reconstruction and rendering quality.\\nExtensive experiments demonstrate that applying GS-Net to 3DGS yields a PSNR\\nimprovement of 2.08 dB for conventional viewpoints and 1.86 dB for novel\\nviewpoints, confirming the method's effectiveness and robustness.\",\"PeriodicalId\":501130,\"journal\":{\"name\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

3D Gaussian Splatting(3DGS)集成了基于基元的表示和体积渲染技术的优势,可实现实时、高质量的渲染。然而,3DGS 模型通常会过度适应单场景训练,并且对高斯椭圆的初始化高度敏感,而高斯椭圆是从运动结构(SfM)点云中启发式导出的,这限制了其通用性和实用性。为了解决这些局限性,我们提出了 GS-Net,这是一种可通用、即插即用的 3DGS 模块,可从稀疏的 SfM 点云中密集化高斯椭圆,从而增强几何结构表示。据我们所知,GS-Net 是第一个具有跨场景通用能力的即插即用 3DGS 模块。此外,我们还引入了 CARLA-NVS 数据集,该数据集包含了额外的摄像机视点,可全面评估重建和渲染质量。大量实验证明,将 GS-Net 应用于 3DGS 可使传统视点的 PSNR 提高 2.08 dB,新视点的 PSNR 提高 1.86 dB,从而证实了该方法的有效性和鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GS-Net: Generalizable Plug-and-Play 3D Gaussian Splatting Module
3D Gaussian Splatting (3DGS) integrates the strengths of primitive-based representations and volumetric rendering techniques, enabling real-time, high-quality rendering. However, 3DGS models typically overfit to single-scene training and are highly sensitive to the initialization of Gaussian ellipsoids, heuristically derived from Structure from Motion (SfM) point clouds, which limits both generalization and practicality. To address these limitations, we propose GS-Net, a generalizable, plug-and-play 3DGS module that densifies Gaussian ellipsoids from sparse SfM point clouds, enhancing geometric structure representation. To the best of our knowledge, GS-Net is the first plug-and-play 3DGS module with cross-scene generalization capabilities. Additionally, we introduce the CARLA-NVS dataset, which incorporates additional camera viewpoints to thoroughly evaluate reconstruction and rendering quality. Extensive experiments demonstrate that applying GS-Net to 3DGS yields a PSNR improvement of 2.08 dB for conventional viewpoints and 1.86 dB for novel viewpoints, confirming the method's effectiveness and robustness.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Massively Multi-Person 3D Human Motion Forecasting with Scene Context Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Precise Forecasting of Sky Images Using Spatial Warping JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Applications of Knowledge Distillation in Remote Sensing: A Survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1