几何图像扩散:基于图像的表面表示:快速、数据高效的文本到三维技术

Slava Elizarov, Ciara Rowles, Simon Donné
{"title":"几何图像扩散:基于图像的表面表示:快速、数据高效的文本到三维技术","authors":"Slava Elizarov, Ciara Rowles, Simon Donné","doi":"arxiv-2409.03718","DOIUrl":null,"url":null,"abstract":"Generating high-quality 3D objects from textual descriptions remains a\nchallenging problem due to computational cost, the scarcity of 3D data, and\ncomplex 3D representations. We introduce Geometry Image Diffusion\n(GIMDiffusion), a novel Text-to-3D model that utilizes geometry images to\nefficiently represent 3D shapes using 2D images, thereby avoiding the need for\ncomplex 3D-aware architectures. By integrating a Collaborative Control\nmechanism, we exploit the rich 2D priors of existing Text-to-Image models such\nas Stable Diffusion. This enables strong generalization even with limited 3D\ntraining data (allowing us to use only high-quality training data) as well as\nretaining compatibility with guidance techniques such as IPAdapter. In short,\nGIMDiffusion enables the generation of 3D assets at speeds comparable to\ncurrent Text-to-Image models. The generated objects consist of semantically\nmeaningful, separate parts and include internal structures, enhancing both\nusability and versatility.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation\",\"authors\":\"Slava Elizarov, Ciara Rowles, Simon Donné\",\"doi\":\"arxiv-2409.03718\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generating high-quality 3D objects from textual descriptions remains a\\nchallenging problem due to computational cost, the scarcity of 3D data, and\\ncomplex 3D representations. We introduce Geometry Image Diffusion\\n(GIMDiffusion), a novel Text-to-3D model that utilizes geometry images to\\nefficiently represent 3D shapes using 2D images, thereby avoiding the need for\\ncomplex 3D-aware architectures. By integrating a Collaborative Control\\nmechanism, we exploit the rich 2D priors of existing Text-to-Image models such\\nas Stable Diffusion. This enables strong generalization even with limited 3D\\ntraining data (allowing us to use only high-quality training data) as well as\\nretaining compatibility with guidance techniques such as IPAdapter. In short,\\nGIMDiffusion enables the generation of 3D assets at speeds comparable to\\ncurrent Text-to-Image models. The generated objects consist of semantically\\nmeaningful, separate parts and include internal structures, enhancing both\\nusability and versatility.\",\"PeriodicalId\":501174,\"journal\":{\"name\":\"arXiv - CS - Graphics\",\"volume\":\"6 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03718\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03718","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于计算成本、三维数据稀缺和复杂的三维表示,从文本描述生成高质量的三维对象仍然是一个具有挑战性的问题。我们介绍了几何图像扩散(GIMDiffusion),这是一种新颖的文本到三维模型,它利用几何图像来使用二维图像有效地表示三维形状,从而避免了对复杂的三维感知架构的需求。通过集成协作控制机制,我们利用了稳定扩散等现有文本到图像模型丰富的 2D 先验。这样,即使在三维训练数据有限的情况下(允许我们只使用高质量的训练数据),也能实现很强的泛化能力,同时还能保持与 IPAdapter 等引导技术的兼容性。简而言之,GIMDiffusion 能够以与当前文本到图像模型相当的速度生成三维资产。生成的对象由具有语义意义的独立部分组成,并包含内部结构,从而提高了可操作性和通用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation
Generating high-quality 3D objects from textual descriptions remains a challenging problem due to computational cost, the scarcity of 3D data, and complex 3D representations. We introduce Geometry Image Diffusion (GIMDiffusion), a novel Text-to-3D model that utilizes geometry images to efficiently represent 3D shapes using 2D images, thereby avoiding the need for complex 3D-aware architectures. By integrating a Collaborative Control mechanism, we exploit the rich 2D priors of existing Text-to-Image models such as Stable Diffusion. This enables strong generalization even with limited 3D training data (allowing us to use only high-quality training data) as well as retaining compatibility with guidance techniques such as IPAdapter. In short, GIMDiffusion enables the generation of 3D assets at speeds comparable to current Text-to-Image models. The generated objects consist of semantically meaningful, separate parts and include internal structures, enhancing both usability and versatility.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations A Missing Data Imputation GAN for Character Sprite Generation Visualizing Temporal Topic Embeddings with a Compass Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large Language Models Phys3DGS: Physically-based 3D Gaussian Splatting for Inverse Rendering
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1