{"title":"一个物体价值 64x64 像素通过图像扩散生成 3D 物体","authors":"Xingguang Yan, Han-Hung Lee, Ziyu Wan, Angel X. Chang","doi":"arxiv-2408.03178","DOIUrl":null,"url":null,"abstract":"We introduce a new approach for generating realistic 3D models with UV maps\nthrough a representation termed \"Object Images.\" This approach encapsulates\nsurface geometry, appearance, and patch structures within a 64x64 pixel image,\neffectively converting complex 3D shapes into a more manageable 2D format. By\ndoing so, we address the challenges of both geometric and semantic irregularity\ninherent in polygonal meshes. This method allows us to use image generation\nmodels, such as Diffusion Transformers, directly for 3D shape generation.\nEvaluated on the ABO dataset, our generated shapes with patch structures\nachieve point cloud FID comparable to recent 3D generative models, while\nnaturally supporting PBR material generation.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"77 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion\",\"authors\":\"Xingguang Yan, Han-Hung Lee, Ziyu Wan, Angel X. Chang\",\"doi\":\"arxiv-2408.03178\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce a new approach for generating realistic 3D models with UV maps\\nthrough a representation termed \\\"Object Images.\\\" This approach encapsulates\\nsurface geometry, appearance, and patch structures within a 64x64 pixel image,\\neffectively converting complex 3D shapes into a more manageable 2D format. By\\ndoing so, we address the challenges of both geometric and semantic irregularity\\ninherent in polygonal meshes. This method allows us to use image generation\\nmodels, such as Diffusion Transformers, directly for 3D shape generation.\\nEvaluated on the ABO dataset, our generated shapes with patch structures\\nachieve point cloud FID comparable to recent 3D generative models, while\\nnaturally supporting PBR material generation.\",\"PeriodicalId\":501174,\"journal\":{\"name\":\"arXiv - CS - Graphics\",\"volume\":\"77 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.03178\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.03178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion
We introduce a new approach for generating realistic 3D models with UV maps
through a representation termed "Object Images." This approach encapsulates
surface geometry, appearance, and patch structures within a 64x64 pixel image,
effectively converting complex 3D shapes into a more manageable 2D format. By
doing so, we address the challenges of both geometric and semantic irregularity
inherent in polygonal meshes. This method allows us to use image generation
models, such as Diffusion Transformers, directly for 3D shape generation.
Evaluated on the ABO dataset, our generated shapes with patch structures
achieve point cloud FID comparable to recent 3D generative models, while
naturally supporting PBR material generation.