Hou In Ivan Tam, Hou In Derek Pun, Austin T. Wang, Angel X. Chang, Manolis Savva
{"title":"SceneMotifCoder: Example-driven Visual Program Learning for Generating 3D Object Arrangements","authors":"Hou In Ivan Tam, Hou In Derek Pun, Austin T. Wang, Angel X. Chang, Manolis Savva","doi":"arxiv-2408.02211","DOIUrl":null,"url":null,"abstract":"Despite advances in text-to-3D generation methods, generation of multi-object\narrangements remains challenging. Current methods exhibit failures in\ngenerating physically plausible arrangements that respect the provided text\ndescription. We present SceneMotifCoder (SMC), an example-driven framework for\ngenerating 3D object arrangements through visual program learning. SMC\nleverages large language models (LLMs) and program synthesis to overcome these\nchallenges by learning visual programs from example arrangements. These\nprograms are generalized into compact, editable meta-programs. When combined\nwith 3D object retrieval and geometry-aware optimization, they can be used to\ncreate object arrangements varying in arrangement structure and contained\nobjects. Our experiments show that SMC generates high-quality arrangements\nusing meta-programs learned from few examples. Evaluation results demonstrates\nthat object arrangements generated by SMC better conform to user-specified text\ndescriptions and are more physically plausible when compared with\nstate-of-the-art text-to-3D generation and layout methods.","PeriodicalId":501174,"journal":{"name":"arXiv - CS - Graphics","volume":"10 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Despite advances in text-to-3D generation methods, generation of multi-object
arrangements remains challenging. Current methods exhibit failures in
generating physically plausible arrangements that respect the provided text
description. We present SceneMotifCoder (SMC), an example-driven framework for
generating 3D object arrangements through visual program learning. SMC
leverages large language models (LLMs) and program synthesis to overcome these
challenges by learning visual programs from example arrangements. These
programs are generalized into compact, editable meta-programs. When combined
with 3D object retrieval and geometry-aware optimization, they can be used to
create object arrangements varying in arrangement structure and contained
objects. Our experiments show that SMC generates high-quality arrangements
using meta-programs learned from few examples. Evaluation results demonstrates
that object arrangements generated by SMC better conform to user-specified text
descriptions and are more physically plausible when compared with
state-of-the-art text-to-3D generation and layout methods.