Cross fields play a critical role in various geometry processing tasks, especially for quad mesh generation. Existing methods for cross field generation often struggle to balance computational efficiency with generation quality, using slow per-shape optimization. We introduce CrossGen , a novel framework that supports both feed-forward prediction and latent generative modeling of cross fields for quad meshing by unifying geometry and cross field representations within a joint latent space. Our method enables extremely fast computation of high-quality cross fields of general input shapes, typically within one second without per-shape optimization. Our method assumes a point-sampled surface, also called a point-cloud surface , as input, so we can accommodate various surface representations by a straightforward point sampling process. Using an auto-encoder network architecture, we encode input point-cloud surfaces into a sparse voxel grid with fine-grained latent spaces, which are decoded into both SDF-based surface geometry and cross fields (see the teaser figure). We also contribute a dataset of models with both high-quality signed distance fields (SDFs) representations and their corresponding cross fields, and use it to train our network. Once trained, the network is capable of computing a cross field of an input surface in a feed-forward manner, ensuring high geometric fidelity, noise resilience, and rapid inference. Furthermore, leveraging the same unified latent representation, we incorporate a diffusion model for computing cross fields of new shapes generated from partial input, such as sketches. To demonstrate its practical applications, we validate CrossGen on the quad mesh generation task for a large variety of surface shapes. Experimental results demonstrate that CrossGen generalizes well across diverse shapes and consistently yields high-fidelity cross fields, thus facilitating the generation of high-quality quad meshes.
{"title":"CrossGen: Learning and Generating Cross Fields for Quad Meshing","authors":"Qiujie Dong, Jiepeng Wang, Rui Xu, Cheng Lin, Yuan Liu, Shiqing Xin, Zichun Zhong, Xin Li, Changhe Tu, Taku Komura, Leif Kobbelt, Scott Schaefer, Wenping Wang","doi":"10.1145/3763299","DOIUrl":"https://doi.org/10.1145/3763299","url":null,"abstract":"Cross fields play a critical role in various geometry processing tasks, especially for quad mesh generation. Existing methods for cross field generation often struggle to balance computational efficiency with generation quality, using slow per-shape optimization. We introduce <jats:italic toggle=\"yes\">CrossGen</jats:italic> , a novel framework that supports both feed-forward prediction and latent generative modeling of cross fields for quad meshing by unifying geometry and cross field representations within a joint latent space. Our method enables extremely fast computation of high-quality cross fields of general input shapes, typically within one second without per-shape optimization. Our method assumes a point-sampled surface, also called a <jats:italic toggle=\"yes\">point-cloud surface</jats:italic> , as input, so we can accommodate various surface representations by a straightforward point sampling process. Using an auto-encoder network architecture, we encode input point-cloud surfaces into a sparse voxel grid with fine-grained latent spaces, which are decoded into both SDF-based surface geometry and cross fields (see the teaser figure). We also contribute a dataset of models with both high-quality signed distance fields (SDFs) representations and their corresponding cross fields, and use it to train our network. Once trained, the network is capable of computing a cross field of an input surface in a feed-forward manner, ensuring high geometric fidelity, noise resilience, and rapid inference. Furthermore, leveraging the same unified latent representation, we incorporate a diffusion model for computing cross fields of new shapes generated from partial input, such as sketches. To demonstrate its practical applications, we validate <jats:italic toggle=\"yes\">CrossGen</jats:italic> on the quad mesh generation task for a large variety of surface shapes. Experimental results demonstrate that <jats:italic toggle=\"yes\">CrossGen</jats:italic> generalizes well across diverse shapes and consistently yields high-fidelity cross fields, thus facilitating the generation of high-quality quad meshes.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"26 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Otman Benchekroun, Eitan Grinspun, Maurizio Chiaramonte, Philip Allen Etter
Designing subspaces for Reduced Order Modeling (ROM) is crucial for accelerating finite element simulations in graphics and engineering. Unfortunately, it's not always clear which subspace is optimal for arbitrary dynamic simulation. We propose to construct simulation subspaces from force distributions, allowing us to tailor such subspaces to common scene interactions involving constraint penalties, handles-based control, contact and musculoskeletal actuation. To achieve this we adopt a statistical perspective on Reduced Order Modelling, which allows us to push such user-designed force distributions through a linearized simulation to obtain a dual distribution on displacements. To construct our subspace, we then fit a low-rank Gaussian model to this displacement distribution, which we show generalizes Linear Modal Analysis subspaces for uncorrelated unit variance force distributions, as well as Green's Function subspaces for low rank force distributions. We show our framework allows for the construction of subspaces that are optimal both with respect to physical material properties, as well as arbitrary force distributions as observed in handle-based, contact, and musculoskeletal scene interactions.
{"title":"Force-Dual Modes: Subspace Design from Stochastic Forces","authors":"Otman Benchekroun, Eitan Grinspun, Maurizio Chiaramonte, Philip Allen Etter","doi":"10.1145/3763310","DOIUrl":"https://doi.org/10.1145/3763310","url":null,"abstract":"Designing subspaces for Reduced Order Modeling (ROM) is crucial for accelerating finite element simulations in graphics and engineering. Unfortunately, it's not always clear which subspace is optimal for arbitrary dynamic simulation. We propose to construct simulation subspaces from force distributions, allowing us to tailor such subspaces to common scene interactions involving constraint penalties, handles-based control, contact and musculoskeletal actuation. To achieve this we adopt a statistical perspective on Reduced Order Modelling, which allows us to push such user-designed force distributions through a linearized simulation to obtain a dual distribution on displacements. To construct our subspace, we then fit a low-rank Gaussian model to this displacement distribution, which we show generalizes Linear Modal Analysis subspaces for uncorrelated unit variance force distributions, as well as Green's Function subspaces for low rank force distributions. We show our framework allows for the construction of subspaces that are optimal both with respect to physical material properties, as well as arbitrary force distributions as observed in handle-based, contact, and musculoskeletal scene interactions.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"110 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sanjeev Muralikrishnan, Niladri Shekhar Dutt, Niloy J. Mitra
Animation retargetting applies sparse motion description (e.g., keypoint sequences) to a character mesh to produce a semantically plausible and temporally coherent full-body mesh sequence. Existing approaches come with restrictions - they require access to template-based shape priors or artist-designed deformation rigs, suffer from limited generalization to unseen motion and/or shapes, or exhibit motion jitter. We propose Self-supervised Motion Fields (SMF), a self-supervised framework that is trained with only sparse motion representations, without requiring dataset-specific annotations, templates, or rigs. At the heart of our method are Kinetic Codes, a novel autoencoder-based sparse motion encoding, that exposes a semantically rich latent space, simplifying large-scale training. Our architecture comprises dedicated spatial and temporal gradient predictors, which are jointly trained in an end-to-end fashion. The combined network, regularized by the Kinetic Codes' latent space, has good generalization across both unseen shapes and new motions. We evaluated our method on unseen motion sampled from AMASS, D4D, Mixamo, and raw monocular video for animation transfer on various characters with varying shapes and topology. We report a new SoTA on the AMASS dataset in the context of generalization to unseen motion.
{"title":"SMF: Template-free and Rig-free Animation Transfer using Kinetic Codes","authors":"Sanjeev Muralikrishnan, Niladri Shekhar Dutt, Niloy J. Mitra","doi":"10.1145/3763309","DOIUrl":"https://doi.org/10.1145/3763309","url":null,"abstract":"Animation retargetting applies sparse motion description (e.g., keypoint sequences) to a character mesh to produce a semantically plausible and temporally coherent full-body mesh sequence. Existing approaches come with restrictions - they require access to template-based shape priors or artist-designed deformation rigs, suffer from limited generalization to unseen motion and/or shapes, or exhibit motion jitter. We propose Self-supervised Motion Fields (SMF), a self-supervised framework that is trained with only sparse motion representations, without requiring dataset-specific annotations, templates, or rigs. At the heart of our method are Kinetic Codes, a novel autoencoder-based sparse motion encoding, that exposes a semantically rich latent space, simplifying large-scale training. Our architecture comprises dedicated spatial and temporal gradient predictors, which are jointly trained in an end-to-end fashion. The combined network, regularized by the Kinetic Codes' latent space, has good generalization across both unseen shapes and new motions. We evaluated our method on unseen motion sampled from AMASS, D4D, Mixamo, and raw monocular video for animation transfer on various characters with varying shapes and topology. We report a new SoTA on the AMASS dataset in the context of generalization to unseen motion.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"203 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advances in text-to-image models have enabled a new era of creative and controllable image generation. However, generating compositional scenes with multiple subjects and attributes remains a significant challenge. To enhance user control over subject placement, several layout-guided methods have been proposed. However, these methods face numerous challenges, particularly in compositional scenes. Unintended subjects often appear outside the layouts, generated images can be out-of-distribution and contain unnatural artifacts, or attributes bleed across subjects, leading to incorrect visual outputs. In this work, we propose MALeR, a method that addresses each of these challenges. Given a text prompt and corresponding layouts, our method prevents subjects from appearing outside the given layouts while being in-distribution. Additionally, we propose a masked, attribute-aware binding mechanism that prevents attribute leakage, enabling accurate rendering of subjects with multiple attributes, even in complex compositional scenes. Qualitative and quantitative evaluation demonstrates that our method achieves superior performance in compositional accuracy, generation consistency, and attribute binding compared to previous work. MALeR is particularly adept at generating images of scenes with multiple subjects and multiple attributes per subject.
{"title":"MALeR: Improving Compositional Fidelity in Layout-Guided Generation","authors":"Shivank Saxena, Dhruv Srivastava, Makarand Tapaswi","doi":"10.1145/3763341","DOIUrl":"https://doi.org/10.1145/3763341","url":null,"abstract":"Recent advances in text-to-image models have enabled a new era of creative and controllable image generation. However, generating compositional scenes with multiple subjects and attributes remains a significant challenge. To enhance user control over subject placement, several layout-guided methods have been proposed. However, these methods face numerous challenges, particularly in compositional scenes. Unintended subjects often appear outside the layouts, generated images can be out-of-distribution and contain unnatural artifacts, or attributes bleed across subjects, leading to incorrect visual outputs. In this work, we propose MALeR, a method that addresses each of these challenges. Given a text prompt and corresponding layouts, our method prevents subjects from appearing outside the given layouts while being in-distribution. Additionally, we propose a masked, attribute-aware binding mechanism that prevents attribute leakage, enabling accurate rendering of subjects with multiple attributes, even in complex compositional scenes. Qualitative and quantitative evaluation demonstrates that our method achieves superior performance in compositional accuracy, generation consistency, and attribute binding compared to previous work. MALeR is particularly adept at generating images of scenes with multiple subjects and multiple attributes per subject.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"29 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christian Stippel, Felix Mujkanovic, Thomas Leimkühler, Pedro Hermosilla
Accurate surface geometry representation is crucial in 3D visual computing. Explicit representations, such as polygonal meshes, and implicit representations, like signed distance functions, each have distinct advantages, making efficient conversions between them increasingly important. Conventional surface extraction methods for implicit representations, such as the widely used Marching Cubes algorithm, rely on spatial decomposition and sampling, leading to inaccuracies due to fixed and limited resolution. We introduce a novel approach for analytically extracting surfaces from neural implicit functions. Our method operates natively in parallel and can navigate large neural architectures. By leveraging the fact that each neuron partitions the domain, we develop a depth-first traversal strategy to efficiently track the encoded surface. The resulting meshes faithfully capture the full geometric information from the network without ad-hoc spatial discretization, achieving unprecedented accuracy across diverse shapes and network architectures while maintaining competitive speed.
{"title":"Marching Neurons: Accurate Surface Extraction for Neural Implicit Shapes","authors":"Christian Stippel, Felix Mujkanovic, Thomas Leimkühler, Pedro Hermosilla","doi":"10.1145/3763328","DOIUrl":"https://doi.org/10.1145/3763328","url":null,"abstract":"Accurate surface geometry representation is crucial in 3D visual computing. Explicit representations, such as polygonal meshes, and implicit representations, like signed distance functions, each have distinct advantages, making efficient conversions between them increasingly important. Conventional surface extraction methods for implicit representations, such as the widely used Marching Cubes algorithm, rely on spatial decomposition and sampling, leading to inaccuracies due to fixed and limited resolution. We introduce a novel approach for analytically extracting surfaces from neural implicit functions. Our method operates natively in parallel and can navigate large neural architectures. By leveraging the fact that each neuron partitions the domain, we develop a depth-first traversal strategy to efficiently track the encoded surface. The resulting meshes faithfully capture the full geometric information from the network without ad-hoc spatial discretization, achieving unprecedented accuracy across diverse shapes and network architectures while maintaining competitive speed.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"33 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A core operation in Monte Carlo volume rendering is transmittance estimation: Given a segment along a ray, the goal is to estimate the fraction of light that will pass through this segment without encountering absorption or out-scattering. A naive approach is to estimate optical depth τ using unbiased ray marching and to then use exp(-τ) as transmittance estimate. However, this strategy systematically overestimates transmittance due to Jensen's inequality. On the other hand, existing unbiased transmittance estimators either suffer from high variance or have a cost governed by random decisions, which makes them less suitable for SIMD architectures. We propose a biased transmittance estimator with significantly reduced bias compared to the naive approach and a deterministic and low cost. We observe that ray marching with stratified jittered sampling results in estimates of optical depth that are nearly normal-distributed. We then apply the unique minimum variance unbiased (UMVU) estimator of exp(- τ ) based on two such estimates (using two different sets of random numbers). Bias only arises from violations of the assumption of normal-distributed inputs. We further reduce bias and variance using a variance-aware importance sampling scheme. The underlying theory can be used to estimate any analytic function of optical depth. We use this generalization to estimate multiple importance sampling (MIS) weights and introduce two integrators: Unbiased MIS with biased MIS weights and a more efficient but biased combination of MIS and transmittance estimation.
{"title":"Jackknife Transmittance and MIS Weight Estimation","authors":"Christoph Peters","doi":"10.1145/3763273","DOIUrl":"https://doi.org/10.1145/3763273","url":null,"abstract":"A core operation in Monte Carlo volume rendering is transmittance estimation: Given a segment along a ray, the goal is to estimate the fraction of light that will pass through this segment without encountering absorption or out-scattering. A naive approach is to estimate optical depth τ using unbiased ray marching and to then use exp(-τ) as transmittance estimate. However, this strategy systematically overestimates transmittance due to Jensen's inequality. On the other hand, existing unbiased transmittance estimators either suffer from high variance or have a cost governed by random decisions, which makes them less suitable for SIMD architectures. We propose a biased transmittance estimator with significantly reduced bias compared to the naive approach and a deterministic and low cost. We observe that ray marching with stratified jittered sampling results in estimates of optical depth that are nearly normal-distributed. We then apply the unique minimum variance unbiased (UMVU) estimator of exp(- <jats:italic toggle=\"yes\">τ</jats:italic> ) based on two such estimates (using two different sets of random numbers). Bias only arises from violations of the assumption of normal-distributed inputs. We further reduce bias and variance using a variance-aware importance sampling scheme. The underlying theory can be used to estimate any analytic function of optical depth. We use this generalization to estimate multiple importance sampling (MIS) weights and introduce two integrators: Unbiased MIS with biased MIS weights and a more efficient but biased combination of MIS and transmittance estimation.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"155 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antoine Guédon, Diego Gomez, Nissim Maruani, Bingchen Gong, George Drettakis, Maks Ovsjanikov
While recent advances in Gaussian Splatting have enabled fast reconstruction of high-quality 3D scenes from images, extracting accurate surface meshes remains a challenge. Current approaches extract the surface through costly post-processing steps, resulting in the loss of fine geometric details or requiring significant time and leading to very dense meshes with millions of vertices. More fundamentally, the a posteriori conversion from a volumetric to a surface representation limits the ability of the final mesh to preserve all geometric structures captured during training. We present MILo, a novel Gaussian Splatting framework that bridges the gap between volumetric and surface representations by differentiably extracting a mesh from the 3D Gaussians. We design a fully differentiable procedure that constructs the mesh—including both vertex locations and connectivity—at every iteration directly from the parameters of the Gaussians, which are the only quantities optimized during training. Our method introduces three key technical contributions: (1) a bidirectional consistency framework ensuring both representations—Gaussians and the extracted mesh—capture the same underlying geometry during training; (2) an adaptive mesh extraction process performed at each training iteration, which uses Gaussians as differentiable pivots for Delaunay triangulation; (3) a novel method for computing signed distance values from the 3D Gaussians that enables precise surface extraction while avoiding geometric erosion. Our approach can reconstruct complete scenes, including backgrounds, with state-of-the-art quality while requiring an order of magnitude fewer mesh vertices than previous methods. Due to their light weight and empty interior, our meshes are well suited for downstream applications such as physics simulations and animation. The code for our approach and an online gallery are available at https://anttwo.github.io/milo/.
{"title":"MILo: Mesh-In-the-Loop Gaussian Splatting for Detailed and Efficient Surface Reconstruction","authors":"Antoine Guédon, Diego Gomez, Nissim Maruani, Bingchen Gong, George Drettakis, Maks Ovsjanikov","doi":"10.1145/3763339","DOIUrl":"https://doi.org/10.1145/3763339","url":null,"abstract":"While recent advances in Gaussian Splatting have enabled fast reconstruction of high-quality 3D scenes from images, extracting accurate surface meshes remains a challenge. Current approaches extract the surface through costly post-processing steps, resulting in the loss of fine geometric details or requiring significant time and leading to very dense meshes with millions of vertices. More fundamentally, the <jats:italic toggle=\"yes\">a posteriori</jats:italic> conversion from a volumetric to a surface representation limits the ability of the final mesh to preserve all geometric structures captured during training. We present MILo, a novel Gaussian Splatting framework that bridges the gap between volumetric and surface representations by differentiably extracting a mesh from the 3D Gaussians. We design a fully differentiable procedure that constructs the mesh—including both vertex locations and connectivity—at every iteration directly from the parameters of the Gaussians, <jats:italic toggle=\"yes\">which are the only quantities optimized during training.</jats:italic> Our method introduces three key technical contributions: (1) a bidirectional consistency framework ensuring both representations—Gaussians and the extracted mesh—capture the same underlying geometry during training; (2) an adaptive mesh extraction process performed at each training iteration, which uses Gaussians as differentiable pivots for Delaunay triangulation; (3) a novel method for computing signed distance values from the 3D Gaussians that enables precise surface extraction while avoiding geometric erosion. Our approach can reconstruct complete scenes, including backgrounds, with state-of-the-art quality while requiring an order of magnitude fewer mesh vertices than previous methods. Due to their light weight and empty interior, our meshes are well suited for downstream applications such as physics simulations and animation. The code for our approach and an online gallery are available at https://anttwo.github.io/milo/.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"55 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Generating artistic and coherent 3D scene layouts is crucial in digital content creation. Traditional optimization-based methods are often constrained by cumbersome manual rules, while deep generative models face challenges in producing content with richness and diversity. Furthermore, approaches that utilize large language models frequently lack robustness and fail to accurately capture complex spatial relationships. To address these challenges, this paper presents a novel vision-guided 3D layout generation system. We first construct a high-quality asset library containing 2,037 scene assets and 147 3D scene layouts. Subsequently, we employ an image generation model to expand prompt representations into images, fine-tuning it to align with our asset library. We then develop a robust image parsing module to recover the 3D layout of scenes based on visual semantics and geometric information. Finally, we optimize the scene layout using scene graphs and overall visual semantics to ensure logical coherence and alignment with the images. Extensive user testing demonstrates that our algorithm significantly outperforms existing methods in terms of layout richness and quality. The code and dataset will be available at https://github.com/HiHiAllen/Imaginarium.
{"title":"Imaginarium: Vision-guided High-Quality 3D Scene Layout Generation","authors":"Xiaoming Zhu, Xu Huang, Qinghongbing Xie, Zhi Deng, Junsheng Yu, Yirui Guan, Zhongyuan Liu, Lin Zhu, Qijun Zhao, Ligang Liu, Long Zeng","doi":"10.1145/3763353","DOIUrl":"https://doi.org/10.1145/3763353","url":null,"abstract":"Generating artistic and coherent 3D scene layouts is crucial in digital content creation. Traditional optimization-based methods are often constrained by cumbersome manual rules, while deep generative models face challenges in producing content with richness and diversity. Furthermore, approaches that utilize large language models frequently lack robustness and fail to accurately capture complex spatial relationships. To address these challenges, this paper presents a novel vision-guided 3D layout generation system. We first construct a high-quality asset library containing 2,037 scene assets and 147 3D scene layouts. Subsequently, we employ an image generation model to expand prompt representations into images, fine-tuning it to align with our asset library. We then develop a robust image parsing module to recover the 3D layout of scenes based on visual semantics and geometric information. Finally, we optimize the scene layout using scene graphs and overall visual semantics to ensure logical coherence and alignment with the images. Extensive user testing demonstrates that our algorithm significantly outperforms existing methods in terms of layout richness and quality. The code and dataset will be available at https://github.com/HiHiAllen/Imaginarium.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"5 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aditya Ganeshan, Kurt Fleischer, Wenzel Jakob, Ariel Shamir, Daniel Ritchie, Takeo Igarashi, Maria Larsson
Traditional integral wood joints, despite their strength, durability, and elegance, remain rare in modern workflows due to the cost and difficulty of manual fabrication. CNC milling offers a scalable alternative, but directly milling traditional joints often fails to produce functional results because milling induces geometric deviations—such as rounded inner corners—that alter the target geometries of the parts. Since joints rely on tightly fitting surfaces, such deviations introduce gaps or overlaps that undermine fit or block assembly. We propose to overcome this problem by (1) designing a language that represent millable geometry, and (2) co-optimizing part geometries to restore coupling. We introduce Millable Extrusion Geometry (MXG), a language for representing geometry as the outcome of milling operations performed with flat-end drill bits. MXG represents each operation as a subtractive extrusion volume defined by a tool direction and drill radius. This parameterization enables the modeling of artifact-free geometry under an idealized zero-radius drill bit, matching traditional joint designs. Increasing the radius then reveals milling-induced deviations, which compromise the integrity of the joint. To restore coupling, we formalize tight coupling in terms of both surface proximity and proximity constraints on the mill-bit paths associated with mating surfaces. We then derive two tractable, differentiable losses that enable efficient optimization of joint geometry. We evaluate our method on 30 traditional joint designs, demonstrating that it produces CNC-compatible, tightly fitting joints that approximates the original geometry. By reinterpreting traditional joints for CNC workflows, we continue the evolution of this heritage craft and help ensure its relevance in future making practices.
{"title":"MiGumi: Making Tightly Coupled Integral Joints Millable","authors":"Aditya Ganeshan, Kurt Fleischer, Wenzel Jakob, Ariel Shamir, Daniel Ritchie, Takeo Igarashi, Maria Larsson","doi":"10.1145/3763304","DOIUrl":"https://doi.org/10.1145/3763304","url":null,"abstract":"Traditional integral wood joints, despite their strength, durability, and elegance, remain rare in modern workflows due to the cost and difficulty of manual fabrication. CNC milling offers a scalable alternative, but directly milling traditional joints often fails to produce functional results because milling induces geometric deviations—such as rounded inner corners—that alter the target geometries of the parts. Since joints rely on tightly fitting surfaces, such deviations introduce gaps or overlaps that undermine fit or block assembly. We propose to overcome this problem by (1) designing a language that represent millable geometry, and (2) co-optimizing part geometries to restore coupling. We introduce Millable Extrusion Geometry (MXG), a language for representing geometry as the outcome of milling operations performed with flat-end drill bits. MXG represents each operation as a subtractive extrusion volume defined by a tool direction and drill radius. This parameterization enables the modeling of artifact-free geometry under an idealized zero-radius drill bit, matching traditional joint designs. Increasing the radius then reveals milling-induced deviations, which compromise the integrity of the joint. To restore coupling, we formalize tight coupling in terms of both surface proximity and proximity constraints on the mill-bit paths associated with mating surfaces. We then derive two tractable, differentiable losses that enable efficient optimization of joint geometry. We evaluate our method on 30 traditional joint designs, demonstrating that it produces CNC-compatible, tightly fitting joints that approximates the original geometry. By reinterpreting traditional joints for CNC workflows, we continue the evolution of this heritage craft and help ensure its relevance in future making practices.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"1 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce a general, scalable computational framework for multi-axis 3D printing based on implicit neural fields (INFs) that unifies all stages of tool-path generation and global collision-free motion planning. In our pipeline, input models are represented as signed distance fields, with fabrication objectives—such as support-free printing, surface finish quality, and extrusion control—directly encoded in the optimization of an implicit guidance field. This unified approach enables toolpath optimization across both surface and interior domains, allowing shell and infill paths to be generated via implicit field interpolation. The printing sequence and multi-axis motion are then jointly optimized over a continuous quaternion field. Our continuous formulation constructs the evolving printing object as a time-varying SDF, supporting differentiable global collision handling throughout INF-based motion planning. Compared to explicit-representation-based methods, INF-3DP achieves up to two orders of magnitude speedup and significantly reduces waypoint-to-surface error. We validate our framework on diverse, complex models and demonstrate its efficiency with physical fabrication experiments using a robot-assisted multi-axis system.
{"title":"INF-3DP: Implicit Neural Fields for Collision-Free Multi-Axis 3D Printing","authors":"Jiasheng Qu, Zhuo Huang, Dezhao Guo, Hailin Sun, Aoran Lyu, Chengkai Dai, Yeung Yam, Guoxin Fang","doi":"10.1145/3763354","DOIUrl":"https://doi.org/10.1145/3763354","url":null,"abstract":"We introduce a general, scalable computational framework for multi-axis 3D printing based on implicit neural fields (INFs) that unifies all stages of tool-path generation and global collision-free motion planning. In our pipeline, input models are represented as signed distance fields, with fabrication objectives—such as support-free printing, surface finish quality, and extrusion control—directly encoded in the optimization of an implicit guidance field. This unified approach enables toolpath optimization across both surface and interior domains, allowing shell and infill paths to be generated via implicit field interpolation. The printing sequence and multi-axis motion are then jointly optimized over a continuous quaternion field. Our continuous formulation constructs the evolving printing object as a time-varying SDF, supporting differentiable global collision handling throughout INF-based motion planning. Compared to explicit-representation-based methods, INF-3DP achieves up to two orders of magnitude speedup and significantly reduces waypoint-to-surface error. We validate our framework on diverse, complex models and demonstrate its efficiency with physical fabrication experiments using a robot-assisted multi-axis system.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"20 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}