High-quality 3D face asset creation remains costly due to reliance on controlled capture setups and manual processing, limiting scalability and diversity. We introduce a fully automated, semantically controllable framework for generating PBR-ready 3D facial assets without requiring dedicated scans. Our pipeline begins with a diffusion-based data synthesis stage, where 2D portrait samples from a pre-trained diffusion model are converted into 44K textured 3D face reconstructions via our proposed geometry recovery and texture normalization algorithm, which aligns arbitrarily shaded outputs into clean albedo space. Using this dataset, we train a disentangled adversarial generator that maps semantic attributes (age, gender, ethnicity) to UV-space geometry and albedo, enabling both direct sampling and continuous latent editing while preserving identity. A refinement stage further produces PBR materials and secondary assets (eyeballs, teeth, gums). The resulting system supports controllable face generation and post-editing in real time and exports directly to standard rendering and animation pipelines. We evaluate each component extensively and provide a web-based interactive interface to showcase practical deployment.
{"title":"Bringing Diversity from Diffusion Models to Semantic-Guided Face Asset Generation","authors":"Yunxuan Cai, Sitao Xiang, Zongjian Li, Haiwei Chen, Yajie Zhao","doi":"10.1145/3793859","DOIUrl":"https://doi.org/10.1145/3793859","url":null,"abstract":"High-quality 3D face asset creation remains costly due to reliance on controlled capture setups and manual processing, limiting scalability and diversity. We introduce a fully automated, semantically controllable framework for generating PBR-ready 3D facial assets without requiring dedicated scans. Our pipeline begins with a diffusion-based data synthesis stage, where 2D portrait samples from a pre-trained diffusion model are converted into 44K textured 3D face reconstructions via our proposed geometry recovery and texture normalization algorithm, which aligns arbitrarily shaded outputs into clean albedo space. Using this dataset, we train a disentangled adversarial generator that maps semantic attributes (age, gender, ethnicity) to UV-space geometry and albedo, enabling both direct sampling and continuous latent editing while preserving identity. A refinement stage further produces PBR materials and secondary assets (eyeballs, teeth, gums). The resulting system supports controllable face generation and post-editing in real time and exports directly to standard rendering and animation pipelines. We evaluate each component extensively and provide a web-based interactive interface to showcase practical deployment.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"218 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146095849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Freeform thin-shell surfaces are critical in various fields, but their fabrication is complex and costly. Traditional methods are wasteful and require custom molds, while 3D printing needs extensive support structures and post-processing. Thermal shrinkage actuated 4D printing is an effective method for fabricating 3D shell. However, existing research faces issues related to precise deformation and limited robustness. Addressing these issues is challenging due to three key factors: (1) Difficulty in finding a universal method to control deformation across different materials; (2) Variability in deformation influenced by factors such as printing speed, layer thickness, and heating temperature; (3) Environmental factors affecting the deformation process. To overcome these challenges, we introduce FreeShell, a robust 4D printing technique that uses thermal shrinkage to create precise 3D shells. This method prints triangular tiles connected by shrinkable connectors using a single material. Upon heating, the connectors shrink, moving the tiles to form the desired 3D shape, simplifying fabrication and reducing material and environment dependency. An optimized mesh layout algorithm computes suitable printing structures that satisfy the defined structural objectives. FreeShell demonstrates its effectiveness through various examples and experiments, showcasing precision, robustness, and strength, representing advancement in fabricating complex freeform surfaces.
{"title":"FreeShell: A Context-Free 4D Printing Technique for Fabricating Complex 3D Triangle Mesh Shells","authors":"Chao Yuan, Shengqi Dang, Xuejiao Ma, Nan Cao","doi":"10.1145/3778349","DOIUrl":"https://doi.org/10.1145/3778349","url":null,"abstract":"Freeform thin-shell surfaces are critical in various fields, but their fabrication is complex and costly. Traditional methods are wasteful and require custom molds, while 3D printing needs extensive support structures and post-processing. Thermal shrinkage actuated 4D printing is an effective method for fabricating 3D shell. However, existing research faces issues related to precise deformation and limited robustness. Addressing these issues is challenging due to three key factors: (1) Difficulty in finding a universal method to control deformation across different materials; (2) Variability in deformation influenced by factors such as printing speed, layer thickness, and heating temperature; (3) Environmental factors affecting the deformation process. To overcome these challenges, we introduce FreeShell, a robust 4D printing technique that uses thermal shrinkage to create precise 3D shells. This method prints triangular tiles connected by shrinkable connectors using a single material. Upon heating, the connectors shrink, moving the tiles to form the desired 3D shape, simplifying fabrication and reducing material and environment dependency. An optimized mesh layout algorithm computes suitable printing structures that satisfy the defined structural objectives. FreeShell demonstrates its effectiveness through various examples and experiments, showcasing precision, robustness, and strength, representing advancement in fabricating complex freeform surfaces.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"117 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146044843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chun Chen, Minseok Chae, Seung-Woo Nam, Myeong-Ho Choi, Minseong Kim, Eunbi Lee, Yoonchan Jeong, Jae-Hyeung Park
Holographic displays offer significant potential for augmented and virtual reality applications by reconstructing wavefronts that enable continuous depth cues and natural parallax without vergence–accommodation conflict. However, despite advances in pixel-level image quality, current systems struggle to achieve perceptually accurate color reproduction—an essential component of visual realism. These challenges arise from complex system-level distortions caused by coherent laser illumination, spatial light modulator imperfections, chromatic aberrations, and camera-induced color biases. In this work, we propose a perceptually-aware color management framework for holographic displays that jointly addresses input–output color inconsistencies through color space transformation, adaptive illumination control, and neural network–based perceptual modeling of the camera’s color response. We validate the effectiveness of our approach through numerical simulations, optical experiments, and a controlled user study. The results demonstrate substantial improvements in perceptual color fidelity, laying the groundwork for perceptually driven holographic rendering in future systems.
{"title":"PAColorHolo: A Perceptually-Aware Color Management Framework for Holographic Displays","authors":"Chun Chen, Minseok Chae, Seung-Woo Nam, Myeong-Ho Choi, Minseong Kim, Eunbi Lee, Yoonchan Jeong, Jae-Hyeung Park","doi":"10.1145/3789511","DOIUrl":"https://doi.org/10.1145/3789511","url":null,"abstract":"Holographic displays offer significant potential for augmented and virtual reality applications by reconstructing wavefronts that enable continuous depth cues and natural parallax without vergence–accommodation conflict. However, despite advances in pixel-level image quality, current systems struggle to achieve perceptually accurate color reproduction—an essential component of visual realism. These challenges arise from complex system-level distortions caused by coherent laser illumination, spatial light modulator imperfections, chromatic aberrations, and camera-induced color biases. In this work, we propose a perceptually-aware color management framework for holographic displays that jointly addresses input–output color inconsistencies through color space transformation, adaptive illumination control, and neural network–based perceptual modeling of the camera’s color response. We validate the effectiveness of our approach through numerical simulations, optical experiments, and a controlled user study. The results demonstrate substantial improvements in perceptual color fidelity, laying the groundwork for perceptually driven holographic rendering in future systems.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"101 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146042617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gaussian Splatting (GS) has proven to be highly effective in novel view synthesis, achieving high-quality and real-time rendering. However, its potential for reconstructing detailed 3D shapes has not been fully explored. Existing methods often suffer from limited shape accuracy due to the discrete and unstructured nature of Gaussian primitives, which complicates the shape extraction. While recent techniques like 2D GS have attempted to improve shape reconstruction, they often reformulate the Gaussian primitives in ways that reduce both rendering quality and computational efficiency. To address these problems, our work introduces a rasterized approach to render the depth maps and surface normal maps of general 3D Gaussian primitives. Our method not only significantly enhances shape reconstruction accuracy but also maintains the computational efficiency intrinsic to Gaussian Splatting. It achieves a Chamfer distance error comparable to Neuralangelo [33] on the DTU dataset and maintains similar computational efficiency as the original 3D GS methods. Our method is a significant advancement in Gaussian Splatting and can be directly integrated into existing Gaussian Splatting-based methods.
{"title":"RaDe-GS: Rasterizing Depth in Gaussian Splatting","authors":"Baowen Zhang, Chuan Fang, Rakesh Shrestha, Yixun Liang, Xiao-Xiao Long, Ping Tan","doi":"10.1145/3789201","DOIUrl":"https://doi.org/10.1145/3789201","url":null,"abstract":"Gaussian Splatting (GS) has proven to be highly effective in novel view synthesis, achieving high-quality and real-time rendering. However, its potential for reconstructing detailed 3D shapes has not been fully explored. Existing methods often suffer from limited shape accuracy due to the discrete and unstructured nature of Gaussian primitives, which complicates the shape extraction. While recent techniques like 2D GS have attempted to improve shape reconstruction, they often reformulate the Gaussian primitives in ways that reduce both rendering quality and computational efficiency. To address these problems, our work introduces a rasterized approach to render the depth maps and surface normal maps of general 3D Gaussian primitives. Our method not only significantly enhances shape reconstruction accuracy but also maintains the computational efficiency intrinsic to Gaussian Splatting. It achieves a Chamfer distance error comparable to Neuralangelo [33] on the DTU dataset and maintains similar computational efficiency as the original 3D GS methods. Our method is a significant advancement in Gaussian Splatting and can be directly integrated into existing Gaussian Splatting-based methods.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"15 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joel Wretborn, Marcus Schoo, Noh-hoon Lee, Christopher Batty, Alexey Stomakhin
This work addresses the challenges of distributing large physics-based simulations often encountered in the visual effects industry. These simulations, based on partial differential equations, model complex phenomena such as free surface liquids, flames, and explosions, and are characterized by domains whose shapes and topologies evolve rapidly. In this context, we propose a novel partitioning algorithm employing optimal transport —which produces a power diagram—and designed to handle a vast variety of simulation domain shapes undergoing rapid changes over time. Our Power partitioner ensures an even distribution of computational tasks, reduces inter-node data exchange, and maintains temporal consistency, all while being intuitive and artist-friendly. To quantify partitioning quality we introduce two metrics, the surface index and the temporal consistency index, which we leverage in a range of comparisons on real-world film production data, showing that our method outperforms the state of the art in a majority of cases.
{"title":"A practical partitioner for distributed simulations on sparse dynamic domains using optimal transport","authors":"Joel Wretborn, Marcus Schoo, Noh-hoon Lee, Christopher Batty, Alexey Stomakhin","doi":"10.1145/3787521","DOIUrl":"https://doi.org/10.1145/3787521","url":null,"abstract":"This work addresses the challenges of distributing large physics-based simulations often encountered in the visual effects industry. These simulations, based on partial differential equations, model complex phenomena such as free surface liquids, flames, and explosions, and are characterized by domains whose shapes and topologies evolve rapidly. In this context, we propose a novel partitioning algorithm employing <jats:italic toggle=\"yes\">optimal transport</jats:italic> —which produces a power diagram—and designed to handle a vast variety of simulation domain shapes undergoing rapid changes over time. Our <jats:italic toggle=\"yes\">Power partitioner</jats:italic> ensures an even distribution of computational tasks, reduces inter-node data exchange, and maintains temporal consistency, all while being intuitive and artist-friendly. To quantify partitioning quality we introduce two metrics, the surface index and the temporal consistency index, which we leverage in a range of comparisons on real-world film production data, showing that our method outperforms the state of the art in a majority of cases.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"3 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145968368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Designing a quad mesh that meets aesthetic, anatomical, and numerical requirements often requires meticulous manual effort in conventional methods, making quadrilateral remeshing an “art of design”. Neural networks hold significant promise for automating this process. However, current approaches that directly predict cross fields cannot properly handle the discontinuous behavior of smooth cross fields: minor shape variations can lead to substantial changes in the cross field, even when singularities remain largely unchanged. Therefore, such methods often result in non-smooth outputs when combining multiple singularity instances. To avoid such discontinuity, we propose to learn the sparse singularities, including their locations and indices, then let the non-neural conventional method to smoothly connect them. The imbalanced ratio of singular and regular vertices poses a significant challenge for learning. We convert them into a geodesic distance field and an over-sampled index field to address it. This carefully designed two-stage strategy satisfies several key requirements, such as coordinate invariance and tessellation insensitivity, while enabling the generation of smooth cross fields with varying topologies. By shifting the focus from directly learning the cross field to learning singularities, we also simplify the dataset preparation process by requiring only sparse annotations.
{"title":"Learning Sparse Singularities for Cross Field Design","authors":"Xiaohu Zhang, Hujun Bao, Jin Huang","doi":"10.1145/3787520","DOIUrl":"https://doi.org/10.1145/3787520","url":null,"abstract":"Designing a quad mesh that meets aesthetic, anatomical, and numerical requirements often requires meticulous manual effort in conventional methods, making quadrilateral remeshing an “art of design”. Neural networks hold significant promise for automating this process. However, current approaches that directly predict cross fields cannot properly handle the discontinuous behavior of smooth cross fields: minor shape variations can lead to substantial changes in the cross field, even when singularities remain largely unchanged. Therefore, such methods often result in non-smooth outputs when combining multiple singularity instances. To avoid such discontinuity, we propose to learn the sparse singularities, including their locations and indices, then let the non-neural conventional method to smoothly connect them. The imbalanced ratio of singular and regular vertices poses a significant challenge for learning. We convert them into a geodesic distance field and an over-sampled index field to address it. This carefully designed two-stage strategy satisfies several key requirements, such as coordinate invariance and tessellation insensitivity, while enabling the generation of smooth cross fields with varying topologies. By shifting the focus from directly learning the cross field to learning singularities, we also simplify the dataset preparation process by requiring only sparse annotations.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"34 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, we propose a system that covers the complete workflow for achieving controlled authoring and editing of textures that present distinctive local characteristics. These include various effects that change the surface appearance of materials, such as stains, tears, holes, abrasions, discoloration, and more. Such alterations are ubiquitous in nature, and including them in the synthesis process is crucial for generating realistic textures. We introduce a novel approach for creating textures with such blemishes, adopting a learning-based approach that leverages unlabeled examples. Our approach does not require manual annotations by the user; instead, it detects the appearance-altering features through unsupervised anomaly detection. The various textural features are then automatically clustered into semantically coherent groups, which are used to guide the conditional generation of images. Our pipeline as a whole goes from a small image collection to a versatile generative model that enables the user to interactively create and paint features on textures of arbitrary size. Notably, the algorithms we introduce for diffusion-based editing and infinite stationary texture generation are generic and should prove useful in other contexts as well. Project page: reality.tf.fau.de/pub/ardelean2025examplebased.html
{"title":"Example-Based Feature Painting on Textures","authors":"Andrei-Timotei Ardelean, Tim Weyrich","doi":"10.1145/3763301","DOIUrl":"https://doi.org/10.1145/3763301","url":null,"abstract":"In this work, we propose a system that covers the complete workflow for achieving controlled authoring and editing of textures that present distinctive local characteristics. These include various effects that change the surface appearance of materials, such as stains, tears, holes, abrasions, discoloration, and more. Such alterations are ubiquitous in nature, and including them in the synthesis process is crucial for generating realistic textures. We introduce a novel approach for creating textures with such blemishes, adopting a learning-based approach that leverages unlabeled examples. Our approach does not require manual annotations by the user; instead, it detects the appearance-altering features through unsupervised anomaly detection. The various textural features are then automatically clustered into semantically coherent groups, which are used to guide the conditional generation of images. Our pipeline as a whole goes from a small image collection to a versatile generative model that enables the user to interactively create and paint features on textures of arbitrary size. Notably, the algorithms we introduce for diffusion-based editing and infinite stationary texture generation are generic and should prove useful in other contexts as well. Project page: reality.tf.fau.de/pub/ardelean2025examplebased.html","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"1 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiwen Ju, Qingnan Zhou, Xingyi Du, Nathan Carr, Tao Ju
Computing the boundary surface of the 3D volume swept by a rigid or deforming solid remains a challenging problem in geometric modeling. Existing approaches are often limited to sweeping rigid shapes, cannot guarantee a watertight surface, or struggle with modeling the intricate geometric features (e.g., sharp creases and narrow gaps) and topological features (e.g., interior voids). We make the observation that the sweep boundary is a subset of the projection of the intersection of two implicit surfaces in a higher dimension, and we derive a characterization of the subset using winding numbers. These insights lead to a general algorithm for any sweep represented as a smooth time-varying implicit function satisfying a genericity assumption, and it produces a watertight and intersection-free surface that better approximates the geometric and topological features than existing methods.
{"title":"Lifted Surfacing of Generalized Sweep Volumes","authors":"Yiwen Ju, Qingnan Zhou, Xingyi Du, Nathan Carr, Tao Ju","doi":"10.1145/3763360","DOIUrl":"https://doi.org/10.1145/3763360","url":null,"abstract":"Computing the boundary surface of the 3D volume swept by a rigid or deforming solid remains a challenging problem in geometric modeling. Existing approaches are often limited to sweeping rigid shapes, cannot guarantee a watertight surface, or struggle with modeling the intricate geometric features (e.g., sharp creases and narrow gaps) and topological features (e.g., interior voids). We make the observation that the sweep boundary is a subset of the projection of the intersection of two implicit surfaces in a higher dimension, and we derive a characterization of the subset using winding numbers. These insights lead to a general algorithm for any sweep represented as a smooth time-varying implicit function satisfying a genericity assumption, and it produces a watertight and intersection-free surface that better approximates the geometric and topological features than existing methods.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"33 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advances in image acquisition and scene reconstruction have enabled the generation of high-quality structural urban scene geometry, given sufficient site information. However, current capture techniques often overlook the crucial importance of texture quality, resulting in noticeable visual artifacts in the textured models. In this work, we introduce the urban geometry and texture co-capture problem under limited prior knowledge before a site visit. The only inputs are a 2D building contour map of the target area and a safe flying altitude above the buildings. We propose an innovative aerial path planning framework designed to co-capture images for reconstructing both structured geometry and high-fidelity textures. To evaluate and guide view planning, we introduce a comprehensive texture quality assessment system, including two novel metrics tailored for building facades. Firstly, our method generates high-quality vertical dipping views and horizontal planar views to effectively capture both geometric and textural details. A multi-objective optimization strategy is then proposed to jointly maximize texture fidelity, improve geometric accuracy, and minimize the cost associated with aerial views. Furthermore, we present a sequential path planning algorithm that accounts for texture consistency during image capture. Extensive experiments on large-scale synthetic and real-world urban datasets demonstrate that our approach effectively produces image sets suitable for concurrent geometric and texture reconstruction, enabling the creation of realistic, textured scene proxies at low operational cost.
{"title":"Aerial Path Planning for Urban Geometry and Texture Co-Capture","authors":"Weidan Xiong, Bochuan Zeng, Ziyu Hu, Jianwei Guo, Ke Xie, Hui Huang","doi":"10.1145/3763292","DOIUrl":"https://doi.org/10.1145/3763292","url":null,"abstract":"Recent advances in image acquisition and scene reconstruction have enabled the generation of high-quality structural urban scene geometry, given sufficient site information. However, current capture techniques often overlook the crucial importance of texture quality, resulting in noticeable visual artifacts in the textured models. In this work, we introduce the urban <jats:italic toggle=\"yes\">geometry and texture co-capture</jats:italic> problem under limited prior knowledge before a site visit. The only inputs are a 2D building contour map of the target area and a safe flying altitude above the buildings. We propose an innovative aerial path planning framework designed to co-capture images for reconstructing both structured geometry and high-fidelity textures. To evaluate and guide view planning, we introduce a comprehensive texture quality assessment system, including two novel metrics tailored for building facades. Firstly, our method generates high-quality vertical dipping views and horizontal planar views to effectively capture both geometric and textural details. A multi-objective optimization strategy is then proposed to jointly maximize texture fidelity, improve geometric accuracy, and minimize the cost associated with aerial views. Furthermore, we present a sequential path planning algorithm that accounts for texture consistency during image capture. Extensive experiments on large-scale synthetic and real-world urban datasets demonstrate that our approach effectively produces image sets suitable for concurrent geometric and texture reconstruction, enabling the creation of realistic, textured scene proxies at low operational cost.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"1 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julien Philip, Li Ma, Pascal Clausen, Wenqi Xian, Ahmet Levent Taşel, Mingming He, Xueming Yu, David M. George, Ning Yu, Oliver Pilarski, Paul Debevec
We present a unique system for large-scale, multi-performer, high resolution 4D volumetric capture providing realistic free-viewpoint video up to and including 4K resolution facial closeups. To achieve this, we employ a novel volumetric capture, reconstruction and rendering pipeline based on Dynamic Gaussian Splatting and Diffusion-based Detail Enhancement. We design our pipeline specifically to meet the demands of high-end media production. We employ two capture rigs: the Scene Rig , which captures multi-actor performances at a resolution which falls short of 4K production quality, and the Face Rig , which records high-fidelity single-actor facial detail to serve as a reference for detail enhancement. We first reconstruct dynamic performances from the Scene Rig using 4D Gaussian Splatting, incorporating new model designs and training strategies to improve reconstruction, dynamic range, and rendering quality. Then to render high-quality images for facial closeups, we introduce a diffusion-based detail enhancement model. This model is fine-tuned with high-fidelity data from the same actors recorded in the Face Rig. We train on paired data generated from low- and high-quality Gaussian Splatting (GS) models, using the low-quality input to match the quality of the Scene Rig , with the high-quality GS as ground truth. Our results demonstrate the effectiveness of this pipeline in bridging the gap between the scalable performance capture of a large-scale rig and the high-resolution standards required for film and media production.
{"title":"Detail Enhanced Gaussian Splatting for Large-Scale Volumetric Capture","authors":"Julien Philip, Li Ma, Pascal Clausen, Wenqi Xian, Ahmet Levent Taşel, Mingming He, Xueming Yu, David M. George, Ning Yu, Oliver Pilarski, Paul Debevec","doi":"10.1145/3763336","DOIUrl":"https://doi.org/10.1145/3763336","url":null,"abstract":"We present a unique system for large-scale, multi-performer, high resolution 4D volumetric capture providing realistic free-viewpoint video up to and including 4K resolution facial closeups. To achieve this, we employ a novel volumetric capture, reconstruction and rendering pipeline based on Dynamic Gaussian Splatting and Diffusion-based Detail Enhancement. We design our pipeline specifically to meet the demands of high-end media production. We employ two capture rigs: the <jats:italic toggle=\"yes\">Scene Rig</jats:italic> , which captures multi-actor performances at a resolution which falls short of 4K production quality, and the <jats:italic toggle=\"yes\">Face Rig</jats:italic> , which records high-fidelity single-actor facial detail to serve as a reference for detail enhancement. We first reconstruct dynamic performances from the <jats:italic toggle=\"yes\">Scene Rig</jats:italic> using 4D Gaussian Splatting, incorporating new model designs and training strategies to improve reconstruction, dynamic range, and rendering quality. Then to render high-quality images for facial closeups, we introduce a diffusion-based detail enhancement model. This model is fine-tuned with high-fidelity data from the same actors recorded in the <jats:italic toggle=\"yes\">Face Rig.</jats:italic> We train on paired data generated from low- and high-quality Gaussian Splatting (GS) models, using the low-quality input to match the quality of the <jats:italic toggle=\"yes\">Scene Rig</jats:italic> , with the high-quality GS as ground truth. Our results demonstrate the effectiveness of this pipeline in bridging the gap between the scalable performance capture of a large-scale rig and the high-resolution standards required for film and media production.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"168 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}