Yiwen Ju, Qingnan Zhou, Xingyi Du, Nathan Carr, Tao Ju
Computing the boundary surface of the 3D volume swept by a rigid or deforming solid remains a challenging problem in geometric modeling. Existing approaches are often limited to sweeping rigid shapes, cannot guarantee a watertight surface, or struggle with modeling the intricate geometric features (e.g., sharp creases and narrow gaps) and topological features (e.g., interior voids). We make the observation that the sweep boundary is a subset of the projection of the intersection of two implicit surfaces in a higher dimension, and we derive a characterization of the subset using winding numbers. These insights lead to a general algorithm for any sweep represented as a smooth time-varying implicit function satisfying a genericity assumption, and it produces a watertight and intersection-free surface that better approximates the geometric and topological features than existing methods.
{"title":"Lifted Surfacing of Generalized Sweep Volumes","authors":"Yiwen Ju, Qingnan Zhou, Xingyi Du, Nathan Carr, Tao Ju","doi":"10.1145/3763360","DOIUrl":"https://doi.org/10.1145/3763360","url":null,"abstract":"Computing the boundary surface of the 3D volume swept by a rigid or deforming solid remains a challenging problem in geometric modeling. Existing approaches are often limited to sweeping rigid shapes, cannot guarantee a watertight surface, or struggle with modeling the intricate geometric features (e.g., sharp creases and narrow gaps) and topological features (e.g., interior voids). We make the observation that the sweep boundary is a subset of the projection of the intersection of two implicit surfaces in a higher dimension, and we derive a characterization of the subset using winding numbers. These insights lead to a general algorithm for any sweep represented as a smooth time-varying implicit function satisfying a genericity assumption, and it produces a watertight and intersection-free surface that better approximates the geometric and topological features than existing methods.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"33 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advances in image acquisition and scene reconstruction have enabled the generation of high-quality structural urban scene geometry, given sufficient site information. However, current capture techniques often overlook the crucial importance of texture quality, resulting in noticeable visual artifacts in the textured models. In this work, we introduce the urban geometry and texture co-capture problem under limited prior knowledge before a site visit. The only inputs are a 2D building contour map of the target area and a safe flying altitude above the buildings. We propose an innovative aerial path planning framework designed to co-capture images for reconstructing both structured geometry and high-fidelity textures. To evaluate and guide view planning, we introduce a comprehensive texture quality assessment system, including two novel metrics tailored for building facades. Firstly, our method generates high-quality vertical dipping views and horizontal planar views to effectively capture both geometric and textural details. A multi-objective optimization strategy is then proposed to jointly maximize texture fidelity, improve geometric accuracy, and minimize the cost associated with aerial views. Furthermore, we present a sequential path planning algorithm that accounts for texture consistency during image capture. Extensive experiments on large-scale synthetic and real-world urban datasets demonstrate that our approach effectively produces image sets suitable for concurrent geometric and texture reconstruction, enabling the creation of realistic, textured scene proxies at low operational cost.
{"title":"Aerial Path Planning for Urban Geometry and Texture Co-Capture","authors":"Weidan Xiong, Bochuan Zeng, Ziyu Hu, Jianwei Guo, Ke Xie, Hui Huang","doi":"10.1145/3763292","DOIUrl":"https://doi.org/10.1145/3763292","url":null,"abstract":"Recent advances in image acquisition and scene reconstruction have enabled the generation of high-quality structural urban scene geometry, given sufficient site information. However, current capture techniques often overlook the crucial importance of texture quality, resulting in noticeable visual artifacts in the textured models. In this work, we introduce the urban <jats:italic toggle=\"yes\">geometry and texture co-capture</jats:italic> problem under limited prior knowledge before a site visit. The only inputs are a 2D building contour map of the target area and a safe flying altitude above the buildings. We propose an innovative aerial path planning framework designed to co-capture images for reconstructing both structured geometry and high-fidelity textures. To evaluate and guide view planning, we introduce a comprehensive texture quality assessment system, including two novel metrics tailored for building facades. Firstly, our method generates high-quality vertical dipping views and horizontal planar views to effectively capture both geometric and textural details. A multi-objective optimization strategy is then proposed to jointly maximize texture fidelity, improve geometric accuracy, and minimize the cost associated with aerial views. Furthermore, we present a sequential path planning algorithm that accounts for texture consistency during image capture. Extensive experiments on large-scale synthetic and real-world urban datasets demonstrate that our approach effectively produces image sets suitable for concurrent geometric and texture reconstruction, enabling the creation of realistic, textured scene proxies at low operational cost.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"1 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julien Philip, Li Ma, Pascal Clausen, Wenqi Xian, Ahmet Levent Taşel, Mingming He, Xueming Yu, David M. George, Ning Yu, Oliver Pilarski, Paul Debevec
We present a unique system for large-scale, multi-performer, high resolution 4D volumetric capture providing realistic free-viewpoint video up to and including 4K resolution facial closeups. To achieve this, we employ a novel volumetric capture, reconstruction and rendering pipeline based on Dynamic Gaussian Splatting and Diffusion-based Detail Enhancement. We design our pipeline specifically to meet the demands of high-end media production. We employ two capture rigs: the Scene Rig , which captures multi-actor performances at a resolution which falls short of 4K production quality, and the Face Rig , which records high-fidelity single-actor facial detail to serve as a reference for detail enhancement. We first reconstruct dynamic performances from the Scene Rig using 4D Gaussian Splatting, incorporating new model designs and training strategies to improve reconstruction, dynamic range, and rendering quality. Then to render high-quality images for facial closeups, we introduce a diffusion-based detail enhancement model. This model is fine-tuned with high-fidelity data from the same actors recorded in the Face Rig. We train on paired data generated from low- and high-quality Gaussian Splatting (GS) models, using the low-quality input to match the quality of the Scene Rig , with the high-quality GS as ground truth. Our results demonstrate the effectiveness of this pipeline in bridging the gap between the scalable performance capture of a large-scale rig and the high-resolution standards required for film and media production.
{"title":"Detail Enhanced Gaussian Splatting for Large-Scale Volumetric Capture","authors":"Julien Philip, Li Ma, Pascal Clausen, Wenqi Xian, Ahmet Levent Taşel, Mingming He, Xueming Yu, David M. George, Ning Yu, Oliver Pilarski, Paul Debevec","doi":"10.1145/3763336","DOIUrl":"https://doi.org/10.1145/3763336","url":null,"abstract":"We present a unique system for large-scale, multi-performer, high resolution 4D volumetric capture providing realistic free-viewpoint video up to and including 4K resolution facial closeups. To achieve this, we employ a novel volumetric capture, reconstruction and rendering pipeline based on Dynamic Gaussian Splatting and Diffusion-based Detail Enhancement. We design our pipeline specifically to meet the demands of high-end media production. We employ two capture rigs: the <jats:italic toggle=\"yes\">Scene Rig</jats:italic> , which captures multi-actor performances at a resolution which falls short of 4K production quality, and the <jats:italic toggle=\"yes\">Face Rig</jats:italic> , which records high-fidelity single-actor facial detail to serve as a reference for detail enhancement. We first reconstruct dynamic performances from the <jats:italic toggle=\"yes\">Scene Rig</jats:italic> using 4D Gaussian Splatting, incorporating new model designs and training strategies to improve reconstruction, dynamic range, and rendering quality. Then to render high-quality images for facial closeups, we introduce a diffusion-based detail enhancement model. This model is fine-tuned with high-fidelity data from the same actors recorded in the <jats:italic toggle=\"yes\">Face Rig.</jats:italic> We train on paired data generated from low- and high-quality Gaussian Splatting (GS) models, using the low-quality input to match the quality of the <jats:italic toggle=\"yes\">Scene Rig</jats:italic> , with the high-quality GS as ground truth. Our results demonstrate the effectiveness of this pipeline in bridging the gap between the scalable performance capture of a large-scale rig and the high-resolution standards required for film and media production.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"168 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Baptiste Genest, Nicolas Bonneel, Vincent Nivoliers, David Coeurjolly
To solve the optimal transport problem between two uniform discrete measures of the same size, one seeks a bijective assignment that minimizes some matching cost. For this task, exact algorithms are intractable for large problems, while approximate ones may lose the bijectivity of the assignment. We address this issue and the more general cases of non-uniform discrete measures with different total masses, where partial transport may be desirable. The core of our algorithm is a variant of the Quicksort algorithm that provides an efficient strategy to randomly explore many relevant and easy-to-compute couplings, by matching BSP trees in loglinear time. The couplings we obtain are as sparse as possible, in the sense that they provide bijections, injective partial matchings or sparse couplings depending on the nature of the matched measures. To improve the transport cost, we propose efficient strategies to merge k sparse couplings into a higher quality one. For k = 64, we obtain transport plans with typically less than 1% of relative error in a matter of seconds between hundreds of thousands of points in 3D on the CPU. We demonstrate how these high-quality approximations can drastically speed-up usual pipelines involving optimal transport, such as shape interpolation, intrinsic manifold sampling, color transfer, topological data analysis, rigid partial registration of point clouds and image stippling.
{"title":"BSP-OT: Sparse transport plans between discrete measures in loglinear time","authors":"Baptiste Genest, Nicolas Bonneel, Vincent Nivoliers, David Coeurjolly","doi":"10.1145/3763281","DOIUrl":"https://doi.org/10.1145/3763281","url":null,"abstract":"To solve the optimal transport problem between two uniform discrete measures of the same size, one seeks a bijective assignment that minimizes some matching cost. For this task, exact algorithms are intractable for large problems, while approximate ones may lose the bijectivity of the assignment. We address this issue and the more general cases of non-uniform discrete measures with different total masses, where partial transport may be desirable. The core of our algorithm is a variant of the Quicksort algorithm that provides an efficient strategy to randomly explore many relevant and easy-to-compute couplings, by matching BSP trees in loglinear time. The couplings we obtain are as sparse as possible, in the sense that they provide bijections, injective partial matchings or sparse couplings depending on the nature of the matched measures. To improve the transport cost, we propose efficient strategies to merge <jats:italic toggle=\"yes\">k</jats:italic> sparse couplings into a higher quality one. For <jats:italic toggle=\"yes\">k =</jats:italic> 64, we obtain transport plans with typically less than 1% of relative error in a matter of seconds between hundreds of thousands of points in 3D on the CPU. We demonstrate how these high-quality approximations can drastically speed-up usual pipelines involving optimal transport, such as shape interpolation, intrinsic manifold sampling, color transfer, topological data analysis, rigid partial registration of point clouds and image stippling.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"30 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Markov chain Monte Carlo (MCMC) algorithms are indispensable when sampling from a complex, high-dimensional distribution by a conventional method is intractable. Even though MCMC is a powerful tool, it is also hard to control and tune in practice. Simultaneously achieving both rapid local exploration of the state space and efficient global discovery of the target distribution is a challenging task. In this work, we introduce a novel continuous-time MCMC formulation to the computer science community. Generalizing existing work from the statistics community, we propose a novel framework for adjusting an arbitrary family of Markov processes - used for local exploration of the state space only - to an overall process which is invariant with respect to a target distribution. To demonstrate the potential of our framework, we focus on a simple, but yet insightful, application in light transport simulation. As a by-product, we introduce continuous-time MCMC sampling to the computer graphics community. We show how any existing MCMC-based light transport algorithm can be seamlessly integrated into our framework. We prove empirically and theoretically that the integrated version is superior to the ordinary algorithm. In fact, our approach will convert any existing algorithm into a highly parallelizable variant with shorter running time, smaller error and less variance.
{"title":"Jump Restore Light Transport","authors":"Sascha Holl, Gurprit Singh, Hans-Peter Seidel","doi":"10.1145/3763286","DOIUrl":"https://doi.org/10.1145/3763286","url":null,"abstract":"Markov chain Monte Carlo (MCMC) algorithms are indispensable when sampling from a complex, high-dimensional distribution by a conventional method is intractable. Even though MCMC is a powerful tool, it is also hard to control and tune in practice. Simultaneously achieving both rapid <jats:italic toggle=\"yes\">local exploration</jats:italic> of the state space and efficient <jats:italic toggle=\"yes\">global discovery</jats:italic> of the target distribution is a challenging task. In this work, we introduce a novel continuous-time MCMC formulation to the computer science community. Generalizing existing work from the statistics community, we propose a novel framework for <jats:italic toggle=\"yes\">adjusting</jats:italic> an arbitrary family of Markov processes - used for local exploration of the state space only - to an overall process which is invariant with respect to a target distribution. To demonstrate the potential of our framework, we focus on a simple, but yet insightful, application in light transport simulation. As a by-product, we introduce continuous-time MCMC sampling to the computer graphics community. We show how any existing MCMC-based light transport algorithm can be seamlessly integrated into our framework. We prove empirically and theoretically that the integrated version is superior to the ordinary algorithm. In fact, our approach will convert any existing algorithm into a highly <jats:italic toggle=\"yes\">parallelizable</jats:italic> variant with shorter running time, smaller error and less variance.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"12 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The demand for high-frame-rate rendering keeps increasing in modern displays. Existing frame generation and super-resolution techniques accelerate rendering by reducing rendering samples across space or time. However, they rely on a uniform sampling reduction strategy, which undersamples areas with complex details or dynamic shading. To address this, we propose to sparsely shade critical areas while reusing generated pixels in low-variation areas for neural extrapolation. Specifically, we introduce the Predictive Error-Flow-eXtrapolation Network (EFXNet)-an architecture that predicts extrapolation errors, estimates flows, and extrapolates frames at once. Firstly, EFXNet leverages temporal coherence to predict extrapolation error and guide the sparse shading of dynamic areas. In addition, EFXNet employs a target-grid correlation module to estimate robust optical flows from pixel correlations rather than pixel values. Finally, EFXNet uses dedicated motion representations for the historical geometric and lighting components, respectively, to extrapolate temporally stable frames. Extensive experimental results show that, compared with state-of-the-art methods, our frame extrapolation method exhibits superior visual quality and temporal stability under a low rendering budget.
{"title":"Consecutive Frame Extrapolation with Predictive Sparse Shading","authors":"Zhizhen Wu, Zhe Cao, Yazhen Yuan, Zhilong Yuan, Rui Wang, Yuchi Huo","doi":"10.1145/3763363","DOIUrl":"https://doi.org/10.1145/3763363","url":null,"abstract":"The demand for high-frame-rate rendering keeps increasing in modern displays. Existing frame generation and super-resolution techniques accelerate rendering by reducing rendering samples across space or time. However, they rely on a uniform sampling reduction strategy, which undersamples areas with complex details or dynamic shading. To address this, we propose to sparsely shade critical areas while reusing generated pixels in low-variation areas for neural extrapolation. Specifically, we introduce the Predictive Error-Flow-eXtrapolation Network (EFXNet)-an architecture that predicts extrapolation errors, estimates flows, and extrapolates frames at once. Firstly, EFXNet leverages temporal coherence to predict extrapolation error and guide the sparse shading of dynamic areas. In addition, EFXNet employs a target-grid correlation module to estimate robust optical flows from pixel correlations rather than pixel values. Finally, EFXNet uses dedicated motion representations for the historical geometric and lighting components, respectively, to extrapolate temporally stable frames. Extensive experimental results show that, compared with state-of-the-art methods, our frame extrapolation method exhibits superior visual quality and temporal stability under a low rendering budget.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"21 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yash Belhe, Ishit Mehta, Wesley Chang, Iliyan Georgiev, Michael Gharbi, Ravi Ramamoorthi, Tzu-Mao Li
We present a novel method to differentiate integrals of discontinuous functions, which are common in inverse graphics, computer vision, and machine learning applications. Previous methods either require specialized routines to sample the discontinuous boundaries of predetermined primitives, or use reparameterization techniques that suffer from high variance. In contrast, our method handles general discontinuous functions, expressed as shader programs, without requiring manually specified boundary sampling routines. We achieve this through a program transformation that converts discontinuous functions into piecewise constant ones, enabling efficient boundary sampling through a novel segment snapping technique, and accurate derivatives at the boundary by simply comparing values on both sides of the discontinuity. Our method handles both explicit boundaries (polygons, ellipses, Bézier curves) and implicit ones (neural networks, noise-based functions, swept surfaces). We demonstrate that our system supports a wide range of applications, including painterly rendering, raster image fitting, constructive solid geometry, swept surfaces, mosaicing, and ray marching.
{"title":"Automatic Sampling for Discontinuities in Differentiable Shaders","authors":"Yash Belhe, Ishit Mehta, Wesley Chang, Iliyan Georgiev, Michael Gharbi, Ravi Ramamoorthi, Tzu-Mao Li","doi":"10.1145/3763291","DOIUrl":"https://doi.org/10.1145/3763291","url":null,"abstract":"We present a novel method to differentiate integrals of discontinuous functions, which are common in inverse graphics, computer vision, and machine learning applications. Previous methods either require specialized routines to sample the discontinuous boundaries of predetermined primitives, or use reparameterization techniques that suffer from high variance. In contrast, our method handles general discontinuous functions, expressed as shader programs, without requiring manually specified boundary sampling routines. We achieve this through a program transformation that converts discontinuous functions into piecewise constant ones, enabling efficient boundary sampling through a novel segment snapping technique, and accurate derivatives at the boundary by simply comparing values on both sides of the discontinuity. Our method handles both explicit boundaries (polygons, ellipses, Bézier curves) and implicit ones (neural networks, noise-based functions, swept surfaces). We demonstrate that our system supports a wide range of applications, including painterly rendering, raster image fitting, constructive solid geometry, swept surfaces, mosaicing, and ray marching.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"125 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Monte Carlo methods based on the walk on spheres (WoS) algorithm offer a parallel, progressive, and output-sensitive approach for solving partial differential equations (PDEs) in complex geometric domains. Building on this foundation, the walk on stars (WoSt) method generalizes WoS to support mixed Dirichlet, Neumann, and Robin boundary conditions. However, accurately computing spatial derivatives of PDE solutions remains a major challenge: existing methods exhibit high variance and bias near the domain boundary, especially in Neumann-dominated problems. We address this limitation with a new extension of WoSt specifically designed for derivative estimation. Our method reformulates the boundary integral equation (BIE) for Poisson PDEs by directly leveraging the harmonicity of spatial derivatives. Combined with a tailored random-walk sampling scheme and an unbiased early termination strategy, we achieve significantly improved accuracy in derivative estimates near the Neumann boundary. We further demonstrate the effectiveness of our approach across various tasks, including recovering the non-unique solution to a pure Neumann problem with reduced bias and variance, constructing divergence-free vector fields, and optimizing parametrically defined boundaries under PDE constraints.
{"title":"Robust Derivative Estimation with Walk on Stars","authors":"Zihan Yu, Rohan Sawhney, Bailey Miller, Lifan Wu, Shuang Zhao","doi":"10.1145/3763333","DOIUrl":"https://doi.org/10.1145/3763333","url":null,"abstract":"Monte Carlo methods based on the walk on spheres (WoS) algorithm offer a parallel, progressive, and output-sensitive approach for solving partial differential equations (PDEs) in complex geometric domains. Building on this foundation, the walk on stars (WoSt) method generalizes WoS to support mixed Dirichlet, Neumann, and Robin boundary conditions. However, accurately computing spatial derivatives of PDE solutions remains a major challenge: existing methods exhibit high variance and bias near the domain boundary, especially in Neumann-dominated problems. We address this limitation with a new extension of WoSt specifically designed for derivative estimation. Our method reformulates the boundary integral equation (BIE) for Poisson PDEs by directly leveraging the harmonicity of spatial derivatives. Combined with a tailored random-walk sampling scheme and an unbiased early termination strategy, we achieve significantly improved accuracy in derivative estimates near the Neumann boundary. We further demonstrate the effectiveness of our approach across various tasks, including recovering the non-unique solution to a pure Neumann problem with reduced bias and variance, constructing divergence-free vector fields, and optimizing parametrically defined boundaries under PDE constraints.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"33 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pauli Kemppinen, Loïs Paulin, Théo Thonat, Jean-Marc Thiery, Jaakko Lehtinen, Tamy Boubekeur
Geometric features between the micro and macro scales produce an expressive family of visual effects grouped under the term "glints". Efficiently rendering these effects amounts to finding the highlights caused by the geometry under each pixel. To allow for fast rendering, we represent our faceted geometry as a 4D point process on an implicit multiscale grid, designed to efficiently find the facets most likely to cause a highlight. The facets' normals are generated to match a given micro-facet normal distribution such as Trowbridge-Reitz (GGX) or Beckmann, to which our model converges under increasing surface area. Our method is simple to implement, memory-and-precomputation-free, allows for importance sampling and covers a wide range of different appearances such as anisotropic as well as individually colored particles. We provide a base implementation as a standalone fragment shader.
{"title":"Evaluating and Sampling Glinty NDFs in Constant Time","authors":"Pauli Kemppinen, Loïs Paulin, Théo Thonat, Jean-Marc Thiery, Jaakko Lehtinen, Tamy Boubekeur","doi":"10.1145/3763282","DOIUrl":"https://doi.org/10.1145/3763282","url":null,"abstract":"Geometric features between the micro and macro scales produce an expressive family of visual effects grouped under the term \"glints\". Efficiently rendering these effects amounts to finding the highlights caused by the geometry under each pixel. To allow for fast rendering, we represent our faceted geometry as a 4D point process on an implicit multiscale grid, designed to efficiently find the facets most likely to cause a highlight. The facets' normals are generated to match a given micro-facet normal distribution such as Trowbridge-Reitz (GGX) or Beckmann, to which our model converges under increasing surface area. Our method is simple to implement, memory-and-precomputation-free, allows for importance sampling and covers a wide range of different appearances such as anisotropic as well as individually colored particles. We provide a base implementation as a standalone fragment shader.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"1 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In text-to-image models, consistent character generation is the task of achieving text alignment while maintaining the subject's appearance across different prompts. However, since style and appearance are often entangled, the existing methods struggle to preserve consistent subject characteristics while adhering to varying style prompts. Current approaches for consistent text-to-image generation typically rely on large-scale fine-tuning on curated image sets or per-subject optimization, which either fail to generalize across prompts or do not align well with textual descriptions. Meanwhile, training-free methods often fail to maintain subject consistency across different styles. In this work, we introduce a training-free method that, for the first time, jointly achieves style preservation and subject consistency across varied styles. The attention matrices are manipulated such that Queries and Keys are obtained from the anchor image(s) that are used to define the subject, while the Values are imported from a parallel copy that is not subject-anchored. Additionally, cross-image components are added to the self-attention mechanism by expanding the Key and Value matrices. To do without shifting from the target style, we align the statistics of the Value matrices. As is demonstrated in a comprehensive battery of qualitative and quantitative experiments, our method effectively decouples style from subject appearance and enables faithful generation of text-aligned images with consistent characters across diverse styles. Code will be available at our project page: jbruner23.github.io/consistyle.
{"title":"ConsiStyle: Style Diversity in Training-Free Consistent T2I Generation","authors":"Yohai Mazuz, Janna Bruner, Lior Wolf","doi":"10.1145/3763303","DOIUrl":"https://doi.org/10.1145/3763303","url":null,"abstract":"In text-to-image models, consistent character generation is the task of achieving text alignment while maintaining the subject's appearance across different prompts. However, since style and appearance are often entangled, the existing methods struggle to preserve consistent subject characteristics while adhering to varying style prompts. Current approaches for consistent text-to-image generation typically rely on large-scale fine-tuning on curated image sets or per-subject optimization, which either fail to generalize across prompts or do not align well with textual descriptions. Meanwhile, training-free methods often fail to maintain subject consistency across different styles. In this work, we introduce a training-free method that, for the first time, jointly achieves style preservation and subject consistency across varied styles. The attention matrices are manipulated such that Queries and Keys are obtained from the anchor image(s) that are used to define the subject, while the Values are imported from a parallel copy that is not subject-anchored. Additionally, cross-image components are added to the self-attention mechanism by expanding the Key and Value matrices. To do without shifting from the target style, we align the statistics of the Value matrices. As is demonstrated in a comprehensive battery of qualitative and quantitative experiments, our method effectively decouples style from subject appearance and enables faithful generation of text-aligned images with consistent characters across diverse styles. Code will be available at our project page: jbruner23.github.io/consistyle.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"4 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}