Hajin Choi, Seokpyo Hong, Inwoo Ha, Nahyup Kang, Bochang Moon
Generating a rendered image sequence through Monte Carlo ray tracing is an appealing option when one aims to accurately simulate various lighting effects. Unfortunately, interactive rendering scenarios limit the allowable sample size for such sampling-based light transport algorithms, resulting in an unbiased but noisy image sequence. Image denoising has been widely adopted as a post-sampling process to convert such noisy image sequences into biased but temporally stable ones. The state-of-the-art strategy for interactive image denoising involves devising a deep neural network and training this network via supervised learning, i.e., optimizing the network parameters using training datasets that include an extensive set of image pairs (noisy and ground truth images). This paper adopts the prevalent approach for interactive image denoising, which relies on a neural network. However, instead of supervised learning, we propose a different learning strategy that trains our network parameters on the fly, i.e., updating them online using runtime image sequences. To achieve our denoising objective with online learning, we tailor local regression to a cross-regression form that can guide robust training of our denoising neural network. We demonstrate that our denoising framework effectively reduces noise in input image sequences while robustly preserving both geometric and non-geometric edges, without requiring the manual effort involved in preparing an external dataset.
{"title":"Online Neural Denoising with Cross-Regression for Interactive Rendering","authors":"Hajin Choi, Seokpyo Hong, Inwoo Ha, Nahyup Kang, Bochang Moon","doi":"10.1145/3687938","DOIUrl":"https://doi.org/10.1145/3687938","url":null,"abstract":"Generating a rendered image sequence through Monte Carlo ray tracing is an appealing option when one aims to accurately simulate various lighting effects. Unfortunately, interactive rendering scenarios limit the allowable sample size for such sampling-based light transport algorithms, resulting in an unbiased but noisy image sequence. Image denoising has been widely adopted as a post-sampling process to convert such noisy image sequences into biased but temporally stable ones. The state-of-the-art strategy for interactive image denoising involves devising a deep neural network and training this network via supervised learning, i.e., optimizing the network parameters using training datasets that include an extensive set of image pairs (noisy and ground truth images). This paper adopts the prevalent approach for interactive image denoising, which relies on a neural network. However, instead of supervised learning, we propose a different learning strategy that trains our network parameters on the fly, i.e., updating them online using runtime image sequences. To achieve our denoising objective with online learning, we tailor local regression to a cross-regression form that can guide robust training of our denoising neural network. We demonstrate that our denoising framework effectively reduces noise in input image sequences while robustly preserving both geometric and non-geometric edges, without requiring the manual effort involved in preparing an external dataset.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"18 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present an implicit surface reconstruction method with 3D Gaussian Splatting (3DGS), namely 3DGSR, that allows for accurate 3D reconstruction with intricate details while inheriting the high efficiency and rendering quality of 3DGS. The key insight is to incorporate an implicit signed distance field (SDF) within 3D Gaussians for surface modeling, and to enable the alignment and joint optimization of both SDF and 3D Gaussians. To achieve this, we design coupling strategies that align and associate the SDF with 3D Gaussians, allowing for unified optimization and enforcing surface constraints on the 3D Gaussians. With alignment, optimizing the 3D Gaussians provides supervisory signals for SDF learning, enabling the reconstruction of intricate details. However, this only offers sparse supervisory signals to the SDF at locations occupied by Gaussians, which is insufficient for learning a continuous SDF. Then, to address this limitation, we incorporate volumetric rendering and align the rendered geometric attributes (depth, normal) with that derived from 3DGS. In sum, these two designs allow SDF and 3DGS to be aligned, jointly optimized, and mutually boosted. Our extensive experimental results demonstrate that our 3DGSR enables high-quality 3D surface reconstruction while preserving the efficiency and rendering quality of 3DGS. Besides, our method competes favorably with leading surface reconstruction techniques while offering a more efficient learning process and much better rendering qualities.
{"title":"3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting","authors":"Xiaoyang Lyu, Yang-Tian Sun, Yi-Hua Huang, Xiuzhe Wu, Ziyi Yang, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi","doi":"10.1145/3687952","DOIUrl":"https://doi.org/10.1145/3687952","url":null,"abstract":"In this paper, we present an implicit surface reconstruction method with 3D Gaussian Splatting (3DGS), namely 3DGSR, that allows for accurate 3D reconstruction with intricate details while inheriting the high efficiency and rendering quality of 3DGS. The key insight is to incorporate an implicit signed distance field (SDF) within 3D Gaussians for surface modeling, and to enable the alignment and joint optimization of both SDF and 3D Gaussians. To achieve this, we design coupling strategies that align and associate the SDF with 3D Gaussians, allowing for unified optimization and enforcing surface constraints on the 3D Gaussians. With alignment, optimizing the 3D Gaussians provides supervisory signals for SDF learning, enabling the reconstruction of intricate details. However, this only offers sparse supervisory signals to the SDF at locations occupied by Gaussians, which is insufficient for learning a continuous SDF. Then, to address this limitation, we incorporate volumetric rendering and align the rendered geometric attributes (depth, normal) with that derived from 3DGS. In sum, these two designs allow SDF and 3DGS to be aligned, jointly optimized, and mutually boosted. Our extensive experimental results demonstrate that our 3DGSR enables high-quality 3D surface reconstruction while preserving the efficiency and rendering quality of 3DGS. Besides, our method competes favorably with leading surface reconstruction techniques while offering a more efficient learning process and much better rendering qualities.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"176 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kehan Xu, Sebastian Herholz, Marco Manzi, Marios Papas, Markus Gross
Simulating the light transport of volumetric effects poses significant challenges and costs, especially in the presence of heterogeneous volumes. Generating stochastic paths for volume rendering involves multiple decisions, and previous works mainly focused on directional and distance sampling, where the volume scattering probability (VSP), i.e., the probability of scattering inside a volume, is indirectly determined as a byproduct of distance sampling. We demonstrate that direct control over the VSP can significantly improve efficiency and present an unbiased volume rendering algorithm based on an existing resampling framework for precise control over the VSP. Compared to previous state-of-the-art, which can only increase the VSP without guaranteeing to reach the desired value, our method also supports decreasing the VSP. We further present a data-driven guiding framework to efficiently learn and query an approximation of the optimal VSP everywhere in the scene without the need for user control. Our approach can easily be combined with existing path-guiding methods for directional sampling at minimal overhead and shows significant improvements over the state-of-the-art in various complex volumetric lighting scenarios.
{"title":"Volume Scattering Probability Guiding","authors":"Kehan Xu, Sebastian Herholz, Marco Manzi, Marios Papas, Markus Gross","doi":"10.1145/3687982","DOIUrl":"https://doi.org/10.1145/3687982","url":null,"abstract":"Simulating the light transport of volumetric effects poses significant challenges and costs, especially in the presence of heterogeneous volumes. Generating stochastic paths for volume rendering involves multiple decisions, and previous works mainly focused on directional and distance sampling, where the volume scattering probability (VSP), i.e., the probability of scattering inside a volume, is indirectly determined as a byproduct of distance sampling. We demonstrate that direct control over the VSP can significantly improve efficiency and present an unbiased volume rendering algorithm based on an existing resampling framework for precise control over the VSP. Compared to previous state-of-the-art, which can only increase the VSP without guaranteeing to reach the desired value, our method also supports decreasing the VSP. We further present a data-driven guiding framework to efficiently learn and query an approximation of the optimal VSP everywhere in the scene without the need for user control. Our approach can easily be combined with existing path-guiding methods for directional sampling at minimal overhead and shows significant improvements over the state-of-the-art in various complex volumetric lighting scenarios.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"14 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Learned optics, which incorporate lightweight diffractive optics, coded-aperture modulation, and specialized image-processing neural networks, have recently garnered attention in the field of snapshot hyperspectral imaging (HSI). While conventional methods typically rely on a single lens element paired with an off-the-shelf color sensor, these setups, despite their widespread availability, present inherent limitations. First, the Bayer sensor's spectral response curves are not optimized for HSI applications, limiting spectral fidelity of the reconstruction. Second, single lens designs rely on a single diffractive optical element (DOE) to simultaneously encode spectral information and maintain spatial resolution across all wavelengths, which constrains spectral encoding capabilities. This work investigates a multi-channel lens array combined with aperture-wise color filters, all co-optimized alongside an image reconstruction network. This configuration enables independent spatial encoding and spectral response for each channel, improving optical encoding across both spatial and spectral dimensions. Specifically, we validate that the method achieves over a 5dB improvement in PSNR for spectral reconstruction compared to existing single-diffractive lens and coded-aperture techniques. Experimental validation further confirmed that the method is capable of recovering up to 31 spectral bands within the 429--700 nm range in diverse indoor and outdoor environments.
{"title":"Learned Multi-aperture Color-coded Optics for Snapshot Hyperspectral Imaging","authors":"Zheng Shi, Xiong Dun, Haoyu Wei, Siyu Dong, Zhanshan Wang, Xinbin Cheng, Felix Heide, Yifan Peng","doi":"10.1145/3687976","DOIUrl":"https://doi.org/10.1145/3687976","url":null,"abstract":"Learned optics, which incorporate lightweight diffractive optics, coded-aperture modulation, and specialized image-processing neural networks, have recently garnered attention in the field of snapshot hyperspectral imaging (HSI). While conventional methods typically rely on a single lens element paired with an off-the-shelf color sensor, these setups, despite their widespread availability, present inherent limitations. First, the Bayer sensor's spectral response curves are not optimized for HSI applications, limiting spectral fidelity of the reconstruction. Second, single lens designs rely on a single diffractive optical element (DOE) to simultaneously encode spectral information and maintain spatial resolution across all wavelengths, which constrains spectral encoding capabilities. This work investigates a multi-channel lens array combined with aperture-wise color filters, all co-optimized alongside an image reconstruction network. This configuration enables independent spatial encoding and spectral response for each channel, improving optical encoding across both spatial and spectral dimensions. Specifically, we validate that the method achieves over a 5dB improvement in PSNR for spectral reconstruction compared to existing single-diffractive lens and coded-aperture techniques. Experimental validation further confirmed that the method is capable of recovering up to 31 spectral bands within the 429--700 nm range in diverse indoor and outdoor environments.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"25 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Steffen Hinderink, Hendrik Brückler, Marcel Campen
A method for the construction of bijective volumetric maps between 3D shapes is presented. Arbitrary shapes of ball-topology are supported, overcoming restrictions of previous methods to convex or star-shaped targets. In essence, the mapping problem is decomposed into a set of simpler mapping problems, each of which can be solved with previous methods for discrete star-shaped mapping problems. Addressing the key challenges in this endeavor, algorithms are described to reliably construct structurally compatible partitions of two shapes with constraints regarding star-shapedness and to compute a parsimonious common refinement of two triangulations.
{"title":"Bijective Volumetric Mapping via Star Decomposition","authors":"Steffen Hinderink, Hendrik Brückler, Marcel Campen","doi":"10.1145/3687950","DOIUrl":"https://doi.org/10.1145/3687950","url":null,"abstract":"A method for the construction of bijective volumetric maps between 3D shapes is presented. Arbitrary shapes of ball-topology are supported, overcoming restrictions of previous methods to convex or star-shaped targets. In essence, the mapping problem is decomposed into a set of simpler mapping problems, each of which can be solved with previous methods for discrete star-shaped mapping problems. Addressing the key challenges in this endeavor, algorithms are described to reliably construct structurally compatible partitions of two shapes with constraints regarding star-shapedness and to compute a parsimonious common refinement of two triangulations.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"25 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a method to reproduce dynamic appearance textures with space-stationary but time-varying visual statistics. While most previous work decomposes dynamic textures into static appearance and motion, we focus on dynamic appearance that results not from motion but variations of fundamental properties, such as rusting, decaying, melting, and weathering. To this end, we adopt the neural ordinary differential equation (ODE) to learn the underlying dynamics of appearance from a target exemplar. We simulate the ODE in two phases. At the "warm-up" phase, the ODE diffuses a random noise to an initial state. We then constrain the further evolution of this ODE to replicate the evolution of visual feature statistics in the exemplar during the generation phase. The particular innovation of this work is the neural ODE achieving both denoising and evolution for dynamics synthesis, with a proposed temporal training scheme. We study both relightable (BRDF) and non-relightable (RGB) appearance models. For both we introduce new pilot datasets, allowing, for the first time, to study such phenomena: For RGB we provide 22 dynamic textures acquired from free online sources; For BRDFs, we further acquire a dataset of 21 flash-lit videos of time-varying materials, enabled by a simple-to-construct setup. Our experiments show that our method consistently yields realistic and coherent results, whereas prior works falter under pronounced temporal appearance variations. A user study confirms our approach is preferred to previous work for such exemplars.
{"title":"Neural Differential Appearance Equations","authors":"Chen Liu, Tobias Ritschel","doi":"10.1145/3687900","DOIUrl":"https://doi.org/10.1145/3687900","url":null,"abstract":"We propose a method to reproduce dynamic appearance textures with space-stationary but time-varying visual statistics. While most previous work decomposes dynamic textures into static appearance and motion, we focus on dynamic appearance that results not from motion but variations of fundamental properties, such as rusting, decaying, melting, and weathering. To this end, we adopt the neural ordinary differential equation (ODE) to learn the underlying dynamics of appearance from a target exemplar. We simulate the ODE in two phases. At the \"warm-up\" phase, the ODE diffuses a random noise to an initial state. We then constrain the further evolution of this ODE to replicate the evolution of visual feature statistics in the exemplar during the generation phase. The particular innovation of this work is the neural ODE achieving both denoising and evolution for dynamics synthesis, with a proposed temporal training scheme. We study both relightable (BRDF) and non-relightable (RGB) appearance models. For both we introduce new pilot datasets, allowing, for the first time, to study such phenomena: For RGB we provide 22 dynamic textures acquired from free online sources; For BRDFs, we further acquire a dataset of 21 flash-lit videos of time-varying materials, enabled by a simple-to-construct setup. Our experiments show that our method consistently yields realistic and coherent results, whereas prior works falter under pronounced temporal appearance variations. A user study confirms our approach is preferred to previous work for such exemplars.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"53 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bailey Miller, Rohan Sawhney, Keenan Crane, Ioannis Gkioulekas
We introduce a Monte Carlo method for computing derivatives of the solution to a partial differential equation (PDE) with respect to problem parameters (such as domain geometry or boundary conditions). Derivatives can be evaluated at arbitrary points, without performing a global solve or constructing a volumetric grid or mesh. The method is hence well suited to inverse problems with complex geometry, such as PDE-constrained shape optimization. Like other walk on spheres (WoS) algorithms, our method is trivial to parallelize, and is agnostic to boundary representation (meshes, splines, implicit surfaces, etc. ), supporting large topological changes. We focus in particular on screened Poisson equations, which model diverse problems from scientific and geometric computing. As in differentiable rendering, we jointly estimate derivatives with respect to all parameters---hence, cost does not grow significantly with parameter count. In practice, even noisy derivative estimates exhibit fast, stable convergence for stochastic gradient-based optimization, as we show through examples from thermal design, shape from diffusion, and computer graphics.
{"title":"Differential Walk on Spheres","authors":"Bailey Miller, Rohan Sawhney, Keenan Crane, Ioannis Gkioulekas","doi":"10.1145/3687913","DOIUrl":"https://doi.org/10.1145/3687913","url":null,"abstract":"We introduce a Monte Carlo method for computing derivatives of the solution to a partial differential equation (PDE) with respect to problem parameters (such as domain geometry or boundary conditions). Derivatives can be evaluated at arbitrary points, without performing a global solve or constructing a volumetric grid or mesh. The method is hence well suited to inverse problems with complex geometry, such as PDE-constrained shape optimization. Like other <jats:italic>walk on spheres (WoS)</jats:italic> algorithms, our method is trivial to parallelize, and is agnostic to boundary representation (meshes, splines, implicit surfaces, <jats:italic>etc.</jats:italic> ), supporting large topological changes. We focus in particular on screened Poisson equations, which model diverse problems from scientific and geometric computing. As in differentiable rendering, we jointly estimate derivatives with respect to all parameters---hence, cost does not grow significantly with parameter count. In practice, even noisy derivative estimates exhibit fast, stable convergence for stochastic gradient-based optimization, as we show through examples from thermal design, shape from diffusion, and computer graphics.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"39 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Understanding the part composition and structure of 3D shapes is crucial for a wide range of 3D applications, including 3D part assembly and 3D assembly completion. Compared to 3D part assembly, 3D assembly completion is more complicated which involves repairing broken or incomplete furniture that miss several parts with a toolkit. The primary challenge persists in how to reveal the potential part relations to infer the absent parts from multiple indistinguishable candidates with similar geometries, and complete for well-connected, structurally stable and aesthetically pleasing assemblies. This task necessitates not only specialized knowledge of part composition but, more importantly, an awareness of physical constraints, i.e. , connectivity, stability, and symmetry. Neglecting these constraints often results in assemblies that, although visually plausible, are impractical. To address this challenge, we propose PhysFiT, a physical-aware 3D shape understanding framework. This framework is built upon attention-based part relation modeling and incorporates connection modeling, simulation-free stability optimization and symmetric transformation consistency. We evaluate its efficacy on 3D part assembly and 3D assembly completion, a novel assembly task presented in this work. Extensive experiments demonstrate the effectiveness of PhysFiT in constructing geometrically sound and physically compliant assemblies.
{"title":"PhysFiT: Physical-aware 3D Shape Understanding for Finishing Incomplete Assembly","authors":"Weihao Wang, Mingyu You, Hongjun Zhou, Bin He","doi":"10.1145/3702226","DOIUrl":"https://doi.org/10.1145/3702226","url":null,"abstract":"Understanding the part composition and structure of 3D shapes is crucial for a wide range of 3D applications, including 3D part assembly and 3D assembly completion. Compared to 3D part assembly, 3D assembly completion is more complicated which involves repairing broken or incomplete furniture that miss several parts with a toolkit. The primary challenge persists in how to reveal the potential part relations to infer the absent parts from multiple indistinguishable candidates with similar geometries, and complete for well-connected, structurally stable and aesthetically pleasing assemblies. This task necessitates not only specialized knowledge of part composition but, more importantly, an awareness of physical constraints, <jats:italic>i.e.</jats:italic> , connectivity, stability, and symmetry. Neglecting these constraints often results in assemblies that, although visually plausible, are impractical. To address this challenge, we propose PhysFiT, a physical-aware 3D shape understanding framework. This framework is built upon attention-based part relation modeling and incorporates connection modeling, simulation-free stability optimization and symmetric transformation consistency. We evaluate its efficacy on 3D part assembly and 3D assembly completion, a novel assembly task presented in this work. Extensive experiments demonstrate the effectiveness of PhysFiT in constructing geometrically sound and physically compliant assemblies.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"17 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142541706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Implicit volumes are known for their ability to represent smooth shapes of arbitrary topology thanks to hierarchical combinations of primitives using a structure called a blobtree. We present a new tile-based rendering pipeline well suited for modeling scenarios, i.e., no preprocessing is required when primitive parameters are updated. When using approximate signed distance fields (fields with Lipschitz bound close to 1), we rely on compact, smooth CSG operators - extended from standard bounded operators - to compute a tight augmented bounding volume for all primitives of the blobtree. The pipeline relies on a low-resolution A-buffer storing the primitives of interest of a given screen tile. The A-buffer is then used during ray processing to synchronize threads within a subfrustum. This allows coherent field evaluation within workgroups. We use a sparse bottom-up tree traversal to prune the blobtree on-the-fly which allows us to decorrelate field evaluation complexity from the full blobtree size. The ray processing itself is done using the sphere tracing algorithm. The pipeline scales well to volumes consisting of thousands of primitives.
众所周知,隐含体能够表现任意拓扑结构的平滑形状,这要归功于使用一种叫做 blobtree 的结构对基元进行分层组合。我们提出了一种新的基于瓦片的渲染管道,非常适合建模场景,即在更新基元参数时无需进行预处理。在使用近似带符号距离场(Lipschitz 边界接近 1 的场)时,我们依靠紧凑、平滑的 CSG 算子(从标准有界算子扩展而来),为 blobtree 的所有基元计算紧密的增强边界体积。该流水线依靠低分辨率的 A 型缓冲区来存储给定屏幕磁贴中感兴趣的基元。然后在光线处理过程中使用 A 缓冲区来同步子信道内的线程。这样就能在工作组内进行连贯的实地评估。我们使用自下而上的稀疏树遍历来即时修剪 Blobtree,这样就能将字段评估复杂度与整个 Blobtree 的大小区分开来。射线处理本身是通过球体追踪算法完成的。该管道可以很好地扩展到由数千个基元组成的体积。
{"title":"Synchronized tracing of primitive-based implicit volumes","authors":"Cédric Zanni","doi":"10.1145/3702227","DOIUrl":"https://doi.org/10.1145/3702227","url":null,"abstract":"Implicit volumes are known for their ability to represent smooth shapes of arbitrary topology thanks to hierarchical combinations of primitives using a structure called a blobtree. We present a new tile-based rendering pipeline well suited for modeling scenarios, i.e., no preprocessing is required when primitive parameters are updated. When using approximate signed distance fields (fields with Lipschitz bound close to 1), we rely on compact, smooth CSG operators - extended from standard bounded operators - to compute a tight augmented bounding volume for all primitives of the blobtree. The pipeline relies on a low-resolution A-buffer storing the primitives of interest of a given screen tile. The A-buffer is then used during ray processing to synchronize threads within a subfrustum. This allows coherent field evaluation within workgroups. We use a sparse bottom-up tree traversal to prune the blobtree on-the-fly which allows us to decorrelate field evaluation complexity from the full blobtree size. The ray processing itself is done using the sphere tracing algorithm. The pipeline scales well to volumes consisting of thousands of primitives.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"6 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142536810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heming Zhu, Fangneng Zhan, Christian Theobalt, Marc Habermann
Creating controllable, photorealistic, and geometrically detailed digital doubles of real humans solely from video data is a key challenge in Computer Graphics and Vision, especially when real-time performance is required. Recent methods attach a neural radiance field (NeRF) to an articulated structure, e.g., a body model or a skeleton, to map points into a pose canonical space while conditioning the NeRF on the skeletal pose. These approaches typically parameterize the neural field with a multi-layer perceptron (MLP) leading to a slow runtime. To address this drawback, we propose TriHuman a novel human-tailored, deformable, and efficient tri-plane representation, which achieves real-time performance, state-of-the-art pose-controllable geometry synthesis as well as photorealistic rendering quality. At the core, we non-rigidly warp global ray samples into our undeformed tri-plane texture space, which effectively addresses the problem of global points being mapped to the same tri-plane locations. We then show how such a tri-plane feature representation can be conditioned on the skeletal motion to account for dynamic appearance and geometry changes. Our results demonstrate a clear step towards higher quality in terms of geometry and appearance modeling of humans as well as runtime performance.
{"title":"TriHuman : A Real-time and Controllable Tri-plane Representation for Detailed Human Geometry and Appearance Synthesis","authors":"Heming Zhu, Fangneng Zhan, Christian Theobalt, Marc Habermann","doi":"10.1145/3697140","DOIUrl":"https://doi.org/10.1145/3697140","url":null,"abstract":"Creating controllable, photorealistic, and geometrically detailed digital doubles of real humans solely from video data is a key challenge in Computer Graphics and Vision, especially when real-time performance is required. Recent methods attach a neural radiance field (NeRF) to an articulated structure, e.g., a body model or a skeleton, to map points into a pose canonical space while conditioning the NeRF on the skeletal pose. These approaches typically parameterize the neural field with a multi-layer perceptron (MLP) leading to a slow runtime. To address this drawback, we propose <jats:italic>TriHuman</jats:italic> a novel human-tailored, deformable, and efficient tri-plane representation, which achieves real-time performance, state-of-the-art pose-controllable geometry synthesis as well as photorealistic rendering quality. At the core, we non-rigidly warp global ray samples into our undeformed tri-plane texture space, which effectively addresses the problem of global points being mapped to the same tri-plane locations. We then show how such a tri-plane feature representation can be conditioned on the skeletal motion to account for dynamic appearance and geometry changes. Our results demonstrate a clear step towards higher quality in terms of geometry and appearance modeling of humans as well as runtime performance.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"4 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142374643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}