This paper aims to address the challenge of reconstructing long volumetric videos from multi-view RGB videos. Recent dynamic view synthesis methods leverage powerful 4D representations, like feature grids or point cloud sequences, to achieve high-quality rendering results. However, they are typically limited to short (1~2s) video clips and often suffer from large memory footprints when dealing with longer videos. To solve this issue, we propose a novel 4D representation, named Temporal Gaussian Hierarchy, to compactly model long volumetric videos. Our key observation is that there are generally various degrees of temporal redundancy in dynamic scenes, which consist of areas changing at different speeds. Motivated by this, our approach builds a multi-level hierarchy of 4D Gaussian primitives, where each level separately describes scene regions with different degrees of content change, and adaptively shares Gaussian primitives to represent unchanged scene content over different temporal segments, thus effectively reducing the number of Gaussian primitives. In addition, the tree-like structure of the Gaussian hierarchy allows us to efficiently represent the scene at a particular moment with a subset of Gaussian primitives, leading to nearly constant GPU memory usage during the training or rendering regardless of the video length. Moreover, we design a Compact Appearance Model that mixes diffuse and view-dependent Gaussians to further minimize the model size while maintaining the rendering quality. We also develop a rasterization pipeline of Gaussian primitives based on the hardware-accelerated technique to improve rendering speed. Extensive experimental results demonstrate the superiority of our method over alternative methods in terms of training cost, rendering speed, and storage usage. To our knowledge, this work is the first approach capable of efficiently handling hours of volumetric video data while maintaining state-of-the-art rendering quality.
{"title":"Representing Long Volumetric Video with Temporal Gaussian Hierarchy","authors":"Zhen Xu, Yinghao Xu, Zhiyuan Yu, Sida Peng, Jiaming Sun, Hujun Bao, Xiaowei Zhou","doi":"10.1145/3687919","DOIUrl":"https://doi.org/10.1145/3687919","url":null,"abstract":"This paper aims to address the challenge of reconstructing long volumetric videos from multi-view RGB videos. Recent dynamic view synthesis methods leverage powerful 4D representations, like feature grids or point cloud sequences, to achieve high-quality rendering results. However, they are typically limited to short (1~2s) video clips and often suffer from large memory footprints when dealing with longer videos. To solve this issue, we propose a novel 4D representation, named Temporal Gaussian Hierarchy, to compactly model long volumetric videos. Our key observation is that there are generally various degrees of temporal redundancy in dynamic scenes, which consist of areas changing at different speeds. Motivated by this, our approach builds a multi-level hierarchy of 4D Gaussian primitives, where each level separately describes scene regions with different degrees of content change, and adaptively shares Gaussian primitives to represent unchanged scene content over different temporal segments, thus effectively reducing the number of Gaussian primitives. In addition, the tree-like structure of the Gaussian hierarchy allows us to efficiently represent the scene at a particular moment with a subset of Gaussian primitives, leading to nearly constant GPU memory usage during the training or rendering regardless of the video length. Moreover, we design a Compact Appearance Model that mixes diffuse and view-dependent Gaussians to further minimize the model size while maintaining the rendering quality. We also develop a rasterization pipeline of Gaussian primitives based on the hardware-accelerated technique to improve rendering speed. Extensive experimental results demonstrate the superiority of our method over alternative methods in terms of training cost, rendering speed, and storage usage. To our knowledge, this work is the first approach capable of efficiently handling hours of volumetric video data while maintaining state-of-the-art rendering quality.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"99 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Time-of-flight (ToF) devices have greatly propelled the advancement of various multi-modal perception applications. However, achieving accurate rendering of time-resolved information remains a challenge, particularly in scenes involving complex geometries, diverse materials and participating media. Existing ToF rendering works have demonstrated notable results, yet they struggle with scenes involving scattering media and camera-warped settings. Other steady-state volumetric rendering methods exhibit significant bias or variance when directly applied to ToF rendering tasks. To address these challenges, we integrate transient diffusion theory into path construction and propose novel sampling methods for free-path distance and scattering direction, via resampled importance sampling and offline tabulation. An elliptical sampling method is further adapted to provide controllable vertex connection satisfying any required photon traversal time. In contrast to the existing temporal uniform sampling strategy, our method is the first to consider the contribution of transient radiance to importance-sample the full path, and thus enables improved temporal path construction under multiple scattering settings. The proposed method can be integrated into both path tracing and photon-based frameworks, delivering significant improvements in quality and efficiency with at least a 5x MSE reduction versus SOTA methods in equal rendering time.
飞行时间(ToF)设备极大地推动了各种多模态感知应用的发展。然而,实现时间分辨信息的精确渲染仍然是一项挑战,尤其是在涉及复杂几何形状、不同材料和参与介质的场景中。现有的 ToF 渲染工作已经取得了显著的成果,但在涉及散射介质和相机扭曲设置的场景中仍有困难。其他稳态体积渲染方法在直接应用于 ToF 渲染任务时,会表现出明显的偏差或差异。为了应对这些挑战,我们将瞬态扩散理论融入路径构建中,并通过重采样重要度采样和离线制表,为自由路径距离和散射方向提出了新颖的采样方法。我们进一步调整了椭圆采样方法,以提供可控的顶点连接,满足光子穿越时间的任何要求。与现有的时间均匀采样策略相比,我们的方法首次考虑了瞬态辐射对整个路径重要度采样的贡献,从而改进了多种散射设置下的时间路径构建。所提出的方法可以集成到路径追踪和基于光子的框架中,在同等渲染时间内,与 SOTA 方法相比,质量和效率都有显著提高,MSE 降低了至少 5 倍。
{"title":"DARTS: Diffusion Approximated Residual Time Sampling for Time-of-flight Rendering in Homogeneous Scattering Media","authors":"Qianyue He, Dongyu Du, Haitian Jiang, Xin Jin","doi":"10.1145/3687930","DOIUrl":"https://doi.org/10.1145/3687930","url":null,"abstract":"Time-of-flight (ToF) devices have greatly propelled the advancement of various multi-modal perception applications. However, achieving accurate rendering of time-resolved information remains a challenge, particularly in scenes involving complex geometries, diverse materials and participating media. Existing ToF rendering works have demonstrated notable results, yet they struggle with scenes involving scattering media and camera-warped settings. Other steady-state volumetric rendering methods exhibit significant bias or variance when directly applied to ToF rendering tasks. To address these challenges, we integrate transient diffusion theory into path construction and propose novel sampling methods for free-path distance and scattering direction, via resampled importance sampling and offline tabulation. An elliptical sampling method is further adapted to provide controllable vertex connection satisfying any required photon traversal time. In contrast to the existing temporal uniform sampling strategy, our method is the first to consider the contribution of transient radiance to importance-sample the full path, and thus enables improved temporal path construction under multiple scattering settings. The proposed method can be integrated into both path tracing and photon-based frameworks, delivering significant improvements in quality and efficiency with at least a 5x MSE reduction versus SOTA methods in equal rendering time.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"22 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose the Medial Skeletal Diagram, a novel skeletal representation that tackles the prevailing issues around skeleton sparsity and reconstruction accuracy in existing skeletal representations. Our approach augments the continuous elements in the medial axis representation to effectively shift the complexity away from the discrete elements. To that end, we introduce generalized enveloping primitives, an enhancement over the standard primitives in the medial axis, which ensure efficient coverage of intricate local features of the input shape and substantially reduce the number of discrete elements required. Moreover, we present a computational framework for constructing a medial skeletal diagram from an arbitrary closed manifold mesh. Our optimization pipeline ensures that the resulting medial skeletal diagram comprehensively covers the input shape with the fewest primitives. Additionally, each optimized primitive undergoes a post-refinement process to guarantee an accurate match with the source mesh in both geometry and tessellation. We validate our approach on a comprehensive benchmark of 100 shapes, demonstrating the sparsity of the discrete elements and superior reconstruction accuracy across a variety of cases. Finally, we exemplify the versatility of our representation in downstream applications such as shape generation, mesh decomposition, shape optimization, mesh alignment, mesh compression, and user-interactive design.
{"title":"Medial Skeletal Diagram: A Generalized Medial Axis Approach for Compact 3D Shape Representation","authors":"Minghao Guo, Bohan Wang, Wojciech Matusik","doi":"10.1145/3687964","DOIUrl":"https://doi.org/10.1145/3687964","url":null,"abstract":"We propose the Medial Skeletal Diagram, a novel skeletal representation that tackles the prevailing issues around skeleton sparsity and reconstruction accuracy in existing skeletal representations. Our approach augments the continuous elements in the medial axis representation to effectively shift the complexity away from the discrete elements. To that end, we introduce generalized enveloping primitives, an enhancement over the standard primitives in the medial axis, which ensure efficient coverage of intricate local features of the input shape and substantially reduce the number of discrete elements required. Moreover, we present a computational framework for constructing a medial skeletal diagram from an arbitrary closed manifold mesh. Our optimization pipeline ensures that the resulting medial skeletal diagram comprehensively covers the input shape with the fewest primitives. Additionally, each optimized primitive undergoes a post-refinement process to guarantee an accurate match with the source mesh in both geometry and tessellation. We validate our approach on a comprehensive benchmark of 100 shapes, demonstrating the sparsity of the discrete elements and superior reconstruction accuracy across a variety of cases. Finally, we exemplify the versatility of our representation in downstream applications such as shape generation, mesh decomposition, shape optimization, mesh alignment, mesh compression, and user-interactive design.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"10 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuming Huang, Yuhu Guo, Renbo Su, Xingjian Han, Junhao Ding, Tianyu Zhang, Tao Liu, Weiming Wang, Guoxin Fang, Xu Song, Emily Whiting, Charlie Wang
This paper presents a learning based planner for computing optimized 3D printing toolpaths on prescribed graphs, the challenges of which include the varying graph structures on different models and the large scale of nodes & edges on a graph. We adopt an on-the-fly strategy to tackle these challenges, formulating the planner as a Deep Q-Network (DQN) based optimizer to decide the next 'best' node to visit. We construct the state spaces by the Local Search Graph (LSG) centered at different nodes on a graph, which is encoded by a carefully designed algorithm so that LSGs in similar configurations can be identified to re-use the earlier learned DQN priors for accelerating the computation of toolpath planning. Our method can cover different 3D printing applications by defining their corresponding reward functions. Toolpath planning problems in wire-frame printing, continuous fiber printing, and metallic printing are selected to demonstrate its generality. The performance of our planner has been verified by testing the resultant toolpaths in physical experiments. By using our planner, wire-frame models with up to 4.2k struts can be successfully printed, up to 93.3% of sharp turns on continuous fiber toolpaths can be avoided, and the thermal distortion in metallic printing can be reduced by 24.9%.
本文提出了一种基于学习的规划器,用于计算规定图形上的优化 3D 打印工具路径,其挑战包括不同模型上的不同图形结构以及图形上的大规模节点和边。我们采用即时策略来应对这些挑战,将规划器设计为基于深度 Q 网络(DQN)的优化器,以决定下一个要访问的 "最佳 "节点。我们通过以图上不同节点为中心的局部搜索图(LSG)来构建状态空间,并通过精心设计的算法对其进行编码,这样就可以识别出类似配置中的 LSG,从而重新使用先前学习的 DQN 先验,加快工具路径规划的计算速度。通过定义相应的奖励函数,我们的方法可以涵盖不同的 3D 打印应用。我们选择了线框打印、连续纤维打印和金属打印中的工具路径规划问题来证明其通用性。通过在物理实验中测试生成的工具路径,我们的规划器的性能得到了验证。通过使用我们的规划器,可以成功地打印出多达 4.2k 支杆的线框模型,在连续纤维工具路径上可以避免多达 93.3% 的急转弯,在金属打印中可以减少 24.9% 的热变形。
{"title":"Learning Based Toolpath Planner on Diverse Graphs for 3D Printing","authors":"Yuming Huang, Yuhu Guo, Renbo Su, Xingjian Han, Junhao Ding, Tianyu Zhang, Tao Liu, Weiming Wang, Guoxin Fang, Xu Song, Emily Whiting, Charlie Wang","doi":"10.1145/3687933","DOIUrl":"https://doi.org/10.1145/3687933","url":null,"abstract":"This paper presents a learning based planner for computing optimized 3D printing toolpaths on prescribed graphs, the challenges of which include the varying graph structures on different models and the large scale of nodes & edges on a graph. We adopt an on-the-fly strategy to tackle these challenges, formulating the planner as a <jats:italic>Deep Q-Network</jats:italic> (DQN) based optimizer to decide the next 'best' node to visit. We construct the state spaces by the <jats:italic>Local Search Graph</jats:italic> (LSG) centered at different nodes on a graph, which is encoded by a carefully designed algorithm so that LSGs in similar configurations can be identified to re-use the earlier learned DQN priors for accelerating the computation of toolpath planning. Our method can cover different 3D printing applications by defining their corresponding reward functions. Toolpath planning problems in wire-frame printing, continuous fiber printing, and metallic printing are selected to demonstrate its generality. The performance of our planner has been verified by testing the resultant toolpaths in physical experiments. By using our planner, wire-frame models with up to 4.2k struts can be successfully printed, up to 93.3% of sharp turns on continuous fiber toolpaths can be avoided, and the thermal distortion in metallic printing can be reduced by 24.9%.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"38 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chun Yuan, Haoyang Shi, Lei Lan, Yuxing Qiu, Cem Yuksel, Huamin Wang, Chenfanfu Jiang, Kui Wu, Yin Yang
This paper presents volumetric homogenization, a spatially varying homogenization scheme for knitwear simulation. We are motivated by the observation that macro-scale fabric dynamics is strongly correlated with its underlying knitting patterns. Therefore, homogenization towards a single material is less effective when the knitting is complex and non-repetitive. Our method tackles this challenge by homogenizing the yarn-level material locally at volumetric elements. Assigning a virtual volume of a knitting structure enables us to model bending and twisting effects via a simple volume-preserving penalty and thus effectively alleviates the material nonlinearity. We employ an adjoint Gauss-Newton formulation[Zehnder et al. 2021] to battle the dimensionality challenge of such per-element material optimization. This intuitive material model makes the forward simulation GPU-friendly. To this end, our pipeline also equips a novel domain-decomposed subspace solver crafted for GPU projective dynamics, which makes our simulator hundreds of times faster than the yarn-level simulator. Experiments validate the capability and effectiveness of volumetric homogenization. Our method produces realistic animations of knitwear matching the quality of full-scale yarn-level simulations. It is also orders of magnitude faster than existing homogenization techniques in both the training and simulation stages.
{"title":"Volumetric Homogenization for Knitwear Simulation","authors":"Chun Yuan, Haoyang Shi, Lei Lan, Yuxing Qiu, Cem Yuksel, Huamin Wang, Chenfanfu Jiang, Kui Wu, Yin Yang","doi":"10.1145/3687911","DOIUrl":"https://doi.org/10.1145/3687911","url":null,"abstract":"This paper presents volumetric homogenization, a spatially varying homogenization scheme for knitwear simulation. We are motivated by the observation that macro-scale fabric dynamics is strongly correlated with its underlying knitting patterns. Therefore, homogenization towards a single material is less effective when the knitting is complex and non-repetitive. Our method tackles this challenge by homogenizing the yarn-level material locally at volumetric elements. Assigning a virtual volume of a knitting structure enables us to model bending and twisting effects via a simple volume-preserving penalty and thus effectively alleviates the material nonlinearity. We employ an adjoint Gauss-Newton formulation[Zehnder et al. 2021] to battle the dimensionality challenge of such per-element material optimization. This intuitive material model makes the forward simulation GPU-friendly. To this end, our pipeline also equips a novel domain-decomposed subspace solver crafted for GPU projective dynamics, which makes our simulator hundreds of times faster than the yarn-level simulator. Experiments validate the capability and effectiveness of volumetric homogenization. Our method produces realistic animations of knitwear matching the quality of full-scale yarn-level simulations. It is also orders of magnitude faster than existing homogenization techniques in both the training and simulation stages.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"22 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a novel approach to generate developable strips along a space curve. The key idea of the new method is to use the rotation angle between the Frenet frame of the input space curve, and its Darboux frame of the curve on the resulting developable strip as a free design parameter, thereby revolving the strip around the tangential axis of the input space curve. This angle is not restricted to be constant but it can be any differentiable function defined on the curve, thereby creating a large design space of developable strips that share a common directrix curve. The range of possibilities for choosing the rotation angle is diverse, encompassing constant angles, linearly varying angles, sinusoidal patterns, and even solutions derived from initial value problems involving ordinary differential equations. This enables the potential of the proposed method to be used for a wide range of practical applications, spanning fields such as architectural design, industrial design, and papercraft modeling. In our computational and physical examples, we demonstrate the flexibility of the method by constructing, among others, toroidal and helical windmill blades for papercraft models, curved foldings, triply orthogonal structures, and developable strips featuring a log-aesthetic directrix curve.
{"title":"All you need is rotation: Construction of developable strips","authors":"Takashi Maekawa, Felix Scholz","doi":"10.1145/3687947","DOIUrl":"https://doi.org/10.1145/3687947","url":null,"abstract":"We present a novel approach to generate developable strips along a space curve. The key idea of the new method is to use the rotation angle between the Frenet frame of the input space curve, and its Darboux frame of the curve on the resulting developable strip as a free design parameter, thereby revolving the strip around the tangential axis of the input space curve. This angle is not restricted to be constant but it can be any differentiable function defined on the curve, thereby creating a large design space of developable strips that share a common directrix curve. The range of possibilities for choosing the rotation angle is diverse, encompassing constant angles, linearly varying angles, sinusoidal patterns, and even solutions derived from initial value problems involving ordinary differential equations. This enables the potential of the proposed method to be used for a wide range of practical applications, spanning fields such as architectural design, industrial design, and papercraft modeling. In our computational and physical examples, we demonstrate the flexibility of the method by constructing, among others, toroidal and helical windmill blades for papercraft models, curved foldings, triply orthogonal structures, and developable strips featuring a log-aesthetic directrix curve.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"10 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicolas Moenne-Loccoz, Ashkan Mirzaei, Or Perel, Riccardo de Lutio, Janick Martinez Esturo, Gavriel State, Sanja Fidler, Nicholas Sharp, Zan Gojcic
Particle-based representations of radiance fields such as 3D Gaussian Splatting have found great success for reconstructing and re-rendering of complex scenes. Most existing methods render particles via rasterization, projecting them to screen space tiles for processing in a sorted order. This work instead considers ray tracing the particles, building a bounding volume hierarchy and casting a ray for each pixel using high-performance GPU ray tracing hardware. To efficiently handle large numbers of semi-transparent particles, we describe a specialized rendering algorithm which encapsulates particles with bounding meshes to leverage fast ray-triangle intersections, and shades batches of intersections in depth-order. The benefits of ray tracing are well-known in computer graphics: processing incoherent rays for secondary lighting effects such as shadows and reflections, rendering from highly-distorted cameras common in robotics, stochastically sampling rays, and more. With our renderer, this flexibility comes at little cost compared to rasterization. Experiments demonstrate the speed and accuracy of our approach, as well as several applications in computer graphics and vision. We further propose related improvements to the basic Gaussian representation, including a simple use of generalized kernel functions which significantly reduces particle hit counts.
{"title":"3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes","authors":"Nicolas Moenne-Loccoz, Ashkan Mirzaei, Or Perel, Riccardo de Lutio, Janick Martinez Esturo, Gavriel State, Sanja Fidler, Nicholas Sharp, Zan Gojcic","doi":"10.1145/3687934","DOIUrl":"https://doi.org/10.1145/3687934","url":null,"abstract":"Particle-based representations of radiance fields such as 3D Gaussian Splatting have found great success for reconstructing and re-rendering of complex scenes. Most existing methods render particles via rasterization, projecting them to screen space tiles for processing in a sorted order. This work instead considers ray tracing the particles, building a bounding volume hierarchy and casting a ray for each pixel using high-performance GPU ray tracing hardware. To efficiently handle large numbers of semi-transparent particles, we describe a specialized rendering algorithm which encapsulates particles with bounding meshes to leverage fast ray-triangle intersections, and shades batches of intersections in depth-order. The benefits of ray tracing are well-known in computer graphics: processing incoherent rays for secondary lighting effects such as shadows and reflections, rendering from highly-distorted cameras common in robotics, stochastically sampling rays, and more. With our renderer, this flexibility comes at little cost compared to rasterization. Experiments demonstrate the speed and accuracy of our approach, as well as several applications in computer graphics and vision. We further propose related improvements to the basic Gaussian representation, including a simple use of generalized kernel functions which significantly reduces particle hit counts.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"14 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John Flynn, Michael Broxton, Lukas Murmann, Lucy Chai, Matthew DuVall, Clément Godard, Kathryn Heal, Srinivas Kaza, Stephen Lombardi, Xuan Luo, Supreeth Achar, Kira Prabhu, Tiancheng Sun, Lynn Tsai, Ryan Overbeck
We present a novel neural algorithm for performing high-quality, highresolution, real-time novel view synthesis. From a sparse set of input RGB images or videos streams, our network both reconstructs the 3D scene and renders novel views at 1080p resolution at 30fps on an NVIDIA A100. Our feed-forward network generalizes across a wide variety of datasets and scenes and produces state-of-the-art quality for a real-time method. Our quality approaches, and in some cases surpasses, the quality of some of the top offline methods. In order to achieve these results we use a novel combination of several key concepts, and tie them together into a cohesive and effective algorithm. We build on previous works that represent the scene using semi-transparent layers and use an iterative learned render-and-refine approach to improve those layers. Instead of flat layers, our method reconstructs layered depth maps (LDMs) that efficiently represent scenes with complex depth and occlusions. The iterative update steps are embedded in a multi-scale, UNet-style architecture to perform as much compute as possible at reduced resolution. Within each update step, to better aggregate the information from multiple input views, we use a specialized Transformer-based network component. This allows the majority of the per-input image processing to be performed in the input image space, as opposed to layer space, further increasing efficiency. Finally, due to the real-time nature of our reconstruction and rendering, we dynamically create and discard the internal 3D geometry for each frame, generating the LDM for each view. Taken together, this produces a novel and effective algorithm for view synthesis. Through extensive evaluation, we demonstrate that we achieve state-of-the-art quality at real-time rates.
{"title":"Quark: Real-time, High-resolution, and General Neural View Synthesis","authors":"John Flynn, Michael Broxton, Lukas Murmann, Lucy Chai, Matthew DuVall, Clément Godard, Kathryn Heal, Srinivas Kaza, Stephen Lombardi, Xuan Luo, Supreeth Achar, Kira Prabhu, Tiancheng Sun, Lynn Tsai, Ryan Overbeck","doi":"10.1145/3687953","DOIUrl":"https://doi.org/10.1145/3687953","url":null,"abstract":"We present a novel neural algorithm for performing high-quality, highresolution, real-time novel view synthesis. From a sparse set of input RGB images or videos streams, our network both reconstructs the 3D scene and renders novel views at 1080p resolution at 30fps on an NVIDIA A100. Our feed-forward network generalizes across a wide variety of datasets and scenes and produces state-of-the-art quality for a real-time method. Our quality approaches, and in some cases surpasses, the quality of some of the top offline methods. In order to achieve these results we use a novel combination of several key concepts, and tie them together into a cohesive and effective algorithm. We build on previous works that represent the scene using semi-transparent layers and use an iterative learned render-and-refine approach to improve those layers. Instead of flat layers, our method reconstructs layered depth maps (LDMs) that efficiently represent scenes with complex depth and occlusions. The iterative update steps are embedded in a multi-scale, UNet-style architecture to perform as much compute as possible at reduced resolution. Within each update step, to better aggregate the information from multiple input views, we use a specialized Transformer-based network component. This allows the majority of the per-input image processing to be performed in the input image space, as opposed to layer space, further increasing efficiency. Finally, due to the real-time nature of our reconstruction and rendering, we dynamically create and discard the internal 3D geometry for each frame, generating the LDM for each view. Taken together, this produces a novel and effective algorithm for view synthesis. Through extensive evaluation, we demonstrate that we achieve state-of-the-art quality at real-time rates.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"14 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pengju Qiao, Qi Wang, Yuchi Huo, Shiji Zhai, Zixuan Xie, Wei Hua, Hujun Bao, Tao Liu
Unbiased Monte Carlo path tracing that is extensively used in realistic rendering produces undesirable noise, especially with low samples per pixel (spp). Recently, several methods have coped with this problem by importing unbiased noisy images and auxiliary features to neural networks to either predict a fixed-sized kernel for convolution or directly predict the denoised result. Since it is impossible to produce arbitrarily high spp images as the training dataset, the network-based denoising fails to produce high-quality images under high spp. More specifically, network-based denoising is inconsistent and does not converge to the ground truth as the sampling rate increases. On the other hand, the post-correction estimators yield a blending coefficient for a pair of biased and unbiased images influenced by image errors or variances to ensure the consistency of the denoised image. As the sampling rate increases, the blending coefficient of the unbiased image converges to 1, that is, using the unbiased image as the denoised results. However, these estimators usually produce artifacts due to the difficulty of accurately predicting image errors or variances with low spp. To address the above problems, we take advantage of both kernel-predicting methods and post-correction denoisers. A novel kernel-based denoiser is proposed based on distribution-free kernel regression consistency theory, which does not explicitly combine the biased and unbiased results but constrain the kernel bandwidth to produce consistent results under high spp. Meanwhile, our kernel regression method explores bandwidth optimization in the robust auxiliary feature space instead of the noisy image space. This leads to consistent high-quality denoising at both low and high spp. Experiment results demonstrate that our method outperforms existing denoisers in accuracy and consistency.
{"title":"Neural Kernel Regression for Consistent Monte Carlo Denoising","authors":"Pengju Qiao, Qi Wang, Yuchi Huo, Shiji Zhai, Zixuan Xie, Wei Hua, Hujun Bao, Tao Liu","doi":"10.1145/3687949","DOIUrl":"https://doi.org/10.1145/3687949","url":null,"abstract":"Unbiased Monte Carlo path tracing that is extensively used in realistic rendering produces undesirable noise, especially with low samples per pixel (spp). Recently, several methods have coped with this problem by importing unbiased noisy images and auxiliary features to neural networks to either predict a fixed-sized kernel for convolution or directly predict the denoised result. Since it is impossible to produce arbitrarily high spp images as the training dataset, the network-based denoising fails to produce high-quality images under high spp. More specifically, network-based denoising is inconsistent and does not converge to the ground truth as the sampling rate increases. On the other hand, the post-correction estimators yield a blending coefficient for a pair of biased and unbiased images influenced by image errors or variances to ensure the consistency of the denoised image. As the sampling rate increases, the blending coefficient of the unbiased image converges to 1, that is, using the unbiased image as the denoised results. However, these estimators usually produce artifacts due to the difficulty of accurately predicting image errors or variances with low spp. To address the above problems, we take advantage of both kernel-predicting methods and post-correction denoisers. A novel kernel-based denoiser is proposed based on distribution-free kernel regression consistency theory, which does not explicitly combine the biased and unbiased results but constrain the kernel bandwidth to produce consistent results under high spp. Meanwhile, our kernel regression method explores bandwidth optimization in the robust auxiliary feature space instead of the noisy image space. This leads to consistent high-quality denoising at both low and high spp. Experiment results demonstrate that our method outperforms existing denoisers in accuracy and consistency.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"197 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Annika Öhri, Aviv Segall, Jing Ren, Olga Sorkine-Hornung
Distortion-minimizing surface parameterization is an essential step for computing 2D pieces necessary to fabricate a target 3D shape from flat material. Garment design and textile fabrication are a prominent application example. Common distortion measures quantify length, angle or area preservation in an isotropic manner, so that when applied to woven textile fabrication, they implicitly assume fabric behaves like paper, which is inextensible in all directions and does not permit shearing. However, woven fabric differs significantly from paper: it exhibits anisotropy along the yarn directions and allows for some degree of shearing. We propose a novel distortion energy based on Chebyshev nets that anisotropically penalizes shearing and stretching. Our energy formulation can be used as an optimization objective for surface parameterization and is simple to minimize via a local-global algorithm. We demonstrate its advantages in modeling nets or woven fabric behavior over the commonly used isotropic distortion energies.
{"title":"Chebyshev Parameterization for Woven Fabric Modeling","authors":"Annika Öhri, Aviv Segall, Jing Ren, Olga Sorkine-Hornung","doi":"10.1145/3687928","DOIUrl":"https://doi.org/10.1145/3687928","url":null,"abstract":"Distortion-minimizing surface parameterization is an essential step for computing 2D pieces necessary to fabricate a target 3D shape from flat material. Garment design and textile fabrication are a prominent application example. Common distortion measures quantify length, angle or area preservation in an isotropic manner, so that when applied to woven textile fabrication, they implicitly assume fabric behaves like paper, which is inextensible in all directions and does not permit shearing. However, woven fabric differs significantly from paper: it exhibits anisotropy along the yarn directions and allows for some degree of shearing. We propose a novel distortion energy based on Chebyshev nets that anisotropically penalizes shearing and stretching. Our energy formulation can be used as an optimization objective for surface parameterization and is simple to minimize via a local-global algorithm. We demonstrate its advantages in modeling nets or woven fabric behavior over the commonly used isotropic distortion energies.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"38 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}