首页 > 最新文献

ACM Transactions on Graphics最新文献

英文 中文
CHOICE: Coordinated Human-Object Interaction in Cluttered Environments for Pick-and-Place Actions 选择:在混乱的拾取和放置动作环境中协调人机交互
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-02 DOI: 10.1145/3770746
Jintao Lu, He Zhang, Yuting Ye, Takaaki Shiratori, Sebastian Starke, Taku Komura
Animating human-scene interactions such as picking and placing a wide range of objects with different geometries is a challenging task, especially in a cluttered environment where interactions with complex articulated containers are involved. The main difficulty lies in the sparsity of the motion data compared to the wide variation of the objects and environments, as well as the poor availability of transition motions between different actions, increasing the complexity of the generalization to arbitrary conditions. To cope with this issue, we develop a system that tackles the interaction synthesis problem as a hierarchical goal-driven task. Firstly, we develop a bimanual scheduler that plans a set of keyframes for simultaneously controlling the two hands to efficiently achieve the pick-and-place task from an abstract goal signal such as the target object selected by the user. Next, we develop a neural implicit planner that generates hand trajectories to guide reaching and leaving motions across diverse object shapes/types and obstacle layouts. Finally, we propose a linear dynamic model for our DeepPhase controller that incorporates a Kalman filter to enable smooth transitions in the frequency domain, resulting in a more realistic and effective multi-objective control of the character. Our system can synthesize a rich variety of natural pick-and-place movements that adapt to different object geometries, container articulations, and scene layouts.
动画人类场景交互,如挑选和放置具有不同几何形状的各种对象是一项具有挑战性的任务,特别是在涉及复杂铰接容器交互的混乱环境中。主要的困难在于运动数据的稀疏性与对象和环境的广泛变化相比,以及不同动作之间过渡运动的可用性较差,增加了泛化到任意条件的复杂性。为了解决这个问题,我们开发了一个系统,将交互综合问题作为分层目标驱动的任务来处理。首先,我们开发了一个手动调度程序,该调度程序规划了一组关键帧,用于同时控制两只手,以有效地从用户选择的目标对象等抽象目标信号中实现拾取任务。接下来,我们开发了一个神经隐式规划器,生成手轨迹来指导跨越不同物体形状/类型和障碍物布局的到达和离开运动。最后,我们为我们的DeepPhase控制器提出了一个线性动态模型,该模型包含一个卡尔曼滤波器,以实现频域的平滑过渡,从而实现更现实和有效的多目标控制。我们的系统可以合成丰富多样的自然拾取和放置运动,以适应不同的物体几何形状、容器铰接和场景布局。
{"title":"CHOICE: Coordinated Human-Object Interaction in Cluttered Environments for Pick-and-Place Actions","authors":"Jintao Lu, He Zhang, Yuting Ye, Takaaki Shiratori, Sebastian Starke, Taku Komura","doi":"10.1145/3770746","DOIUrl":"https://doi.org/10.1145/3770746","url":null,"abstract":"Animating human-scene interactions such as picking and placing a wide range of objects with different geometries is a challenging task, especially in a cluttered environment where interactions with complex articulated containers are involved. The main difficulty lies in the sparsity of the motion data compared to the wide variation of the objects and environments, as well as the poor availability of transition motions between different actions, increasing the complexity of the generalization to arbitrary conditions. To cope with this issue, we develop a system that tackles the interaction synthesis problem as a hierarchical goal-driven task. Firstly, we develop a bimanual scheduler that plans a set of keyframes for simultaneously controlling the two hands to efficiently achieve the pick-and-place task from an abstract goal signal such as the target object selected by the user. Next, we develop a neural implicit planner that generates hand trajectories to guide reaching and leaving motions across diverse object shapes/types and obstacle layouts. Finally, we propose a linear dynamic model for our DeepPhase controller that incorporates a Kalman filter to enable smooth transitions in the frequency domain, resulting in a more realistic and effective multi-objective control of the character. Our system can synthesize a rich variety of natural pick-and-place movements that adapt to different object geometries, container articulations, and scene layouts.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"55 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RemixFusion: Residual-based Mixed Representation for Large-scale Online RGB-D Reconstruction RemixFusion:基于残差的大规模在线RGB-D重建混合表示
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-19 DOI: 10.1145/3769007
Yuqing Lan, Chenyang Zhu, Shuaifeng Zhi, Jiazhao Zhang, Zhoufeng Wang, Renjiao Yi, Yijie Wang, Kai Xu
The introduction of the neural implicit representation has notably propelled the advancement of online dense reconstruction techniques. Compared to traditional explicit representations, such as TSDF, it substantially improves the mapping completeness and memory efficiency. However, the lack of reconstruction details and the time-consuming learning of neural representations hinder the widespread application of neural-based methods to large-scale online reconstruction. We introduce RemixFusion, a novel residual-based mixed representation for scene reconstruction and camera pose estimation dedicated to high-quality and large-scale online RGB-D reconstruction. In particular, we propose a residual-based map representation comprised of an explicit coarse TSDF grid and an implicit neural module that produces residuals representing fine-grained details to be added to the coarse grid. Such mixed representation allows for detail-rich reconstruction with bounded time and memory budget, contrasting with the overly-smoothed results by the purely implicit representations, thus paving the way for high-quality camera tracking. Furthermore, we extend the residual-based representation to handle multi-frame joint pose optimization via bundle adjustment (BA). In contrast to the existing methods, which optimize poses directly, we opt to optimize pose changes. Combined with a novel technique for adaptive gradient amplification, our method attains better optimization convergence and global optimality. Furthermore, we adopt a local moving volume to factorize the whole mixed scene representation with a divide-and-conquer design to facilitate efficient online learning in our residual-based framework. Extensive experiments demonstrate that our method surpasses all state-of-the-art ones, including those based either on explicit or implicit representations, in terms of the accuracy of both mapping and tracking on large-scale scenes.
神经隐式表示的引入极大地推动了在线密集重建技术的发展。与传统的显式表示(如TSDF)相比,它大大提高了映射的完整性和内存效率。然而,缺乏重建细节和耗时的神经表征学习阻碍了基于神经的方法在大规模在线重建中的广泛应用。我们介绍了RemixFusion,一种新的基于残差的混合表示,用于场景重建和相机姿态估计,致力于高质量和大规模的在线RGB-D重建。特别是,我们提出了一种基于残差的地图表示,该表示由显式粗TSDF网格和隐式神经模块组成,该模块产生代表细粒度细节的残差,以添加到粗网格中。这种混合表示允许在有限的时间和内存预算下进行细节丰富的重建,与纯隐式表示过于平滑的结果形成鲜明对比,从而为高质量的相机跟踪铺平了道路。此外,我们扩展了基于残差的表示,通过束调整(BA)处理多帧关节位姿优化。与直接优化姿态的现有方法不同,我们选择优化姿态变化。结合一种新颖的自适应梯度放大技术,该方法具有较好的优化收敛性和全局最优性。此外,我们采用局部移动体积来分解整个混合场景表示,并采用分而治之的设计,以促进我们基于残差的框架中的高效在线学习。大量的实验表明,我们的方法在大规模场景的映射和跟踪的准确性方面超过了所有最先进的方法,包括那些基于显式或隐式表示的方法。
{"title":"RemixFusion: Residual-based Mixed Representation for Large-scale Online RGB-D Reconstruction","authors":"Yuqing Lan, Chenyang Zhu, Shuaifeng Zhi, Jiazhao Zhang, Zhoufeng Wang, Renjiao Yi, Yijie Wang, Kai Xu","doi":"10.1145/3769007","DOIUrl":"https://doi.org/10.1145/3769007","url":null,"abstract":"The introduction of the neural implicit representation has notably propelled the advancement of online dense reconstruction techniques. Compared to traditional explicit representations, such as TSDF, it substantially improves the mapping completeness and memory efficiency. However, the lack of reconstruction details and the time-consuming learning of neural representations hinder the widespread application of neural-based methods to large-scale online reconstruction. We introduce RemixFusion, a novel residual-based mixed representation for scene reconstruction and camera pose estimation dedicated to high-quality and large-scale online RGB-D reconstruction. In particular, we propose a residual-based map representation comprised of an explicit coarse TSDF grid and an implicit neural module that produces residuals representing fine-grained details to be added to the coarse grid. Such mixed representation allows for detail-rich reconstruction with bounded time and memory budget, contrasting with the overly-smoothed results by the purely implicit representations, thus paving the way for high-quality camera tracking. Furthermore, we extend the residual-based representation to handle multi-frame joint pose optimization via bundle adjustment (BA). In contrast to the existing methods, which optimize poses directly, we opt to optimize pose changes. Combined with a novel technique for adaptive gradient amplification, our method attains better optimization convergence and global optimality. Furthermore, we adopt a local moving volume to factorize the whole mixed scene representation with a divide-and-conquer design to facilitate efficient online learning in our residual-based framework. Extensive experiments demonstrate that our method surpasses all state-of-the-art ones, including those based either on explicit or implicit representations, in terms of the accuracy of both mapping and tracking on large-scale scenes.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"38 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145089117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Local Surface Parameterizations via Smoothed Geodesic Splines 通过光滑测地线样条进行局部曲面参数化
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-17 DOI: 10.1145/3767323
Abhishek Madan, David Levin
We present a general method for computing local parameterizations rooted at a point on a surface, where the surface is described only through a signed implicit function and a corresponding projection function. Using a two-stage process, we compute several points radially emanating from the map origin, and interpolate between them with a spline surface. The narrow interface of our method allows it to support several kinds of geometry such as signed distance functions, general analytic implicit functions, triangle meshes, neural implicits, and point clouds. We demonstrate the high quality of our generated parameterizations on a variety of examples, and show applications in local texturing and surface curve drawing.
我们提出了一种计算植根于曲面上一点的局部参数化的一般方法,其中曲面仅通过符号隐式函数和相应的投影函数来描述。使用两阶段的过程,我们计算了从地图原点径向发散的几个点,并用样条曲面在它们之间进行插值。我们方法的窄接口允许它支持几种几何类型,如符号距离函数、一般解析隐式函数、三角形网格、神经隐式和点云。我们在各种例子上展示了我们生成的参数化的高质量,并展示了在局部纹理和表面曲线绘制中的应用。
{"title":"Local Surface Parameterizations via Smoothed Geodesic Splines","authors":"Abhishek Madan, David Levin","doi":"10.1145/3767323","DOIUrl":"https://doi.org/10.1145/3767323","url":null,"abstract":"We present a general method for computing local parameterizations rooted at a point on a surface, where the surface is described only through a signed implicit function and a corresponding projection function. Using a two-stage process, we compute several points radially emanating from the map origin, and interpolate between them with a spline surface. The narrow interface of our method allows it to support several kinds of geometry such as signed distance functions, general analytic implicit functions, triangle meshes, neural implicits, and point clouds. We demonstrate the high quality of our generated parameterizations on a variety of examples, and show applications in local texturing and surface curve drawing.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"18 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145084085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Many-Worlds Inverse Rendering 多世界逆向渲染
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-12 DOI: 10.1145/3767318
Ziyi Zhang, Nicolas Roussel, Wenzel Jakob
Discontinuous visibility changes remain a major bottleneck when optimizing surfaces within a physically based inverse renderer. Many previous works have proposed sophisticated algorithms and data structures to sample visibility silhouettes more efficiently. Our work presents another solution: instead of evolving a surface locally, we extend differentiation to hypothetical surface patches anywhere in 3D space. We refer to this as a “many-worlds” representation because it models a superposition of independent surface hypotheses that compete to explain the reference images. These hypotheses do not interact through shadowing or scattering, leading to a new transport law that distinguishes our method from prior work based on exponential random media. The complete elimination of visibility-related discontinuity handling bypasses the most complex and costly component of prior inverse rendering methods, while the extended derivative domain promotes rapid convergence. We demonstrate that the resulting Monte Carlo algorithm solves physically based inverse problems with both reduced per-iteration cost and fewer total iterations.
在基于物理的反向渲染器中优化表面时,不连续的可见性变化仍然是一个主要瓶颈。许多先前的工作已经提出了复杂的算法和数据结构来更有效地采样可见性轮廓。我们的工作提出了另一种解决方案:我们不是局部进化表面,而是将分化扩展到3D空间中的任何假设表面斑块。我们将其称为“多世界”表示,因为它模拟了相互竞争以解释参考图像的独立表面假设的叠加。这些假设不会通过阴影或散射相互作用,从而导致新的传输定律,将我们的方法与先前基于指数随机介质的工作区分开来。完全消除了与可见性相关的不连续处理,绕过了先前逆绘制方法中最复杂和最昂贵的部分,而扩展的导数域促进了快速收敛。我们证明了所得到的蒙特卡罗算法解决了基于物理的逆问题,减少了每次迭代的成本和更少的总迭代。
{"title":"Many-Worlds Inverse Rendering","authors":"Ziyi Zhang, Nicolas Roussel, Wenzel Jakob","doi":"10.1145/3767318","DOIUrl":"https://doi.org/10.1145/3767318","url":null,"abstract":"Discontinuous visibility changes remain a major bottleneck when optimizing surfaces within a physically based inverse renderer. Many previous works have proposed sophisticated algorithms and data structures to sample visibility silhouettes more efficiently. Our work presents another solution: instead of evolving a surface locally, we extend differentiation to hypothetical surface patches anywhere in 3D space. We refer to this as a “many-worlds” representation because it models a superposition of independent surface hypotheses that compete to explain the reference images. These hypotheses do not interact through shadowing or scattering, leading to a new transport law that distinguishes our method from prior work based on exponential random media. The complete elimination of visibility-related discontinuity handling bypasses the most complex and costly component of prior inverse rendering methods, while the extended derivative domain promotes rapid convergence. We demonstrate that the resulting Monte Carlo algorithm solves physically based inverse problems with both reduced per-iteration cost and fewer total iterations.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"14 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145072127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A fast, efficient, and robust feature protected denoising method 一种快速、高效、鲁棒的特征保护去噪方法
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-03 DOI: 10.1145/3765902
Mengyu Luo, Jian Wang
This paper proposes a fast, efficient, and robust feature protected 3D mesh denoising method based on a modified Lengyel-Epstein (LE) model, primarily aiming to ensure volume stability and deliver superior denoising results. Compared with the original model, we mainly introduce a function expression ζ ( X ) to replace the fixed parameters. The modified model is then discretized using a seven-point difference scheme and solved by an explicit Euler method. Notably, our approach requires no training samples or upfront training time, significantly enhancing overall computational efficiency.
本文提出了一种基于改进的lengye - epstein (LE)模型的快速、高效、鲁棒的特征保护的三维网格去噪方法,主要目的是保证体积稳定性并提供良好的去噪效果。与原模型相比,我们主要引入一个函数表达式ζ (X)来代替固定的参数。然后用七点差分格式对修正后的模型进行离散化,并用显式欧拉法求解。值得注意的是,我们的方法不需要训练样本或前期训练时间,显著提高了整体计算效率。
{"title":"A fast, efficient, and robust feature protected denoising method","authors":"Mengyu Luo, Jian Wang","doi":"10.1145/3765902","DOIUrl":"https://doi.org/10.1145/3765902","url":null,"abstract":"This paper proposes a fast, efficient, and robust feature protected 3D mesh denoising method based on a modified Lengyel-Epstein (LE) model, primarily aiming to ensure volume stability and deliver superior denoising results. Compared with the original model, we mainly introduce a function expression <jats:italic toggle=\"yes\">ζ</jats:italic> ( <jats:italic toggle=\"yes\">X</jats:italic> ) to replace the fixed parameters. The modified model is then discretized using a seven-point difference scheme and solved by an explicit Euler method. Notably, our approach requires no training samples or upfront training time, significantly enhancing overall computational efficiency.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"3 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144987697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SymX: Energy-based Simulation from Symbolic Expressions SymX:基于符号表达式的能量模拟
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-02 DOI: 10.1145/3764928
José Fernández-Fernández, Fabian Löschner, Lukas Westhofen, Andreas Longva, Jan Bender
Optimization time integrators are effective at solving complex multi-physics problems including deformable solids with non-linear material models, contact with friction, strain limiting, etc. For challenging problems, Newton-type optimizers are often used, which necessitates first- and second-order derivatives of the global non-linear objective function. Manually differentiating, implementing, testing, optimizing, and maintaining the resulting code is extremely time-consuming, error-prone, and precludes quick changes to the model, even when using tools that assist with parts of such pipeline. We present SymX, an open source framework that computes the required derivatives of the different energy contributions by symbolic differentiation, generates optimized code, compiles it on-the-fly, and performs the global assembly. The user only has to provide the symbolic expression of each energy for a single representative element in its corresponding discretization and our system will determine the assembled derivatives for the whole simulation. We demonstrate the versatility of SymX in complex simulations featuring different non-linear materials, high-order finite elements, rigid body systems, adaptive discretizations, frictional contact, and coupling of multiple interacting physical systems. SymX’s derivatives offer performance on par with SymPy, an established off-the-shelf symbolic engine, and produces simulations at least one order of magnitude faster than TinyAD, an alternative state-of-the-art integral solution.
优化时间积分器可以有效地解决复杂的多物理场问题,包括具有非线性材料模型的变形固体、摩擦接触、应变极限等。对于具有挑战性的问题,通常使用牛顿型优化器,这需要全局非线性目标函数的一阶和二阶导数。手动区分、实现、测试、优化和维护结果代码非常耗时,容易出错,并且妨碍了对模型的快速更改,即使在使用辅助处理此类管道部分的工具时也是如此。我们介绍了SymX,一个开源框架,通过符号微分计算不同能量贡献的所需导数,生成优化代码,实时编译,并执行全局汇编。用户只需要在其相应的离散化中为单个代表性元素提供每个能量的符号表达式,我们的系统将确定整个模拟的组合导数。我们展示了SymX在复杂模拟中的多功能性,包括不同的非线性材料、高阶有限元、刚体系统、自适应离散化、摩擦接触和多个相互作用物理系统的耦合。SymX的衍生产品提供了与SymPy相当的性能,SymPy是一个现成的符号引擎,并且产生的模拟速度至少比TinyAD快一个数量级,TinyAD是一种替代的最先进的集成解决方案。
{"title":"SymX: Energy-based Simulation from Symbolic Expressions","authors":"José Fernández-Fernández, Fabian Löschner, Lukas Westhofen, Andreas Longva, Jan Bender","doi":"10.1145/3764928","DOIUrl":"https://doi.org/10.1145/3764928","url":null,"abstract":"Optimization time integrators are effective at solving complex multi-physics problems including deformable solids with non-linear material models, contact with friction, strain limiting, etc. For challenging problems, Newton-type optimizers are often used, which necessitates first- and second-order derivatives of the global non-linear objective function. Manually differentiating, implementing, testing, optimizing, and maintaining the resulting code is extremely time-consuming, error-prone, and precludes quick changes to the model, even when using tools that assist with parts of such pipeline. We present SymX, an open source framework that computes the required derivatives of the different energy contributions by symbolic differentiation, generates optimized code, compiles it on-the-fly, and performs the global assembly. The user only has to provide the symbolic expression of each energy for a single representative element in its corresponding discretization and our system will determine the assembled derivatives for the whole simulation. We demonstrate the versatility of SymX in complex simulations featuring different non-linear materials, high-order finite elements, rigid body systems, adaptive discretizations, frictional contact, and coupling of multiple interacting physical systems. SymX’s derivatives offer performance on par with SymPy, an established off-the-shelf symbolic engine, and produces simulations at least one order of magnitude faster than TinyAD, an alternative state-of-the-art integral solution.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"24 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144930903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Neural Reflectance Field Model for Accurate Relighting in RTI Applications RTI应用中精确重照明的神经反射场模型
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-08-28 DOI: 10.1145/3759452
Shambel Fente Mengistu, Filippo Bergamasco, Mara Pistellato
Reflectance Transformation Imaging (RTI) is a computational photography technique in which an object is acquired from a fixed point-of-view with different light directions. The aim is to estimate the light transport function at each point so that the object can be interactively relighted in a physically-accurate way, revealing its surface characteristics. In this paper, we propose a novel RTI approach describing surface reflectance as an implicit neural representation acting as a ”relightable image” for a specific object. We propose to represent the light transport function with a Neural Reflectance Field (NRF) model, feeding it with pixel coordinates, light direction, and a latent vector encoding the per-pixel reflectance in a neighbourhood. These vectors, computed during training, allow a more accurate relighting than a pure implicit representation (i.e., relying only on positional encoding) enabling the NRF to handle complex surface shadings. Moreover, they can be efficiently stored with the learned NRF for compression and transmission. As an additional contribution, we propose a novel synthetic dataset containing objects of various shapes and materials created with a physically based rendering software. An extensive experimental section shows that the proposed NRF accurately models the light transport function for challenging datasets in synthetic and real-world scenarios.
反射变换成像(RTI)是一种计算摄影技术,该技术从固定的角度以不同的光方向获取物体。目的是估计每个点的光传输函数,以便物体可以以物理精确的方式交互重亮,揭示其表面特征。在本文中,我们提出了一种新的RTI方法,将表面反射率描述为一种隐式神经表征,作为特定物体的“可照明图像”。我们建议用神经反射场(NRF)模型来表示光传输函数,为其提供像素坐标、光方向和编码邻域中逐像素反射率的潜在向量。这些向量,在训练期间计算,允许比纯隐式表示(即,仅依赖于位置编码)更准确的重光照,使NRF能够处理复杂的表面阴影。此外,它们可以有效地存储在学习到的NRF中,用于压缩和传输。作为额外的贡献,我们提出了一个新的合成数据集,其中包含使用基于物理的渲染软件创建的各种形状和材料的对象。一个广泛的实验部分表明,所提出的NRF准确地模拟了合成和现实世界场景中具有挑战性的数据集的光传输函数。
{"title":"A Neural Reflectance Field Model for Accurate Relighting in RTI Applications","authors":"Shambel Fente Mengistu, Filippo Bergamasco, Mara Pistellato","doi":"10.1145/3759452","DOIUrl":"https://doi.org/10.1145/3759452","url":null,"abstract":"Reflectance Transformation Imaging (RTI) is a computational photography technique in which an object is acquired from a fixed point-of-view with different light directions. The aim is to estimate the light transport function at each point so that the object can be interactively relighted in a physically-accurate way, revealing its surface characteristics. In this paper, we propose a novel RTI approach describing surface reflectance as an implicit neural representation acting as a ”relightable image” for a specific object. We propose to represent the light transport function with a Neural Reflectance Field (NRF) model, feeding it with pixel coordinates, light direction, and a latent vector encoding the per-pixel reflectance in a neighbourhood. These vectors, computed during training, allow a more accurate relighting than a pure implicit representation (i.e., relying only on positional encoding) enabling the NRF to handle complex surface shadings. Moreover, they can be efficiently stored with the learned NRF for compression and transmission. As an additional contribution, we propose a novel synthetic dataset containing objects of various shapes and materials created with a physically based rendering software. An extensive experimental section shows that the proposed NRF accurately models the light transport function for challenging datasets in synthetic and real-world scenarios.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"27 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144915648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PatchEX: High-Quality Real-Time Temporal Supersampling through Patch-based Parallel Extrapolation PatchEX:基于patch的并行外推的高质量实时时间超采样
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-08-12 DOI: 10.1145/3759247
Akanksha Dixit, Smruti R. Sarangi
High-refresh rate displays have become very popular in recent years due to the need for superior visual quality in gaming, professional displays and specialized applications such as medical imaging. However, high-refresh rate displays alone do not guarantee a superior visual experience; the GPU needs to render frames at a matching rate. Otherwise, we observe disconcerting visual artifacts such as screen tearing and stuttering. Real-time frame generation is an effective technique to increase frame rates by predicting new frames from other rendered frames. There are two methods in this space: interpolation and extrapolation. Interpolation-based methods provide good image quality at the cost of a higher runtime because they also require the next rendered frame. On the other hand, extrapolation methods are much faster at the cost of quality. This paper introduces PatchEX , a novel frame extrapolation method that aims to provide the quality of interpolation at the speed of extrapolation. It smartly segments each frame into foreground and background regions and employs a novel neural network to generate the final extrapolated frame. Additionally, a wavelet transform (WT)-based filter pruning technique is applied to compress the network, significantly reducing the runtime of the extrapolation process. Our results demonstrate that PatchEX achieves a 61.32% and 49.21% improvement in PSNR over the latest extrapolation methods ExtraNet and ExtraSS, respectively, while being 3 × and 2.6 × faster, respectively.
近年来,由于在游戏、专业显示和医疗成像等专业应用中需要卓越的视觉质量,高刷新率显示器变得非常流行。然而,高刷新率显示器本身并不能保证卓越的视觉体验;GPU需要以匹配的速率渲染帧。否则,我们会观察到令人不安的视觉假象,如屏幕撕裂和口吃。实时帧生成是一种通过预测其他渲染帧的新帧来提高帧率的有效技术。在这个领域有两种方法:插值和外推。基于插值的方法以更高的运行时间为代价提供了良好的图像质量,因为它们也需要下一个渲染帧。另一方面,外推方法更快,但代价是质量。本文介绍了一种新的帧外推方法PatchEX,该方法旨在以外推的速度提供插值的质量。它巧妙地将每帧分割为前景和背景区域,并使用一种新的神经网络来生成最终的外推帧。此外,采用基于小波变换(WT)的滤波剪枝技术对网络进行压缩,显著缩短了外推过程的运行时间。结果表明,与最新的外推方法ExtraNet和ExtraSS相比,PatchEX的PSNR分别提高了61.32%和49.21%,速度分别提高了3倍和2.6倍。
{"title":"PatchEX: High-Quality Real-Time Temporal Supersampling through Patch-based Parallel Extrapolation","authors":"Akanksha Dixit, Smruti R. Sarangi","doi":"10.1145/3759247","DOIUrl":"https://doi.org/10.1145/3759247","url":null,"abstract":"High-refresh rate displays have become very popular in recent years due to the need for superior visual quality in gaming, professional displays and specialized applications such as medical imaging. However, high-refresh rate displays alone do not guarantee a superior visual experience; the GPU needs to render frames at a matching rate. Otherwise, we observe disconcerting visual artifacts such as screen tearing and stuttering. Real-time frame generation is an effective technique to increase frame rates by predicting new frames from other rendered frames. There are two methods in this space: interpolation and extrapolation. Interpolation-based methods provide good image quality at the cost of a higher runtime because they also require the next rendered frame. On the other hand, extrapolation methods are much faster at the cost of quality. This paper introduces <jats:italic toggle=\"yes\">PatchEX</jats:italic> , a novel frame extrapolation method that aims to provide the quality of interpolation at the speed of extrapolation. It smartly segments each frame into foreground and background regions and employs a novel neural network to generate the final extrapolated frame. Additionally, a wavelet transform (WT)-based filter pruning technique is applied to compress the network, significantly reducing the runtime of the extrapolation process. Our results demonstrate that <jats:italic toggle=\"yes\">PatchEX</jats:italic> achieves a 61.32% and 49.21% improvement in PSNR over the latest extrapolation methods ExtraNet and ExtraSS, respectively, while being 3 × and 2.6 × faster, respectively.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"19 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144850851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Iris3D: 3D Generation via Synchronized Diffusion Distillation Iris3D:通过同步扩散蒸馏生成3D
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-08-07 DOI: 10.1145/3759249
Yixun Liang, Weiyu Li, Rui Chen, Fei-Peng Tian, Jiarui Liu, Ying-Cong Chen, Ping Tan, Xiao-Xiao Long
We introduce Iris3D, a novel 3D content generation system that generates vivid textures and detailed 3D shapes while preserving the input information. Our system integrates a Multi-View Large Reconstruction Model (MVLRM [25]) to generate a coarse 3D mesh and introduces a novel optimization scheme called Synchronized Diffusion Distillation (SDD) for refinement. Unlike previous refined methods based on Score Distillation Sampling (SDS), which suffer from unstable optimization and geometric over-smoothing due to ambiguities across different views and modalities, our method effectively distills consistent multi-view and multi-modal priors from 2D diffusion models in a training-free manner. This enables robust optimization of 3D representations. Additionally, because SDD is training-free, it preserves the diffusion’s prior knowledge and mitigates potential degradation. This characteristic makes it highly compatible with advanced 2D diffusion techniques like IP-Adapters and ControlNet, allowing for more controllable 3D generation with additional conditioning signals. Experiments demonstrate that our method produces high-quality 3D results with plausible textures and intricate geometric details.
我们介绍了Iris3D,一个新颖的3D内容生成系统,生成生动的纹理和详细的3D形状,同时保留输入信息。我们的系统集成了多视图大重构模型(MVLRM[25])来生成粗三维网格,并引入了一种新的优化方案,称为同步扩散蒸馏(SDD)进行细化。与以往基于分数蒸馏采样(SDS)的改进方法不同,该方法由于不同视图和模态的模糊性而遭受不稳定的优化和几何过度平滑,我们的方法以无训练的方式有效地从二维扩散模型中提取出一致的多视图和多模态先验。这使得3D表示的鲁棒优化成为可能。此外,由于SDD不需要训练,它保留了扩散的先验知识并减轻了潜在的退化。这种特性使其与先进的2D扩散技术(如ip适配器和ControlNet)高度兼容,允许使用额外的调节信号进行更可控的3D生成。实验证明,我们的方法可以产生高质量的3D结果,具有合理的纹理和复杂的几何细节。
{"title":"Iris3D: 3D Generation via Synchronized Diffusion Distillation","authors":"Yixun Liang, Weiyu Li, Rui Chen, Fei-Peng Tian, Jiarui Liu, Ying-Cong Chen, Ping Tan, Xiao-Xiao Long","doi":"10.1145/3759249","DOIUrl":"https://doi.org/10.1145/3759249","url":null,"abstract":"We introduce Iris3D, a novel 3D content generation system that generates vivid textures and detailed 3D shapes while preserving the input information. Our system integrates a Multi-View Large Reconstruction Model (MVLRM [25]) to generate a coarse 3D mesh and introduces a novel optimization scheme called Synchronized Diffusion Distillation (SDD) for refinement. Unlike previous refined methods based on Score Distillation Sampling (SDS), which suffer from unstable optimization and geometric over-smoothing due to ambiguities across different views and modalities, our method effectively distills consistent multi-view and multi-modal priors from 2D diffusion models in a training-free manner. This enables robust optimization of 3D representations. Additionally, because SDD is training-free, it preserves the diffusion’s prior knowledge and mitigates potential degradation. This characteristic makes it highly compatible with advanced 2D diffusion techniques like IP-Adapters and ControlNet, allowing for more controllable 3D generation with additional conditioning signals. Experiments demonstrate that our method produces high-quality 3D results with plausible textures and intricate geometric details.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"55 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144792842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GS-ROR 2 : Bidirectional-guided 3DGS and SDF for Reflective Object Relighting and Reconstruction GS-ROR 2:用于反射物体重光照和重建的双向引导3DGS和SDF
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-08-07 DOI: 10.1145/3759248
Zuoliang Zhu, Beibei Wang, Jian Yang
3D Gaussian Splatting (3DGS) has shown a powerful capability for novel view synthesis due to its detailed expressive ability and highly efficient rendering speed. Unfortunately, creating relightable 3D assets and reconstructing faithful geometry with 3DGS is still problematic, particularly for reflective objects, as its discontinuous representation raises difficulties in constraining geometries. In contrary, volumetric signed distance field (SDF) methods provide robust geometry reconstruction, while the expensive ray marching hinders its real-time application and slows the training. Besides, these methods struggle to capture sharp geometric details. To this end, we propose to guide 3DGS and SDF bidirectionally in a complementary manner, including an SDF-aided Gaussian splatting for efficient optimization of the relighting model and a GS-guided SDF enhancement for high-quality geometry reconstruction. At the core of our SDF-aided Gaussian splatting is the mutual supervision of the depth and normal between blended Gaussians and SDF, which avoids the expensive volume rendering of SDF. Thanks to this mutual supervision, the learned blended Gaussians are well-constrained with a minimal time cost. As the Gaussians are rendered in a deferred shading mode, the alpha-blended Gaussians are smooth, while individual Gaussians may still be outliers, yielding floater artifacts. Therefore, we introduce an SDF-aware pruning strategy to remove Gaussian outliers located distant from the surface defined by SDF, avoiding the floater issue. This way, our GS framework provides reasonable normal and achieves realistic relighting, while the mesh of truncated SDF (TSDF) fusion from depth is still problematic. Therefore, we design a GS-guided SDF refinement, which utilizes the blended normal from Gaussians to finetune SDF. Equipped with the efficient enhancement, our method can further provide high-quality meshes for reflective objects at the cost of 17% extra training time. Consequently, our method outperforms the existing Gaussian-based inverse rendering methods in terms of relighting and mesh quality. Our method also exhibits competitive relighting/mesh quality compared to NeRF-based methods with at most 25%/33% of training time and allows rendering at 200+ frames per second on an RTX4090. Our code is available at https://github.com/NK-CS-ZZL/GS-ROR.
三维高斯飞溅(3DGS)以其精细的表达能力和高效的渲染速度显示出强大的新视图合成能力。不幸的是,使用3DGS创建可照明的3D资产和重建忠实的几何形状仍然存在问题,特别是对于反射对象,因为它的不连续表示增加了约束几何形状的困难。相反,体积符号距离场(SDF)方法提供了鲁棒的几何重建,但昂贵的射线推进阻碍了其实时应用并减慢了训练速度。此外,这些方法很难捕捉到尖锐的几何细节。为此,我们建议以互补的方式双向引导3DGS和SDF,包括SDF辅助的高斯溅射,用于有效优化重光照模型,gs引导的SDF增强,用于高质量的几何重建。我们的SDF辅助高斯喷溅的核心是混合高斯和SDF之间的深度和法线的相互监督,这避免了昂贵的SDF的体积渲染。由于这种相互监督,学习到的混合高斯函数以最小的时间成本得到了很好的约束。由于高斯分布是在延迟着色模式下渲染的,混合的高斯分布是平滑的,而单个高斯分布可能仍然是异常值,产生浮动伪影。因此,我们引入了一种SDF感知的剪枝策略来去除远离SDF定义的表面的高斯异常值,从而避免了浮子问题。这样,我们的GS框架提供了合理的法线并实现了逼真的重光照,而从深度融合的截断SDF (TSDF)网格仍然存在问题。因此,我们设计了一种gs引导的SDF细化方法,该方法利用高斯的混合正态来微调SDF。通过有效的增强,我们的方法可以以额外17%的训练时间为代价,进一步为反射物体提供高质量的网格。因此,我们的方法在重光照和网格质量方面优于现有的基于高斯的反向渲染方法。与基于nerf的方法相比,我们的方法也具有竞争力的重光照/网格质量,最多只需要25%/33%的训练时间,并允许在RTX4090上以每秒200帧的速度渲染。我们的代码可在https://github.com/NK-CS-ZZL/GS-ROR上获得。
{"title":"GS-ROR 2 : Bidirectional-guided 3DGS and SDF for Reflective Object Relighting and Reconstruction","authors":"Zuoliang Zhu, Beibei Wang, Jian Yang","doi":"10.1145/3759248","DOIUrl":"https://doi.org/10.1145/3759248","url":null,"abstract":"3D Gaussian Splatting (3DGS) has shown a powerful capability for novel view synthesis due to its detailed expressive ability and highly efficient rendering speed. Unfortunately, creating relightable 3D assets and reconstructing faithful geometry with 3DGS is still problematic, particularly for reflective objects, as its discontinuous representation raises difficulties in constraining geometries. In contrary, volumetric signed distance field (SDF) methods provide robust geometry reconstruction, while the expensive ray marching hinders its real-time application and slows the training. Besides, these methods struggle to capture sharp geometric details. To this end, we propose to guide 3DGS and SDF bidirectionally in a complementary manner, including an SDF-aided Gaussian splatting for efficient optimization of the relighting model and a GS-guided SDF enhancement for high-quality geometry reconstruction. At the core of our SDF-aided Gaussian splatting is the <jats:italic toggle=\"yes\">mutual supervision</jats:italic> of the depth and normal between blended Gaussians and SDF, which avoids the expensive volume rendering of SDF. Thanks to this mutual supervision, the learned blended Gaussians are well-constrained with a minimal time cost. As the Gaussians are rendered in a deferred shading mode, the alpha-blended Gaussians are smooth, while individual Gaussians may still be outliers, yielding floater artifacts. Therefore, we introduce an SDF-aware pruning strategy to remove Gaussian outliers located distant from the surface defined by SDF, avoiding the floater issue. This way, our GS framework provides reasonable normal and achieves realistic relighting, while the mesh of truncated SDF (TSDF) fusion from depth is still problematic. Therefore, we design a GS-guided SDF refinement, which utilizes the blended normal from Gaussians to finetune SDF. Equipped with the efficient enhancement, our method can further provide high-quality meshes for reflective objects at the cost of 17% extra training time. Consequently, our method outperforms the existing Gaussian-based inverse rendering methods in terms of relighting and mesh quality. Our method also exhibits competitive relighting/mesh quality compared to NeRF-based methods with at most 25%/33% of training time and allows rendering at 200+ frames per second on an RTX4090. Our code is available at https://github.com/NK-CS-ZZL/GS-ROR.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"9 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144792846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1