首页 > 最新文献

ACM Transactions on Graphics最新文献

英文 中文
Resolution Where It Counts: Hash-based GPU-Accelerated 3D Reconstruction via Variance-Adaptive Voxel Grids 分辨率计算:基于哈希的gpu加速3D重建通过方差自适应体素网格
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-11-20 DOI: 10.1145/3777909
Lorenzo De Rebotti, Emanuele Giacomini, Giorgio Grisetti, Luca Di Giammarino
Efficient and scalable 3D surface reconstruction from range data remains a core challenge in computer graphics and vision, particularly in real-time and resource-constrained scenarios. Traditional volumetric methods based on fixed-resolution voxel grids or hierarchical structures like octrees often suffer from memory inefficiency, computational overhead, and a lack of GPU support. We propose a novel variance-adaptive, multi-resolution voxel grid that dynamically adjusts voxel size based on the local variance of signed distance field (SDF) observations. Unlike prior multi-resolution approaches that rely on recursive octree structures, our method leverages a flat spatial hash table to store all voxel blocks, supporting constant-time access and full GPU parallelism. This design enables high memory efficiency, and real-time scalability. We further demonstrate how our representation supports GPU-accelerated rendering through a parallel quad-tree structure for Gaussian Splatting, enabling effective control over splat density. Our open-source CUDA/C++ implementation achieves up to 13× speedup and 4× lower memory usage compared to fixed-resolution baselines, while maintaining on par results in terms of reconstruction accuracy, offering a practical and extensible solution for high-performance 3D reconstruction.
从距离数据中高效、可扩展的3D表面重建仍然是计算机图形学和视觉的核心挑战,特别是在实时和资源受限的情况下。基于固定分辨率体素网格或八叉树等分层结构的传统体积方法通常存在内存效率低下、计算开销和缺乏GPU支持的问题。提出了一种基于符号距离场(SDF)观测值的局部方差动态调整体素大小的方差自适应多分辨率体素网格。与之前依赖于递归八叉树结构的多分辨率方法不同,我们的方法利用平面空间哈希表来存储所有体素块,支持恒定时间访问和完全GPU并行性。该设计实现了高内存效率和实时可扩展性。我们进一步演示了我们的表示如何通过并行四叉树结构支持高斯飞溅的gpu加速渲染,从而有效控制飞溅密度。与固定分辨率基线相比,我们的开源CUDA/ c++实现实现了高达13倍的加速和4倍的内存使用,同时在重建精度方面保持了同等的结果,为高性能3D重建提供了实用和可扩展的解决方案。
{"title":"Resolution Where It Counts: Hash-based GPU-Accelerated 3D Reconstruction via Variance-Adaptive Voxel Grids","authors":"Lorenzo De Rebotti, Emanuele Giacomini, Giorgio Grisetti, Luca Di Giammarino","doi":"10.1145/3777909","DOIUrl":"https://doi.org/10.1145/3777909","url":null,"abstract":"Efficient and scalable 3D surface reconstruction from range data remains a core challenge in computer graphics and vision, particularly in real-time and resource-constrained scenarios. Traditional volumetric methods based on fixed-resolution voxel grids or hierarchical structures like octrees often suffer from memory inefficiency, computational overhead, and a lack of GPU support. We propose a novel variance-adaptive, multi-resolution voxel grid that dynamically adjusts voxel size based on the local variance of signed distance field (SDF) observations. Unlike prior multi-resolution approaches that rely on recursive octree structures, our method leverages a flat spatial hash table to store all voxel blocks, supporting constant-time access and full GPU parallelism. This design enables high memory efficiency, and real-time scalability. We further demonstrate how our representation supports GPU-accelerated rendering through a parallel quad-tree structure for Gaussian Splatting, enabling effective control over splat density. Our open-source CUDA/C++ implementation achieves up to 13× speedup and 4× lower memory usage compared to fixed-resolution baselines, while maintaining on par results in terms of reconstruction accuracy, offering a practical and extensible solution for high-performance 3D reconstruction.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"204 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145554482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Voronoi Rooms: Dynamic Visibility Modulation of Overlapping Spaces for Telepresence Voronoi房间:远程呈现重叠空间的动态可视性调制
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-11-20 DOI: 10.1145/3777900
Taehei Kim, Jihun Shin, Hyeshim Kim, Hyuckjin Jang, Jiho Kang, Sung-Hee Lee
We propose a multi-user Mixed Reality (MR) telepresence system that allows users to interact by seamlessly visualizing remote environments and avatars overlaid onto their local physical space. Building on prior shared-space approaches, our method first aligns overlapping rooms to maximize a shared space —a common area containing matched real and virtual objects where all users can interact. Uniquely, our system extends beyond this shared space by visualizing non-shared spaces, the remaining part of each room, allowing users to inhabit these distinct areas. To address the issue of overlap between non-shared spaces, we dynamically adjust their visibility based on user proximity, using a Voronoi diagram to prioritize subspaces closer to each user. Visualizing the surrounding space of each user conveys spatial context, helping others interpret their behavior within their environment. Visibility is updated in real time as users move, maintaining a coherent sense of spatial awareness. Through a user study, we demonstrate that our system enhances enjoyment, spatial understanding, and presence compared to shared-space-only approaches. Quantitative results further show that our dynamic visibility modulation improves both personal space preservation and space accessibility relative to static methods. Overall, our system provides users with a seamless, dynamically connected, and shared multi-room environment.
我们提出了一个多用户混合现实(MR)远程呈现系统,允许用户通过无缝可视化远程环境和覆盖在其本地物理空间上的化身进行交互。在先前的共享空间方法的基础上,我们的方法首先对齐重叠的房间以最大化共享空间-一个包含匹配的真实和虚拟对象的公共区域,所有用户都可以在其中进行交互。独特的是,我们的系统通过可视化非共享空间扩展了这个共享空间,每个房间的其余部分,允许用户居住在这些不同的区域。为了解决非共享空间之间的重叠问题,我们根据用户接近度动态调整其可见性,使用Voronoi图来优先考虑靠近每个用户的子空间。可视化每个用户的周围空间传达空间背景,帮助其他人理解他们在环境中的行为。可见性随着用户的移动而实时更新,保持连贯的空间意识。通过用户研究,我们证明了与仅共享空间的方法相比,我们的系统增强了享受、空间理解和存在感。定量结果进一步表明,相对于静态方法,动态可视性调制提高了个人空间保存和空间可达性。总的来说,我们的系统为用户提供了一个无缝的、动态连接的、共享的多房间环境。
{"title":"Voronoi Rooms: Dynamic Visibility Modulation of Overlapping Spaces for Telepresence","authors":"Taehei Kim, Jihun Shin, Hyeshim Kim, Hyuckjin Jang, Jiho Kang, Sung-Hee Lee","doi":"10.1145/3777900","DOIUrl":"https://doi.org/10.1145/3777900","url":null,"abstract":"We propose a multi-user Mixed Reality (MR) telepresence system that allows users to interact by seamlessly visualizing remote environments and avatars overlaid onto their local physical space. Building on prior shared-space approaches, our method first aligns overlapping rooms to maximize a <jats:italic toggle=\"yes\">shared space</jats:italic> —a common area containing matched real and virtual objects where all users can interact. Uniquely, our system extends beyond this shared space by visualizing non-shared spaces, the remaining part of each room, allowing users to inhabit these distinct areas. To address the issue of overlap between non-shared spaces, we dynamically adjust their visibility based on user proximity, using a Voronoi diagram to prioritize subspaces closer to each user. Visualizing the surrounding space of each user conveys spatial context, helping others interpret their behavior within their environment. Visibility is updated in real time as users move, maintaining a coherent sense of spatial awareness. Through a user study, we demonstrate that our system enhances enjoyment, spatial understanding, and presence compared to shared-space-only approaches. Quantitative results further show that our dynamic visibility modulation improves both personal space preservation and space accessibility relative to static methods. Overall, our system provides users with a seamless, dynamically connected, and shared multi-room environment.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"6 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145554480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectral Theory of Light Transport Operators 光输运算子的光谱理论
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-11-04 DOI: 10.1145/3774756
Cyril Soler, Kartic Subr
Light Transport Operators (LTOs) represent a fundamental concept in computer graphics, modeling single bounces of light within a virtual environment as linears operators on infinite dimensional spaces. While the LTOs play a crucial role in rendering, prior studies have primarily focused on spectral analyses of the light field rather than the operators themselves. This paper presents a rigorous investigation into the spectral properties of the LTOs. Due to their non-compact nature, traditional spectral analysis techniques face challenges in this setting. However, many practical rendering methods effectively employ compact approximations, suggesting that non-compactness is not an absolute barrier. We show the relevance of such approximations and establish various path integral formulations of their spectrum. These findings enhance the theoretical understanding of light transport and offer new perspectives for improving rendering efficiency and accuracy.
光传输算子(LTOs)代表了计算机图形学中的一个基本概念,它将虚拟环境中的单个光反弹建模为无限维空间上的线性算子。虽然lto在渲染中起着至关重要的作用,但之前的研究主要集中于光场的光谱分析,而不是操作员本身。本文对lto的光谱特性进行了严格的研究。由于其非紧凑性,传统的光谱分析技术在这种情况下面临挑战。然而,许多实际的渲染方法有效地采用紧致近似,这表明非紧致并不是一个绝对的障碍。我们展示了这些近似的相关性,并建立了它们光谱的各种路径积分公式。这些发现增强了对光输运的理论认识,并为提高绘制效率和精度提供了新的视角。
{"title":"Spectral Theory of Light Transport Operators","authors":"Cyril Soler, Kartic Subr","doi":"10.1145/3774756","DOIUrl":"https://doi.org/10.1145/3774756","url":null,"abstract":"Light Transport Operators (LTOs) represent a fundamental concept in computer graphics, modeling single bounces of light within a virtual environment as linears operators on infinite dimensional spaces. While the LTOs play a crucial role in rendering, prior studies have primarily focused on spectral analyses of the light field rather than the operators themselves. This paper presents a rigorous investigation into the spectral properties of the LTOs. Due to their non-compact nature, traditional spectral analysis techniques face challenges in this setting. However, many practical rendering methods effectively employ compact approximations, suggesting that non-compactness is not an absolute barrier. We show the relevance of such approximations and establish various path integral formulations of their spectrum. These findings enhance the theoretical understanding of light transport and offer new perspectives for improving rendering efficiency and accuracy.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"53 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145434325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NeuPPS: Neural Piecewise Parametric Surfaces NeuPPS:神经分段参数曲面
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-29 DOI: 10.1145/3771546
Lei Yang, Yongqing Liang, Xin Li, Congyi Zhang, Guying Lin, Cheng Lin, Alla Sheffer, Scott Schaefer, John Keyser, Wenping Wang
Piecewise parametric surfaces have long been established as prevalent geometric representations; however, they often require surface refinement or sophisticated quadrangulation to accurately represent complex geometries. Geometric deep learning has shown that neural networks can provide greater representational power than conventional methods. Nevertheless, approaches using a single parametric surface for shape fitting struggle to capture fine-grained geometric details, while multi-patch methods fail to ensure seamless connections between adjacent patches. We present Neural Piecewise Parametric Surfaces ( NeuPPS ), the first piecewise neural surface representation that allows for coarse patch layouts composed of arbitrary n -sided surface patches to model complex surface geometries with high precision, offering enhanced flexibility compared to traditional parametric surfaces. This new surface representation guarantees, by construction, the continuity between adjacent patches, a property that other neural patch-based approaches cannot ensure. Two novel components are introduced: a learnable feature complex and a continuous mapping function approximated by multi-layer perceptrons (MLPs). We apply the proposed NeuPPS to surface fitting and shape space learning tasks. Extensive experiments demonstrate the advantages of NeuPPS over traditional parametric representations and existing patch-based learning approaches.
分段参数曲面早已被确立为流行的几何表示;然而,它们通常需要表面细化或复杂的四边形来准确地表示复杂的几何形状。几何深度学习表明,神经网络可以提供比传统方法更大的表示能力。然而,使用单一参数曲面进行形状拟合的方法难以捕获细粒度的几何细节,而多补丁方法无法确保相邻补丁之间的无缝连接。我们提出了神经分段参数曲面(NeuPPS),这是第一个分段神经曲面表示,它允许由任意n边表面斑块组成的粗糙斑块布局,以高精度模拟复杂的表面几何形状,与传统的参数曲面相比,提供了更高的灵活性。这种新的表面表示通过构造保证了相邻斑块之间的连续性,这是其他基于神经斑块的方法无法保证的特性。引入了两个新的组成部分:一个可学习的特征复合体和一个由多层感知器(mlp)近似的连续映射函数。我们将提出的NeuPPS应用于曲面拟合和形状空间学习任务。大量的实验证明了NeuPPS优于传统的参数表示和现有的基于补丁的学习方法。
{"title":"NeuPPS: Neural Piecewise Parametric Surfaces","authors":"Lei Yang, Yongqing Liang, Xin Li, Congyi Zhang, Guying Lin, Cheng Lin, Alla Sheffer, Scott Schaefer, John Keyser, Wenping Wang","doi":"10.1145/3771546","DOIUrl":"https://doi.org/10.1145/3771546","url":null,"abstract":"Piecewise parametric surfaces have long been established as prevalent geometric representations; however, they often require surface refinement or sophisticated quadrangulation to accurately represent complex geometries. Geometric deep learning has shown that neural networks can provide greater representational power than conventional methods. Nevertheless, approaches using a single parametric surface for shape fitting struggle to capture fine-grained geometric details, while multi-patch methods fail to ensure seamless connections between adjacent patches. We present <jats:italic toggle=\"yes\">Neural Piecewise Parametric Surfaces</jats:italic> ( <jats:italic toggle=\"yes\">NeuPPS</jats:italic> ), the <jats:italic toggle=\"yes\">first</jats:italic> piecewise neural surface representation that allows for coarse patch layouts composed of <jats:italic toggle=\"yes\"> arbitrary <jats:italic toggle=\"yes\">n</jats:italic> -sided surface patches </jats:italic> to model complex surface geometries with high precision, offering enhanced <jats:italic toggle=\"yes\">flexibility</jats:italic> compared to traditional parametric surfaces. This new surface representation guarantees, by construction, the continuity between adjacent patches, a property that other neural patch-based approaches cannot ensure. Two novel components are introduced: a learnable feature complex and a continuous mapping function approximated by multi-layer perceptrons (MLPs). We apply the proposed <jats:italic toggle=\"yes\">NeuPPS</jats:italic> to surface fitting and shape space learning tasks. Extensive experiments demonstrate the advantages of <jats:italic toggle=\"yes\">NeuPPS</jats:italic> over traditional parametric representations and existing patch-based learning approaches.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"69 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145396367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Biharmonic Skinning Using Geometric Fields 利用几何场鲁棒双谐蒙皮
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-28 DOI: 10.1145/3771928
Ana Dodik, Vincent Sitzmann, Justin Solomon, Oded Stein
Bounded bihramonic weights are a popular tool used to rig and deform characters for animation, to compute reduced-order simulations, and to define feature descriptors for geometry processing. They necessitate tetrahedralizing the volume bounded by the surface, introducing the possibility of meshing artifacts or tetrahedralization failure. We introduce a mesh-free and robust automatic skinning technique that generates weights comparable to the current state of the art, but works reliably even on open surfaces, triangle soups, and point clouds where current methods fail. We achieve this through the use of a specialized Lagrangian representation enabled by the advent of hardware ray-tracing, which circumvents the need for finite elements while optimizing the biharmonic energy and enforcing boundary conditions. The flexibility of our formulation allows us to integrate artistic control through weight painting during the optimization. We offer a thorough qualitative and quantitative evaluation of our method.
有界二元权值是一种流行的工具,用于装配和变形动画中的角色,计算降阶模拟,以及定义用于几何处理的特征描述符。它们需要将由表面包围的体积四面体化,从而引入网格伪影或四面体化失败的可能性。我们引入了一种无网格和强大的自动蒙皮技术,该技术可以生成与当前技术相当的权重,但即使在开放表面、三角形汤和当前方法失败的点云上也能可靠地工作。我们通过使用硬件光线追踪实现的专用拉格朗日表示来实现这一目标,这规避了对有限元的需求,同时优化了双调和能量并强制执行边界条件。我们配方的灵活性使我们能够在优化过程中通过重量绘画来整合艺术控制。我们对我们的方法进行全面的定性和定量评估。
{"title":"Robust Biharmonic Skinning Using Geometric Fields","authors":"Ana Dodik, Vincent Sitzmann, Justin Solomon, Oded Stein","doi":"10.1145/3771928","DOIUrl":"https://doi.org/10.1145/3771928","url":null,"abstract":"Bounded bihramonic weights are a popular tool used to rig and deform characters for animation, to compute reduced-order simulations, and to define feature descriptors for geometry processing. They necessitate tetrahedralizing the volume bounded by the surface, introducing the possibility of meshing artifacts or tetrahedralization failure. We introduce a <jats:italic toggle=\"yes\">mesh-free</jats:italic> and <jats:italic toggle=\"yes\">robust</jats:italic> automatic skinning technique that generates weights comparable to the current state of the art, but works reliably even on open surfaces, triangle soups, and point clouds where current methods fail. We achieve this through the use of a specialized Lagrangian representation enabled by the advent of hardware ray-tracing, which circumvents the need for finite elements while optimizing the biharmonic energy and enforcing boundary conditions. The flexibility of our formulation allows us to integrate artistic control through weight painting during the optimization. We offer a thorough qualitative and quantitative evaluation of our method.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"19 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145396487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Emotion Manipulation for Talking-Head Videos via Facial Landmarks 基于面部标志的“会说话的头”视频的情绪控制
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-13 DOI: 10.1145/3770576
Kwanggyoon Seo, Rene Culaway, Byeong-Uk Lee, Junyong Noh
Manipulating the emotion of a performer in a video is a challenging task. The lip motion needs to be preserved while performing the desired changes in the emotion of the subject; however, simply utilizing existing image-based editing methods sabotages the original lip synchronization. We tackle this problem by utilizing a pretrained StyleGAN paired with a landmark-based editing module that modifies the bias present in the edit direction used in image manipulation. The proposed editing module consists of a latent-based landmark detection network and an editing network that modifies the editing direction to match the original lip synchronization while preserving the desired emotion manipulation results. This is realized by taking the facial landmarks as control points. Both networks operate on the latent space, which enables fast training and inference. We show that the proposed method runs significantly faster and performs better in terms of visual quality than alternative approaches, which was validated through a perceptual study. The proposed method can also be extended to perform face reenactment to generate a talking-head video from a single image and face image manipulation using facial landmarks as control points.
在视频中操纵表演者的情绪是一项具有挑战性的任务。嘴唇的运动需要保留,同时执行预期的变化,在主题的情绪;然而,简单地利用现有的基于图像的编辑方法破坏了原始的唇同步。我们通过使用预训练的StyleGAN与基于地标的编辑模块配对来解决这个问题,该模块可以修改图像处理中使用的编辑方向中存在的偏见。所提出的编辑模块包括一个基于潜在的地标检测网络和一个编辑网络,该网络修改编辑方向以匹配原始唇同步,同时保持期望的情绪操纵结果。这是通过将面部地标作为控制点来实现的。这两种网络都在潜在空间上运行,从而实现了快速训练和推理。我们表明,所提出的方法比其他方法运行得更快,在视觉质量方面表现得更好,这一点通过感知研究得到了验证。该方法还可以扩展到从单个图像生成说话头视频的人脸再现和以面部地标为控制点的人脸图像处理。
{"title":"Emotion Manipulation for Talking-Head Videos via Facial Landmarks","authors":"Kwanggyoon Seo, Rene Culaway, Byeong-Uk Lee, Junyong Noh","doi":"10.1145/3770576","DOIUrl":"https://doi.org/10.1145/3770576","url":null,"abstract":"Manipulating the emotion of a performer in a video is a challenging task. The lip motion needs to be preserved while performing the desired changes in the emotion of the subject; however, simply utilizing existing image-based editing methods sabotages the original lip synchronization. We tackle this problem by utilizing a pretrained StyleGAN paired with a landmark-based editing module that modifies the bias present in the edit direction used in image manipulation. The proposed editing module consists of a latent-based landmark detection network and an editing network that modifies the editing direction to match the original lip synchronization while preserving the desired emotion manipulation results. This is realized by taking the facial landmarks as control points. Both networks operate on the latent space, which enables fast training and inference. We show that the proposed method runs significantly faster and performs better in terms of visual quality than alternative approaches, which was validated through a perceptual study. The proposed method can also be extended to perform face reenactment to generate a talking-head video from a single image and face image manipulation using facial landmarks as control points.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"24 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145282651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Radiance Fields from Photons 光子的辐射场
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-07 DOI: 10.1145/3770578
Sacha Jungerman, Aryan Garg, Mohit Gupta
Neural radiance fields, or NeRFs, have become the de facto approach for high-quality view synthesis from a collection of images captured from multiple viewpoints. However, many issues remain when capturing images in-the-wild under challenging conditions, such as in low light, high dynamic range, or with rapid motion, leading to smeared reconstructions with noticeable artifacts. In this work, we introduce quanta radiance fields , a novel class of neural radiance fields that are trained at the granularity of individual photons using single-photon cameras (SPCs). We develop theory and practical computational techniques for building radiance fields and estimating dense camera poses from unconventional, stochastic, and high-speed binary frame sequences captured by SPCs. We demonstrate, both via simulations and a SPC hardware prototype, high-fidelity reconstructions under high-speed motion, in low light, and for extreme dynamic range settings.
神经辐射场(nerf)已经成为从多个视点捕获的图像集合中进行高质量视图合成的事实上的方法。然而,在具有挑战性的条件下拍摄图像时,例如在低光,高动态范围或快速运动中,仍然存在许多问题,导致带有明显伪影的涂抹重建。在这项工作中,我们引入了量子辐射场,这是一种新型的神经辐射场,使用单光子相机(SPCs)在单个光子的粒度上进行训练。我们开发了理论和实用的计算技术,用于从SPCs捕获的非常规,随机和高速二进制帧序列中构建辐射场和估计密集相机姿态。通过仿真和SPC硬件原型,我们演示了高速运动、低光和极端动态范围设置下的高保真重建。
{"title":"Radiance Fields from Photons","authors":"Sacha Jungerman, Aryan Garg, Mohit Gupta","doi":"10.1145/3770578","DOIUrl":"https://doi.org/10.1145/3770578","url":null,"abstract":"Neural radiance fields, or NeRFs, have become the de facto approach for high-quality view synthesis from a collection of images captured from multiple viewpoints. However, many issues remain when capturing images in-the-wild under challenging conditions, such as in low light, high dynamic range, or with rapid motion, leading to smeared reconstructions with noticeable artifacts. In this work, we introduce <jats:italic toggle=\"yes\">quanta radiance fields</jats:italic> , a novel class of neural radiance fields that are trained at the granularity of individual photons using single-photon cameras (SPCs). We develop theory and practical computational techniques for building radiance fields and estimating dense camera poses from unconventional, stochastic, and high-speed binary frame sequences captured by SPCs. We demonstrate, both via simulations and a SPC hardware prototype, high-fidelity reconstructions under high-speed motion, in low light, and for extreme dynamic range settings.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"112 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145241484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CHOICE: Coordinated Human-Object Interaction in Cluttered Environments for Pick-and-Place Actions 选择:在混乱的拾取和放置动作环境中协调人机交互
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-02 DOI: 10.1145/3770746
Jintao Lu, He Zhang, Yuting Ye, Takaaki Shiratori, Sebastian Starke, Taku Komura
Animating human-scene interactions such as picking and placing a wide range of objects with different geometries is a challenging task, especially in a cluttered environment where interactions with complex articulated containers are involved. The main difficulty lies in the sparsity of the motion data compared to the wide variation of the objects and environments, as well as the poor availability of transition motions between different actions, increasing the complexity of the generalization to arbitrary conditions. To cope with this issue, we develop a system that tackles the interaction synthesis problem as a hierarchical goal-driven task. Firstly, we develop a bimanual scheduler that plans a set of keyframes for simultaneously controlling the two hands to efficiently achieve the pick-and-place task from an abstract goal signal such as the target object selected by the user. Next, we develop a neural implicit planner that generates hand trajectories to guide reaching and leaving motions across diverse object shapes/types and obstacle layouts. Finally, we propose a linear dynamic model for our DeepPhase controller that incorporates a Kalman filter to enable smooth transitions in the frequency domain, resulting in a more realistic and effective multi-objective control of the character. Our system can synthesize a rich variety of natural pick-and-place movements that adapt to different object geometries, container articulations, and scene layouts.
动画人类场景交互,如挑选和放置具有不同几何形状的各种对象是一项具有挑战性的任务,特别是在涉及复杂铰接容器交互的混乱环境中。主要的困难在于运动数据的稀疏性与对象和环境的广泛变化相比,以及不同动作之间过渡运动的可用性较差,增加了泛化到任意条件的复杂性。为了解决这个问题,我们开发了一个系统,将交互综合问题作为分层目标驱动的任务来处理。首先,我们开发了一个手动调度程序,该调度程序规划了一组关键帧,用于同时控制两只手,以有效地从用户选择的目标对象等抽象目标信号中实现拾取任务。接下来,我们开发了一个神经隐式规划器,生成手轨迹来指导跨越不同物体形状/类型和障碍物布局的到达和离开运动。最后,我们为我们的DeepPhase控制器提出了一个线性动态模型,该模型包含一个卡尔曼滤波器,以实现频域的平滑过渡,从而实现更现实和有效的多目标控制。我们的系统可以合成丰富多样的自然拾取和放置运动,以适应不同的物体几何形状、容器铰接和场景布局。
{"title":"CHOICE: Coordinated Human-Object Interaction in Cluttered Environments for Pick-and-Place Actions","authors":"Jintao Lu, He Zhang, Yuting Ye, Takaaki Shiratori, Sebastian Starke, Taku Komura","doi":"10.1145/3770746","DOIUrl":"https://doi.org/10.1145/3770746","url":null,"abstract":"Animating human-scene interactions such as picking and placing a wide range of objects with different geometries is a challenging task, especially in a cluttered environment where interactions with complex articulated containers are involved. The main difficulty lies in the sparsity of the motion data compared to the wide variation of the objects and environments, as well as the poor availability of transition motions between different actions, increasing the complexity of the generalization to arbitrary conditions. To cope with this issue, we develop a system that tackles the interaction synthesis problem as a hierarchical goal-driven task. Firstly, we develop a bimanual scheduler that plans a set of keyframes for simultaneously controlling the two hands to efficiently achieve the pick-and-place task from an abstract goal signal such as the target object selected by the user. Next, we develop a neural implicit planner that generates hand trajectories to guide reaching and leaving motions across diverse object shapes/types and obstacle layouts. Finally, we propose a linear dynamic model for our DeepPhase controller that incorporates a Kalman filter to enable smooth transitions in the frequency domain, resulting in a more realistic and effective multi-objective control of the character. Our system can synthesize a rich variety of natural pick-and-place movements that adapt to different object geometries, container articulations, and scene layouts.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"55 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145203278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RemixFusion: Residual-based Mixed Representation for Large-scale Online RGB-D Reconstruction RemixFusion:基于残差的大规模在线RGB-D重建混合表示
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-19 DOI: 10.1145/3769007
Yuqing Lan, Chenyang Zhu, Shuaifeng Zhi, Jiazhao Zhang, Zhoufeng Wang, Renjiao Yi, Yijie Wang, Kai Xu
The introduction of the neural implicit representation has notably propelled the advancement of online dense reconstruction techniques. Compared to traditional explicit representations, such as TSDF, it substantially improves the mapping completeness and memory efficiency. However, the lack of reconstruction details and the time-consuming learning of neural representations hinder the widespread application of neural-based methods to large-scale online reconstruction. We introduce RemixFusion, a novel residual-based mixed representation for scene reconstruction and camera pose estimation dedicated to high-quality and large-scale online RGB-D reconstruction. In particular, we propose a residual-based map representation comprised of an explicit coarse TSDF grid and an implicit neural module that produces residuals representing fine-grained details to be added to the coarse grid. Such mixed representation allows for detail-rich reconstruction with bounded time and memory budget, contrasting with the overly-smoothed results by the purely implicit representations, thus paving the way for high-quality camera tracking. Furthermore, we extend the residual-based representation to handle multi-frame joint pose optimization via bundle adjustment (BA). In contrast to the existing methods, which optimize poses directly, we opt to optimize pose changes. Combined with a novel technique for adaptive gradient amplification, our method attains better optimization convergence and global optimality. Furthermore, we adopt a local moving volume to factorize the whole mixed scene representation with a divide-and-conquer design to facilitate efficient online learning in our residual-based framework. Extensive experiments demonstrate that our method surpasses all state-of-the-art ones, including those based either on explicit or implicit representations, in terms of the accuracy of both mapping and tracking on large-scale scenes.
神经隐式表示的引入极大地推动了在线密集重建技术的发展。与传统的显式表示(如TSDF)相比,它大大提高了映射的完整性和内存效率。然而,缺乏重建细节和耗时的神经表征学习阻碍了基于神经的方法在大规模在线重建中的广泛应用。我们介绍了RemixFusion,一种新的基于残差的混合表示,用于场景重建和相机姿态估计,致力于高质量和大规模的在线RGB-D重建。特别是,我们提出了一种基于残差的地图表示,该表示由显式粗TSDF网格和隐式神经模块组成,该模块产生代表细粒度细节的残差,以添加到粗网格中。这种混合表示允许在有限的时间和内存预算下进行细节丰富的重建,与纯隐式表示过于平滑的结果形成鲜明对比,从而为高质量的相机跟踪铺平了道路。此外,我们扩展了基于残差的表示,通过束调整(BA)处理多帧关节位姿优化。与直接优化姿态的现有方法不同,我们选择优化姿态变化。结合一种新颖的自适应梯度放大技术,该方法具有较好的优化收敛性和全局最优性。此外,我们采用局部移动体积来分解整个混合场景表示,并采用分而治之的设计,以促进我们基于残差的框架中的高效在线学习。大量的实验表明,我们的方法在大规模场景的映射和跟踪的准确性方面超过了所有最先进的方法,包括那些基于显式或隐式表示的方法。
{"title":"RemixFusion: Residual-based Mixed Representation for Large-scale Online RGB-D Reconstruction","authors":"Yuqing Lan, Chenyang Zhu, Shuaifeng Zhi, Jiazhao Zhang, Zhoufeng Wang, Renjiao Yi, Yijie Wang, Kai Xu","doi":"10.1145/3769007","DOIUrl":"https://doi.org/10.1145/3769007","url":null,"abstract":"The introduction of the neural implicit representation has notably propelled the advancement of online dense reconstruction techniques. Compared to traditional explicit representations, such as TSDF, it substantially improves the mapping completeness and memory efficiency. However, the lack of reconstruction details and the time-consuming learning of neural representations hinder the widespread application of neural-based methods to large-scale online reconstruction. We introduce RemixFusion, a novel residual-based mixed representation for scene reconstruction and camera pose estimation dedicated to high-quality and large-scale online RGB-D reconstruction. In particular, we propose a residual-based map representation comprised of an explicit coarse TSDF grid and an implicit neural module that produces residuals representing fine-grained details to be added to the coarse grid. Such mixed representation allows for detail-rich reconstruction with bounded time and memory budget, contrasting with the overly-smoothed results by the purely implicit representations, thus paving the way for high-quality camera tracking. Furthermore, we extend the residual-based representation to handle multi-frame joint pose optimization via bundle adjustment (BA). In contrast to the existing methods, which optimize poses directly, we opt to optimize pose changes. Combined with a novel technique for adaptive gradient amplification, our method attains better optimization convergence and global optimality. Furthermore, we adopt a local moving volume to factorize the whole mixed scene representation with a divide-and-conquer design to facilitate efficient online learning in our residual-based framework. Extensive experiments demonstrate that our method surpasses all state-of-the-art ones, including those based either on explicit or implicit representations, in terms of the accuracy of both mapping and tracking on large-scale scenes.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"38 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145089117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Local Surface Parameterizations via Smoothed Geodesic Splines 通过光滑测地线样条进行局部曲面参数化
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-17 DOI: 10.1145/3767323
Abhishek Madan, David Levin
We present a general method for computing local parameterizations rooted at a point on a surface, where the surface is described only through a signed implicit function and a corresponding projection function. Using a two-stage process, we compute several points radially emanating from the map origin, and interpolate between them with a spline surface. The narrow interface of our method allows it to support several kinds of geometry such as signed distance functions, general analytic implicit functions, triangle meshes, neural implicits, and point clouds. We demonstrate the high quality of our generated parameterizations on a variety of examples, and show applications in local texturing and surface curve drawing.
我们提出了一种计算植根于曲面上一点的局部参数化的一般方法,其中曲面仅通过符号隐式函数和相应的投影函数来描述。使用两阶段的过程,我们计算了从地图原点径向发散的几个点,并用样条曲面在它们之间进行插值。我们方法的窄接口允许它支持几种几何类型,如符号距离函数、一般解析隐式函数、三角形网格、神经隐式和点云。我们在各种例子上展示了我们生成的参数化的高质量,并展示了在局部纹理和表面曲线绘制中的应用。
{"title":"Local Surface Parameterizations via Smoothed Geodesic Splines","authors":"Abhishek Madan, David Levin","doi":"10.1145/3767323","DOIUrl":"https://doi.org/10.1145/3767323","url":null,"abstract":"We present a general method for computing local parameterizations rooted at a point on a surface, where the surface is described only through a signed implicit function and a corresponding projection function. Using a two-stage process, we compute several points radially emanating from the map origin, and interpolate between them with a spline surface. The narrow interface of our method allows it to support several kinds of geometry such as signed distance functions, general analytic implicit functions, triangle meshes, neural implicits, and point clouds. We demonstrate the high quality of our generated parameterizations on a variety of examples, and show applications in local texturing and surface curve drawing.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"18 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145084085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1