首页 > 最新文献

ACM Transactions on Graphics最新文献

英文 中文
Glare Pattern Depiction: High-Fidelity Physical Computation and Physiologically-Inspired Visual Response 眩光模式描述:高保真物理计算和生理启发的视觉反应
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-12-04 DOI: 10.1145/3763356
Yuxiang Sun, Gladimir V. G. Baranoski
When observing an intense light source, humans perceive dense radiating spikes known as glare/starburst patterns. These patterns are frequently used in computer graphics applications to enhance the perception of brightness (e.g., in games and films). Previous works have computed the physical energy distribution of glare patterns under daytime conditions using approximations like Fresnel diffraction. These techniques are capable of producing visually believable results, particularly when the pupil remains small. However, they are insufficient under nighttime conditions, when the pupil is significantly dilated and the assumptions behind the approximations no longer hold. To address this, we employ the Rayleigh-Sommerfeld diffraction solution, from which Fresnel diffraction is derived as an approximation, as our baseline reference. In pursuit of performance and visual quality, we also employ Ochoa's approximation and the Chirp Z transform to efficiently generate high-resolution results for computer graphics applications. By also taking into account background illumination and certain physiological characteristics of the human photoreceptor cells, particularly the visual threshold of light stimulus, we propose a framework capable of producing plausible visual depictions of glare patterns for both daytime and nighttime scenes.
当观察强光源时,人类感知到密集的辐射尖峰,称为眩光/星暴模式。这些图案经常用于计算机图形应用程序,以增强对亮度的感知(例如,在游戏和电影中)。以前的工作已经使用菲涅耳衍射等近似方法计算了白天条件下眩光模式的物理能量分布。这些技术能够产生视觉上可信的结果,特别是当瞳孔很小的时候。然而,它们在夜间条件下是不够的,当瞳孔显着扩大时,近似背后的假设不再成立。为了解决这个问题,我们采用瑞利-索默菲尔德衍射解决方案,从菲涅耳衍射作为近似推导,作为我们的基准参考。为了追求性能和视觉质量,我们还采用了Ochoa近似和Chirp Z变换来有效地为计算机图形应用程序生成高分辨率结果。同时考虑到背景照明和人类光感受器细胞的某些生理特征,特别是光刺激的视觉阈值,我们提出了一个框架,能够对白天和夜间场景的眩光模式产生合理的视觉描述。
{"title":"Glare Pattern Depiction: High-Fidelity Physical Computation and Physiologically-Inspired Visual Response","authors":"Yuxiang Sun, Gladimir V. G. Baranoski","doi":"10.1145/3763356","DOIUrl":"https://doi.org/10.1145/3763356","url":null,"abstract":"When observing an intense light source, humans perceive dense radiating spikes known as glare/starburst patterns. These patterns are frequently used in computer graphics applications to enhance the perception of brightness (e.g., in games and films). Previous works have computed the physical energy distribution of glare patterns under daytime conditions using approximations like Fresnel diffraction. These techniques are capable of producing visually believable results, particularly when the pupil remains small. However, they are insufficient under nighttime conditions, when the pupil is significantly dilated and the assumptions behind the approximations no longer hold. To address this, we employ the Rayleigh-Sommerfeld diffraction solution, from which Fresnel diffraction is derived as an approximation, as our baseline reference. In pursuit of performance and visual quality, we also employ Ochoa's approximation and the Chirp Z transform to efficiently generate high-resolution results for computer graphics applications. By also taking into account background illumination and certain physiological characteristics of the human photoreceptor cells, particularly the visual threshold of light stimulus, we propose a framework capable of producing plausible visual depictions of glare patterns for both daytime and nighttime scenes.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"155 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artifact-Resilient Real-Time Holography 人工制品弹性实时全息
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-12-04 DOI: 10.1145/3763361
Victor Chu, Oscar Pueyo-Ciutad, Ethan Tseng, Florian Schiffers, Grace Kuo, Nathan Matsuda, Alberto Redo-Sanchez, Douglas Lanman, Oliver Cossairt, Felix Heide
Holographic near-eye displays promise unparalleled depth cues, high-resolution imagery, and realistic three-dimensional parallax at a compact form factor, making them promising candidates for emerging augmented and virtual reality systems. However, existing holographic display methods often assume ideal viewing conditions and overlook real-world factors such as eye floaters and eyelashes—obstructions that can severely degrade perceived image quality. In this work, we propose a new metric that quantifies hologram resilience to artifacts and apply it to computer generated holography (CGH) optimization. We call this Artifact Resilient Holography (ARH). We begin by introducing a simulation method that models the effects of pre- and post-pupil obstructions on holographic displays. Our analysis reveals that eyebox regions dominated by low frequencies—produced especially by the smooth-phase holograms broadly adopted in recent holography work—are vulnerable to visual degradation from dynamic obstructions such as floaters and eyelashes. In contrast, random phase holograms spread energy more uniformly across the eyebox spectrum, enabling them to diffract around obstructions without producing prominent artifacts. By characterizing a random phase eyebox using the Rayleigh Distribution, we derive a differentiable metric in the eyebox domain. We then apply this metric to train a real-time neural network-based phase generator, enabling it to produce artifact-resilient 3D holograms that preserve visual fidelity across a range of practical viewing conditions—enhancing both robustness and user interactivity.
全息近眼显示器提供了无与伦比的深度线索、高分辨率图像和逼真的三维视差,使其成为新兴增强现实和虚拟现实系统的有希望的候选者。然而,现有的全息显示方法通常假设理想的观看条件,而忽略了现实世界的因素,如眼球漂浮物和睫毛,这些障碍物会严重降低感知图像质量。在这项工作中,我们提出了一种新的度量来量化全息图对伪影的弹性,并将其应用于计算机生成全息(CGH)优化。我们称之为神器复原全息(ARH)。我们首先介绍了一种模拟方法,该方法模拟了瞳孔前和瞳孔后障碍物对全息显示器的影响。我们的分析表明,低频占主导地位的眼箱区域——尤其是在最近的全息摄影工作中广泛采用的平滑相位全息图——容易受到动态障碍物(如漂浮物和睫毛)的视觉退化。相比之下,随机相位全息图在眼框光谱中更均匀地传播能量,使它们能够绕过障碍物而不会产生明显的伪影。利用瑞利分布对随机相位眼盒进行表征,得到了眼盒域中的可微度量。然后,我们应用此度量来训练基于实时神经网络的相位生成器,使其能够生成伪影弹性3D全息图,在一系列实际观看条件下保持视觉保真度,从而增强鲁棒性和用户交互性。
{"title":"Artifact-Resilient Real-Time Holography","authors":"Victor Chu, Oscar Pueyo-Ciutad, Ethan Tseng, Florian Schiffers, Grace Kuo, Nathan Matsuda, Alberto Redo-Sanchez, Douglas Lanman, Oliver Cossairt, Felix Heide","doi":"10.1145/3763361","DOIUrl":"https://doi.org/10.1145/3763361","url":null,"abstract":"Holographic near-eye displays promise unparalleled depth cues, high-resolution imagery, and realistic three-dimensional parallax at a compact form factor, making them promising candidates for emerging augmented and virtual reality systems. However, existing holographic display methods often assume ideal viewing conditions and overlook real-world factors such as eye floaters and eyelashes—obstructions that can severely degrade perceived image quality. In this work, we propose a new metric that quantifies hologram resilience to artifacts and apply it to computer generated holography (CGH) optimization. We call this Artifact Resilient Holography (ARH). We begin by introducing a simulation method that models the effects of pre- and post-pupil obstructions on holographic displays. Our analysis reveals that eyebox regions dominated by low frequencies—produced especially by the smooth-phase holograms broadly adopted in recent holography work—are vulnerable to visual degradation from dynamic obstructions such as floaters and eyelashes. In contrast, random phase holograms spread energy more uniformly across the eyebox spectrum, enabling them to diffract around obstructions without producing prominent artifacts. By characterizing a random phase eyebox using the Rayleigh Distribution, we derive a differentiable metric in the eyebox domain. We then apply this metric to train a real-time neural network-based phase generator, enabling it to produce artifact-resilient 3D holograms that preserve visual fidelity across a range of practical viewing conditions—enhancing both robustness and user interactivity.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"26 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation 航海家:远程和世界一致的视频扩散可探索的3D场景生成
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-12-04 DOI: 10.1145/3763330
Tianyu Huang, Wangguandong Zheng, Tengfei Wang, Yuhao Liu, Zhenwei Wang, Junta Wu, Jie Jiang, Hui Li, Rynson Lau, Wangmeng Zuo, Chunchao Guo
Real-world applications like video gaming and virtual reality often demand the ability to model 3D scenes that users can explore along custom camera trajectories. While significant progress has been made in generating 3D objects from text or images, creating long-range, 3D-consistent, explorable 3D scenes remains a complex and challenging problem. In this work, we present Voyager , a novel video diffusion framework that generates world-consistent 3D point-cloud sequences from a single image with user-defined camera path. Unlike existing approaches, Voyager achieves end-to-end scene generation and reconstruction with inherent consistency across frames, eliminating the need for 3D reconstruction pipelines (e.g., structure-from-motion or multi-view stereo). Our method integrates three key components: 1) World-Consistent Video Diffusion : A unified architecture that jointly generates aligned RGB and depth video sequences, conditioned on existing world observation to ensure global coherence 2) Long-Range World Exploration : An efficient world cache with point culling and an auto-regressive inference with smooth video sampling for iterative scene extension with context-aware consistency, and 3) Scalable Data Engine : A video reconstruction pipeline that automates camera pose estimation and metric depth prediction for arbitrary videos, enabling large-scale, diverse training data curation without manual 3D annotations. Collectively, these designs result in a clear improvement over existing methods in visual quality and geometric accuracy, with versatile applications. Code for this paper are at https://github.com/Tencent-Hunyuan/HunyuanWorld-Voyager.
视频游戏和虚拟现实等现实世界的应用通常需要能够模拟3D场景,用户可以沿着自定义的相机轨迹探索。虽然在从文本或图像生成3D对象方面取得了重大进展,但创建远程、3D一致、可探索的3D场景仍然是一个复杂而具有挑战性的问题。在这项工作中,我们提出了Voyager,这是一个新颖的视频扩散框架,可以从具有用户定义的相机路径的单个图像中生成世界一致的3D点云序列。与现有的方法不同,Voyager实现了端到端的场景生成和重建,具有跨帧的内在一致性,消除了对3D重建管道的需求(例如,运动结构或多视图立体)。我们的方法集成了三个关键组件:1)世界一致的视频扩散:一个统一的架构,共同生成对齐的RGB和深度视频序列,以现有的世界观测为条件,以确保全球一致性;2)远程世界探索:一个有效的世界缓存,带有点剔除和自动回归推理,带有平滑视频采样,用于迭代场景扩展,具有上下文感知一致性;3)可扩展的数据引擎:一个视频重建管道,可自动对任意视频进行相机姿态估计和度量深度预测,实现大规模、多样化的训练数据管理,无需手动3D注释。总的来说,这些设计在视觉质量和几何精度方面明显优于现有方法,具有广泛的应用。本文的代码见https://github.com/Tencent-Hunyuan/HunyuanWorld-Voyager。
{"title":"Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation","authors":"Tianyu Huang, Wangguandong Zheng, Tengfei Wang, Yuhao Liu, Zhenwei Wang, Junta Wu, Jie Jiang, Hui Li, Rynson Lau, Wangmeng Zuo, Chunchao Guo","doi":"10.1145/3763330","DOIUrl":"https://doi.org/10.1145/3763330","url":null,"abstract":"Real-world applications like video gaming and virtual reality often demand the ability to model 3D scenes that users can explore along custom camera trajectories. While significant progress has been made in generating 3D objects from text or images, creating long-range, 3D-consistent, explorable 3D scenes remains a complex and challenging problem. In this work, we present <jats:italic toggle=\"yes\">Voyager</jats:italic> , a novel video diffusion framework that generates world-consistent 3D point-cloud sequences from a single image with user-defined camera path. Unlike existing approaches, Voyager achieves end-to-end scene generation and reconstruction with inherent consistency across frames, eliminating the need for 3D reconstruction pipelines (e.g., structure-from-motion or multi-view stereo). Our method integrates three key components: 1) World-Consistent Video Diffusion : A unified architecture that jointly generates aligned RGB and depth video sequences, conditioned on existing world observation to ensure global coherence 2) Long-Range World Exploration : An efficient world cache with point culling and an auto-regressive inference with smooth video sampling for iterative scene extension with context-aware consistency, and 3) Scalable Data Engine : A video reconstruction pipeline that automates camera pose estimation and metric depth prediction for arbitrary videos, enabling large-scale, diverse training data curation without manual 3D annotations. Collectively, these designs result in a clear improvement over existing methods in visual quality and geometric accuracy, with versatile applications. Code for this paper are at https://github.com/Tencent-Hunyuan/HunyuanWorld-Voyager.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"34 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Auto Hair Card Extraction for Smooth Hair with Differentiable Rendering 自动头发卡提取光滑的头发与可微分渲染
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-12-04 DOI: 10.1145/3763295
Zhongtian Zheng, Tao Huang, Haozhe Su, Xueqi Ma, Yuefan Shen, Tongtong Wang, Yin Yang, Xifeng Gao, Zherong Pan, Kui Wu
Hair cards remain a widely used representation for hair modeling in real-time applications, offering a practical trade-off between visual fidelity, memory usage, and performance. However, generating high-quality hair card models remains a challenging and labor-intensive task. This work presents an automated pipeline for converting strand-based hair models into hair card models with a limited number of cards and textures while preserving the hairstyle appearance. Our key idea is a novel differentiable representation where each strand is encoded as a projected 2D curve in the texture space, which enables end-to-end optimization with differentiable rendering while respecting the structures of the hair geometry. Based on this representation, we develop a novel algorithm pipeline, where we first cluster hair strands into initial hair cards and project the strands into the texture space. We then conduct a two-stage optimization, where our first stage optimizes the orientation of each hair card separately, and after strand projection, our second stage conducts joint optimization over the entire hair card model for fine-tuning. Our method is evaluated on a range of hairstyles, including straight, wavy, curly, and coily hair. To capture the appearance of short or coily hair, our method comes with support for hair caps and cross-card.
头发卡仍然是实时应用中广泛使用的头发建模表示,在视觉保真度、内存使用和性能之间提供了一个实际的权衡。然而,生成高质量的发卡模型仍然是一项具有挑战性和劳动密集型的任务。这项工作提出了一个自动化的流水线,用于将基于发丝的头发模型转换为具有有限数量的卡片和纹理的发卡模型,同时保留发型外观。我们的关键思想是一种新颖的可微表示,其中每条线都被编码为纹理空间中的投影2D曲线,这使得端到端的可微渲染优化成为可能,同时尊重头发的几何结构。在此基础上,我们开发了一种新的算法管道,首先将发丝聚类到初始发卡中,并将发丝投影到纹理空间中。然后我们进行两阶段的优化,第一阶段分别优化每个发卡的方向,在进行发丝投影后,第二阶段对整个发卡模型进行联合优化进行微调。我们的方法是在一系列发型上进行评估的,包括直发、波浪发、卷发和卷发。捕捉短或卷曲的头发的外观,我们的方法来支持发帽和交叉卡。
{"title":"Auto Hair Card Extraction for Smooth Hair with Differentiable Rendering","authors":"Zhongtian Zheng, Tao Huang, Haozhe Su, Xueqi Ma, Yuefan Shen, Tongtong Wang, Yin Yang, Xifeng Gao, Zherong Pan, Kui Wu","doi":"10.1145/3763295","DOIUrl":"https://doi.org/10.1145/3763295","url":null,"abstract":"Hair cards remain a widely used representation for hair modeling in real-time applications, offering a practical trade-off between visual fidelity, memory usage, and performance. However, generating high-quality hair card models remains a challenging and labor-intensive task. This work presents an automated pipeline for converting strand-based hair models into hair card models with a limited number of cards and textures while preserving the hairstyle appearance. Our key idea is a novel differentiable representation where each strand is encoded as a projected 2D curve in the texture space, which enables end-to-end optimization with differentiable rendering while respecting the structures of the hair geometry. Based on this representation, we develop a novel algorithm pipeline, where we first cluster hair strands into initial hair cards and project the strands into the texture space. We then conduct a two-stage optimization, where our first stage optimizes the orientation of each hair card separately, and after strand projection, our second stage conducts joint optimization over the entire hair card model for fine-tuning. Our method is evaluated on a range of hairstyles, including straight, wavy, curly, and coily hair. To capture the appearance of short or coily hair, our method comes with support for hair caps and cross-card.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"12 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resolution Where It Counts: Hash-based GPU-Accelerated 3D Reconstruction via Variance-Adaptive Voxel Grids 分辨率计算:基于哈希的gpu加速3D重建通过方差自适应体素网格
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-11-20 DOI: 10.1145/3777909
Lorenzo De Rebotti, Emanuele Giacomini, Giorgio Grisetti, Luca Di Giammarino
Efficient and scalable 3D surface reconstruction from range data remains a core challenge in computer graphics and vision, particularly in real-time and resource-constrained scenarios. Traditional volumetric methods based on fixed-resolution voxel grids or hierarchical structures like octrees often suffer from memory inefficiency, computational overhead, and a lack of GPU support. We propose a novel variance-adaptive, multi-resolution voxel grid that dynamically adjusts voxel size based on the local variance of signed distance field (SDF) observations. Unlike prior multi-resolution approaches that rely on recursive octree structures, our method leverages a flat spatial hash table to store all voxel blocks, supporting constant-time access and full GPU parallelism. This design enables high memory efficiency, and real-time scalability. We further demonstrate how our representation supports GPU-accelerated rendering through a parallel quad-tree structure for Gaussian Splatting, enabling effective control over splat density. Our open-source CUDA/C++ implementation achieves up to 13× speedup and 4× lower memory usage compared to fixed-resolution baselines, while maintaining on par results in terms of reconstruction accuracy, offering a practical and extensible solution for high-performance 3D reconstruction.
从距离数据中高效、可扩展的3D表面重建仍然是计算机图形学和视觉的核心挑战,特别是在实时和资源受限的情况下。基于固定分辨率体素网格或八叉树等分层结构的传统体积方法通常存在内存效率低下、计算开销和缺乏GPU支持的问题。提出了一种基于符号距离场(SDF)观测值的局部方差动态调整体素大小的方差自适应多分辨率体素网格。与之前依赖于递归八叉树结构的多分辨率方法不同,我们的方法利用平面空间哈希表来存储所有体素块,支持恒定时间访问和完全GPU并行性。该设计实现了高内存效率和实时可扩展性。我们进一步演示了我们的表示如何通过并行四叉树结构支持高斯飞溅的gpu加速渲染,从而有效控制飞溅密度。与固定分辨率基线相比,我们的开源CUDA/ c++实现实现了高达13倍的加速和4倍的内存使用,同时在重建精度方面保持了同等的结果,为高性能3D重建提供了实用和可扩展的解决方案。
{"title":"Resolution Where It Counts: Hash-based GPU-Accelerated 3D Reconstruction via Variance-Adaptive Voxel Grids","authors":"Lorenzo De Rebotti, Emanuele Giacomini, Giorgio Grisetti, Luca Di Giammarino","doi":"10.1145/3777909","DOIUrl":"https://doi.org/10.1145/3777909","url":null,"abstract":"Efficient and scalable 3D surface reconstruction from range data remains a core challenge in computer graphics and vision, particularly in real-time and resource-constrained scenarios. Traditional volumetric methods based on fixed-resolution voxel grids or hierarchical structures like octrees often suffer from memory inefficiency, computational overhead, and a lack of GPU support. We propose a novel variance-adaptive, multi-resolution voxel grid that dynamically adjusts voxel size based on the local variance of signed distance field (SDF) observations. Unlike prior multi-resolution approaches that rely on recursive octree structures, our method leverages a flat spatial hash table to store all voxel blocks, supporting constant-time access and full GPU parallelism. This design enables high memory efficiency, and real-time scalability. We further demonstrate how our representation supports GPU-accelerated rendering through a parallel quad-tree structure for Gaussian Splatting, enabling effective control over splat density. Our open-source CUDA/C++ implementation achieves up to 13× speedup and 4× lower memory usage compared to fixed-resolution baselines, while maintaining on par results in terms of reconstruction accuracy, offering a practical and extensible solution for high-performance 3D reconstruction.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"204 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145554482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Voronoi Rooms: Dynamic Visibility Modulation of Overlapping Spaces for Telepresence Voronoi房间:远程呈现重叠空间的动态可视性调制
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-11-20 DOI: 10.1145/3777900
Taehei Kim, Jihun Shin, Hyeshim Kim, Hyuckjin Jang, Jiho Kang, Sung-Hee Lee
We propose a multi-user Mixed Reality (MR) telepresence system that allows users to interact by seamlessly visualizing remote environments and avatars overlaid onto their local physical space. Building on prior shared-space approaches, our method first aligns overlapping rooms to maximize a shared space —a common area containing matched real and virtual objects where all users can interact. Uniquely, our system extends beyond this shared space by visualizing non-shared spaces, the remaining part of each room, allowing users to inhabit these distinct areas. To address the issue of overlap between non-shared spaces, we dynamically adjust their visibility based on user proximity, using a Voronoi diagram to prioritize subspaces closer to each user. Visualizing the surrounding space of each user conveys spatial context, helping others interpret their behavior within their environment. Visibility is updated in real time as users move, maintaining a coherent sense of spatial awareness. Through a user study, we demonstrate that our system enhances enjoyment, spatial understanding, and presence compared to shared-space-only approaches. Quantitative results further show that our dynamic visibility modulation improves both personal space preservation and space accessibility relative to static methods. Overall, our system provides users with a seamless, dynamically connected, and shared multi-room environment.
我们提出了一个多用户混合现实(MR)远程呈现系统,允许用户通过无缝可视化远程环境和覆盖在其本地物理空间上的化身进行交互。在先前的共享空间方法的基础上,我们的方法首先对齐重叠的房间以最大化共享空间-一个包含匹配的真实和虚拟对象的公共区域,所有用户都可以在其中进行交互。独特的是,我们的系统通过可视化非共享空间扩展了这个共享空间,每个房间的其余部分,允许用户居住在这些不同的区域。为了解决非共享空间之间的重叠问题,我们根据用户接近度动态调整其可见性,使用Voronoi图来优先考虑靠近每个用户的子空间。可视化每个用户的周围空间传达空间背景,帮助其他人理解他们在环境中的行为。可见性随着用户的移动而实时更新,保持连贯的空间意识。通过用户研究,我们证明了与仅共享空间的方法相比,我们的系统增强了享受、空间理解和存在感。定量结果进一步表明,相对于静态方法,动态可视性调制提高了个人空间保存和空间可达性。总的来说,我们的系统为用户提供了一个无缝的、动态连接的、共享的多房间环境。
{"title":"Voronoi Rooms: Dynamic Visibility Modulation of Overlapping Spaces for Telepresence","authors":"Taehei Kim, Jihun Shin, Hyeshim Kim, Hyuckjin Jang, Jiho Kang, Sung-Hee Lee","doi":"10.1145/3777900","DOIUrl":"https://doi.org/10.1145/3777900","url":null,"abstract":"We propose a multi-user Mixed Reality (MR) telepresence system that allows users to interact by seamlessly visualizing remote environments and avatars overlaid onto their local physical space. Building on prior shared-space approaches, our method first aligns overlapping rooms to maximize a <jats:italic toggle=\"yes\">shared space</jats:italic> —a common area containing matched real and virtual objects where all users can interact. Uniquely, our system extends beyond this shared space by visualizing non-shared spaces, the remaining part of each room, allowing users to inhabit these distinct areas. To address the issue of overlap between non-shared spaces, we dynamically adjust their visibility based on user proximity, using a Voronoi diagram to prioritize subspaces closer to each user. Visualizing the surrounding space of each user conveys spatial context, helping others interpret their behavior within their environment. Visibility is updated in real time as users move, maintaining a coherent sense of spatial awareness. Through a user study, we demonstrate that our system enhances enjoyment, spatial understanding, and presence compared to shared-space-only approaches. Quantitative results further show that our dynamic visibility modulation improves both personal space preservation and space accessibility relative to static methods. Overall, our system provides users with a seamless, dynamically connected, and shared multi-room environment.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"6 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145554480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectral Theory of Light Transport Operators 光输运算子的光谱理论
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-11-04 DOI: 10.1145/3774756
Cyril Soler, Kartic Subr
Light Transport Operators (LTOs) represent a fundamental concept in computer graphics, modeling single bounces of light within a virtual environment as linears operators on infinite dimensional spaces. While the LTOs play a crucial role in rendering, prior studies have primarily focused on spectral analyses of the light field rather than the operators themselves. This paper presents a rigorous investigation into the spectral properties of the LTOs. Due to their non-compact nature, traditional spectral analysis techniques face challenges in this setting. However, many practical rendering methods effectively employ compact approximations, suggesting that non-compactness is not an absolute barrier. We show the relevance of such approximations and establish various path integral formulations of their spectrum. These findings enhance the theoretical understanding of light transport and offer new perspectives for improving rendering efficiency and accuracy.
光传输算子(LTOs)代表了计算机图形学中的一个基本概念,它将虚拟环境中的单个光反弹建模为无限维空间上的线性算子。虽然lto在渲染中起着至关重要的作用,但之前的研究主要集中于光场的光谱分析,而不是操作员本身。本文对lto的光谱特性进行了严格的研究。由于其非紧凑性,传统的光谱分析技术在这种情况下面临挑战。然而,许多实际的渲染方法有效地采用紧致近似,这表明非紧致并不是一个绝对的障碍。我们展示了这些近似的相关性,并建立了它们光谱的各种路径积分公式。这些发现增强了对光输运的理论认识,并为提高绘制效率和精度提供了新的视角。
{"title":"Spectral Theory of Light Transport Operators","authors":"Cyril Soler, Kartic Subr","doi":"10.1145/3774756","DOIUrl":"https://doi.org/10.1145/3774756","url":null,"abstract":"Light Transport Operators (LTOs) represent a fundamental concept in computer graphics, modeling single bounces of light within a virtual environment as linears operators on infinite dimensional spaces. While the LTOs play a crucial role in rendering, prior studies have primarily focused on spectral analyses of the light field rather than the operators themselves. This paper presents a rigorous investigation into the spectral properties of the LTOs. Due to their non-compact nature, traditional spectral analysis techniques face challenges in this setting. However, many practical rendering methods effectively employ compact approximations, suggesting that non-compactness is not an absolute barrier. We show the relevance of such approximations and establish various path integral formulations of their spectrum. These findings enhance the theoretical understanding of light transport and offer new perspectives for improving rendering efficiency and accuracy.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"53 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145434325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NeuPPS: Neural Piecewise Parametric Surfaces NeuPPS:神经分段参数曲面
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-29 DOI: 10.1145/3771546
Lei Yang, Yongqing Liang, Xin Li, Congyi Zhang, Guying Lin, Cheng Lin, Alla Sheffer, Scott Schaefer, John Keyser, Wenping Wang
Piecewise parametric surfaces have long been established as prevalent geometric representations; however, they often require surface refinement or sophisticated quadrangulation to accurately represent complex geometries. Geometric deep learning has shown that neural networks can provide greater representational power than conventional methods. Nevertheless, approaches using a single parametric surface for shape fitting struggle to capture fine-grained geometric details, while multi-patch methods fail to ensure seamless connections between adjacent patches. We present Neural Piecewise Parametric Surfaces ( NeuPPS ), the first piecewise neural surface representation that allows for coarse patch layouts composed of arbitrary n -sided surface patches to model complex surface geometries with high precision, offering enhanced flexibility compared to traditional parametric surfaces. This new surface representation guarantees, by construction, the continuity between adjacent patches, a property that other neural patch-based approaches cannot ensure. Two novel components are introduced: a learnable feature complex and a continuous mapping function approximated by multi-layer perceptrons (MLPs). We apply the proposed NeuPPS to surface fitting and shape space learning tasks. Extensive experiments demonstrate the advantages of NeuPPS over traditional parametric representations and existing patch-based learning approaches.
分段参数曲面早已被确立为流行的几何表示;然而,它们通常需要表面细化或复杂的四边形来准确地表示复杂的几何形状。几何深度学习表明,神经网络可以提供比传统方法更大的表示能力。然而,使用单一参数曲面进行形状拟合的方法难以捕获细粒度的几何细节,而多补丁方法无法确保相邻补丁之间的无缝连接。我们提出了神经分段参数曲面(NeuPPS),这是第一个分段神经曲面表示,它允许由任意n边表面斑块组成的粗糙斑块布局,以高精度模拟复杂的表面几何形状,与传统的参数曲面相比,提供了更高的灵活性。这种新的表面表示通过构造保证了相邻斑块之间的连续性,这是其他基于神经斑块的方法无法保证的特性。引入了两个新的组成部分:一个可学习的特征复合体和一个由多层感知器(mlp)近似的连续映射函数。我们将提出的NeuPPS应用于曲面拟合和形状空间学习任务。大量的实验证明了NeuPPS优于传统的参数表示和现有的基于补丁的学习方法。
{"title":"NeuPPS: Neural Piecewise Parametric Surfaces","authors":"Lei Yang, Yongqing Liang, Xin Li, Congyi Zhang, Guying Lin, Cheng Lin, Alla Sheffer, Scott Schaefer, John Keyser, Wenping Wang","doi":"10.1145/3771546","DOIUrl":"https://doi.org/10.1145/3771546","url":null,"abstract":"Piecewise parametric surfaces have long been established as prevalent geometric representations; however, they often require surface refinement or sophisticated quadrangulation to accurately represent complex geometries. Geometric deep learning has shown that neural networks can provide greater representational power than conventional methods. Nevertheless, approaches using a single parametric surface for shape fitting struggle to capture fine-grained geometric details, while multi-patch methods fail to ensure seamless connections between adjacent patches. We present <jats:italic toggle=\"yes\">Neural Piecewise Parametric Surfaces</jats:italic> ( <jats:italic toggle=\"yes\">NeuPPS</jats:italic> ), the <jats:italic toggle=\"yes\">first</jats:italic> piecewise neural surface representation that allows for coarse patch layouts composed of <jats:italic toggle=\"yes\"> arbitrary <jats:italic toggle=\"yes\">n</jats:italic> -sided surface patches </jats:italic> to model complex surface geometries with high precision, offering enhanced <jats:italic toggle=\"yes\">flexibility</jats:italic> compared to traditional parametric surfaces. This new surface representation guarantees, by construction, the continuity between adjacent patches, a property that other neural patch-based approaches cannot ensure. Two novel components are introduced: a learnable feature complex and a continuous mapping function approximated by multi-layer perceptrons (MLPs). We apply the proposed <jats:italic toggle=\"yes\">NeuPPS</jats:italic> to surface fitting and shape space learning tasks. Extensive experiments demonstrate the advantages of <jats:italic toggle=\"yes\">NeuPPS</jats:italic> over traditional parametric representations and existing patch-based learning approaches.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"69 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145396367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Biharmonic Skinning Using Geometric Fields 利用几何场鲁棒双谐蒙皮
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-28 DOI: 10.1145/3771928
Ana Dodik, Vincent Sitzmann, Justin Solomon, Oded Stein
Bounded bihramonic weights are a popular tool used to rig and deform characters for animation, to compute reduced-order simulations, and to define feature descriptors for geometry processing. They necessitate tetrahedralizing the volume bounded by the surface, introducing the possibility of meshing artifacts or tetrahedralization failure. We introduce a mesh-free and robust automatic skinning technique that generates weights comparable to the current state of the art, but works reliably even on open surfaces, triangle soups, and point clouds where current methods fail. We achieve this through the use of a specialized Lagrangian representation enabled by the advent of hardware ray-tracing, which circumvents the need for finite elements while optimizing the biharmonic energy and enforcing boundary conditions. The flexibility of our formulation allows us to integrate artistic control through weight painting during the optimization. We offer a thorough qualitative and quantitative evaluation of our method.
有界二元权值是一种流行的工具,用于装配和变形动画中的角色,计算降阶模拟,以及定义用于几何处理的特征描述符。它们需要将由表面包围的体积四面体化,从而引入网格伪影或四面体化失败的可能性。我们引入了一种无网格和强大的自动蒙皮技术,该技术可以生成与当前技术相当的权重,但即使在开放表面、三角形汤和当前方法失败的点云上也能可靠地工作。我们通过使用硬件光线追踪实现的专用拉格朗日表示来实现这一目标,这规避了对有限元的需求,同时优化了双调和能量并强制执行边界条件。我们配方的灵活性使我们能够在优化过程中通过重量绘画来整合艺术控制。我们对我们的方法进行全面的定性和定量评估。
{"title":"Robust Biharmonic Skinning Using Geometric Fields","authors":"Ana Dodik, Vincent Sitzmann, Justin Solomon, Oded Stein","doi":"10.1145/3771928","DOIUrl":"https://doi.org/10.1145/3771928","url":null,"abstract":"Bounded bihramonic weights are a popular tool used to rig and deform characters for animation, to compute reduced-order simulations, and to define feature descriptors for geometry processing. They necessitate tetrahedralizing the volume bounded by the surface, introducing the possibility of meshing artifacts or tetrahedralization failure. We introduce a <jats:italic toggle=\"yes\">mesh-free</jats:italic> and <jats:italic toggle=\"yes\">robust</jats:italic> automatic skinning technique that generates weights comparable to the current state of the art, but works reliably even on open surfaces, triangle soups, and point clouds where current methods fail. We achieve this through the use of a specialized Lagrangian representation enabled by the advent of hardware ray-tracing, which circumvents the need for finite elements while optimizing the biharmonic energy and enforcing boundary conditions. The flexibility of our formulation allows us to integrate artistic control through weight painting during the optimization. We offer a thorough qualitative and quantitative evaluation of our method.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"19 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145396487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Emotion Manipulation for Talking-Head Videos via Facial Landmarks 基于面部标志的“会说话的头”视频的情绪控制
IF 6.2 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-13 DOI: 10.1145/3770576
Kwanggyoon Seo, Rene Culaway, Byeong-Uk Lee, Junyong Noh
Manipulating the emotion of a performer in a video is a challenging task. The lip motion needs to be preserved while performing the desired changes in the emotion of the subject; however, simply utilizing existing image-based editing methods sabotages the original lip synchronization. We tackle this problem by utilizing a pretrained StyleGAN paired with a landmark-based editing module that modifies the bias present in the edit direction used in image manipulation. The proposed editing module consists of a latent-based landmark detection network and an editing network that modifies the editing direction to match the original lip synchronization while preserving the desired emotion manipulation results. This is realized by taking the facial landmarks as control points. Both networks operate on the latent space, which enables fast training and inference. We show that the proposed method runs significantly faster and performs better in terms of visual quality than alternative approaches, which was validated through a perceptual study. The proposed method can also be extended to perform face reenactment to generate a talking-head video from a single image and face image manipulation using facial landmarks as control points.
在视频中操纵表演者的情绪是一项具有挑战性的任务。嘴唇的运动需要保留,同时执行预期的变化,在主题的情绪;然而,简单地利用现有的基于图像的编辑方法破坏了原始的唇同步。我们通过使用预训练的StyleGAN与基于地标的编辑模块配对来解决这个问题,该模块可以修改图像处理中使用的编辑方向中存在的偏见。所提出的编辑模块包括一个基于潜在的地标检测网络和一个编辑网络,该网络修改编辑方向以匹配原始唇同步,同时保持期望的情绪操纵结果。这是通过将面部地标作为控制点来实现的。这两种网络都在潜在空间上运行,从而实现了快速训练和推理。我们表明,所提出的方法比其他方法运行得更快,在视觉质量方面表现得更好,这一点通过感知研究得到了验证。该方法还可以扩展到从单个图像生成说话头视频的人脸再现和以面部地标为控制点的人脸图像处理。
{"title":"Emotion Manipulation for Talking-Head Videos via Facial Landmarks","authors":"Kwanggyoon Seo, Rene Culaway, Byeong-Uk Lee, Junyong Noh","doi":"10.1145/3770576","DOIUrl":"https://doi.org/10.1145/3770576","url":null,"abstract":"Manipulating the emotion of a performer in a video is a challenging task. The lip motion needs to be preserved while performing the desired changes in the emotion of the subject; however, simply utilizing existing image-based editing methods sabotages the original lip synchronization. We tackle this problem by utilizing a pretrained StyleGAN paired with a landmark-based editing module that modifies the bias present in the edit direction used in image manipulation. The proposed editing module consists of a latent-based landmark detection network and an editing network that modifies the editing direction to match the original lip synchronization while preserving the desired emotion manipulation results. This is realized by taking the facial landmarks as control points. Both networks operate on the latent space, which enables fast training and inference. We show that the proposed method runs significantly faster and performs better in terms of visual quality than alternative approaches, which was validated through a perceptual study. The proposed method can also be extended to perform face reenactment to generate a talking-head video from a single image and face image manipulation using facial landmarks as control points.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"24 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145282651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1