首页 > 最新文献

IEEE transactions on visualization and computer graphics最新文献

英文 中文
Two-Handed Click and Tap: Expanding Input Vocabulary of Controllers for Virtual Reality Interaction. 双手点击和点击:扩展虚拟现实交互控制器的输入词汇。
IF 6.5 Pub Date : 2026-02-01 DOI: 10.1109/TVCG.2025.3624569
Huawei Tu, BoYu Gao, Yujun Lu, Weiqiang Xin, Hui Cui, Weiqi Luo, Jian Weng, Henry Been-Lirn Duh

This study explores the design space of two-handed input (i.e., clicking or tapping with the thumb) on the touchpads of controllers for virtual reality (VR) interaction. Four experiments were conducted to fulfill this purpose. Experiment 1 investigated how users employed two VR controllers to perform four representative interaction tasks in VR and identified 14 potentially usable two-handed operations that involved tapping or clicking. Experiments 2 and 3 analyzed user performance of the 14 operations, providing insights into their interaction characteristics in terms of completion time, accuracy, and subjective feedback. In Experiment 4, we designed a command-input technique based on the proposed operations. We verified its effectiveness compared to context menus and marking menus in a VR text entry scenario. Our technique generally had shorter times and similar accuracy to the two menu types. Our work contributes to the design of VR interactions using two-handed controllers.

本研究探讨了在虚拟现实(VR)交互中,在控制器的触摸板上进行双手输入(即用拇指点击或敲击)的设计空间。为了达到这个目的,进行了四个实验。实验1调查了用户如何使用两个VR控制器在VR中执行四个具有代表性的交互任务,并确定了14个潜在可用的双手操作,包括点击或点击。实验2和实验3分析了14种操作的用户表现,从完成时间、准确性和主观反馈等方面了解了这些操作的交互特征。在实验4中,我们基于提议的操作设计了一种命令输入技术。我们将其与VR文本输入场景中的上下文菜单和标记菜单进行了比较,验证了其有效性。与这两种菜单类型相比,我们的技术通常用时更短,准确性相似。我们的工作有助于设计使用双手控制器的VR交互。
{"title":"Two-Handed Click and Tap: Expanding Input Vocabulary of Controllers for Virtual Reality Interaction.","authors":"Huawei Tu, BoYu Gao, Yujun Lu, Weiqiang Xin, Hui Cui, Weiqi Luo, Jian Weng, Henry Been-Lirn Duh","doi":"10.1109/TVCG.2025.3624569","DOIUrl":"10.1109/TVCG.2025.3624569","url":null,"abstract":"<p><p>This study explores the design space of two-handed input (i.e., clicking or tapping with the thumb) on the touchpads of controllers for virtual reality (VR) interaction. Four experiments were conducted to fulfill this purpose. Experiment 1 investigated how users employed two VR controllers to perform four representative interaction tasks in VR and identified 14 potentially usable two-handed operations that involved tapping or clicking. Experiments 2 and 3 analyzed user performance of the 14 operations, providing insights into their interaction characteristics in terms of completion time, accuracy, and subjective feedback. In Experiment 4, we designed a command-input technique based on the proposed operations. We verified its effectiveness compared to context menus and marking menus in a VR text entry scenario. Our technique generally had shorter times and similar accuracy to the two menu types. Our work contributes to the design of VR interactions using two-handed controllers.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1682-1697"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145357317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-Shot Video Translation via Token Warping. 零镜头视频翻译通过令牌翘曲。
IF 6.5 Pub Date : 2026-02-01 DOI: 10.1109/TVCG.2025.3636949
Haiming Zhu, Yangyang Xu, Jun Yu, Shengfeng He

With the revolution of generative AI, video-related tasks have been widely studied. However, current state-of-the-art video models still lag behind image models in visual quality and user control over generated content. In this paper, we introduce TokenWarping, a novel framework for temporally coherent video translation. Existing diffusion-based video editing approaches rely solely on key and value patches in self-attention to ensure temporal consistency, often sacrificing the preservation of local and structural regions. Critically, these methods overlook the significance of the query patches in achieving accurate feature aggregation and temporal coherence. In contrast, TokenWarping leverages complementary token priors by constructing temporal correlations across different frames. Our method begins by extracting optical flows from source videos. During the denoising process of the diffusion model, these optical flows are used to warp the previous frame's query, key, and value patches, aligning them with the current frame's patches. By directly warping the query patches, we enhance feature aggregation in self-attention, while warping the key and value patches ensures temporal consistency across frames. This token warping imposes explicit constraints on the self-attention layer outputs, effectively ensuring temporally coherent translation. Our framework does not require any additional training or fine-tuning and can be seamlessly integrated with existing text-to-image editing methods. We conduct extensive experiments on various video translation tasks, demonstrating that TokenWarping surpasses state-of-the-art methods both qualitatively and quantitatively. Video demonstrations are available in supplementary materials.

随着生成式人工智能的革命,视频相关任务得到了广泛的研究。然而,目前最先进的视频模型在视觉质量和用户对生成内容的控制方面仍然落后于图像模型。在本文中,我们介绍了TokenWarping,一个用于时间连贯视频翻译的新框架。现有的基于扩散的视频编辑方法仅仅依靠自关注中的关键和值补丁来保证时间一致性,往往牺牲了局部和结构区域的保存。关键是,这些方法忽略了查询补丁在实现准确的特征聚合和时间相干性方面的重要性。相反,tokenwarp通过构建跨不同帧的时间相关性来利用互补的token先验。我们的方法首先从源视频中提取光流。在扩散模型去噪过程中,这些光流被用来扭曲前一帧的查询、键和值补丁,使它们与当前帧的补丁对齐。通过直接扭曲查询补丁,我们增强了自关注的特征聚合,而扭曲键和值补丁确保了帧间的时间一致性。这种令牌扭曲对自关注层输出施加了显式约束,有效地确保了暂时连贯的翻译。我们的框架不需要任何额外的培训或微调,可以与现有的文本到图像的编辑方法无缝集成。我们对各种视频翻译任务进行了广泛的实验,证明TokenWarping在定性和定量上都超过了最先进的方法。视频演示可在补充材料中找到。
{"title":"Zero-Shot Video Translation via Token Warping.","authors":"Haiming Zhu, Yangyang Xu, Jun Yu, Shengfeng He","doi":"10.1109/TVCG.2025.3636949","DOIUrl":"10.1109/TVCG.2025.3636949","url":null,"abstract":"<p><p>With the revolution of generative AI, video-related tasks have been widely studied. However, current state-of-the-art video models still lag behind image models in visual quality and user control over generated content. In this paper, we introduce TokenWarping, a novel framework for temporally coherent video translation. Existing diffusion-based video editing approaches rely solely on key and value patches in self-attention to ensure temporal consistency, often sacrificing the preservation of local and structural regions. Critically, these methods overlook the significance of the query patches in achieving accurate feature aggregation and temporal coherence. In contrast, TokenWarping leverages complementary token priors by constructing temporal correlations across different frames. Our method begins by extracting optical flows from source videos. During the denoising process of the diffusion model, these optical flows are used to warp the previous frame's query, key, and value patches, aligning them with the current frame's patches. By directly warping the query patches, we enhance feature aggregation in self-attention, while warping the key and value patches ensures temporal consistency across frames. This token warping imposes explicit constraints on the self-attention layer outputs, effectively ensuring temporally coherent translation. Our framework does not require any additional training or fine-tuning and can be seamlessly integrated with existing text-to-image editing methods. We conduct extensive experiments on various video translation tasks, demonstrating that TokenWarping surpasses state-of-the-art methods both qualitatively and quantitatively. Video demonstrations are available in supplementary materials.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1582-1592"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145643887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PFF-Net: Patch Feature Fitting for Point Cloud Normal Estimation. PFF-Net:点云法向估计的Patch Feature拟合。
IF 6.5 Pub Date : 2026-02-01 DOI: 10.1109/TVCG.2025.3638450
Qing Li, Huifang Feng, Kanle Shi, Yue Gao, Yi Fang, Yu-Shen Liu, Zhizhong Han

Estimating the normal of a point requires constructing a local patch to provide center-surrounding context, but determining the appropriate neighborhood size is difficult when dealing with different data or geometries. Existing methods commonly employ various parameter-heavy strategies to extract a full feature description from the input patch. However, they still have difficulties in accurately and efficiently predicting normals for various point clouds. In this work, we present a new idea of feature extraction for robust normal estimation of point clouds. We use the fusion of multi-scale features from different neighborhood sizes to address the issue of selecting reasonable patch sizes for various data or geometries. We seek to model a patch feature fitting (PFF) based on multi-scale features to approximate the optimal geometric description for normal estimation and implement the approximation process via multi-scale feature aggregation and cross-scale feature compensation. The feature aggregation module progressively aggregates the patch features of different scales to the center of the patch and shrinks the patch size by removing points far from the center. It not only enables the network to precisely capture the structure characteristic in a wide range, but also describes highly detailed geometries. The feature compensation module ensures the reusability of features from earlier layers of large scales and reveals associated information in different patch sizes. Our approximation strategy based on aggregating the features of multiple scales enables the model to achieve scale adaptation of varying local patches and deliver the optimal feature description. Extensive experiments demonstrate that our method achieves state-of-the-art performance on both synthetic and real-world datasets with fewer network parameters and running time.

估计一个点的法线需要构建一个局部补丁来提供中心周围的上下文,但是当处理不同的数据或几何形状时,确定适当的邻域大小是困难的。现有的方法通常采用各种重参数策略从输入patch中提取完整的特征描述。然而,它们在准确有效地预测各种点云的法线方面仍然存在困难。本文提出了一种用于点云鲁棒正态估计的特征提取方法。我们使用不同邻域尺寸的多尺度特征融合来解决各种数据或几何形状选择合理补丁尺寸的问题。我们试图建立一个基于多尺度特征的补丁特征拟合(PFF)模型来逼近正态估计的最优几何描述,并通过多尺度特征聚合和跨尺度特征补偿实现逼近过程。特征聚集模块将不同尺度的patch特征逐步聚集到patch的中心,并通过去除远离中心的点来缩小patch的大小。它不仅使网络能够在大范围内精确捕获结构特征,而且还可以描述非常详细的几何形状。特征补偿模块确保了早期大尺度层特征的可重用性,并在不同的补丁大小中显示相关信息。我们基于多尺度特征聚合的近似策略使模型能够实现不同局部斑块的尺度自适应,并提供最优的特征描述。大量的实验表明,我们的方法在合成和现实世界的数据集上都能以更少的网络参数和运行时间实现最先进的性能。
{"title":"PFF-Net: Patch Feature Fitting for Point Cloud Normal Estimation.","authors":"Qing Li, Huifang Feng, Kanle Shi, Yue Gao, Yi Fang, Yu-Shen Liu, Zhizhong Han","doi":"10.1109/TVCG.2025.3638450","DOIUrl":"10.1109/TVCG.2025.3638450","url":null,"abstract":"<p><p>Estimating the normal of a point requires constructing a local patch to provide center-surrounding context, but determining the appropriate neighborhood size is difficult when dealing with different data or geometries. Existing methods commonly employ various parameter-heavy strategies to extract a full feature description from the input patch. However, they still have difficulties in accurately and efficiently predicting normals for various point clouds. In this work, we present a new idea of feature extraction for robust normal estimation of point clouds. We use the fusion of multi-scale features from different neighborhood sizes to address the issue of selecting reasonable patch sizes for various data or geometries. We seek to model a patch feature fitting (PFF) based on multi-scale features to approximate the optimal geometric description for normal estimation and implement the approximation process via multi-scale feature aggregation and cross-scale feature compensation. The feature aggregation module progressively aggregates the patch features of different scales to the center of the patch and shrinks the patch size by removing points far from the center. It not only enables the network to precisely capture the structure characteristic in a wide range, but also describes highly detailed geometries. The feature compensation module ensures the reusability of features from earlier layers of large scales and reveals associated information in different patch sizes. Our approximation strategy based on aggregating the features of multiple scales enables the model to achieve scale adaptation of varying local patches and deliver the optimal feature description. Extensive experiments demonstrate that our method achieves state-of-the-art performance on both synthetic and real-world datasets with fewer network parameters and running time.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1713-1728"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145644183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
InterMamba: Efficient Human-Human Interaction Generation With Adaptive Spatio-Temporal Mamba. 曼巴:利用自适应时空曼巴产生高效的人际互动。
IF 6.5 Pub Date : 2026-02-01 DOI: 10.1109/TVCG.2025.3635116
Zizhao Wu, Yingying Sun, Yiming Chen, Xiaoling Gu, Ruyu Liu, Jiazhou Chen

Human-human interaction generation has garnered significant attention in motion synthesis due to its vital role in understanding humans as social beings. However, existing methods typically rely on transformer-based architectures, which often face challenges related to scalability and efficiency. To address these challenges, we propose InterMamba, a novel and efficient human-human interaction generation method built on the Mamba framework, designed to capture long-sequence dependencies effectively while enabling real-time feedback. Specifically, we introduce an adaptive spatio-temporal Mamba framework that utilizes two parallel SSM branches with an adaptive mechanism to integrate the spatial and temporal features of motion sequences. To further enhance the model's ability to capture dependencies within individual motion sequences and the interactions between different individual sequences, we develop two key modules: the self adaptive spatio-temporal Mamba module and the cross adaptive spatio-temporal Mamba module, enabling efficient feature learning. Extensive experiments demonstrate that our method achieves the state-of-the-art results on both two interaction datasets with remarkable quality and efficiency. Compared to the baseline method InterGen, our approach not only improves accuracy but also reduces the parameter size to just 66 M (36% of InterGen's), while achieving an average inference speed of 0.57 seconds, which is 46% of InterGen's execution time.

在运动合成中,人-人互动生成由于其在理解人类作为社会生物方面的重要作用而引起了极大的关注。然而,现有的方法通常依赖于基于转换器的体系结构,这通常面临着与可伸缩性和效率相关的挑战。为了应对这些挑战,我们提出了InterMamba,这是一种基于Mamba框架的新型高效人机交互生成方法,旨在有效捕获长序列依赖关系,同时实现实时反馈。具体来说,我们引入了一个自适应时空曼巴框架,该框架利用两个平行的SSM分支和自适应机制来整合运动序列的空间和时间特征。为了进一步增强模型捕获单个运动序列中的依赖关系和不同单个序列之间相互作用的能力,我们开发了两个关键模块:自适应时空曼巴模块和交叉适应时空曼巴模块,实现了高效的特征学习。大量的实验表明,我们的方法在两个交互数据集上都达到了最先进的结果,并且具有显著的质量和效率。与基准方法InterGen相比,我们的方法不仅提高了精度,而且将参数大小减少到仅66 M (InterGen的36%),同时实现了0.57秒的平均推理速度,这是InterGen执行时间的46%。
{"title":"InterMamba: Efficient Human-Human Interaction Generation With Adaptive Spatio-Temporal Mamba.","authors":"Zizhao Wu, Yingying Sun, Yiming Chen, Xiaoling Gu, Ruyu Liu, Jiazhou Chen","doi":"10.1109/TVCG.2025.3635116","DOIUrl":"10.1109/TVCG.2025.3635116","url":null,"abstract":"<p><p>Human-human interaction generation has garnered significant attention in motion synthesis due to its vital role in understanding humans as social beings. However, existing methods typically rely on transformer-based architectures, which often face challenges related to scalability and efficiency. To address these challenges, we propose InterMamba, a novel and efficient human-human interaction generation method built on the Mamba framework, designed to capture long-sequence dependencies effectively while enabling real-time feedback. Specifically, we introduce an adaptive spatio-temporal Mamba framework that utilizes two parallel SSM branches with an adaptive mechanism to integrate the spatial and temporal features of motion sequences. To further enhance the model's ability to capture dependencies within individual motion sequences and the interactions between different individual sequences, we develop two key modules: the self adaptive spatio-temporal Mamba module and the cross adaptive spatio-temporal Mamba module, enabling efficient feature learning. Extensive experiments demonstrate that our method achieves the state-of-the-art results on both two interaction datasets with remarkable quality and efficiency. Compared to the baseline method InterGen, our approach not only improves accuracy but also reduces the parameter size to just 66 M (36% of InterGen's), while achieving an average inference speed of 0.57 seconds, which is 46% of InterGen's execution time.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1928-1940"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145574686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
2DGH: 2D Gaussian-Hermite Splatting for High-Quality Rendering and Better Geometry Features. 2DGH:用于高质量渲染和更好的几何特征的2D高斯-埃尔米特飞溅。
IF 6.5 Pub Date : 2026-02-01 DOI: 10.1109/TVCG.2025.3622157
Ruihan Yu, Tianyu Huang, Jingwang Ling, Feng Xu

2D Gaussian Splatting has recently emerged as a significant method in 3D reconstruction, enabling novel view synthesis and geometry reconstruction simultaneously. While the well-known Gaussian kernel is broadly used, its lack of anisotropy and deformation ability leads to dim and vague edges at object silhouettes, limiting the reconstruction quality of current Gaussian splatting methods. To enhance the representation power, we draw inspiration from quantum physics and propose to use the Gaussian-Hermite kernel as the new primitive in Gaussian splatting. The new kernel takes a unified mathematical form and extends the Gaussian function, which serves as the zero-rank special case in the updated general formulation. Our experiments demonstrate that the proposed Gaussian-Hermite kernel achieves improved performance over traditional Gaussian Splatting kernels on both geometry reconstruction and novel-view synthesis tasks. Specifically, on the DTU dataset, our method yields more accurate geometry reconstruction, while on datasets such as MipNeRF360 and our customized Detail dataset, it achieves better results in novel-view synthesis. These results highlight the potential of the Gaussian-Hermite kernel for high-quality 3D reconstruction and rendering.

二维高斯溅射是一种重要的三维重建方法,它可以同时实现新的视图合成和几何重建。虽然众所周知的高斯核被广泛使用,但由于其缺乏各向异性和变形能力,导致物体轮廓边缘模糊,限制了当前高斯飞溅方法的重建质量。为了提高高斯溅射的表示能力,我们从量子物理中汲取灵感,提出使用高斯-埃尔米特核作为高斯溅射的新原语。新核采用统一的数学形式,对高斯函数进行了扩展,作为更新后的一般公式中的零阶特例。我们的实验表明,所提出的高斯-埃尔米特核在几何重建和新视图合成任务上都比传统的高斯飞溅核具有更高的性能。具体来说,在DTU数据集上,我们的方法产生了更精确的几何重建,而在MipNeRF360和我们定制的Detail数据集上,它在新视图合成方面取得了更好的结果。这些结果突出了高斯-埃尔米特核在高质量3D重建和渲染方面的潜力。
{"title":"2DGH: 2D Gaussian-Hermite Splatting for High-Quality Rendering and Better Geometry Features.","authors":"Ruihan Yu, Tianyu Huang, Jingwang Ling, Feng Xu","doi":"10.1109/TVCG.2025.3622157","DOIUrl":"10.1109/TVCG.2025.3622157","url":null,"abstract":"<p><p>2D Gaussian Splatting has recently emerged as a significant method in 3D reconstruction, enabling novel view synthesis and geometry reconstruction simultaneously. While the well-known Gaussian kernel is broadly used, its lack of anisotropy and deformation ability leads to dim and vague edges at object silhouettes, limiting the reconstruction quality of current Gaussian splatting methods. To enhance the representation power, we draw inspiration from quantum physics and propose to use the Gaussian-Hermite kernel as the new primitive in Gaussian splatting. The new kernel takes a unified mathematical form and extends the Gaussian function, which serves as the zero-rank special case in the updated general formulation. Our experiments demonstrate that the proposed Gaussian-Hermite kernel achieves improved performance over traditional Gaussian Splatting kernels on both geometry reconstruction and novel-view synthesis tasks. Specifically, on the DTU dataset, our method yields more accurate geometry reconstruction, while on datasets such as MipNeRF360 and our customized Detail dataset, it achieves better results in novel-view synthesis. These results highlight the potential of the Gaussian-Hermite kernel for high-quality 3D reconstruction and rendering.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1513-1524"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145305095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Neural Field-Based Approach for View Computation & Data Exploration in 3D Urban Environments. 一种基于神经场的三维城市环境视图计算与数据探索方法。
IF 6.5 Pub Date : 2026-02-01 DOI: 10.1109/TVCG.2025.3635528
Stefan Cobeli, Kazi Shahrukh Omar, Rodrigo Valenca, Nivan Ferreira, Fabio Miranda

Despite the growing availability of 3D urban datasets, extracting insights remains challenging due to computational bottlenecks and the complexity of interacting with data. In fact, the intricate geometry of 3D urban environments results in high degrees of occlusion and requires extensive manual viewpoint adjustments that make large-scale exploration inefficient. To address this, we propose a view-based approach for 3D data exploration, where a vector field encodes views from the environment. To support this approach, we introduce a neural field-based method that constructs an efficient implicit representation of 3D environments. This representation enables both faster direct queries, which consist of the computation of view assessment indices, and inverse queries, which help avoid occlusion and facilitate the search for views that match desired data patterns. Our approach supports key urban analysis tasks such as visibility assessments, solar exposure evaluation, and assessing the visual impact of new developments. We validate our method through quantitative experiments, case studies informed by real-world urban challenges, and feedback from domain experts. Results show its effectiveness in finding desirable viewpoints, analyzing building facade visibility, and evaluating views from outdoor spaces.

尽管3D城市数据集的可用性越来越高,但由于计算瓶颈和与数据交互的复杂性,提取见解仍然具有挑战性。事实上,3D城市环境的复杂几何结构导致高度遮挡,需要大量的手动视点调整,这使得大规模勘探效率低下。为了解决这个问题,我们提出了一种基于视图的3D数据探索方法,其中矢量场对来自环境的视图进行编码。为了支持这种方法,我们引入了一种基于神经场的方法,该方法构建了3D环境的有效隐式表示。这种表示既支持更快的直接查询(包含视图评估索引的计算),也支持反向查询(有助于避免遮挡并促进搜索与所需数据模式匹配的视图)。我们的方法支持关键的城市分析任务,如能见度评估、太阳照射评估和评估新开发项目的视觉影响。我们通过定量实验、基于现实城市挑战的案例研究以及领域专家的反馈来验证我们的方法。结果表明,它在寻找理想的视点、分析建筑立面可视性和评估室外空间的景观方面是有效的。代码和数据可在urbantk.org/neural-3d上公开获取。
{"title":"A Neural Field-Based Approach for View Computation & Data Exploration in 3D Urban Environments.","authors":"Stefan Cobeli, Kazi Shahrukh Omar, Rodrigo Valenca, Nivan Ferreira, Fabio Miranda","doi":"10.1109/TVCG.2025.3635528","DOIUrl":"10.1109/TVCG.2025.3635528","url":null,"abstract":"<p><p>Despite the growing availability of 3D urban datasets, extracting insights remains challenging due to computational bottlenecks and the complexity of interacting with data. In fact, the intricate geometry of 3D urban environments results in high degrees of occlusion and requires extensive manual viewpoint adjustments that make large-scale exploration inefficient. To address this, we propose a view-based approach for 3D data exploration, where a vector field encodes views from the environment. To support this approach, we introduce a neural field-based method that constructs an efficient implicit representation of 3D environments. This representation enables both faster direct queries, which consist of the computation of view assessment indices, and inverse queries, which help avoid occlusion and facilitate the search for views that match desired data patterns. Our approach supports key urban analysis tasks such as visibility assessments, solar exposure evaluation, and assessing the visual impact of new developments. We validate our method through quantitative experiments, case studies informed by real-world urban challenges, and feedback from domain experts. Results show its effectiveness in finding desirable viewpoints, analyzing building facade visibility, and evaluating views from outdoor spaces.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1540-1553"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145575099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing Parameters for Static Equilibrium of Discrete Elastic Rods With Active-Set Cholesky. 离散弹性杆静力平衡参数的主动集choolesky优化。
IF 6.5 Pub Date : 2026-02-01 DOI: 10.1109/TVCG.2025.3622483
Tetsuya Takahashi, Christopher Batty

We propose a parameter optimization method for achieving static equilibrium of discrete elastic rods. Our method simultaneously optimizes material stiffness and rest shape parameters under box constraints to exactly enforce zero net forces while avoiding stability issues and violations of physical laws. For efficiency, we split our constrained optimization problem into primal and dual subproblems via the augmented Lagrangian method, while handling the dual maximization subproblem via simple vector updates. To efficiently solve the box-constrained primal minimization subproblem, we propose a new active-set Cholesky preconditioner for variants of conjugate gradient solvers with active sets. Our method surpasses prior work in generality, robustness, and speed.

提出了一种离散弹性杆静力平衡的参数优化方法。我们的方法在盒子约束下同时优化材料刚度和静止形状参数,以精确地执行零净力,同时避免稳定性问题和违反物理定律。为了提高效率,我们通过增广拉格朗日方法将约束优化问题分解为原始子问题和对偶子问题,同时通过简单的向量更新处理对偶最大化子问题。为了有效地求解盒约束的原始最小化子问题,我们提出了一种新的具有活动集的共轭梯度解的变体的活动集Cholesky预条件。我们的方法在通用性、鲁棒性和速度上超越了先前的工作。
{"title":"Optimizing Parameters for Static Equilibrium of Discrete Elastic Rods With Active-Set Cholesky.","authors":"Tetsuya Takahashi, Christopher Batty","doi":"10.1109/TVCG.2025.3622483","DOIUrl":"10.1109/TVCG.2025.3622483","url":null,"abstract":"<p><p>We propose a parameter optimization method for achieving static equilibrium of discrete elastic rods. Our method simultaneously optimizes material stiffness and rest shape parameters under box constraints to exactly enforce zero net forces while avoiding stability issues and violations of physical laws. For efficiency, we split our constrained optimization problem into primal and dual subproblems via the augmented Lagrangian method, while handling the dual maximization subproblem via simple vector updates. To efficiently solve the box-constrained primal minimization subproblem, we propose a new active-set Cholesky preconditioner for variants of conjugate gradient solvers with active sets. Our method surpasses prior work in generality, robustness, and speed.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1951-1962"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145310348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reevaluating the Gaze Cursor in Virtual Reality: A Comparative Analysis of Cursor Visibility, Confirmation Mechanisms, and Task Paradigms. 虚拟现实中注视光标的再评估:光标可见性、确认机制和任务范式的比较分析。
IF 6.5 Pub Date : 2026-02-01 DOI: 10.1109/TVCG.2025.3622042
Yushi Wei, Rongkai Shi, Sen Zhang, Anil Ufuk Batmaz, Pan Hui, Hai-Ning Liang

Cursors and how they are presented significantly influence user experience in both VR and non-VR environments by shaping how users interact with and perceive interfaces. In traditional interfaces, cursors serve as a fundamental component for translating human movement into digital interactions, enhancing interaction accuracy, efficiency, and experience. The design and visibility of cursors can affect users' ability to locate interactive elements and understand system feedback. In VR, cursor manipulation is more complex than in non-VR environments, as it can be controlled through hand, head, and gaze movements. With the arrival of the Apple Vision Pro, the use of gaze-controlled non-visible cursors has gained some prominence. However, there has been limited exploration of the effect of this type of cursor. This work presents a comprehensive study of the effects of cursor visibility (visible versus invisible) in gaze-based interactions within VR environments. Through two user studies, we investigate how cursor visibility impacts user performance and experience across different confirmation mechanisms and tasks. The first study focuses on selection tasks, examining the influence of target width, movement amplitude, and three common confirmation methods (air tap, blinking, and dwell). The second study explores pursuit tasks, analyzing cursor effects under varying movement speeds. Our findings reveal that cursor visibility significantly affects both objective performance metrics and subjective user preferences, but these effects vary depending on the confirmation mechanism used and task type. We propose eight design implications based on our empirical results to guide the future development of gaze-based interfaces in VR. These insights highlight the importance of tailoring cursor metaphors to specific interaction tasks and provide practical guidance for researchers and developers in optimizing VR user interfaces.

光标及其呈现方式通过塑造用户与界面交互和感知界面的方式,在VR和非VR环境中显著影响用户体验。在传统的界面中,光标是将人类动作转化为数字交互的基本组件,可以提高交互的准确性、效率和体验。游标的设计和可见性会影响用户定位交互元素和理解系统反馈的能力。在VR中,光标操作比在非VR环境中更复杂,因为它可以通过手、头和目光运动来控制。随着Apple Vision Pro的问世,使用由眼球控制的不可见光标得到了一定的重视。然而,对于这种类型的光标的效果的探索是有限的。这项工作对光标可见性(可见与不可见)在VR环境中基于凝视的交互中的影响进行了全面研究。通过两项用户研究,我们研究了光标可见性如何影响用户在不同确认机制和任务中的性能和体验。第一项研究侧重于选择任务,考察了目标宽度、运动幅度和三种常见的确认方法(轻拍、眨眼和停留)的影响。第二项研究探讨了追踪任务,分析了光标在不同移动速度下的效果。我们的研究结果表明,光标可见性显著影响客观性能指标和主观用户偏好,但这些影响取决于所使用的确认机制和任务类型。根据我们的实证结果,我们提出了八个设计启示,以指导VR中基于凝视的界面的未来发展。这些见解强调了针对特定交互任务定制光标隐喻的重要性,并为研究人员和开发人员优化VR用户界面提供了实用指导。
{"title":"Reevaluating the Gaze Cursor in Virtual Reality: A Comparative Analysis of Cursor Visibility, Confirmation Mechanisms, and Task Paradigms.","authors":"Yushi Wei, Rongkai Shi, Sen Zhang, Anil Ufuk Batmaz, Pan Hui, Hai-Ning Liang","doi":"10.1109/TVCG.2025.3622042","DOIUrl":"10.1109/TVCG.2025.3622042","url":null,"abstract":"<p><p>Cursors and how they are presented significantly influence user experience in both VR and non-VR environments by shaping how users interact with and perceive interfaces. In traditional interfaces, cursors serve as a fundamental component for translating human movement into digital interactions, enhancing interaction accuracy, efficiency, and experience. The design and visibility of cursors can affect users' ability to locate interactive elements and understand system feedback. In VR, cursor manipulation is more complex than in non-VR environments, as it can be controlled through hand, head, and gaze movements. With the arrival of the Apple Vision Pro, the use of gaze-controlled non-visible cursors has gained some prominence. However, there has been limited exploration of the effect of this type of cursor. This work presents a comprehensive study of the effects of cursor visibility (visible versus invisible) in gaze-based interactions within VR environments. Through two user studies, we investigate how cursor visibility impacts user performance and experience across different confirmation mechanisms and tasks. The first study focuses on selection tasks, examining the influence of target width, movement amplitude, and three common confirmation methods (air tap, blinking, and dwell). The second study explores pursuit tasks, analyzing cursor effects under varying movement speeds. Our findings reveal that cursor visibility significantly affects both objective performance metrics and subjective user preferences, but these effects vary depending on the confirmation mechanism used and task type. We propose eight design implications based on our empirical results to guide the future development of gaze-based interfaces in VR. These insights highlight the importance of tailoring cursor metaphors to specific interaction tasks and provide practical guidance for researchers and developers in optimizing VR user interfaces.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1640-1655"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145305158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Bayesian Guided Spatial-, Angular- and Temporal-Consistent View Synthesis. 层次贝叶斯引导的空间、角度和时间一致视图合成。
IF 6.5 Pub Date : 2026-02-01 DOI: 10.1109/TVCG.2025.3631702
Junyu Zhu, Hao Zhu, Sheng Wang, Zhan Ma, Xun Cao

Neural Radiance Fields (NeRF) have gained significant attention due to their precise reconstruction and rapid inference capabilities, making them highly promising for applications in virtual reality and gaming. However, extending NeRF's capabilities to dynamic scenes remains underexplored, particularly in ensuring consistent and coherent reconstructions across space, time, and viewing angles. To address this challenge, we propose Scale-NeRF, a novel approach that organizes the training of dynamic NeRFs as a progressive, scale-based refinement process, grounded in hierarchical Bayesian theory. Scale-NeRF begins by reconstructing the radiance fields using coarse, large-scale frames and iteratively refines them with progressively smaller-scale frames. This hierarchical strategy, combined with a corresponding sampling approach and a newly introduced structural loss, ensures consistency and integrity throughout the reconstruction process. Experiments on public datasets validate the superiority of Scale-NeRF over traditional methods, especially in terms of the proposed metrics evaluating spatial, angular, and temporal consistency. Furthermore, Scale-NeRF demonstrates excellent dynamic reconstruction capabilities with real-time rendering, offering a significant advancement for applications demanding both high fidelity and real-time performance.

神经辐射场(NeRF)由于其精确的重建和快速的推理能力而获得了极大的关注,使其在虚拟现实和游戏中的应用具有很大的前景。然而,将NeRF的能力扩展到动态场景仍然有待探索,特别是在确保跨空间、时间和视角的一致和连贯的重建方面。为了应对这一挑战,我们提出了Scale-NeRF,这是一种新颖的方法,将动态nerf的训练组织为一个渐进的、基于规模的细化过程,以层次贝叶斯理论为基础。Scale-NeRF首先使用粗的、大规模的帧重建辐射场,然后用逐渐小尺度的帧迭代地改进它们。这种分层策略与相应的采样方法和新引入的结构损失相结合,确保了整个重建过程的一致性和完整性。在公共数据集上的实验验证了Scale-NeRF相对于传统方法的优越性,特别是在评估空间、角度和时间一致性方面。此外,Scale-NeRF在实时渲染中展示了出色的动态重建能力,为要求高保真度和实时性能的应用提供了重大进步。
{"title":"Hierarchical Bayesian Guided Spatial-, Angular- and Temporal-Consistent View Synthesis.","authors":"Junyu Zhu, Hao Zhu, Sheng Wang, Zhan Ma, Xun Cao","doi":"10.1109/TVCG.2025.3631702","DOIUrl":"10.1109/TVCG.2025.3631702","url":null,"abstract":"<p><p>Neural Radiance Fields (NeRF) have gained significant attention due to their precise reconstruction and rapid inference capabilities, making them highly promising for applications in virtual reality and gaming. However, extending NeRF's capabilities to dynamic scenes remains underexplored, particularly in ensuring consistent and coherent reconstructions across space, time, and viewing angles. To address this challenge, we propose Scale-NeRF, a novel approach that organizes the training of dynamic NeRFs as a progressive, scale-based refinement process, grounded in hierarchical Bayesian theory. Scale-NeRF begins by reconstructing the radiance fields using coarse, large-scale frames and iteratively refines them with progressively smaller-scale frames. This hierarchical strategy, combined with a corresponding sampling approach and a newly introduced structural loss, ensures consistency and integrity throughout the reconstruction process. Experiments on public datasets validate the superiority of Scale-NeRF over traditional methods, especially in terms of the proposed metrics evaluating spatial, angular, and temporal consistency. Furthermore, Scale-NeRF demonstrates excellent dynamic reconstruction capabilities with real-time rendering, offering a significant advancement for applications demanding both high fidelity and real-time performance.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1438-1451"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145508619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Far is Too Far? The Trade-Off Between Selection Distance and Accuracy During Teleportation in Immersive Virtual Reality. 多远才算太远?沉浸式虚拟现实中隐形传态选择距离与精度的权衡
IF 6.5 Pub Date : 2026-02-01 DOI: 10.1109/TVCG.2025.3632345
Daniel Rupp, Tim Weissker, Matthias Wolwer, Torsten W Kuhlen, Daniel Zielasko

Target-selection-based teleportation is one of the most widely used and researched travel techniques in immersive virtual environments, requiring the user to specify a target location with a selection ray before being transported there. This work explores the influence of the maximum reach of the parabolic selection ray, modeled by different emission velocities of the projectile motion equation, and compares the resulting teleportation performance to a straight ray as the baseline. In a user study with 60 participants, we asked participants to teleport as far as possible while still remaining within accuracy constraints to understand how the theoretical implications of the projectile motion equation apply to a realistic VR use case. We found that a projectile emission velocity of $14 frac{m}{s}$14ms (resulting in a maximal reach of $text{21.52 m}$21.52m) offered the best trade-off between selection distance and accuracy, with an inferior performance of the straight ray. Our results demonstrate the necessity to carefully set and report the projectile emission velocity in future work, as it was shown to directly influence user-selected distance, selection errors, and controller height during selection.

基于目标选择的隐形传态是沉浸式虚拟环境中应用最广泛和研究最多的一种旅行技术,它要求用户在被传送到目标位置之前使用选择射线指定目标位置。这项工作探讨了抛物选择射线的最大到达范围的影响,通过抛物运动方程的不同发射速度建模,并将由此产生的隐形传态性能与直线射线作为基线进行了比较。在一项有60名参与者的用户研究中,我们要求参与者在保持精度限制的情况下尽可能地传送,以了解抛射运动方程的理论含义如何适用于现实的VR用例。我们发现,弹丸发射速度为$14 frac{m}{s}$(导致最大到达$21.52 m$)在选择距离和精度之间提供了最佳折衷,而直线射线的性能较差。我们的研究结果表明,在未来的工作中,有必要仔细设置和报告弹丸发射速度,因为它直接影响用户选择的距离、选择误差和选择过程中的控制器高度。
{"title":"How Far is Too Far? The Trade-Off Between Selection Distance and Accuracy During Teleportation in Immersive Virtual Reality.","authors":"Daniel Rupp, Tim Weissker, Matthias Wolwer, Torsten W Kuhlen, Daniel Zielasko","doi":"10.1109/TVCG.2025.3632345","DOIUrl":"10.1109/TVCG.2025.3632345","url":null,"abstract":"<p><p>Target-selection-based teleportation is one of the most widely used and researched travel techniques in immersive virtual environments, requiring the user to specify a target location with a selection ray before being transported there. This work explores the influence of the maximum reach of the parabolic selection ray, modeled by different emission velocities of the projectile motion equation, and compares the resulting teleportation performance to a straight ray as the baseline. In a user study with 60 participants, we asked participants to teleport as far as possible while still remaining within accuracy constraints to understand how the theoretical implications of the projectile motion equation apply to a realistic VR use case. We found that a projectile emission velocity of $14 frac{m}{s}$14ms (resulting in a maximal reach of $text{21.52 m}$21.52m) offered the best trade-off between selection distance and accuracy, with an inferior performance of the straight ray. Our results demonstrate the necessity to carefully set and report the projectile emission velocity in future work, as it was shown to directly influence user-selected distance, selection errors, and controller height during selection.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1864-1878"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145524848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on visualization and computer graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1