首页 > 最新文献

IEEE transactions on visualization and computer graphics最新文献

英文 中文
TR-Gaussians: High-fidelity Real-time Rendering of Planar Transmission and Reflection with 3D Gaussian Splatting. tr -高斯:高保真实时渲染的平面传输和反射与三维高斯飞溅。
IF 6.5 Pub Date : 2026-03-18 DOI: 10.1109/TVCG.2026.3675416
Yong Liu, Keyang Ye, Tianjia Shao, Kun Zhou

We propose Transmission-Reflection Gaussians (TRGaussians), a novel 3D-Gaussian-based representation for highfidelity rendering of planar transmission and reflection, which are ubiquitous in indoor scenes. Our method combines 3D Gaussians with learnable reflection planes that explicitly model the glass planes with view-dependent reflectance strengths. Real scenes and transmission components are modeled by 3D Gaussians and the reflection components are modeled by the mirrored Gaussians with respect to the reflection plane. The transmission and reflection components are blended according to a Fresnelbased, view-dependent weighting scheme, allowing for faithful synthesis of complex appearance effects under varying viewpoints. To effectively optimize TR-Gaussians, we develop a multistage optimization framework incorporating color and geometry constraints and an opacity perturbation mechanism. Experiments on different datasets demonstrate that TR-Gaussians achieve real-time, high-fidelity novel view synthesis in scenes with planar transmission and reflection, and outperform state-of-the-art approaches both quantitatively and qualitatively.

我们提出了传输-反射高斯(tr高斯),这是一种新的基于3d高斯的表示,用于高保真渲染平面传输和反射,这在室内场景中无处不在。我们的方法将3D高斯模型与可学习的反射平面相结合,可以明确地对具有视依赖反射强度的玻璃平面进行建模。真实场景和传输分量采用三维高斯分量建模,反射分量采用相对于反射平面的镜像高斯分量建模。透射和反射组件根据基于菲涅耳的、依赖于视角的加权方案混合,允许在不同视点下忠实地合成复杂的外观效果。为了有效地优化tr -高斯,我们开发了一个包含颜色和几何约束以及不透明度摄动机制的多阶段优化框架。在不同数据集上的实验表明,tr -高斯算法在具有平面传输和反射的场景中实现了实时、高保真的新视图合成,并且在数量和质量上都优于最先进的方法。
{"title":"TR-Gaussians: High-fidelity Real-time Rendering of Planar Transmission and Reflection with 3D Gaussian Splatting.","authors":"Yong Liu, Keyang Ye, Tianjia Shao, Kun Zhou","doi":"10.1109/TVCG.2026.3675416","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3675416","url":null,"abstract":"<p><p>We propose Transmission-Reflection Gaussians (TRGaussians), a novel 3D-Gaussian-based representation for highfidelity rendering of planar transmission and reflection, which are ubiquitous in indoor scenes. Our method combines 3D Gaussians with learnable reflection planes that explicitly model the glass planes with view-dependent reflectance strengths. Real scenes and transmission components are modeled by 3D Gaussians and the reflection components are modeled by the mirrored Gaussians with respect to the reflection plane. The transmission and reflection components are blended according to a Fresnelbased, view-dependent weighting scheme, allowing for faithful synthesis of complex appearance effects under varying viewpoints. To effectively optimize TR-Gaussians, we develop a multistage optimization framework incorporating color and geometry constraints and an opacity perturbation mechanism. Experiments on different datasets demonstrate that TR-Gaussians achieve real-time, high-fidelity novel view synthesis in scenes with planar transmission and reflection, and outperform state-of-the-art approaches both quantitatively and qualitatively.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147482716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How We Map Possibilities: Understanding Design Spaces for Visualization. 我们如何映射可能性:理解可视化的设计空间。
IF 6.5 Pub Date : 2026-03-18 DOI: 10.1109/TVCG.2026.3675300
Zichun Dai, Yechun Peng, Nan Cao, Yang Shi

Design spaces serve as conceptual frameworks that enable systematic exploration of possibilities and constraints for particular design problems. Despite growing recognition of their importance in visualization research, the community faces two main challenges: characterizing what constitute a design space, given the lack of consensus on its definition, and determining how to construct these spaces in the absence of established methodologies. To address the challenges, we first conducted a literature review of visualization design space research, identifying three distinct research threads. Focusing on the thread that views design spaces as multi-dimensional frameworks, we refined our corpus to 49 papers and developed a unified conceptualization of design spaces. Building on this foundation, we proposed a systematic approach to design space construction, synthesized from an analysis of practices spanning five phases: exploration, data collection, creation, evaluation, and communication.

设计空间作为概念框架,能够系统地探索特定设计问题的可能性和约束。尽管越来越多的人认识到它们在可视化研究中的重要性,但社区面临着两个主要挑战:考虑到对其定义缺乏共识,确定设计空间的构成特征,以及在缺乏既定方法的情况下确定如何构建这些空间。为了应对这些挑战,我们首先对可视化设计空间研究进行了文献综述,确定了三个不同的研究思路。围绕将设计空间视为多维框架的思路,我们将语料库提炼为49篇论文,并形成了设计空间的统一概念。在此基础上,我们提出了一种系统的设计空间构建方法,通过对五个阶段的实践分析:探索、数据收集、创造、评估和交流。
{"title":"How We Map Possibilities: Understanding Design Spaces for Visualization.","authors":"Zichun Dai, Yechun Peng, Nan Cao, Yang Shi","doi":"10.1109/TVCG.2026.3675300","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3675300","url":null,"abstract":"<p><p>Design spaces serve as conceptual frameworks that enable systematic exploration of possibilities and constraints for particular design problems. Despite growing recognition of their importance in visualization research, the community faces two main challenges: characterizing what constitute a design space, given the lack of consensus on its definition, and determining how to construct these spaces in the absence of established methodologies. To address the challenges, we first conducted a literature review of visualization design space research, identifying three distinct research threads. Focusing on the thread that views design spaces as multi-dimensional frameworks, we refined our corpus to 49 papers and developed a unified conceptualization of design spaces. Building on this foundation, we proposed a systematic approach to design space construction, synthesized from an analysis of practices spanning five phases: exploration, data collection, creation, evaluation, and communication.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147481536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards the Automatic Detection of Vection in Virtual Reality Using EEG. 基于脑电图的虚拟现实向量自动检测研究。
IF 6.5 Pub Date : 2026-03-18 DOI: 10.1109/TVCG.2026.3675421
Gael Van der Lee, Anatole Lecuyer, Maxence Naud, Reinhold Scherer, FranCois Cabestaing, Hakim Si-Mohammed

Vection, the visual illusion of self-motion, provides a strong marker of the VR user experience and plays an important role in both presence and cybersickness. Traditional measurements have been conducted using questionnaires, which exhibit inherent limitations due to their subjective nature and prevent real-time adjustments. Detecting vection in real time would allow VR systems to adapt to users' needs, improving comfort and minimizing negative effects like cybersickness. This paper investigates the presence of vection markers in electroencephalographic (EEG) brain signals using evoked potentials (brain responses to external stimuli). We designed a VR experiment that induces vection using two conditions: (1) forward acceleration or (2) backward acceleration. We recorded electroencephalographic (EEG) signals and gathered subjective reports on thirty (30) participants. We found an evoked potential of vection characterized by a positive peak around 600 ms (P600) after stimulus onset in the parietal region and a simultaneous negative peak in the frontal region. This result paves the way for the automatic detection of vection using EEG as well as a better understanding of vection. It also provides insights into the functional role of the visual system and its integration with the vestibular system during motion-perception. It has the potential to help enhance VR user experience by qualifying users' perceived vection and adapting the VR environments accordingly.

Vection是一种自我运动的视觉错觉,它是VR用户体验的重要标志,在存在感和晕屏中都起着重要作用。传统的测量方法是使用问卷进行的,由于其主观性质,这种方法存在固有的局限性,并且无法进行实时调整。实时检测方向将使虚拟现实系统适应用户的需求,提高舒适度,并最大限度地减少晕屏等负面影响。本文利用诱发电位(大脑对外部刺激的反应)研究了脑电图(EEG)信号中向量标记物的存在。我们设计了一个VR实验,在两个条件下诱导矢量:(1)向前加速或(2)向后加速。我们记录了30名参与者的脑电图(EEG)信号并收集了主观报告。我们发现,在刺激开始后的600 ms (P600)左右,顶叶区有一个正的诱发电位峰,同时在额叶区有一个负的诱发电位峰。这一结果为利用脑电图实现向量的自动检测以及更好地理解向量奠定了基础。它还提供了对视觉系统的功能作用及其与前庭系统在运动感知中的整合的见解。它有可能通过确定用户感知的向量并相应地调整VR环境来帮助增强VR用户体验。
{"title":"Towards the Automatic Detection of Vection in Virtual Reality Using EEG.","authors":"Gael Van der Lee, Anatole Lecuyer, Maxence Naud, Reinhold Scherer, FranCois Cabestaing, Hakim Si-Mohammed","doi":"10.1109/TVCG.2026.3675421","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3675421","url":null,"abstract":"<p><p>Vection, the visual illusion of self-motion, provides a strong marker of the VR user experience and plays an important role in both presence and cybersickness. Traditional measurements have been conducted using questionnaires, which exhibit inherent limitations due to their subjective nature and prevent real-time adjustments. Detecting vection in real time would allow VR systems to adapt to users' needs, improving comfort and minimizing negative effects like cybersickness. This paper investigates the presence of vection markers in electroencephalographic (EEG) brain signals using evoked potentials (brain responses to external stimuli). We designed a VR experiment that induces vection using two conditions: (1) forward acceleration or (2) backward acceleration. We recorded electroencephalographic (EEG) signals and gathered subjective reports on thirty (30) participants. We found an evoked potential of vection characterized by a positive peak around 600 ms (P600) after stimulus onset in the parietal region and a simultaneous negative peak in the frontal region. This result paves the way for the automatic detection of vection using EEG as well as a better understanding of vection. It also provides insights into the functional role of the visual system and its integration with the vestibular system during motion-perception. It has the potential to help enhance VR user experience by qualifying users' perceived vection and adapting the VR environments accordingly.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147481483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast and Robust Deformable 3D Gaussian Splatting. 快速和鲁棒的变形3D高斯飞溅。
IF 6.5 Pub Date : 2026-03-18 DOI: 10.1109/TVCG.2026.3675272
Han Jiao, Jiakai Sun, Lei Zhao, Zhanjie Zhang, Wei Xing, Huaizhong Lin

3D Gaussian Splatting has demonstrated remarkable real-time rendering capabilities and superior visual quality in novel view synthesis for static scenes. Building upon these advantages, researchers have progressively extended 3D Gaussians to dynamic scene reconstruction. Deformation field-based methods have emerged as a promising approach among various techniques. These methods maintain 3D Gaussian attributes in a canonical field and employ the deformation field to transform this field across temporal sequences. Nevertheless, these approaches frequently encounter challenges such as suboptimal rendering speeds, significant dependence on initial point clouds, and vulnerability to local optima in dim scenes. To overcome these limitations, we present FRoG, an efficient and robust framework for high-quality dynamic scene reconstruction. FRoG integrates per-Gaussian embedding with a coarse-to-fine temporal embedding strategy, accelerating rendering through the early fusion of temporal embeddings. Moreover, to enhance robustness against sparse initializations, we introduce a novel depth- and error-guided sampling strategy. This strategy populates the canonical field with new 3D Gaussians at low-deviation initial positions, significantly reducing the optimization burden on the deformation field and improving detail reconstruction in both static and dynamic regions. Furthermore, by modulating opacity variations, we mitigate the local optima problem in dim scenes, improving color fidelity. Comprehensive experimental results validate that our method achieves accelerated rendering speeds while maintaining state-of-the-art visual quality.

三维高斯喷溅在静态场景的新视图合成中展示了卓越的实时渲染能力和卓越的视觉质量。在这些优势的基础上,研究人员逐渐将三维高斯模型扩展到动态场景重建。在各种技术中,基于变形场的方法已经成为一种很有前途的方法。这些方法在规范域中保持三维高斯属性,并利用变形场跨时间序列对该场进行变换。然而,这些方法经常遇到挑战,例如次优渲染速度,对初始点云的显著依赖,以及在昏暗场景中容易受到局部最优的影响。为了克服这些限制,我们提出了FRoG,一种高效且鲁棒的高质量动态场景重建框架。FRoG将每高斯嵌入与从粗到精的时间嵌入策略相结合,通过时间嵌入的早期融合来加速渲染。此外,为了增强对稀疏初始化的鲁棒性,我们引入了一种新的深度和误差引导采样策略。该策略在低偏差初始位置用新的三维高斯分布填充正则场,显著减轻了变形场的优化负担,提高了静态和动态区域的细节重建。此外,通过调制不透明度变化,我们缓解了昏暗场景下的局部最优问题,提高了色彩保真度。综合实验结果验证了我们的方法在保持最先进的视觉质量的同时实现了加速的渲染速度。
{"title":"Fast and Robust Deformable 3D Gaussian Splatting.","authors":"Han Jiao, Jiakai Sun, Lei Zhao, Zhanjie Zhang, Wei Xing, Huaizhong Lin","doi":"10.1109/TVCG.2026.3675272","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3675272","url":null,"abstract":"<p><p>3D Gaussian Splatting has demonstrated remarkable real-time rendering capabilities and superior visual quality in novel view synthesis for static scenes. Building upon these advantages, researchers have progressively extended 3D Gaussians to dynamic scene reconstruction. Deformation field-based methods have emerged as a promising approach among various techniques. These methods maintain 3D Gaussian attributes in a canonical field and employ the deformation field to transform this field across temporal sequences. Nevertheless, these approaches frequently encounter challenges such as suboptimal rendering speeds, significant dependence on initial point clouds, and vulnerability to local optima in dim scenes. To overcome these limitations, we present FRoG, an efficient and robust framework for high-quality dynamic scene reconstruction. FRoG integrates per-Gaussian embedding with a coarse-to-fine temporal embedding strategy, accelerating rendering through the early fusion of temporal embeddings. Moreover, to enhance robustness against sparse initializations, we introduce a novel depth- and error-guided sampling strategy. This strategy populates the canonical field with new 3D Gaussians at low-deviation initial positions, significantly reducing the optimization burden on the deformation field and improving detail reconstruction in both static and dynamic regions. Furthermore, by modulating opacity variations, we mitigate the local optima problem in dim scenes, improving color fidelity. Comprehensive experimental results validate that our method achieves accelerated rendering speeds while maintaining state-of-the-art visual quality.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147483009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-Branch Aesthetic Image Retouching Via Active Reinforcement Learning for Color Enhancement and Composition Optimization. 通过主动强化学习进行色彩增强和构图优化的双分支美学图像修饰。
IF 6.5 Pub Date : 2026-03-17 DOI: 10.1109/TVCG.2026.3674534
Dong Liang, Yifan Liu, Yuanhang Gao, Sheng-Jun Huang, Songcan Chen

Existing learning-based visual retouching primarily focuses on improving image quality through end-to-end objective mapping between input and retouched images. However, these approaches often overlook two critical aspects: the progressive nature of image retouching and the subjective aesthetic preferences, resulting in suboptimal visual outcomes. To address this, we introduce Automatic Aesthetic Image Retouching via active reinforcement learning (A $^{3}$ RL) to enhance the visualization experience in two sub-tasks: color enhancement and composition optimization, which are formulated as a unified Markov Decision Process in the proposed A $^{3}$ RL framework. In our approach, each pixel functions as an autonomous agent that determines optimal actions based on aesthetic guidance, engaging in online exploration through immediate pixel-wise and channel-wise feedback from the aesthetic environment. By leveraging a pretrained image aesthetic model, our method ensures that the A $^{3}$ RL process aligns with human aesthetic preferences and adheres to subjective aesthetic principles. The framework integrates pixel-level retouching actions with image-level operations to achieve optimal image sequences through progressive iterations. Extensive experiments demonstrate that our method effectively recalibrates image aesthetics across multiple dimensions: low-level quality metrics (PSNR, SSIM), visual perception (LPIPS), and subjective visual experience (human survey). The results demonstrate high consistency with expert-retouched ground-truth images. Source code is available at: https://github.com/S-Ir-V/color_crop.

现有的基于学习的视觉修饰主要是通过输入图像和修饰图像之间的端到端客观映射来提高图像质量。然而,这些方法往往忽略了两个关键方面:图像修饰的进步性和主观审美偏好,导致次优的视觉效果。为了解决这个问题,我们通过主动强化学习(A $^{3}$ RL)引入了自动美学图像修饰,以增强两个子任务的可视化体验:颜色增强和构图优化,这两个子任务在提议的A $^{3}$ RL框架中被表述为统一的马尔可夫决策过程。在我们的方法中,每个像素都作为一个自主代理,根据美学指导确定最佳行动,通过来自美学环境的即时像素和渠道反馈参与在线探索。通过利用预训练的图像美学模型,我们的方法确保a $^{3}$ RL过程符合人类的审美偏好,并坚持主观审美原则。该框架将像素级修饰动作与图像级操作相结合,通过渐进迭代实现最佳图像序列。大量实验表明,我们的方法有效地从多个维度重新校准图像美学:低水平质量指标(PSNR, SSIM),视觉感知(LPIPS)和主观视觉体验(人体调查)。结果表明,该图像与经专家修饰的真实图像具有较高的一致性。源代码可从https://github.com/S-Ir-V/color_crop获得。
{"title":"Dual-Branch Aesthetic Image Retouching Via Active Reinforcement Learning for Color Enhancement and Composition Optimization.","authors":"Dong Liang, Yifan Liu, Yuanhang Gao, Sheng-Jun Huang, Songcan Chen","doi":"10.1109/TVCG.2026.3674534","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3674534","url":null,"abstract":"<p><p>Existing learning-based visual retouching primarily focuses on improving image quality through end-to-end objective mapping between input and retouched images. However, these approaches often overlook two critical aspects: the progressive nature of image retouching and the subjective aesthetic preferences, resulting in suboptimal visual outcomes. To address this, we introduce Automatic Aesthetic Image Retouching via active reinforcement learning (A $^{3}$ RL) to enhance the visualization experience in two sub-tasks: color enhancement and composition optimization, which are formulated as a unified Markov Decision Process in the proposed A $^{3}$ RL framework. In our approach, each pixel functions as an autonomous agent that determines optimal actions based on aesthetic guidance, engaging in online exploration through immediate pixel-wise and channel-wise feedback from the aesthetic environment. By leveraging a pretrained image aesthetic model, our method ensures that the A $^{3}$ RL process aligns with human aesthetic preferences and adheres to subjective aesthetic principles. The framework integrates pixel-level retouching actions with image-level operations to achieve optimal image sequences through progressive iterations. Extensive experiments demonstrate that our method effectively recalibrates image aesthetics across multiple dimensions: low-level quality metrics (PSNR, SSIM), visual perception (LPIPS), and subjective visual experience (human survey). The results demonstrate high consistency with expert-retouched ground-truth images. Source code is available at: https://github.com/S-Ir-V/color_crop.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147476857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The bar-tip limit error in bar charts: Exploring its relationship to the within-the-bar bias. 柱状图中柱尖极限误差:与柱内偏差的关系探讨。
IF 6.5 Pub Date : 2026-03-17 DOI: 10.1109/TVCG.2026.3675047
Daniel Reimann

The present study suggests that the bar-tip limit error-assuming the raw data are limited by the bar-tip-assessed with one question, moderates the within-the-bar bias-rating values inside the bar as more likely than those outside the bar-and that this bias is resilient to reduction through explanation.

目前的研究表明,假设用一个问题评估的原始数据受到条形提示的限制,条形提示的限制误差会使条形提示内的偏差评定值比条形提示外的偏差评定值更有可能得到调节,而且这种偏差可以通过解释来减少。
{"title":"The bar-tip limit error in bar charts: Exploring its relationship to the within-the-bar bias.","authors":"Daniel Reimann","doi":"10.1109/TVCG.2026.3675047","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3675047","url":null,"abstract":"<p><p>The present study suggests that the bar-tip limit error-assuming the raw data are limited by the bar-tip-assessed with one question, moderates the within-the-bar bias-rating values inside the bar as more likely than those outside the bar-and that this bias is resilient to reduction through explanation.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147476144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TalkingEyes: Pluralistic Speech-Driven 3D Eye Gaze Animation. TalkingEyes:多元语言驱动的3D眼睛凝视动画。
IF 6.5 Pub Date : 2026-03-17 DOI: 10.1109/TVCG.2026.3674699
Yixiang Zhuang, Chunshan Ma, Yao Cheng, Xuan Cheng, Jing Liao, Juncong Lin

Although significant progress has been made in the field of speech-driven 3D facial animation recently, the speech-driven animation of an indispensable facial component, eye gaze, has been overlooked by recent research. This is primarily due to the weak correlation between speech and eye gaze, as well as the scarcity of audio-gaze data, making it very challenging to generate 3D eye gaze motion from speech alone. In this paper, we propose a novel data-driven method which can generate diverse 3D eye gaze motions in harmony with the speech. To achieve this, we firstly construct an audio-gaze dataset that contains about 14 hours of audio-mesh sequences featuring high-quality eye gaze motion, head motion and facial motion simultaneously. The motion data is acquired by performing lightweight eye gaze fitting and face reconstruction on videos from existing audio-visual datasets. We then tailor a novel speech-to-motion translation framework in which the head motions and eye gaze motions are jointly generated from speech but are modeled in two separate latent spaces. This design stems from the physiological knowledge that the rotation range of eyeballs is less than that of head. Through mapping the speech embedding into the two latent spaces, the difficulty in modeling the weak correlation between speech and non-verbal motion is thus attenuated. Finally, our TalkingEyes, integrated with a speech-driven 3D facial motion generator, can synthesize eye gaze motion, eye blinks, head motion and facial motion collectively from speech. Qualitative and quantitative evaluations, along with a perceptual user study, demonstrate the superiority of the proposed method in generating diverse and natural 3D eye gaze motions from speech. The project page of this paper is: https://lkjkjoiuiu.github.io/TalkingEyes_Home/.

虽然近年来在语音驱动的3D人脸动画领域取得了重大进展,但语音驱动的人脸动画中不可或缺的一个组成部分——眼睛注视,却被最近的研究所忽视。这主要是由于语音和眼睛注视之间的相关性较弱,以及音频注视数据的稀缺,使得仅从语音生成3D眼睛注视运动非常具有挑战性。在本文中,我们提出了一种新的数据驱动方法,该方法可以生成与语音协调的多种三维眼球运动。为了实现这一点,我们首先构建了一个音频-凝视数据集,该数据集包含约14小时的音频网格序列,同时具有高质量的眼睛凝视运动,头部运动和面部运动。运动数据是通过对现有视听数据集的视频进行轻量级的眼睛注视拟合和面部重建来获取的。然后,我们定制了一个新的语音到动作翻译框架,其中头部运动和眼睛注视运动由语音共同产生,但在两个独立的潜在空间中建模。这种设计源于生理知识,眼球的旋转范围小于头部的旋转范围。通过将语音嵌入映射到两个潜在空间中,从而降低了语音和非语言运动之间弱相关性建模的难度。最后,我们的TalkingEyes集成了一个语音驱动的3D面部运动生成器,可以从语音中合成眼睛的凝视运动、眨眼运动、头部运动和面部运动。定性和定量评估,以及感知用户研究,证明了所提出的方法在从语音生成多样化和自然的3D眼球注视运动方面的优越性。本文的项目页面是:https://lkjkjoiuiu.github.io/TalkingEyes_Home/。
{"title":"TalkingEyes: Pluralistic Speech-Driven 3D Eye Gaze Animation.","authors":"Yixiang Zhuang, Chunshan Ma, Yao Cheng, Xuan Cheng, Jing Liao, Juncong Lin","doi":"10.1109/TVCG.2026.3674699","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3674699","url":null,"abstract":"<p><p>Although significant progress has been made in the field of speech-driven 3D facial animation recently, the speech-driven animation of an indispensable facial component, eye gaze, has been overlooked by recent research. This is primarily due to the weak correlation between speech and eye gaze, as well as the scarcity of audio-gaze data, making it very challenging to generate 3D eye gaze motion from speech alone. In this paper, we propose a novel data-driven method which can generate diverse 3D eye gaze motions in harmony with the speech. To achieve this, we firstly construct an audio-gaze dataset that contains about 14 hours of audio-mesh sequences featuring high-quality eye gaze motion, head motion and facial motion simultaneously. The motion data is acquired by performing lightweight eye gaze fitting and face reconstruction on videos from existing audio-visual datasets. We then tailor a novel speech-to-motion translation framework in which the head motions and eye gaze motions are jointly generated from speech but are modeled in two separate latent spaces. This design stems from the physiological knowledge that the rotation range of eyeballs is less than that of head. Through mapping the speech embedding into the two latent spaces, the difficulty in modeling the weak correlation between speech and non-verbal motion is thus attenuated. Finally, our TalkingEyes, integrated with a speech-driven 3D facial motion generator, can synthesize eye gaze motion, eye blinks, head motion and facial motion collectively from speech. Qualitative and quantitative evaluations, along with a perceptual user study, demonstrate the superiority of the proposed method in generating diverse and natural 3D eye gaze motions from speech. The project page of this paper is: https://lkjkjoiuiu.github.io/TalkingEyes_Home/.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147476817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DIQ-MPM: Dual Interface Quadrature MPM for Simulating Large Deformation and Fluid-Solid Coupling. 模拟大变形和流固耦合的双界面正交MPM。
IF 6.5 Pub Date : 2026-03-16 DOI: 10.1109/TVCG.2026.3674656
Kangrui Zhang, Ruihong Cen, Siyan Zhu, Ruoyan Chen, Bo Ren

We present DIQ-MPM, a novel monolithic two-way coupling framework for simulating interactions between solids modeled with the total Lagrangian formulation and Eulerian incompressible fluids using the Material Point Method (MPM). Our approach combines an implicit TLMPM formulation with a mixed velocity-pressure scheme to robustly simulate compressible solids undergoing large deformations, while eliminating numerical fractures. To enable strong fluid-solid coupling without relying on overlapping grids, we introduce a Dual Interface Quadrature (DIQ) mechanism that maps fluid-solid interface information consistently between the current and reference configurations. This allows us to construct a unified sparse pressure-only system via Schur complement, leading to efficient and stable coupling. We also integrate a particle-based contact force model to resolve solid-solid and solid-boundary contacts within implicit TLMPM. Experimental results demonstrate that our method stably captures free-slip coupling, large deformation phenomena, and complex interactions between compressible solids and incompressible fluids.

我们提出了DIQ-MPM,一个新的单块双向耦合框架,用于模拟固体之间的相互作用,用全拉格朗日公式建模和欧拉不可压缩流体使用物质点法(MPM)。我们的方法将隐式TLMPM公式与混合速度-压力方案相结合,以强大地模拟大变形的可压缩固体,同时消除数值裂缝。为了在不依赖于重叠网格的情况下实现强流固耦合,我们引入了双界面正交(DIQ)机制,该机制可以在当前和参考配置之间一致地映射流固界面信息。这允许我们通过Schur补构造一个统一的稀疏仅压系统,从而实现高效稳定的耦合。我们还集成了一个基于粒子的接触力模型来解决隐式TLMPM中的固体-固体和固体边界接触。实验结果表明,该方法稳定地捕获了可压缩固体与不可压缩流体之间的自由滑移耦合、大变形现象以及复杂的相互作用。
{"title":"DIQ-MPM: Dual Interface Quadrature MPM for Simulating Large Deformation and Fluid-Solid Coupling.","authors":"Kangrui Zhang, Ruihong Cen, Siyan Zhu, Ruoyan Chen, Bo Ren","doi":"10.1109/TVCG.2026.3674656","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3674656","url":null,"abstract":"<p><p>We present DIQ-MPM, a novel monolithic two-way coupling framework for simulating interactions between solids modeled with the total Lagrangian formulation and Eulerian incompressible fluids using the Material Point Method (MPM). Our approach combines an implicit TLMPM formulation with a mixed velocity-pressure scheme to robustly simulate compressible solids undergoing large deformations, while eliminating numerical fractures. To enable strong fluid-solid coupling without relying on overlapping grids, we introduce a Dual Interface Quadrature (DIQ) mechanism that maps fluid-solid interface information consistently between the current and reference configurations. This allows us to construct a unified sparse pressure-only system via Schur complement, leading to efficient and stable coupling. We also integrate a particle-based contact force model to resolve solid-solid and solid-boundary contacts within implicit TLMPM. Experimental results demonstrate that our method stably captures free-slip coupling, large deformation phenomena, and complex interactions between compressible solids and incompressible fluids.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147470663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WonderTex: Consistent-and-Seamless Texture Generation with Text-Guided Multi-View Image Diffusion Models. WonderTex:具有文本引导的多视图图像扩散模型的一致和无缝纹理生成。
IF 6.5 Pub Date : 2026-03-16 DOI: 10.1109/TVCG.2026.3673926
Qi Xu, Xiao-Guang Han, Lei Zhang

Text-guided texture generation has been rapidly developed with the proliferation of generative artificial intelligence for creating three-dimensional textured objects. However, existing text-guided texture generation methods often suffer from artifacts such as inconsistent visual appearance across different views, the Janus problems and seams in texture maps. To address these issues, we propose a novel text-guided texture generation method, named WonderTex. It achieves the generation of high-quality, view-consistent, and seamless texture maps through a two-stage pipeline. Specifically, we fine-tune a Stable Diffusion model using a large dataset to obtain a multi-view image diffusion model capable of generating a 4-view grid. This model serves as the foundation for producing four consistent views and establishing the base texture in the first stage. Subsequently, an automatic view selection and inpainting strategy is employed to effectively f ill and refine the texture maps in the second stage. Extensive experiments have shown that our method is effective and robust, capable of generating high-quality textures with various meshes and prompts, outperforming baseline methods in terms of texture details, view consistency, and other metrics.

随着生成式人工智能(generative artificial intelligence)用于创建三维纹理对象的普及,文本引导纹理生成得到了迅速发展。然而,现有的文本引导纹理生成方法经常受到诸如不同视图的视觉外观不一致、纹理图中的Janus问题和接缝等人工因素的影响。为了解决这些问题,我们提出了一种新的文本引导纹理生成方法,名为WonderTex。它通过两个阶段的管道实现了高质量,视图一致和无缝纹理贴图的生成。具体来说,我们使用大型数据集微调稳定扩散模型,以获得能够生成4视图网格的多视图图像扩散模型。该模型作为第一阶段生成四个一致视图和建立基本纹理的基础。然后,在第二阶段,采用自动视图选择和绘制策略有效地填充和细化纹理贴图。大量的实验表明,我们的方法是有效的和鲁棒的,能够生成具有各种网格和提示的高质量纹理,在纹理细节、视图一致性和其他指标方面优于基线方法。
{"title":"WonderTex: Consistent-and-Seamless Texture Generation with Text-Guided Multi-View Image Diffusion Models.","authors":"Qi Xu, Xiao-Guang Han, Lei Zhang","doi":"10.1109/TVCG.2026.3673926","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3673926","url":null,"abstract":"<p><p>Text-guided texture generation has been rapidly developed with the proliferation of generative artificial intelligence for creating three-dimensional textured objects. However, existing text-guided texture generation methods often suffer from artifacts such as inconsistent visual appearance across different views, the Janus problems and seams in texture maps. To address these issues, we propose a novel text-guided texture generation method, named WonderTex. It achieves the generation of high-quality, view-consistent, and seamless texture maps through a two-stage pipeline. Specifically, we fine-tune a Stable Diffusion model using a large dataset to obtain a multi-view image diffusion model capable of generating a 4-view grid. This model serves as the foundation for producing four consistent views and establishing the base texture in the first stage. Subsequently, an automatic view selection and inpainting strategy is employed to effectively f ill and refine the texture maps in the second stage. Extensive experiments have shown that our method is effective and robust, capable of generating high-quality textures with various meshes and prompts, outperforming baseline methods in terms of texture details, view consistency, and other metrics.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147470746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Penetration-free Solid-Fluid Interaction on Shells and Rods. 壳和杆上无穿透的固-流相互作用。
IF 6.5 Pub Date : 2026-03-13 DOI: 10.1109/TVCG.2026.3674041
Yuchen Sun, Jinyuan Liu, Yin Yang, Chenfanfu Jiang, Minchen Li, Bo Zhu

We introduce a novel approach to simulate the interaction between fluids and thin elastic solids without any penetration. Our approach is centered around an optimization system augmented with barriers, which aims to find a configuration that ensures the absence of penetration while enforcing incompressibility for the fluids and minimizing elastic potentials for the solids. Unlike previous methods that primarily focus on velocity coherence at the fluid-solid interfaces, we demonstrate the effectiveness and flexibility of explicitly resolving positional constraints, including both explicit representation of solid positions and the implicit representation of fluid level-set interface. To preserve the volume of the fluid, we propose a simple yet efficient approach that adjusts the associated level-set values. Additionally, we develop a distance metric capable of measuring the separation between an implicitly represented surface and a Lagrangian object of arbitrary codimension. By integrating the inertia, solid elastic potential, damping, barrier potential, and fluid incompressibility within a unified system, we are able to robustly simulate a wide range of processes involving fluid interactions with lower-dimensional objects such as shells and rods. These processes include topology changes, bouncing, splashing, sliding, rolling, floating, and more.

我们介绍了一种新的方法来模拟流体与无渗透的薄弹性固体之间的相互作用。我们的方法围绕着一个带有屏障的优化系统展开,旨在找到一种配置,既能确保没有穿透,又能增强流体的不可压缩性,并最大限度地减少固体的弹性潜力。与先前主要关注流固界面速度相干性的方法不同,我们展示了显式解决位置约束的有效性和灵活性,包括固体位置的显式表示和流体水平集界面的隐式表示。为了保持流体的体积,我们提出了一种简单而有效的方法来调整相关的水平集值。此外,我们开发了一种距离度量,能够测量隐式表示的表面和任意余维的拉格朗日物体之间的距离。通过将惯性、固体弹性势、阻尼、势垒和流体不可压缩性集成到一个统一的系统中,我们能够模拟涉及流体与低维物体(如壳和棒)相互作用的广泛过程。这些过程包括拓扑变化、弹跳、飞溅、滑动、滚动、浮动等等。
{"title":"Penetration-free Solid-Fluid Interaction on Shells and Rods.","authors":"Yuchen Sun, Jinyuan Liu, Yin Yang, Chenfanfu Jiang, Minchen Li, Bo Zhu","doi":"10.1109/TVCG.2026.3674041","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3674041","url":null,"abstract":"<p><p>We introduce a novel approach to simulate the interaction between fluids and thin elastic solids without any penetration. Our approach is centered around an optimization system augmented with barriers, which aims to find a configuration that ensures the absence of penetration while enforcing incompressibility for the fluids and minimizing elastic potentials for the solids. Unlike previous methods that primarily focus on velocity coherence at the fluid-solid interfaces, we demonstrate the effectiveness and flexibility of explicitly resolving positional constraints, including both explicit representation of solid positions and the implicit representation of fluid level-set interface. To preserve the volume of the fluid, we propose a simple yet efficient approach that adjusts the associated level-set values. Additionally, we develop a distance metric capable of measuring the separation between an implicitly represented surface and a Lagrangian object of arbitrary codimension. By integrating the inertia, solid elastic potential, damping, barrier potential, and fluid incompressibility within a unified system, we are able to robustly simulate a wide range of processes involving fluid interactions with lower-dimensional objects such as shells and rods. These processes include topology changes, bouncing, splashing, sliding, rolling, floating, and more.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147461429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on visualization and computer graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1