FovealNet: Advancing AI-Driven Gaze Tracking Solutions for Efficient Foveated Rendering in Virtual Reality

IF 6.5 IEEE transactions on visualization and computer graphics Pub Date : 2025-03-11 DOI:10.1109/TVCG.2025.3549577

Wenxuan Liu;Budmonde Duinkharjav;Qi Sun;Sai Qian Zhang

{"title":"FovealNet: Advancing AI-Driven Gaze Tracking Solutions for Efficient Foveated Rendering in Virtual Reality","authors":"Wenxuan Liu;Budmonde Duinkharjav;Qi Sun;Sai Qian Zhang","doi":"10.1109/TVCG.2025.3549577","DOIUrl":null,"url":null,"abstract":"Leveraging real-time eye tracking, foveated rendering optimizes hardware efficiency and enhances visual quality virtual reality (VR). This approach leverages eye-tracking techniques to determine where the user is looking, allowing the system to render high-resolution graphics only in the foveal region—the small area of the retina where visual acuity is highest, while the peripheral view is rendered at lower resolution. However, modern deep learning-based gaze-tracking solutions often exhibit a long-tail distribution of tracking errors, which can degrade user experience and reduce the benefits of foveated rendering by causing misalignment and decreased visual quality. This paper introduces FovealNet, an advanced AI-driven gaze tracking framework designed to optimize system performance by strategically enhancing gaze tracking accuracy. To further reduce the implementation cost of the gaze tracking algorithm, FovealNet employs an event-based cropping method that eliminates over 64.8% of irrelevant pixels from the input image. Additionally, it incorporates a simple yet effective token-pruning strategy that dynamically removes tokens on the fly without compromising tracking accuracy. Finally, to support different runtime rendering configurations, we propose a system performance-aware multi-resolution training strategy, allowing the gaze tracking DNN to adapt and optimize overall system performance more effectively. Evaluation results demonstrate that FovealNet achieves at least 1.42× speed up compared to previous methods and 13% increase in perceptual quality for foveated output. The code is available at https://github.com/wl3181/FovealNet.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"31 5","pages":"3183-3193"},"PeriodicalIF":6.5000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10918846/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Leveraging real-time eye tracking, foveated rendering optimizes hardware efficiency and enhances visual quality virtual reality (VR). This approach leverages eye-tracking techniques to determine where the user is looking, allowing the system to render high-resolution graphics only in the foveal region—the small area of the retina where visual acuity is highest, while the peripheral view is rendered at lower resolution. However, modern deep learning-based gaze-tracking solutions often exhibit a long-tail distribution of tracking errors, which can degrade user experience and reduce the benefits of foveated rendering by causing misalignment and decreased visual quality. This paper introduces FovealNet, an advanced AI-driven gaze tracking framework designed to optimize system performance by strategically enhancing gaze tracking accuracy. To further reduce the implementation cost of the gaze tracking algorithm, FovealNet employs an event-based cropping method that eliminates over 64.8% of irrelevant pixels from the input image. Additionally, it incorporates a simple yet effective token-pruning strategy that dynamically removes tokens on the fly without compromising tracking accuracy. Finally, to support different runtime rendering configurations, we propose a system performance-aware multi-resolution training strategy, allowing the gaze tracking DNN to adapt and optimize overall system performance more effectively. Evaluation results demonstrate that FovealNet achieves at least 1.42× speed up compared to previous methods and 13% increase in perceptual quality for foveated output. The code is available at https://github.com/wl3181/FovealNet.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

FovealNet：推进人工智能驱动的注视跟踪解决方案，用于虚拟现实中高效的注视点渲染。

利用实时眼动追踪，注视点渲染优化了硬件效率，增强了虚拟现实（VR）的视觉质量。这种方法利用眼球追踪技术来确定用户正在看的地方，允许系统只在中央凹区域（视网膜上视觉灵敏度最高的小区域）渲染高分辨率图形，而外围视图则以较低分辨率渲染。然而，现代基于深度学习的注视跟踪解决方案往往表现出跟踪误差的长尾分布，这可能会降低用户体验，并通过导致不对齐和视觉质量下降而降低注视点渲染的好处。本文介绍了一种先进的人工智能驱动的注视跟踪框架FovealNet，该框架旨在通过战略性地提高注视跟踪精度来优化系统性能。为了进一步降低注视跟踪算法的实现成本，FovealNet采用了一种基于事件的裁剪方法，从输入图像中消除了超过64.8%的不相关像素。此外，它还包含一个简单而有效的令牌修剪策略，可以动态地删除令牌，而不会影响跟踪准确性。最后，为了支持不同的运行时渲染配置，我们提出了一种系统性能感知的多分辨率训练策略，使注视跟踪DNN能够更有效地适应和优化整体系统性能。评估结果表明，与以前的方法相比，FovealNet的速度至少提高了1.42倍，对注视点输出的感知质量提高了13%。代码可在https://github.com/wl3181/FovealNet上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on visualization and computer graphics

自引率

0.00%

发文量