MEgATrack: monochrome egocentric articulated hand-tracking for virtual reality

ACM Trans. Graph. Pub Date : 2020-07-08 DOI:10.1145/3386569.3392452

Shangchen Han, Beibei Liu, Randi Cabezas, Christopher D. Twigg, Peizhao Zhang, Jeff Petkau, Tsz-Ho Yu, Chun-Jung Tai, Muzaffer Akbay, Z. Wang, Asaf Nitzan, Gang Dong, Yuting Ye, Lingling Tao, Chengde Wan, Robert Wang

{"title":"MEgATrack: monochrome egocentric articulated hand-tracking for virtual reality","authors":"Shangchen Han, Beibei Liu, Randi Cabezas, Christopher D. Twigg, Peizhao Zhang, Jeff Petkau, Tsz-Ho Yu, Chun-Jung Tai, Muzaffer Akbay, Z. Wang, Asaf Nitzan, Gang Dong, Yuting Ye, Lingling Tao, Chengde Wan, Robert Wang","doi":"10.1145/3386569.3392452","DOIUrl":null,"url":null,"abstract":"We present a system for real-time hand-tracking to drive virtual and augmented reality (VR/AR) experiences. Using four fisheye monochrome cameras, our system generates accurate and low-jitter 3D hand motion across a large working volume for a diverse set of users. We achieve this by proposing neural network architectures for detecting hands and estimating hand keypoint locations. Our hand detection network robustly handles a variety of real world environments. The keypoint estimation network leverages tracking history to produce spatially and temporally consistent poses. We design scalable, semi-automated mechanisms to collect a large and diverse set of ground truth data using a combination of manual annotation and automated tracking. Additionally, we introduce a detection-by-tracking method that increases smoothness while reducing the computational cost; the optimized system runs at 60Hz on PC and 30Hz on a mobile processor. Together, these contributions yield a practical system for capturing a user’s hands and is the default feature on the Oculus Quest VR headset powering input and social presence.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"27 1","pages":"87"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"42","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Trans. Graph.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3386569.3392452","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 42

Abstract

We present a system for real-time hand-tracking to drive virtual and augmented reality (VR/AR) experiences. Using four fisheye monochrome cameras, our system generates accurate and low-jitter 3D hand motion across a large working volume for a diverse set of users. We achieve this by proposing neural network architectures for detecting hands and estimating hand keypoint locations. Our hand detection network robustly handles a variety of real world environments. The keypoint estimation network leverages tracking history to produce spatially and temporally consistent poses. We design scalable, semi-automated mechanisms to collect a large and diverse set of ground truth data using a combination of manual annotation and automated tracking. Additionally, we introduce a detection-by-tracking method that increases smoothness while reducing the computational cost; the optimized system runs at 60Hz on PC and 30Hz on a mobile processor. Together, these contributions yield a practical system for capturing a user’s hands and is the default feature on the Oculus Quest VR headset powering input and social presence.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MEgATrack:用于虚拟现实的单色自我中心关节手跟踪

我们提出了一个实时手部跟踪系统，以驱动虚拟和增强现实(VR/AR)体验。使用四个鱼眼单色相机，我们的系统生成准确和低抖动的3D手部运动在一个大的工作体积为不同的用户。我们通过提出用于手部检测和手部关键点位置估计的神经网络架构来实现这一目标。我们的手部检测网络稳健地处理各种现实世界的环境。关键点估计网络利用跟踪历史产生空间和时间一致的姿态。我们设计了可扩展的半自动化机制，使用手动注释和自动跟踪的组合来收集大量不同的地面真实数据。此外，我们还引入了一种跟踪检测方法，该方法在降低计算成本的同时增加了平滑度;优化后的系统在PC上运行60Hz，在移动处理器上运行30Hz。总之，这些贡献产生了一个实用的系统来捕捉用户的手，是Oculus Quest VR耳机的默认功能，为输入和社交提供动力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Trans. Graph.

自引率

0.00%

发文量

期刊最新文献

LuisaRender: A High-Performance Rendering Framework with Layered and Unified Interfaces on Stream Architectures BoolSurf: Boolean Operations on Surfaces SkinMixer: Blending 3D Animated Models PopStage: The Generation of Stage Cross-Editing Video Based on Spatio-Temporal Matching QuadStream: A Quad-Based Scene Streaming Architecture for Novel Viewpoint Reconstruction