MEgATrack: monochrome egocentric articulated hand-tracking for virtual reality

Shangchen Han, Beibei Liu, Randi Cabezas, Christopher D. Twigg, Peizhao Zhang, Jeff Petkau, Tsz-Ho Yu, Chun-Jung Tai, Muzaffer Akbay, Z. Wang, Asaf Nitzan, Gang Dong, Yuting Ye, Lingling Tao, Chengde Wan, Robert Wang
{"title":"MEgATrack: monochrome egocentric articulated hand-tracking for virtual reality","authors":"Shangchen Han, Beibei Liu, Randi Cabezas, Christopher D. Twigg, Peizhao Zhang, Jeff Petkau, Tsz-Ho Yu, Chun-Jung Tai, Muzaffer Akbay, Z. Wang, Asaf Nitzan, Gang Dong, Yuting Ye, Lingling Tao, Chengde Wan, Robert Wang","doi":"10.1145/3386569.3392452","DOIUrl":null,"url":null,"abstract":"We present a system for real-time hand-tracking to drive virtual and augmented reality (VR/AR) experiences. Using four fisheye monochrome cameras, our system generates accurate and low-jitter 3D hand motion across a large working volume for a diverse set of users. We achieve this by proposing neural network architectures for detecting hands and estimating hand keypoint locations. Our hand detection network robustly handles a variety of real world environments. The keypoint estimation network leverages tracking history to produce spatially and temporally consistent poses. We design scalable, semi-automated mechanisms to collect a large and diverse set of ground truth data using a combination of manual annotation and automated tracking. Additionally, we introduce a detection-by-tracking method that increases smoothness while reducing the computational cost; the optimized system runs at 60Hz on PC and 30Hz on a mobile processor. Together, these contributions yield a practical system for capturing a user’s hands and is the default feature on the Oculus Quest VR headset powering input and social presence.","PeriodicalId":7121,"journal":{"name":"ACM Trans. Graph.","volume":"27 1","pages":"87"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"42","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Trans. Graph.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3386569.3392452","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 42

Abstract

We present a system for real-time hand-tracking to drive virtual and augmented reality (VR/AR) experiences. Using four fisheye monochrome cameras, our system generates accurate and low-jitter 3D hand motion across a large working volume for a diverse set of users. We achieve this by proposing neural network architectures for detecting hands and estimating hand keypoint locations. Our hand detection network robustly handles a variety of real world environments. The keypoint estimation network leverages tracking history to produce spatially and temporally consistent poses. We design scalable, semi-automated mechanisms to collect a large and diverse set of ground truth data using a combination of manual annotation and automated tracking. Additionally, we introduce a detection-by-tracking method that increases smoothness while reducing the computational cost; the optimized system runs at 60Hz on PC and 30Hz on a mobile processor. Together, these contributions yield a practical system for capturing a user’s hands and is the default feature on the Oculus Quest VR headset powering input and social presence.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MEgATrack:用于虚拟现实的单色自我中心关节手跟踪
我们提出了一个实时手部跟踪系统,以驱动虚拟和增强现实(VR/AR)体验。使用四个鱼眼单色相机,我们的系统生成准确和低抖动的3D手部运动在一个大的工作体积为不同的用户。我们通过提出用于手部检测和手部关键点位置估计的神经网络架构来实现这一目标。我们的手部检测网络稳健地处理各种现实世界的环境。关键点估计网络利用跟踪历史产生空间和时间一致的姿态。我们设计了可扩展的半自动化机制,使用手动注释和自动跟踪的组合来收集大量不同的地面真实数据。此外,我们还引入了一种跟踪检测方法,该方法在降低计算成本的同时增加了平滑度;优化后的系统在PC上运行60Hz,在移动处理器上运行30Hz。总之,这些贡献产生了一个实用的系统来捕捉用户的手,是Oculus Quest VR耳机的默认功能,为输入和社交提供动力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
LuisaRender: A High-Performance Rendering Framework with Layered and Unified Interfaces on Stream Architectures BoolSurf: Boolean Operations on Surfaces SkinMixer: Blending 3D Animated Models PopStage: The Generation of Stage Cross-Editing Video Based on Spatio-Temporal Matching QuadStream: A Quad-Based Scene Streaming Architecture for Novel Viewpoint Reconstruction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1