SparsePoser:基于稀疏数据的实时全身运动重建

IF 7.8 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING ACM Transactions on Graphics Pub Date : 2023-10-31 DOI:10.1145/3625264
Jose Luis Ponton, Haoran Yun, Andreas Aristidou, Carlos Andujar, Nuria Pelechano
{"title":"SparsePoser:基于稀疏数据的实时全身运动重建","authors":"Jose Luis Ponton, Haoran Yun, Andreas Aristidou, Carlos Andujar, Nuria Pelechano","doi":"10.1145/3625264","DOIUrl":null,"url":null,"abstract":"Accurate and reliable human motion reconstruction is crucial for creating natural interactions of full-body avatars in Virtual Reality (VR) and entertainment applications. As the Metaverse and social applications gain popularity, users are seeking cost-effective solutions to create full-body animations that are comparable in quality to those produced by commercial motion capture systems. In order to provide affordable solutions though, it is important to minimize the number of sensors attached to the subject’s body. Unfortunately, reconstructing the full-body pose from sparse data is a heavily under-determined problem. Some studies that use IMU sensors face challenges in reconstructing the pose due to positional drift and ambiguity of the poses. In recent years, some mainstream VR systems have released 6-degree-of-freedom (6-DoF) tracking devices providing positional and rotational information. Nevertheless, most solutions for reconstructing full-body poses rely on traditional inverse kinematics (IK) solutions, which often produce non-continuous and unnatural poses. In this article, we introduce SparsePoser, a novel deep learning-based solution for reconstructing a full-body pose from a reduced set of six tracking devices. Our system incorporates a convolutional-based autoencoder that synthesizes high-quality continuous human poses by learning the human motion manifold from motion capture data. Then, we employ a learned IK component, made of multiple lightweight feed-forward neural networks, to adjust the hands and feet toward the corresponding trackers. We extensively evaluate our method on publicly available motion capture datasets and with real-time live demos. We show that our method outperforms state-of-the-art techniques using IMU sensors or 6-DoF tracking devices, and can be used for users with different body dimensions and proportions.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"198 4","pages":"0"},"PeriodicalIF":7.8000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data\",\"authors\":\"Jose Luis Ponton, Haoran Yun, Andreas Aristidou, Carlos Andujar, Nuria Pelechano\",\"doi\":\"10.1145/3625264\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate and reliable human motion reconstruction is crucial for creating natural interactions of full-body avatars in Virtual Reality (VR) and entertainment applications. As the Metaverse and social applications gain popularity, users are seeking cost-effective solutions to create full-body animations that are comparable in quality to those produced by commercial motion capture systems. In order to provide affordable solutions though, it is important to minimize the number of sensors attached to the subject’s body. Unfortunately, reconstructing the full-body pose from sparse data is a heavily under-determined problem. Some studies that use IMU sensors face challenges in reconstructing the pose due to positional drift and ambiguity of the poses. In recent years, some mainstream VR systems have released 6-degree-of-freedom (6-DoF) tracking devices providing positional and rotational information. Nevertheless, most solutions for reconstructing full-body poses rely on traditional inverse kinematics (IK) solutions, which often produce non-continuous and unnatural poses. In this article, we introduce SparsePoser, a novel deep learning-based solution for reconstructing a full-body pose from a reduced set of six tracking devices. Our system incorporates a convolutional-based autoencoder that synthesizes high-quality continuous human poses by learning the human motion manifold from motion capture data. Then, we employ a learned IK component, made of multiple lightweight feed-forward neural networks, to adjust the hands and feet toward the corresponding trackers. We extensively evaluate our method on publicly available motion capture datasets and with real-time live demos. We show that our method outperforms state-of-the-art techniques using IMU sensors or 6-DoF tracking devices, and can be used for users with different body dimensions and proportions.\",\"PeriodicalId\":50913,\"journal\":{\"name\":\"ACM Transactions on Graphics\",\"volume\":\"198 4\",\"pages\":\"0\"},\"PeriodicalIF\":7.8000,\"publicationDate\":\"2023-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3625264\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3625264","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

在虚拟现实(VR)和娱乐应用中,准确可靠的人体运动重建对于创造全身化身的自然交互至关重要。随着Metaverse和社交应用程序的普及,用户正在寻求具有成本效益的解决方案来创建与商业动作捕捉系统制作的质量相当的全身动画。然而,为了提供经济实惠的解决方案,重要的是要尽量减少附着在受试者身体上的传感器数量。不幸的是,从稀疏数据中重建全身姿态是一个严重不确定的问题。由于姿态的位置漂移和模糊性,一些使用IMU传感器的研究在重建姿态时面临挑战。近年来,一些主流的VR系统已经发布了6自由度(6-DoF)跟踪设备,提供位置和旋转信息。然而,大多数重建全身姿态的解决方案依赖于传统的逆运动学(IK)解决方案,这往往产生不连续和不自然的姿态。在本文中,我们介绍了SparsePoser,这是一种基于深度学习的新型解决方案,用于从六个跟踪设备的简化集中重建全身姿势。我们的系统集成了一个基于卷积的自编码器,通过从动作捕捉数据中学习人体运动流形来合成高质量的连续人体姿势。然后,我们使用由多个轻量级前馈神经网络组成的学习IK组件来调整手和脚朝向相应的跟踪器。我们在公开可用的动作捕捉数据集和实时现场演示上广泛评估我们的方法。我们表明,我们的方法优于使用IMU传感器或6自由度跟踪设备的最先进技术,并且可以用于不同身体尺寸和比例的用户。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data
Accurate and reliable human motion reconstruction is crucial for creating natural interactions of full-body avatars in Virtual Reality (VR) and entertainment applications. As the Metaverse and social applications gain popularity, users are seeking cost-effective solutions to create full-body animations that are comparable in quality to those produced by commercial motion capture systems. In order to provide affordable solutions though, it is important to minimize the number of sensors attached to the subject’s body. Unfortunately, reconstructing the full-body pose from sparse data is a heavily under-determined problem. Some studies that use IMU sensors face challenges in reconstructing the pose due to positional drift and ambiguity of the poses. In recent years, some mainstream VR systems have released 6-degree-of-freedom (6-DoF) tracking devices providing positional and rotational information. Nevertheless, most solutions for reconstructing full-body poses rely on traditional inverse kinematics (IK) solutions, which often produce non-continuous and unnatural poses. In this article, we introduce SparsePoser, a novel deep learning-based solution for reconstructing a full-body pose from a reduced set of six tracking devices. Our system incorporates a convolutional-based autoencoder that synthesizes high-quality continuous human poses by learning the human motion manifold from motion capture data. Then, we employ a learned IK component, made of multiple lightweight feed-forward neural networks, to adjust the hands and feet toward the corresponding trackers. We extensively evaluate our method on publicly available motion capture datasets and with real-time live demos. We show that our method outperforms state-of-the-art techniques using IMU sensors or 6-DoF tracking devices, and can be used for users with different body dimensions and proportions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACM Transactions on Graphics
ACM Transactions on Graphics 工程技术-计算机:软件工程
CiteScore
14.30
自引率
25.80%
发文量
193
审稿时长
12 months
期刊介绍: ACM Transactions on Graphics (TOG) is a peer-reviewed scientific journal that aims to disseminate the latest findings of note in the field of computer graphics. It has been published since 1982 by the Association for Computing Machinery. Starting in 2003, all papers accepted for presentation at the annual SIGGRAPH conference are printed in a special summer issue of the journal.
期刊最新文献
Direct Manipulation of Procedural Implicit Surfaces 3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting Quark: Real-time, High-resolution, and General Neural View Synthesis Differentiable Owen Scrambling ELMO: Enhanced Real-time LiDAR Motion Capture through Upsampling
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1