HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR

Yudi Dai, Zhiyong Wang, Xiping Lin, Chenglu Wen, Lan Xu, Siqi Shen, Yuexin Ma, Cheng Wang
{"title":"HiSC4D: Human-centered interaction and 4D Scene Capture in Large-scale Space Using Wearable IMUs and LiDAR","authors":"Yudi Dai, Zhiyong Wang, Xiping Lin, Chenglu Wen, Lan Xu, Siqi Shen, Yuexin Ma, Cheng Wang","doi":"arxiv-2409.04398","DOIUrl":null,"url":null,"abstract":"We introduce HiSC4D, a novel Human-centered interaction and 4D Scene Capture\nmethod, aimed at accurately and efficiently creating a dynamic digital world,\ncontaining large-scale indoor-outdoor scenes, diverse human motions, rich\nhuman-human interactions, and human-environment interactions. By utilizing\nbody-mounted IMUs and a head-mounted LiDAR, HiSC4D can capture egocentric human\nmotions in unconstrained space without the need for external devices and\npre-built maps. This affords great flexibility and accessibility for\nhuman-centered interaction and 4D scene capturing in various environments.\nTaking into account that IMUs can capture human spatially unrestricted poses\nbut are prone to drifting for long-period using, and while LiDAR is stable for\nglobal localization but rough for local positions and orientations, HiSC4D\nemploys a joint optimization method, harmonizing all sensors and utilizing\nenvironment cues, yielding promising results for long-term capture in large\nscenes. To promote research of egocentric human interaction in large scenes and\nfacilitate downstream tasks, we also present a dataset, containing 8 sequences\nin 4 large scenes (200 to 5,000 $m^2$), providing 36k frames of accurate 4D\nhuman motions with SMPL annotations and dynamic scenes, 31k frames of cropped\nhuman point clouds, and scene mesh of the environment. A variety of scenarios,\nsuch as the basketball gym and commercial street, alongside challenging human\nmotions, such as daily greeting, one-on-one basketball playing, and tour\nguiding, demonstrate the effectiveness and the generalization ability of\nHiSC4D. The dataset and code will be publicated on\nwww.lidarhumanmotion.net/hisc4d available for research purposes.","PeriodicalId":501480,"journal":{"name":"arXiv - CS - Multimedia","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.04398","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We introduce HiSC4D, a novel Human-centered interaction and 4D Scene Capture method, aimed at accurately and efficiently creating a dynamic digital world, containing large-scale indoor-outdoor scenes, diverse human motions, rich human-human interactions, and human-environment interactions. By utilizing body-mounted IMUs and a head-mounted LiDAR, HiSC4D can capture egocentric human motions in unconstrained space without the need for external devices and pre-built maps. This affords great flexibility and accessibility for human-centered interaction and 4D scene capturing in various environments. Taking into account that IMUs can capture human spatially unrestricted poses but are prone to drifting for long-period using, and while LiDAR is stable for global localization but rough for local positions and orientations, HiSC4D employs a joint optimization method, harmonizing all sensors and utilizing environment cues, yielding promising results for long-term capture in large scenes. To promote research of egocentric human interaction in large scenes and facilitate downstream tasks, we also present a dataset, containing 8 sequences in 4 large scenes (200 to 5,000 $m^2$), providing 36k frames of accurate 4D human motions with SMPL annotations and dynamic scenes, 31k frames of cropped human point clouds, and scene mesh of the environment. A variety of scenarios, such as the basketball gym and commercial street, alongside challenging human motions, such as daily greeting, one-on-one basketball playing, and tour guiding, demonstrate the effectiveness and the generalization ability of HiSC4D. The dataset and code will be publicated on www.lidarhumanmotion.net/hisc4d available for research purposes.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
HiSC4D:使用可穿戴式 IMU 和激光雷达在大尺度空间进行以人为本的交互和 4D 场景捕捉
我们介绍的 HiSC4D 是一种新颖的以人为本的交互和 4D 场景捕捉方法,旨在准确高效地创建一个动态数字世界,其中包含大规模室内外场景、多样化的人体运动、丰富的人与人之间的交互以及人与环境之间的交互。通过利用安装在身体上的 IMUs 和头戴式激光雷达,HiSC4D 可以捕捉无约束空间中以自我为中心的人类运动,而无需外部设备和预先构建的地图。考虑到IMUs可以捕捉人类在空间上不受限制的姿势,但在长时间使用时容易漂移,而LiDAR对于全局定位是稳定的,但对于局部位置和方向是粗糙的,HiSC4D采用了一种联合优化方法,协调所有传感器并利用环境线索,为在大型场景中进行长期捕捉带来了可喜的成果。为了促进大场景中以自我为中心的人机交互研究和下游任务的开展,我们还提供了一个数据集,其中包含4个大场景(200到5000美元m^2$)中的8个序列,提供了36k帧带有SMPL注释和动态场景的精确4D人体运动、31k帧裁剪人体点云和环境场景网格。篮球馆和商业街等各种场景,以及日常问候、一对一篮球比赛和导游等具有挑战性的人类动作,都证明了 HiSC4D 的有效性和泛化能力。数据集和代码将在www.lidarhumanmotion.net/hisc4d 上公布,供研究使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Vista3D: Unravel the 3D Darkside of a Single Image MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion Efficient Low-Resolution Face Recognition via Bridge Distillation Enhancing Few-Shot Classification without Forgetting through Multi-Level Contrastive Constraints NVLM: Open Frontier-Class Multimodal LLMs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1