Hand-Object Interaction Pretraining from Videos

Himanshu Gaurav Singh, Antonio Loquercio, Carmelo Sferrazza, Jane Wu, Haozhi Qi, Pieter Abbeel, Jitendra Malik
{"title":"Hand-Object Interaction Pretraining from Videos","authors":"Himanshu Gaurav Singh, Antonio Loquercio, Carmelo Sferrazza, Jane Wu, Haozhi Qi, Pieter Abbeel, Jitendra Malik","doi":"arxiv-2409.08273","DOIUrl":null,"url":null,"abstract":"We present an approach to learn general robot manipulation priors from 3D\nhand-object interaction trajectories. We build a framework to use in-the-wild\nvideos to generate sensorimotor robot trajectories. We do so by lifting both\nthe human hand and the manipulated object in a shared 3D space and retargeting\nhuman motions to robot actions. Generative modeling on this data gives us a\ntask-agnostic base policy. This policy captures a general yet flexible\nmanipulation prior. We empirically demonstrate that finetuning this policy,\nwith both reinforcement learning (RL) and behavior cloning (BC), enables\nsample-efficient adaptation to downstream tasks and simultaneously improves\nrobustness and generalizability compared to prior approaches. Qualitative\nexperiments are available at: \\url{https://hgaurav2k.github.io/hop/}.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We present an approach to learn general robot manipulation priors from 3D hand-object interaction trajectories. We build a framework to use in-the-wild videos to generate sensorimotor robot trajectories. We do so by lifting both the human hand and the manipulated object in a shared 3D space and retargeting human motions to robot actions. Generative modeling on this data gives us a task-agnostic base policy. This policy captures a general yet flexible manipulation prior. We empirically demonstrate that finetuning this policy, with both reinforcement learning (RL) and behavior cloning (BC), enables sample-efficient adaptation to downstream tasks and simultaneously improves robustness and generalizability compared to prior approaches. Qualitative experiments are available at: \url{https://hgaurav2k.github.io/hop/}.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过视频进行手与物体交互预训练
我们提出了一种从三维手-物交互轨迹中学习通用机器人操纵先验的方法。我们建立了一个框架,利用实时视频生成机器人的感应运动轨迹。我们的方法是在共享的三维空间中同时抬起人手和被操纵物体,并将人的动作重定向为机器人动作。通过对这些数据进行生成建模,我们可以获得与物体无关的基本策略。该策略捕捉到了通用但灵活的操纵先验。我们通过经验证明,利用强化学习(RL)和行为克隆(BC)对这一策略进行微调,可以实现对下游任务的无例高效适应,与之前的方法相比,同时提高了稳健性和普适性。定性实验见\url{https://hgaurav2k.github.io/hop/}.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
IMRL: Integrating Visual, Physical, Temporal, and Geometric Representations for Enhanced Food Acquisition Human-Robot Cooperative Piano Playing with Learning-Based Real-Time Music Accompaniment GauTOAO: Gaussian-based Task-Oriented Affordance of Objects Reinforcement Learning with Lie Group Orientations for Robotics Haptic-ACT: Bridging Human Intuition with Compliant Robotic Manipulation via Immersive VR
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1