Residual Skill Policies: Learning an Adaptable Skill-based Action Space for Reinforcement Learning for Robotics

Krishan Rana, Ming Xu, Brendan Tidd, Michael Milford, N. Sunderhauf
{"title":"Residual Skill Policies: Learning an Adaptable Skill-based Action Space for Reinforcement Learning for Robotics","authors":"Krishan Rana, Ming Xu, Brendan Tidd, Michael Milford, N. Sunderhauf","doi":"10.48550/arXiv.2211.02231","DOIUrl":null,"url":null,"abstract":"Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from which they can be sampled as actions by a high-level RL agent. However, this skill space is expansive, and not all skills are relevant for a given robot state, making exploration difficult. Furthermore, the downstream RL agent is limited to learning structurally similar tasks to those used to construct the skill space. We firstly propose accelerating exploration in the skill space using state-conditioned generative models to directly bias the high-level agent towards only sampling skills relevant to a given state based on prior experience. Next, we propose a low-level residual policy for fine-grained skill adaptation enabling downstream RL agents to adapt to unseen task variations. Finally, we validate our approach across four challenging manipulation tasks that differ from those used to build the skill space, demonstrating our ability to learn across task variations while significantly accelerating exploration, outperforming prior works. Code and videos are available on our project website: https://krishanrana.github.io/reskill.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Robot Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2211.02231","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from which they can be sampled as actions by a high-level RL agent. However, this skill space is expansive, and not all skills are relevant for a given robot state, making exploration difficult. Furthermore, the downstream RL agent is limited to learning structurally similar tasks to those used to construct the skill space. We firstly propose accelerating exploration in the skill space using state-conditioned generative models to directly bias the high-level agent towards only sampling skills relevant to a given state based on prior experience. Next, we propose a low-level residual policy for fine-grained skill adaptation enabling downstream RL agents to adapt to unseen task variations. Finally, we validate our approach across four challenging manipulation tasks that differ from those used to build the skill space, demonstrating our ability to learn across task variations while significantly accelerating exploration, outperforming prior works. Code and videos are available on our project website: https://krishanrana.github.io/reskill.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
剩余技能策略:学习一个可适应的基于技能的机器人强化学习行动空间
基于技能的强化学习(RL)已经成为利用先验知识加速机器人学习的一种有前途的策略。技能通常是从专家演示中提取出来的,并嵌入到一个潜在空间中,从中它们可以被高级强化学习代理作为动作进行采样。然而,这个技能空间是很广阔的,并不是所有的技能都与给定的机器人状态相关,这使得探索变得困难。此外,下游RL代理仅限于学习结构上与用于构建技能空间的任务相似的任务。我们首先提出使用状态条件生成模型加速技能空间的探索,直接使高级智能体偏向于基于先前经验的与给定状态相关的采样技能。接下来,我们提出了一种用于细粒度技能适应的低级残留策略,使下游RL代理能够适应看不见的任务变化。最后,我们在四个不同于构建技能空间的具有挑战性的操作任务中验证了我们的方法,证明了我们在跨任务变化学习的能力,同时显著加速了探索,超越了之前的工作。代码和视频可在我们的项目网站上获得:https://krishanrana.github.io/reskill。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models Lidar Line Selection with Spatially-Aware Shapley Value for Cost-Efficient Depth Completion Safe Robot Learning in Assistive Devices through Neural Network Repair COACH: Cooperative Robot Teaching Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1