ACDER:增强好奇心驱动体验重放

2020 IEEE International Conference on Robotics and Automation (ICRA) Pub Date : 2020-05-01 DOI:10.1109/ICRA40945.2020.9197421

Boyao Li, Tao Lu, Jiayi Li, N. Lu, Yinghao Cai, Shuo Wang

{"title":"ACDER:增强好奇心驱动体验重放","authors":"Boyao Li, Tao Lu, Jiayi Li, N. Lu, Yinghao Cai, Shuo Wang","doi":"10.1109/ICRA40945.2020.9197421","DOIUrl":null,"url":null,"abstract":"Exploration in environments with sparse feed-back remains a challenging research problem in reinforcement learning (RL). When the RL agent explores the environment randomly, it results in low exploration efficiency, especially in robotic manipulation tasks with high dimensional continuous state and action space. In this paper, we propose a novel method, called Augmented Curiosity-Driven Experience Replay (ACDER), which leverages (i) a new goal-oriented curiosity-driven exploration to encourage the agent to pursue novel and task-relevant states more purposefully and (ii) the dynamic initial states selection as an automatic exploratory curriculum to further improve the sample-efficiency. Our approach complements Hindsight Experience Replay (HER) by introducing a new way to pursue valuable states. Experiments conducted on four challenging robotic manipulation tasks with binary rewards, including Reach, Push, Pick&Place and Multi-step Push. The empirical results show that our proposed method significantly outperforms existing methods in the first three basic tasks and also achieves satisfactory performance in multi-step robotic task learning.","PeriodicalId":6859,"journal":{"name":"2020 IEEE International Conference on Robotics and Automation (ICRA)","volume":"15 1","pages":"4218-4224"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"ACDER: Augmented Curiosity-Driven Experience Replay\",\"authors\":\"Boyao Li, Tao Lu, Jiayi Li, N. Lu, Yinghao Cai, Shuo Wang\",\"doi\":\"10.1109/ICRA40945.2020.9197421\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Exploration in environments with sparse feed-back remains a challenging research problem in reinforcement learning (RL). When the RL agent explores the environment randomly, it results in low exploration efficiency, especially in robotic manipulation tasks with high dimensional continuous state and action space. In this paper, we propose a novel method, called Augmented Curiosity-Driven Experience Replay (ACDER), which leverages (i) a new goal-oriented curiosity-driven exploration to encourage the agent to pursue novel and task-relevant states more purposefully and (ii) the dynamic initial states selection as an automatic exploratory curriculum to further improve the sample-efficiency. Our approach complements Hindsight Experience Replay (HER) by introducing a new way to pursue valuable states. Experiments conducted on four challenging robotic manipulation tasks with binary rewards, including Reach, Push, Pick&Place and Multi-step Push. The empirical results show that our proposed method significantly outperforms existing methods in the first three basic tasks and also achieves satisfactory performance in multi-step robotic task learning.\",\"PeriodicalId\":6859,\"journal\":{\"name\":\"2020 IEEE International Conference on Robotics and Automation (ICRA)\",\"volume\":\"15 1\",\"pages\":\"4218-4224\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Robotics and Automation (ICRA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRA40945.2020.9197421\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA40945.2020.9197421","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

在稀疏反馈环境下的探索仍然是强化学习(RL)中一个具有挑战性的研究问题。当RL agent随机探索环境时，其探索效率很低，特别是在具有高维连续状态和动作空间的机器人操作任务中。在本文中，我们提出了一种新的方法，称为增强好奇心驱动体验重放(ACDER)，它利用(i)一个新的面向目标的好奇心驱动探索来鼓励智能体更有目的地追求新的和任务相关的状态;(ii)动态初始状态选择作为一个自动探索课程来进一步提高样本效率。我们的方法通过引入一种追求有价值状态的新方法来补充后见之明经验回放(HER)。对四种具有二元奖励的机器人操作任务进行了实验，包括伸手、推、取放和多步推。实验结果表明，本文提出的方法在前三个基本任务上明显优于现有方法，在多步机器人任务学习中也取得了令人满意的效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ACDER: Augmented Curiosity-Driven Experience Replay

Exploration in environments with sparse feed-back remains a challenging research problem in reinforcement learning (RL). When the RL agent explores the environment randomly, it results in low exploration efficiency, especially in robotic manipulation tasks with high dimensional continuous state and action space. In this paper, we propose a novel method, called Augmented Curiosity-Driven Experience Replay (ACDER), which leverages (i) a new goal-oriented curiosity-driven exploration to encourage the agent to pursue novel and task-relevant states more purposefully and (ii) the dynamic initial states selection as an automatic exploratory curriculum to further improve the sample-efficiency. Our approach complements Hindsight Experience Replay (HER) by introducing a new way to pursue valuable states. Experiments conducted on four challenging robotic manipulation tasks with binary rewards, including Reach, Push, Pick&Place and Multi-step Push. The empirical results show that our proposed method significantly outperforms existing methods in the first three basic tasks and also achieves satisfactory performance in multi-step robotic task learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE International Conference on Robotics and Automation (ICRA)

自引率

0.00%

发文量