基于深度强化学习的导航技能对人类偏好的快速适应

2020 IEEE International Conference on Robotics and Automation (ICRA) Pub Date : 2020-05-01 DOI:10.1109/ICRA40945.2020.9197159

Jinyoung Choi, C. Dance, Jung-Eun Kim, Kyungsik Park, Jaehun Han, Joonho Seo, Minsu Kim

{"title":"基于深度强化学习的导航技能对人类偏好的快速适应","authors":"Jinyoung Choi, C. Dance, Jung-Eun Kim, Kyungsik Park, Jaehun Han, Joonho Seo, Minsu Kim","doi":"10.1109/ICRA40945.2020.9197159","DOIUrl":null,"url":null,"abstract":"Deep reinforcement learning (RL) is being actively studied for robot navigation due to its promise of superior performance and robustness. However, most existing deep RL navigation agents are trained using fixed parameters, such as maximum velocities and weightings of reward components. Since the optimal choice of parameters depends on the use-case, it can be difficult to deploy such existing methods in a variety of real-world service scenarios. In this paper, we propose a novel deep RL navigation method that can adapt its policy to a wide range of parameters and reward functions without expensive retraining. Additionally, we explore a Bayesian deep learning method to optimize these parameters that requires only a small amount of preference data. We empirically show that our method can learn diverse navigation skills and quickly adapt its policy to a given performance metric or to human preference. We also demonstrate our method in real-world scenarios.","PeriodicalId":6859,"journal":{"name":"2020 IEEE International Conference on Robotics and Automation (ICRA)","volume":"30 1","pages":"3363-3370"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Fast Adaptation of Deep Reinforcement Learning-Based Navigation Skills to Human Preference\",\"authors\":\"Jinyoung Choi, C. Dance, Jung-Eun Kim, Kyungsik Park, Jaehun Han, Joonho Seo, Minsu Kim\",\"doi\":\"10.1109/ICRA40945.2020.9197159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep reinforcement learning (RL) is being actively studied for robot navigation due to its promise of superior performance and robustness. However, most existing deep RL navigation agents are trained using fixed parameters, such as maximum velocities and weightings of reward components. Since the optimal choice of parameters depends on the use-case, it can be difficult to deploy such existing methods in a variety of real-world service scenarios. In this paper, we propose a novel deep RL navigation method that can adapt its policy to a wide range of parameters and reward functions without expensive retraining. Additionally, we explore a Bayesian deep learning method to optimize these parameters that requires only a small amount of preference data. We empirically show that our method can learn diverse navigation skills and quickly adapt its policy to a given performance metric or to human preference. We also demonstrate our method in real-world scenarios.\",\"PeriodicalId\":6859,\"journal\":{\"name\":\"2020 IEEE International Conference on Robotics and Automation (ICRA)\",\"volume\":\"30 1\",\"pages\":\"3363-3370\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Robotics and Automation (ICRA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRA40945.2020.9197159\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA40945.2020.9197159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

深度强化学习(RL)由于其优越的性能和鲁棒性而被积极研究用于机器人导航。然而，大多数现有的深度强化学习导航代理都是使用固定参数进行训练的，比如最大速度和奖励成分的权重。由于参数的最佳选择取决于用例，因此很难在各种实际服务场景中部署这种现有方法。在本文中，我们提出了一种新的深度强化学习导航方法，该方法可以使其策略适应广泛的参数和奖励函数，而无需昂贵的再训练。此外，我们还探索了一种贝叶斯深度学习方法来优化这些只需要少量偏好数据的参数。我们的经验表明，我们的方法可以学习不同的导航技能，并快速调整其策略以适应给定的性能指标或人类偏好。我们还在实际场景中演示了我们的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Fast Adaptation of Deep Reinforcement Learning-Based Navigation Skills to Human Preference

Deep reinforcement learning (RL) is being actively studied for robot navigation due to its promise of superior performance and robustness. However, most existing deep RL navigation agents are trained using fixed parameters, such as maximum velocities and weightings of reward components. Since the optimal choice of parameters depends on the use-case, it can be difficult to deploy such existing methods in a variety of real-world service scenarios. In this paper, we propose a novel deep RL navigation method that can adapt its policy to a wide range of parameters and reward functions without expensive retraining. Additionally, we explore a Bayesian deep learning method to optimize these parameters that requires only a small amount of preference data. We empirically show that our method can learn diverse navigation skills and quickly adapt its policy to a given performance metric or to human preference. We also demonstrate our method in real-world scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE International Conference on Robotics and Automation (ICRA)

自引率

0.00%

发文量