基于深度强化学习的导航技能对人类偏好的快速适应

Jinyoung Choi, C. Dance, Jung-Eun Kim, Kyungsik Park, Jaehun Han, Joonho Seo, Minsu Kim
{"title":"基于深度强化学习的导航技能对人类偏好的快速适应","authors":"Jinyoung Choi, C. Dance, Jung-Eun Kim, Kyungsik Park, Jaehun Han, Joonho Seo, Minsu Kim","doi":"10.1109/ICRA40945.2020.9197159","DOIUrl":null,"url":null,"abstract":"Deep reinforcement learning (RL) is being actively studied for robot navigation due to its promise of superior performance and robustness. However, most existing deep RL navigation agents are trained using fixed parameters, such as maximum velocities and weightings of reward components. Since the optimal choice of parameters depends on the use-case, it can be difficult to deploy such existing methods in a variety of real-world service scenarios. In this paper, we propose a novel deep RL navigation method that can adapt its policy to a wide range of parameters and reward functions without expensive retraining. Additionally, we explore a Bayesian deep learning method to optimize these parameters that requires only a small amount of preference data. We empirically show that our method can learn diverse navigation skills and quickly adapt its policy to a given performance metric or to human preference. We also demonstrate our method in real-world scenarios.","PeriodicalId":6859,"journal":{"name":"2020 IEEE International Conference on Robotics and Automation (ICRA)","volume":"30 1","pages":"3363-3370"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Fast Adaptation of Deep Reinforcement Learning-Based Navigation Skills to Human Preference\",\"authors\":\"Jinyoung Choi, C. Dance, Jung-Eun Kim, Kyungsik Park, Jaehun Han, Joonho Seo, Minsu Kim\",\"doi\":\"10.1109/ICRA40945.2020.9197159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep reinforcement learning (RL) is being actively studied for robot navigation due to its promise of superior performance and robustness. However, most existing deep RL navigation agents are trained using fixed parameters, such as maximum velocities and weightings of reward components. Since the optimal choice of parameters depends on the use-case, it can be difficult to deploy such existing methods in a variety of real-world service scenarios. In this paper, we propose a novel deep RL navigation method that can adapt its policy to a wide range of parameters and reward functions without expensive retraining. Additionally, we explore a Bayesian deep learning method to optimize these parameters that requires only a small amount of preference data. We empirically show that our method can learn diverse navigation skills and quickly adapt its policy to a given performance metric or to human preference. We also demonstrate our method in real-world scenarios.\",\"PeriodicalId\":6859,\"journal\":{\"name\":\"2020 IEEE International Conference on Robotics and Automation (ICRA)\",\"volume\":\"30 1\",\"pages\":\"3363-3370\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Robotics and Automation (ICRA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRA40945.2020.9197159\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA40945.2020.9197159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

摘要

深度强化学习(RL)由于其优越的性能和鲁棒性而被积极研究用于机器人导航。然而,大多数现有的深度强化学习导航代理都是使用固定参数进行训练的,比如最大速度和奖励成分的权重。由于参数的最佳选择取决于用例,因此很难在各种实际服务场景中部署这种现有方法。在本文中,我们提出了一种新的深度强化学习导航方法,该方法可以使其策略适应广泛的参数和奖励函数,而无需昂贵的再训练。此外,我们还探索了一种贝叶斯深度学习方法来优化这些只需要少量偏好数据的参数。我们的经验表明,我们的方法可以学习不同的导航技能,并快速调整其策略以适应给定的性能指标或人类偏好。我们还在实际场景中演示了我们的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Fast Adaptation of Deep Reinforcement Learning-Based Navigation Skills to Human Preference
Deep reinforcement learning (RL) is being actively studied for robot navigation due to its promise of superior performance and robustness. However, most existing deep RL navigation agents are trained using fixed parameters, such as maximum velocities and weightings of reward components. Since the optimal choice of parameters depends on the use-case, it can be difficult to deploy such existing methods in a variety of real-world service scenarios. In this paper, we propose a novel deep RL navigation method that can adapt its policy to a wide range of parameters and reward functions without expensive retraining. Additionally, we explore a Bayesian deep learning method to optimize these parameters that requires only a small amount of preference data. We empirically show that our method can learn diverse navigation skills and quickly adapt its policy to a given performance metric or to human preference. We also demonstrate our method in real-world scenarios.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Abstractions for computing all robotic sensors that suffice to solve a planning problem An Adaptive Supervisory Control Approach to Dynamic Locomotion Under Parametric Uncertainty Interval Search Genetic Algorithm Based on Trajectory to Solve Inverse Kinematics of Redundant Manipulators and Its Application Path-Following Model Predictive Control of Ballbots Identification and evaluation of a force model for multirotor UAVs*
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1