基于分层强化学习的无地图导航预测探索价值

Yan Gao, Ze Ji, Jing Wu, Changyun Wei, R. Grech
{"title":"基于分层强化学习的无地图导航预测探索价值","authors":"Yan Gao, Ze Ji, Jing Wu, Changyun Wei, R. Grech","doi":"10.1109/ICMA57826.2023.10215569","DOIUrl":null,"url":null,"abstract":"Hierarchical reinforcement learning (HRL) is a promising approach for complex mapless navigation tasks by decomposing the task into a hierarchy of subtasks. However, selecting appropriate subgoals is challenging. Existing methods predominantly rely on sensory inputs, which may contain inadequate information or excessive redundancy. Inspired by the cognitive processes underpinning human navigation, our aim is to enable the robot to leverage both ‘intrinsic and extrinsic factors’ to make informed decisions regarding subgoal selection. In this work, we propose a novel HRL-based mapless navigation framework. Specifically, we introduce a predictive module, named Predictive Exploration Worthiness (PEW), into the high-level (HL) decision-making policy. The hypothesis is that the worthiness of an area for further exploration is related to obstacle spatial distribution, such as the area of free space and the distribution of obstacles. The PEW is introduced as a compact representation for obstacle spatial distribution. Additionally, to incorporate ‘intrinsic factors’ in the subgoal selection process, a penalty element is introduced in the HL reward function, allowing the robot to take into account the capabilities of the low-level policy when selecting subgoals. Our method exhibits significant improvements in success rate when tested in unseen environments.","PeriodicalId":151364,"journal":{"name":"2023 IEEE International Conference on Mechatronics and Automation (ICMA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hierarchical Reinforcement Learning-based Mapless Navigation with Predictive Exploration Worthiness\",\"authors\":\"Yan Gao, Ze Ji, Jing Wu, Changyun Wei, R. Grech\",\"doi\":\"10.1109/ICMA57826.2023.10215569\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hierarchical reinforcement learning (HRL) is a promising approach for complex mapless navigation tasks by decomposing the task into a hierarchy of subtasks. However, selecting appropriate subgoals is challenging. Existing methods predominantly rely on sensory inputs, which may contain inadequate information or excessive redundancy. Inspired by the cognitive processes underpinning human navigation, our aim is to enable the robot to leverage both ‘intrinsic and extrinsic factors’ to make informed decisions regarding subgoal selection. In this work, we propose a novel HRL-based mapless navigation framework. Specifically, we introduce a predictive module, named Predictive Exploration Worthiness (PEW), into the high-level (HL) decision-making policy. The hypothesis is that the worthiness of an area for further exploration is related to obstacle spatial distribution, such as the area of free space and the distribution of obstacles. The PEW is introduced as a compact representation for obstacle spatial distribution. Additionally, to incorporate ‘intrinsic factors’ in the subgoal selection process, a penalty element is introduced in the HL reward function, allowing the robot to take into account the capabilities of the low-level policy when selecting subgoals. Our method exhibits significant improvements in success rate when tested in unseen environments.\",\"PeriodicalId\":151364,\"journal\":{\"name\":\"2023 IEEE International Conference on Mechatronics and Automation (ICMA)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Mechatronics and Automation (ICMA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMA57826.2023.10215569\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Mechatronics and Automation (ICMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMA57826.2023.10215569","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

分层强化学习(HRL)通过将复杂的无地图导航任务分解为子任务的层次结构,是一种很有前途的方法。然而,选择合适的子目标是一项挑战。现有的方法主要依赖于感官输入,这可能包含不充分的信息或过多的冗余。受人类导航认知过程的启发,我们的目标是使机器人能够利用“内在和外在因素”来做出关于子目标选择的明智决定。在这项工作中,我们提出了一种新的基于hr的无地图导航框架。具体而言,我们将预测探索价值(predictive Exploration worth, PEW)模块引入到高层次决策策略中。假设一个区域是否值得进一步探索与障碍物的空间分布有关,例如自由空间的面积和障碍物的分布。皮尤是障碍物空间分布的一种紧凑表示。此外,为了在子目标选择过程中纳入“内在因素”,在HL奖励函数中引入了惩罚元素,允许机器人在选择子目标时考虑低级策略的能力。当在不可见的环境中测试时,我们的方法显着提高了成功率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Hierarchical Reinforcement Learning-based Mapless Navigation with Predictive Exploration Worthiness
Hierarchical reinforcement learning (HRL) is a promising approach for complex mapless navigation tasks by decomposing the task into a hierarchy of subtasks. However, selecting appropriate subgoals is challenging. Existing methods predominantly rely on sensory inputs, which may contain inadequate information or excessive redundancy. Inspired by the cognitive processes underpinning human navigation, our aim is to enable the robot to leverage both ‘intrinsic and extrinsic factors’ to make informed decisions regarding subgoal selection. In this work, we propose a novel HRL-based mapless navigation framework. Specifically, we introduce a predictive module, named Predictive Exploration Worthiness (PEW), into the high-level (HL) decision-making policy. The hypothesis is that the worthiness of an area for further exploration is related to obstacle spatial distribution, such as the area of free space and the distribution of obstacles. The PEW is introduced as a compact representation for obstacle spatial distribution. Additionally, to incorporate ‘intrinsic factors’ in the subgoal selection process, a penalty element is introduced in the HL reward function, allowing the robot to take into account the capabilities of the low-level policy when selecting subgoals. Our method exhibits significant improvements in success rate when tested in unseen environments.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
ICMA 2023 Conference Info A Parameter Fluctuation Impact Analysis Algorithm for The Control of Sealing Performance Stability Research on Composite Control Strategy of Off-Grid PV Inverter under Nonlinear Asymmetric Load A Low-Cost Skiing Motion Capture System Based on Monocular RGB Camera and MINS Fusion INS/GNSS/UWB/OD Robust Navigation Algorithm Based on Factor Graph
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1