{"title":"基于分层强化学习的无地图导航预测探索价值","authors":"Yan Gao, Ze Ji, Jing Wu, Changyun Wei, R. Grech","doi":"10.1109/ICMA57826.2023.10215569","DOIUrl":null,"url":null,"abstract":"Hierarchical reinforcement learning (HRL) is a promising approach for complex mapless navigation tasks by decomposing the task into a hierarchy of subtasks. However, selecting appropriate subgoals is challenging. Existing methods predominantly rely on sensory inputs, which may contain inadequate information or excessive redundancy. Inspired by the cognitive processes underpinning human navigation, our aim is to enable the robot to leverage both ‘intrinsic and extrinsic factors’ to make informed decisions regarding subgoal selection. In this work, we propose a novel HRL-based mapless navigation framework. Specifically, we introduce a predictive module, named Predictive Exploration Worthiness (PEW), into the high-level (HL) decision-making policy. The hypothesis is that the worthiness of an area for further exploration is related to obstacle spatial distribution, such as the area of free space and the distribution of obstacles. The PEW is introduced as a compact representation for obstacle spatial distribution. Additionally, to incorporate ‘intrinsic factors’ in the subgoal selection process, a penalty element is introduced in the HL reward function, allowing the robot to take into account the capabilities of the low-level policy when selecting subgoals. Our method exhibits significant improvements in success rate when tested in unseen environments.","PeriodicalId":151364,"journal":{"name":"2023 IEEE International Conference on Mechatronics and Automation (ICMA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hierarchical Reinforcement Learning-based Mapless Navigation with Predictive Exploration Worthiness\",\"authors\":\"Yan Gao, Ze Ji, Jing Wu, Changyun Wei, R. Grech\",\"doi\":\"10.1109/ICMA57826.2023.10215569\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hierarchical reinforcement learning (HRL) is a promising approach for complex mapless navigation tasks by decomposing the task into a hierarchy of subtasks. However, selecting appropriate subgoals is challenging. Existing methods predominantly rely on sensory inputs, which may contain inadequate information or excessive redundancy. Inspired by the cognitive processes underpinning human navigation, our aim is to enable the robot to leverage both ‘intrinsic and extrinsic factors’ to make informed decisions regarding subgoal selection. In this work, we propose a novel HRL-based mapless navigation framework. Specifically, we introduce a predictive module, named Predictive Exploration Worthiness (PEW), into the high-level (HL) decision-making policy. The hypothesis is that the worthiness of an area for further exploration is related to obstacle spatial distribution, such as the area of free space and the distribution of obstacles. The PEW is introduced as a compact representation for obstacle spatial distribution. Additionally, to incorporate ‘intrinsic factors’ in the subgoal selection process, a penalty element is introduced in the HL reward function, allowing the robot to take into account the capabilities of the low-level policy when selecting subgoals. Our method exhibits significant improvements in success rate when tested in unseen environments.\",\"PeriodicalId\":151364,\"journal\":{\"name\":\"2023 IEEE International Conference on Mechatronics and Automation (ICMA)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Mechatronics and Automation (ICMA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMA57826.2023.10215569\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Mechatronics and Automation (ICMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMA57826.2023.10215569","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hierarchical Reinforcement Learning-based Mapless Navigation with Predictive Exploration Worthiness
Hierarchical reinforcement learning (HRL) is a promising approach for complex mapless navigation tasks by decomposing the task into a hierarchy of subtasks. However, selecting appropriate subgoals is challenging. Existing methods predominantly rely on sensory inputs, which may contain inadequate information or excessive redundancy. Inspired by the cognitive processes underpinning human navigation, our aim is to enable the robot to leverage both ‘intrinsic and extrinsic factors’ to make informed decisions regarding subgoal selection. In this work, we propose a novel HRL-based mapless navigation framework. Specifically, we introduce a predictive module, named Predictive Exploration Worthiness (PEW), into the high-level (HL) decision-making policy. The hypothesis is that the worthiness of an area for further exploration is related to obstacle spatial distribution, such as the area of free space and the distribution of obstacles. The PEW is introduced as a compact representation for obstacle spatial distribution. Additionally, to incorporate ‘intrinsic factors’ in the subgoal selection process, a penalty element is introduced in the HL reward function, allowing the robot to take into account the capabilities of the low-level policy when selecting subgoals. Our method exhibits significant improvements in success rate when tested in unseen environments.