Reinforcement learning path planning method incorporating multi-step Hindsight Experience Replay for lightweight robots

IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Displays Pub Date : 2024-07-14 DOI:10.1016/j.displa.2024.102796
Jiaqi Wang, Huiyan Han, Xie Han, Liqun Kuang, Xiaowen Yang
{"title":"Reinforcement learning path planning method incorporating multi-step Hindsight Experience Replay for lightweight robots","authors":"Jiaqi Wang,&nbsp;Huiyan Han,&nbsp;Xie Han,&nbsp;Liqun Kuang,&nbsp;Xiaowen Yang","doi":"10.1016/j.displa.2024.102796","DOIUrl":null,"url":null,"abstract":"<div><p>Home service robots prioritize cost-effectiveness and convenience over the precision required for industrial tasks like autonomous driving, making their task execution more easily. Meanwhile, path planning tasks using Deep Reinforcement Learning(DRL) are commonly sparse reward problems with limited data utilization, posing challenges in obtaining meaningful rewards during training, consequently resulting in slow or challenging training. In response to these challenges, our paper introduces a lightweight end-to-end path planning algorithm employing with hindsight experience replay(HER). Initially, we optimize the reinforcement learning training process from scratch and map the complex high-dimensional action space and state space to the representative low-dimensional action space. At the same time, we improve the network structure to decouple the model navigation and obstacle avoidance module to meet the requirements of lightweight. Subsequently, we integrate HER and curriculum learning (CL) to tackle issues related to inefficient training. Additionally, we propose a multi-step hindsight experience replay (MS-HER) specifically for the path planning task, markedly enhancing both training efficiency and model generalization across diverse environments. To substantiate the enhanced training efficiency of the refined algorithm, we conducted tests within diverse Gazebo simulation environments. Results of the experiments reveal noteworthy enhancements in critical metrics, including success rate and training efficiency. To further ascertain the enhanced algorithm’s generalization capability, we evaluate its performance in some ”never-before-seen” simulation environment. Ultimately, we deploy the trained model onto a real lightweight robot for validation. The experimental outcomes indicate the model’s competence in successfully executing the path planning task, even on a small robot with constrained computational resources.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102796"},"PeriodicalIF":3.7000,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224001604","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Home service robots prioritize cost-effectiveness and convenience over the precision required for industrial tasks like autonomous driving, making their task execution more easily. Meanwhile, path planning tasks using Deep Reinforcement Learning(DRL) are commonly sparse reward problems with limited data utilization, posing challenges in obtaining meaningful rewards during training, consequently resulting in slow or challenging training. In response to these challenges, our paper introduces a lightweight end-to-end path planning algorithm employing with hindsight experience replay(HER). Initially, we optimize the reinforcement learning training process from scratch and map the complex high-dimensional action space and state space to the representative low-dimensional action space. At the same time, we improve the network structure to decouple the model navigation and obstacle avoidance module to meet the requirements of lightweight. Subsequently, we integrate HER and curriculum learning (CL) to tackle issues related to inefficient training. Additionally, we propose a multi-step hindsight experience replay (MS-HER) specifically for the path planning task, markedly enhancing both training efficiency and model generalization across diverse environments. To substantiate the enhanced training efficiency of the refined algorithm, we conducted tests within diverse Gazebo simulation environments. Results of the experiments reveal noteworthy enhancements in critical metrics, including success rate and training efficiency. To further ascertain the enhanced algorithm’s generalization capability, we evaluate its performance in some ”never-before-seen” simulation environment. Ultimately, we deploy the trained model onto a real lightweight robot for validation. The experimental outcomes indicate the model’s competence in successfully executing the path planning task, even on a small robot with constrained computational resources.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
针对轻型机器人的包含多步 "后见之明 "经验回放的强化学习路径规划方法
与自动驾驶等工业任务所需的精度相比,家用服务机器人更注重成本效益和便利性,这使其更容易执行任务。与此同时,使用深度强化学习(DRL)的路径规划任务通常是数据利用率有限的稀疏奖励问题,在训练过程中难以获得有意义的奖励,从而导致训练速度缓慢或训练难度增加。为了应对这些挑战,我们的论文介绍了一种采用事后经验重放(HER)的轻量级端到端路径规划算法。首先,我们从头开始优化强化学习训练过程,将复杂的高维行动空间和状态空间映射到有代表性的低维行动空间。同时,我们改进了网络结构,将模型导航和避障模块解耦,以满足轻量级的要求。随后,我们整合了 HER 和课程学习(CL),以解决训练效率低下的相关问题。此外,我们还针对路径规划任务提出了多步骤后见经验重放(MS-HER),显著提高了训练效率和模型在不同环境下的泛化能力。为了证实改进算法提高了训练效率,我们在不同的 Gazebo 仿真环境中进行了测试。实验结果表明,成功率和训练效率等关键指标都有显著提高。为了进一步确定增强算法的泛化能力,我们在一些 "前所未见 "的模拟环境中对其性能进行了评估。最后,我们将训练好的模型部署到一个真正的轻型机器人上进行验证。实验结果表明,即使在计算资源有限的小型机器人上,该模型也能成功执行路径规划任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Displays
Displays 工程技术-工程:电子与电气
CiteScore
4.60
自引率
25.60%
发文量
138
审稿时长
92 days
期刊介绍: Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.
期刊最新文献
Mambav3d: A mamba-based virtual 3D module stringing semantic information between layers of medical image slices Luminance decomposition and Transformer based no-reference tone-mapped image quality assessment GLDBF: Global and local dual-branch fusion network for no-reference point cloud quality assessment Virtual reality in medical education: Effectiveness of Immersive Virtual Anatomy Laboratory (IVAL) compared to traditional learning approaches Weighted ensemble deep learning approach for classification of gastrointestinal diseases in colonoscopy images aided by explainable AI
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1