Path Planning for Robotic Arm Based on Reinforcement Learning under the Train

Guanhao Xie, Duo Zhao, Qichao Tang, Muhua Zhang, Wenjie Zhao, Yewen Wang
{"title":"Path Planning for Robotic Arm Based on Reinforcement Learning under the Train","authors":"Guanhao Xie, Duo Zhao, Qichao Tang, Muhua Zhang, Wenjie Zhao, Yewen Wang","doi":"10.1109/ROBIO58561.2023.10354783","DOIUrl":null,"url":null,"abstract":"Due to the widespread use of robotic arms, path planning for them has always been a hot research topic. However, traditional path planning algorithms struggle to ensure low disparity in each path, making them unsuitable for operation scenarios with high safety requirements, such as the undercarriage environment of train. A Reinforcement Learning (RL) framework is proposed in this article to address this challenge. The Proximal Policy Optimization (PPO) algorithm has been enhanced, resulting in a variant referred to as Randomized PPO (RPPO), which demonstrates slightly accelerated convergence speed. Additionally, a reward model is proposed to assist the agent in escaping local optima. For modeling application environment, lidar is employed for obtaining obstacle point cloud information, which is then transformed into an octree grid map for maneuvering the robotic arm to avoid obstacles. According to the experimental results, the paths planned by our system are superior to those of RRT* in terms of both average length and standard deviation, and RPPO exhibits better convergence speed and path standard deviation compared to PPO.","PeriodicalId":505134,"journal":{"name":"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"34 12","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO58561.2023.10354783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Due to the widespread use of robotic arms, path planning for them has always been a hot research topic. However, traditional path planning algorithms struggle to ensure low disparity in each path, making them unsuitable for operation scenarios with high safety requirements, such as the undercarriage environment of train. A Reinforcement Learning (RL) framework is proposed in this article to address this challenge. The Proximal Policy Optimization (PPO) algorithm has been enhanced, resulting in a variant referred to as Randomized PPO (RPPO), which demonstrates slightly accelerated convergence speed. Additionally, a reward model is proposed to assist the agent in escaping local optima. For modeling application environment, lidar is employed for obtaining obstacle point cloud information, which is then transformed into an octree grid map for maneuvering the robotic arm to avoid obstacles. According to the experimental results, the paths planned by our system are superior to those of RRT* in terms of both average length and standard deviation, and RPPO exhibits better convergence speed and path standard deviation compared to PPO.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于列车下强化学习的机械臂路径规划
由于机械臂的广泛应用,其路径规划一直是热门研究课题。然而,传统的路径规划算法难以确保每条路径的低差异,因此不适合安全要求较高的操作场景,如列车底盘环境。本文提出了一种强化学习(RL)框架来应对这一挑战。本文对近端策略优化(PPO)算法进行了改进,形成了一种称为随机 PPO(RPPO)的变体,其收敛速度略有加快。此外,还提出了一个奖励模型,以帮助代理摆脱局部最优状态。在模拟应用环境时,采用激光雷达获取障碍物点云信息,然后将其转换成八叉网格图,用于操纵机械臂避开障碍物。实验结果表明,我们的系统规划的路径在平均长度和标准偏差方面都优于 RRT*,与 PPO 相比,RPPO 表现出更好的收敛速度和路径标准偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Barometric Soft Tactile Sensor for Depth Independent Contact Localization Stability Margin Based Gait Design on Slopes for a Novel Reconfigurable Quadruped Robot with a Foldable Trunk Blind Walking Balance Control and Disturbance Rejection of the Bipedal Humanoid Robot Xiao-Man via Reinforcement Learning A Closed-Loop Multi-perspective Visual Servoing Approach with Reinforcement Learning Modeling and Analysis of Pipe External Surface Grinding Force using Cup-shaped Wire Brush
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1