基于列车下强化学习的机械臂路径规划

2023 IEEE International Conference on Robotics and Biomimetics (ROBIO) Pub Date : 2023-12-04 DOI:10.1109/ROBIO58561.2023.10354783

Guanhao Xie, Duo Zhao, Qichao Tang, Muhua Zhang, Wenjie Zhao, Yewen Wang

{"title":"基于列车下强化学习的机械臂路径规划","authors":"Guanhao Xie, Duo Zhao, Qichao Tang, Muhua Zhang, Wenjie Zhao, Yewen Wang","doi":"10.1109/ROBIO58561.2023.10354783","DOIUrl":null,"url":null,"abstract":"Due to the widespread use of robotic arms, path planning for them has always been a hot research topic. However, traditional path planning algorithms struggle to ensure low disparity in each path, making them unsuitable for operation scenarios with high safety requirements, such as the undercarriage environment of train. A Reinforcement Learning (RL) framework is proposed in this article to address this challenge. The Proximal Policy Optimization (PPO) algorithm has been enhanced, resulting in a variant referred to as Randomized PPO (RPPO), which demonstrates slightly accelerated convergence speed. Additionally, a reward model is proposed to assist the agent in escaping local optima. For modeling application environment, lidar is employed for obtaining obstacle point cloud information, which is then transformed into an octree grid map for maneuvering the robotic arm to avoid obstacles. According to the experimental results, the paths planned by our system are superior to those of RRT* in terms of both average length and standard deviation, and RPPO exhibits better convergence speed and path standard deviation compared to PPO.","PeriodicalId":505134,"journal":{"name":"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"34 12","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Path Planning for Robotic Arm Based on Reinforcement Learning under the Train\",\"authors\":\"Guanhao Xie, Duo Zhao, Qichao Tang, Muhua Zhang, Wenjie Zhao, Yewen Wang\",\"doi\":\"10.1109/ROBIO58561.2023.10354783\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the widespread use of robotic arms, path planning for them has always been a hot research topic. However, traditional path planning algorithms struggle to ensure low disparity in each path, making them unsuitable for operation scenarios with high safety requirements, such as the undercarriage environment of train. A Reinforcement Learning (RL) framework is proposed in this article to address this challenge. The Proximal Policy Optimization (PPO) algorithm has been enhanced, resulting in a variant referred to as Randomized PPO (RPPO), which demonstrates slightly accelerated convergence speed. Additionally, a reward model is proposed to assist the agent in escaping local optima. For modeling application environment, lidar is employed for obtaining obstacle point cloud information, which is then transformed into an octree grid map for maneuvering the robotic arm to avoid obstacles. According to the experimental results, the paths planned by our system are superior to those of RRT* in terms of both average length and standard deviation, and RPPO exhibits better convergence speed and path standard deviation compared to PPO.\",\"PeriodicalId\":505134,\"journal\":{\"name\":\"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"volume\":\"34 12\",\"pages\":\"1-8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROBIO58561.2023.10354783\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO58561.2023.10354783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

由于机械臂的广泛应用，其路径规划一直是热门研究课题。然而，传统的路径规划算法难以确保每条路径的低差异，因此不适合安全要求较高的操作场景，如列车底盘环境。本文提出了一种强化学习（RL）框架来应对这一挑战。本文对近端策略优化（PPO）算法进行了改进，形成了一种称为随机 PPO（RPPO）的变体，其收敛速度略有加快。此外，还提出了一个奖励模型，以帮助代理摆脱局部最优状态。在模拟应用环境时，采用激光雷达获取障碍物点云信息，然后将其转换成八叉网格图，用于操纵机械臂避开障碍物。实验结果表明，我们的系统规划的路径在平均长度和标准偏差方面都优于 RRT*，与 PPO 相比，RPPO 表现出更好的收敛速度和路径标准偏差。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Path Planning for Robotic Arm Based on Reinforcement Learning under the Train

Due to the widespread use of robotic arms, path planning for them has always been a hot research topic. However, traditional path planning algorithms struggle to ensure low disparity in each path, making them unsuitable for operation scenarios with high safety requirements, such as the undercarriage environment of train. A Reinforcement Learning (RL) framework is proposed in this article to address this challenge. The Proximal Policy Optimization (PPO) algorithm has been enhanced, resulting in a variant referred to as Randomized PPO (RPPO), which demonstrates slightly accelerated convergence speed. Additionally, a reward model is proposed to assist the agent in escaping local optima. For modeling application environment, lidar is employed for obtaining obstacle point cloud information, which is then transformed into an octree grid map for maneuvering the robotic arm to avoid obstacles. According to the experimental results, the paths planned by our system are superior to those of RRT* in terms of both average length and standard deviation, and RPPO exhibits better convergence speed and path standard deviation compared to PPO.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)

自引率

0.00%

发文量