Guanhao Xie, Duo Zhao, Qichao Tang, Muhua Zhang, Wenjie Zhao, Yewen Wang
{"title":"基于列车下强化学习的机械臂路径规划","authors":"Guanhao Xie, Duo Zhao, Qichao Tang, Muhua Zhang, Wenjie Zhao, Yewen Wang","doi":"10.1109/ROBIO58561.2023.10354783","DOIUrl":null,"url":null,"abstract":"Due to the widespread use of robotic arms, path planning for them has always been a hot research topic. However, traditional path planning algorithms struggle to ensure low disparity in each path, making them unsuitable for operation scenarios with high safety requirements, such as the undercarriage environment of train. A Reinforcement Learning (RL) framework is proposed in this article to address this challenge. The Proximal Policy Optimization (PPO) algorithm has been enhanced, resulting in a variant referred to as Randomized PPO (RPPO), which demonstrates slightly accelerated convergence speed. Additionally, a reward model is proposed to assist the agent in escaping local optima. For modeling application environment, lidar is employed for obtaining obstacle point cloud information, which is then transformed into an octree grid map for maneuvering the robotic arm to avoid obstacles. According to the experimental results, the paths planned by our system are superior to those of RRT* in terms of both average length and standard deviation, and RPPO exhibits better convergence speed and path standard deviation compared to PPO.","PeriodicalId":505134,"journal":{"name":"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"34 12","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Path Planning for Robotic Arm Based on Reinforcement Learning under the Train\",\"authors\":\"Guanhao Xie, Duo Zhao, Qichao Tang, Muhua Zhang, Wenjie Zhao, Yewen Wang\",\"doi\":\"10.1109/ROBIO58561.2023.10354783\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the widespread use of robotic arms, path planning for them has always been a hot research topic. However, traditional path planning algorithms struggle to ensure low disparity in each path, making them unsuitable for operation scenarios with high safety requirements, such as the undercarriage environment of train. A Reinforcement Learning (RL) framework is proposed in this article to address this challenge. The Proximal Policy Optimization (PPO) algorithm has been enhanced, resulting in a variant referred to as Randomized PPO (RPPO), which demonstrates slightly accelerated convergence speed. Additionally, a reward model is proposed to assist the agent in escaping local optima. For modeling application environment, lidar is employed for obtaining obstacle point cloud information, which is then transformed into an octree grid map for maneuvering the robotic arm to avoid obstacles. According to the experimental results, the paths planned by our system are superior to those of RRT* in terms of both average length and standard deviation, and RPPO exhibits better convergence speed and path standard deviation compared to PPO.\",\"PeriodicalId\":505134,\"journal\":{\"name\":\"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"volume\":\"34 12\",\"pages\":\"1-8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROBIO58561.2023.10354783\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO58561.2023.10354783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Path Planning for Robotic Arm Based on Reinforcement Learning under the Train
Due to the widespread use of robotic arms, path planning for them has always been a hot research topic. However, traditional path planning algorithms struggle to ensure low disparity in each path, making them unsuitable for operation scenarios with high safety requirements, such as the undercarriage environment of train. A Reinforcement Learning (RL) framework is proposed in this article to address this challenge. The Proximal Policy Optimization (PPO) algorithm has been enhanced, resulting in a variant referred to as Randomized PPO (RPPO), which demonstrates slightly accelerated convergence speed. Additionally, a reward model is proposed to assist the agent in escaping local optima. For modeling application environment, lidar is employed for obtaining obstacle point cloud information, which is then transformed into an octree grid map for maneuvering the robotic arm to avoid obstacles. According to the experimental results, the paths planned by our system are superior to those of RRT* in terms of both average length and standard deviation, and RPPO exhibits better convergence speed and path standard deviation compared to PPO.