{"title":"动态环境下移动机器人自主导航的安全强化学习方法","authors":"Zhiqian Zhou, Junkai Ren, Zhiwen Zeng, Junhao Xiao, Xinglong Zhang, Xian Guo, Zongtan Zhou, Huimin Lu","doi":"10.1049/cit2.12269","DOIUrl":null,"url":null,"abstract":"Abstract When deploying mobile robots in real‐world scenarios, such as airports, train stations, hospitals, and schools, collisions with pedestrians are intolerable and catastrophic. Motion safety becomes one of the most fundamental requirements for mobile robots. However, until now, efficient and safe robot navigation in such dynamic environments is still an open problem. The critical reason is that the inconsistency between navigation efficiency and motion safety is greatly intensified by the high dynamics and uncertainties of pedestrians. To face the challenge, this paper proposes a safe deep reinforcement learning algorithm named Conflict‐Averse Safe Reinforcement Learning (CASRL) for autonomous robot navigation in dynamic environments. Specifically, it first separates the collision avoidance sub‐task from the overall navigation task and maintains a safety critic to evaluate the safety/risk of actions. Later, it constructs two task‐specific but model‐agnostic policy gradients for goal‐reaching and collision avoidance sub‐tasks to eliminate their mutual interference. Then, it further performs a conflict‐averse gradient manipulation to address the inconsistency between two sub‐tasks. Finally, extensive experiments are performed to evaluate the superiority of CASRL. Simulation results show an average 8.2% performance improvement over the vanilla baseline in eight groups of dynamic environments, which is further extended to 13.4% in the most challenging group. Besides, forty real‐world experiments fully illustrated that the CASRL could be successfully deployed on a real robot.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"62 1","pages":"0"},"PeriodicalIF":8.4000,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A safe reinforcement learning approach for autonomous navigation of mobile robots in dynamic environments\",\"authors\":\"Zhiqian Zhou, Junkai Ren, Zhiwen Zeng, Junhao Xiao, Xinglong Zhang, Xian Guo, Zongtan Zhou, Huimin Lu\",\"doi\":\"10.1049/cit2.12269\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract When deploying mobile robots in real‐world scenarios, such as airports, train stations, hospitals, and schools, collisions with pedestrians are intolerable and catastrophic. Motion safety becomes one of the most fundamental requirements for mobile robots. However, until now, efficient and safe robot navigation in such dynamic environments is still an open problem. The critical reason is that the inconsistency between navigation efficiency and motion safety is greatly intensified by the high dynamics and uncertainties of pedestrians. To face the challenge, this paper proposes a safe deep reinforcement learning algorithm named Conflict‐Averse Safe Reinforcement Learning (CASRL) for autonomous robot navigation in dynamic environments. Specifically, it first separates the collision avoidance sub‐task from the overall navigation task and maintains a safety critic to evaluate the safety/risk of actions. Later, it constructs two task‐specific but model‐agnostic policy gradients for goal‐reaching and collision avoidance sub‐tasks to eliminate their mutual interference. Then, it further performs a conflict‐averse gradient manipulation to address the inconsistency between two sub‐tasks. Finally, extensive experiments are performed to evaluate the superiority of CASRL. Simulation results show an average 8.2% performance improvement over the vanilla baseline in eight groups of dynamic environments, which is further extended to 13.4% in the most challenging group. Besides, forty real‐world experiments fully illustrated that the CASRL could be successfully deployed on a real robot.\",\"PeriodicalId\":46211,\"journal\":{\"name\":\"CAAI Transactions on Intelligence Technology\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2023-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CAAI Transactions on Intelligence Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/cit2.12269\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/cit2.12269","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A safe reinforcement learning approach for autonomous navigation of mobile robots in dynamic environments
Abstract When deploying mobile robots in real‐world scenarios, such as airports, train stations, hospitals, and schools, collisions with pedestrians are intolerable and catastrophic. Motion safety becomes one of the most fundamental requirements for mobile robots. However, until now, efficient and safe robot navigation in such dynamic environments is still an open problem. The critical reason is that the inconsistency between navigation efficiency and motion safety is greatly intensified by the high dynamics and uncertainties of pedestrians. To face the challenge, this paper proposes a safe deep reinforcement learning algorithm named Conflict‐Averse Safe Reinforcement Learning (CASRL) for autonomous robot navigation in dynamic environments. Specifically, it first separates the collision avoidance sub‐task from the overall navigation task and maintains a safety critic to evaluate the safety/risk of actions. Later, it constructs two task‐specific but model‐agnostic policy gradients for goal‐reaching and collision avoidance sub‐tasks to eliminate their mutual interference. Then, it further performs a conflict‐averse gradient manipulation to address the inconsistency between two sub‐tasks. Finally, extensive experiments are performed to evaluate the superiority of CASRL. Simulation results show an average 8.2% performance improvement over the vanilla baseline in eight groups of dynamic environments, which is further extended to 13.4% in the most challenging group. Besides, forty real‐world experiments fully illustrated that the CASRL could be successfully deployed on a real robot.
期刊介绍:
CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.