{"title":"PIPO: Policy Optimization with Permutation-Invariant Constraint for Distributed Multi-Robot Navigation","authors":"Ruiqi Zhang, Guang Chen, Jing Hou, Zhijun Li, Alois Knoll","doi":"10.1109/MFI55806.2022.9913862","DOIUrl":null,"url":null,"abstract":"For large-scale multi-agent systems (MAS), ensuring the safety and effectiveness of navigation in complicated scenarios is a challenging task. With the agent scale increasing, most existing centralized methods lose their magic for the lack of scalability, and the popular decentralized approaches are hampered by high latency and computing requirements. In this research, we offer PIPO, a novel policy optimization algorithm for decentralized MAS navigation with permutation-invariant constraints. To conduct navigation and avoid un-necessary exploration in the early episodes, we first defined a guide-policy. Then, we introduce the permutation invariant property in decentralized multi-agent systems and leverage the graph convolution network to produce the same output under shuffled observations. Our approach can be easily scaled to an arbitrary number of agents and used in large-scale systems for its decentralized training and execution. We also provide extensive experiments to demonstrate that our PIPO significantly outperforms the baselines of multi-agent reinforcement learning algorithms and other leading methods in variant scenarios.","PeriodicalId":344737,"journal":{"name":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MFI55806.2022.9913862","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
For large-scale multi-agent systems (MAS), ensuring the safety and effectiveness of navigation in complicated scenarios is a challenging task. With the agent scale increasing, most existing centralized methods lose their magic for the lack of scalability, and the popular decentralized approaches are hampered by high latency and computing requirements. In this research, we offer PIPO, a novel policy optimization algorithm for decentralized MAS navigation with permutation-invariant constraints. To conduct navigation and avoid un-necessary exploration in the early episodes, we first defined a guide-policy. Then, we introduce the permutation invariant property in decentralized multi-agent systems and leverage the graph convolution network to produce the same output under shuffled observations. Our approach can be easily scaled to an arbitrary number of agents and used in large-scale systems for its decentralized training and execution. We also provide extensive experiments to demonstrate that our PIPO significantly outperforms the baselines of multi-agent reinforcement learning algorithms and other leading methods in variant scenarios.