{"title":"A Deep Reinforcement Learning Approach Using Asymmetric Self-Play for Robust Multirobot Flocking","authors":"Yunjie Jia;Yong Song;Jiyu Cheng;Jiong Jin;Wei Zhang;Simon X. Yang;Sam Kwong","doi":"10.1109/TII.2024.3523576","DOIUrl":null,"url":null,"abstract":"Flocking control, as an essential approach for survivable navigation of multirobot systems, has been widely applied in fields, such as logistics, service delivery, and search and rescue. However, realistic environments are typically complex, dynamic, and even aggressive, posing considerable threats to the safety of flocking robots. In this article, based on deep reinforcement learning, an <italic>A</i>symmetric <italic>S</i>elf-play-empowered <italic>F</i>locking <italic>C</i>ontrol framework is proposed to address this concern. Specifically, the flocking robots are trained concurrently with learnable adversarial interferers to stimulate the intelligence of the flocking strategy. A two-stage self-play training paradigm is developed to improve the robustness and generalization of the model. Furthermore, an auxiliary training module regarding the learning of transition dynamics is designed, dramatically enhancing the adaptability to environmental uncertainties. Feature-level and agent-level attention are implemented for action and value generation, respectively. Both extensive comparative experiments and real-world deployment demonstrate the superiority and practicality of the proposed framework.","PeriodicalId":13301,"journal":{"name":"IEEE Transactions on Industrial Informatics","volume":"21 4","pages":"3266-3275"},"PeriodicalIF":9.9000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Industrial Informatics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10851371/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Flocking control, as an essential approach for survivable navigation of multirobot systems, has been widely applied in fields, such as logistics, service delivery, and search and rescue. However, realistic environments are typically complex, dynamic, and even aggressive, posing considerable threats to the safety of flocking robots. In this article, based on deep reinforcement learning, an Asymmetric Self-play-empowered Flocking Control framework is proposed to address this concern. Specifically, the flocking robots are trained concurrently with learnable adversarial interferers to stimulate the intelligence of the flocking strategy. A two-stage self-play training paradigm is developed to improve the robustness and generalization of the model. Furthermore, an auxiliary training module regarding the learning of transition dynamics is designed, dramatically enhancing the adaptability to environmental uncertainties. Feature-level and agent-level attention are implemented for action and value generation, respectively. Both extensive comparative experiments and real-world deployment demonstrate the superiority and practicality of the proposed framework.
期刊介绍:
The IEEE Transactions on Industrial Informatics is a multidisciplinary journal dedicated to publishing technical papers that connect theory with practical applications of informatics in industrial settings. It focuses on the utilization of information in intelligent, distributed, and agile industrial automation and control systems. The scope includes topics such as knowledge-based and AI-enhanced automation, intelligent computer control systems, flexible and collaborative manufacturing, industrial informatics in software-defined vehicles and robotics, computer vision, industrial cyber-physical and industrial IoT systems, real-time and networked embedded systems, security in industrial processes, industrial communications, systems interoperability, and human-machine interaction.