{"title":"GCMA:一种具有群组交流功能的自适应多代理强化学习框架,用于协调复杂和类似任务","authors":"Kexing Peng;Tinghuai Ma;Xin Yu;Huan Rong;Yurong Qian;Najla Al-Nabhan","doi":"10.1109/TG.2023.3346394","DOIUrl":null,"url":null,"abstract":"Coordinating multiple agents with diverse tasks and changing goals without interference is a challenge. Multiagent reinforcement learning (MARL) aims to develop effective communication and joint policies using group learning. Some of the previous approaches required each agent to maintain a set of networks independently, resulting in no consideration of interactions. Joint communication work causes agents receiving information unrelated to their own tasks. Currently, agents with different task divisions are often grouped by action tendency, but this can lead to poor dynamic grouping. This article presents a two-phase solution for multiple agents, addressing these issues. The first phase develops heterogeneous agent communication joint policies using a group communication MARL framework (GCMA). The framework employs a periodic grouping strategy, reducing exploration and communication redundancy by dynamically assigning agent group hidden features through hypernetwork and graph communication. The scheme efficiently utilizes resources for adapting to multiple similar tasks. In the second phase, each agent's policy network is distilled into a generalized simple network, adapting to similar tasks with varying quantities and sizes. GCMA is tested in complex environments, such as \n<italic>StarCraft II</i>\n and unmanned aerial vehicle (UAV) take-off, showing its well-performing for large-scale, coordinated tasks. It shows GCMA's effectiveness for solid generalization in multitask tests with simulated pedestrians.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 3","pages":"670-682"},"PeriodicalIF":1.7000,"publicationDate":"2023-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GCMA: An Adaptive Multiagent Reinforcement Learning Framework With Group Communication for Complex and Similar Tasks Coordination\",\"authors\":\"Kexing Peng;Tinghuai Ma;Xin Yu;Huan Rong;Yurong Qian;Najla Al-Nabhan\",\"doi\":\"10.1109/TG.2023.3346394\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Coordinating multiple agents with diverse tasks and changing goals without interference is a challenge. Multiagent reinforcement learning (MARL) aims to develop effective communication and joint policies using group learning. Some of the previous approaches required each agent to maintain a set of networks independently, resulting in no consideration of interactions. Joint communication work causes agents receiving information unrelated to their own tasks. Currently, agents with different task divisions are often grouped by action tendency, but this can lead to poor dynamic grouping. This article presents a two-phase solution for multiple agents, addressing these issues. The first phase develops heterogeneous agent communication joint policies using a group communication MARL framework (GCMA). The framework employs a periodic grouping strategy, reducing exploration and communication redundancy by dynamically assigning agent group hidden features through hypernetwork and graph communication. The scheme efficiently utilizes resources for adapting to multiple similar tasks. In the second phase, each agent's policy network is distilled into a generalized simple network, adapting to similar tasks with varying quantities and sizes. GCMA is tested in complex environments, such as \\n<italic>StarCraft II</i>\\n and unmanned aerial vehicle (UAV) take-off, showing its well-performing for large-scale, coordinated tasks. It shows GCMA's effectiveness for solid generalization in multitask tests with simulated pedestrians.\",\"PeriodicalId\":55977,\"journal\":{\"name\":\"IEEE Transactions on Games\",\"volume\":\"16 3\",\"pages\":\"670-682\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2023-12-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Games\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10373072/\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Games","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10373072/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
GCMA: An Adaptive Multiagent Reinforcement Learning Framework With Group Communication for Complex and Similar Tasks Coordination
Coordinating multiple agents with diverse tasks and changing goals without interference is a challenge. Multiagent reinforcement learning (MARL) aims to develop effective communication and joint policies using group learning. Some of the previous approaches required each agent to maintain a set of networks independently, resulting in no consideration of interactions. Joint communication work causes agents receiving information unrelated to their own tasks. Currently, agents with different task divisions are often grouped by action tendency, but this can lead to poor dynamic grouping. This article presents a two-phase solution for multiple agents, addressing these issues. The first phase develops heterogeneous agent communication joint policies using a group communication MARL framework (GCMA). The framework employs a periodic grouping strategy, reducing exploration and communication redundancy by dynamically assigning agent group hidden features through hypernetwork and graph communication. The scheme efficiently utilizes resources for adapting to multiple similar tasks. In the second phase, each agent's policy network is distilled into a generalized simple network, adapting to similar tasks with varying quantities and sizes. GCMA is tested in complex environments, such as
StarCraft II
and unmanned aerial vehicle (UAV) take-off, showing its well-performing for large-scale, coordinated tasks. It shows GCMA's effectiveness for solid generalization in multitask tests with simulated pedestrians.