{"title":"针对大规模多代理环境的注意力自适应分散政策","authors":"Youness Boutyour;Abdellah Idrissi","doi":"10.1109/TAI.2024.3415550","DOIUrl":null,"url":null,"abstract":"Multiagent reinforcement learning (MARL) poses unique challenges in real-world applications, demanding the adaptation of reinforcement learning principles to scenarios where agents interact in dynamically changing environments. This article presents a novel approach, “decentralized policy with attention” (ADPA), designed to address these challenges in large-scale multiagent environments. ADPA leverages an attention mechanism to dynamically select relevant information for estimating critics while training decentralized policies. This enables effective and scalable learning, supporting both cooperative and competitive settings, and scenarios with nonglobal states. In this work, we conduct a comprehensive evaluation of ADPA across a range of multiagent environments, including cooperative treasure collection and rover-tower communication. We compare ADPA with existing centralized training methods and ablated variants to showcase its advantages in terms of scalability, adaptability to various environments, and robustness. Our results demonstrate that ADPA offers a promising solution for addressing the complexities of large-scale MARL, providing the flexibility to handle diverse multiagent scenarios. By combining decentralized policies with attention mechanisms, we contribute to the advancement of MARL techniques, offering a powerful tool for real-world applications in dynamic and interactive multiagent systems.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 10","pages":"4905-4914"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive Decentralized Policies With Attention for Large-Scale Multiagent Environments\",\"authors\":\"Youness Boutyour;Abdellah Idrissi\",\"doi\":\"10.1109/TAI.2024.3415550\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multiagent reinforcement learning (MARL) poses unique challenges in real-world applications, demanding the adaptation of reinforcement learning principles to scenarios where agents interact in dynamically changing environments. This article presents a novel approach, “decentralized policy with attention” (ADPA), designed to address these challenges in large-scale multiagent environments. ADPA leverages an attention mechanism to dynamically select relevant information for estimating critics while training decentralized policies. This enables effective and scalable learning, supporting both cooperative and competitive settings, and scenarios with nonglobal states. In this work, we conduct a comprehensive evaluation of ADPA across a range of multiagent environments, including cooperative treasure collection and rover-tower communication. We compare ADPA with existing centralized training methods and ablated variants to showcase its advantages in terms of scalability, adaptability to various environments, and robustness. Our results demonstrate that ADPA offers a promising solution for addressing the complexities of large-scale MARL, providing the flexibility to handle diverse multiagent scenarios. By combining decentralized policies with attention mechanisms, we contribute to the advancement of MARL techniques, offering a powerful tool for real-world applications in dynamic and interactive multiagent systems.\",\"PeriodicalId\":73305,\"journal\":{\"name\":\"IEEE transactions on artificial intelligence\",\"volume\":\"5 10\",\"pages\":\"4905-4914\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on artificial intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10562040/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10562040/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Adaptive Decentralized Policies With Attention for Large-Scale Multiagent Environments
Multiagent reinforcement learning (MARL) poses unique challenges in real-world applications, demanding the adaptation of reinforcement learning principles to scenarios where agents interact in dynamically changing environments. This article presents a novel approach, “decentralized policy with attention” (ADPA), designed to address these challenges in large-scale multiagent environments. ADPA leverages an attention mechanism to dynamically select relevant information for estimating critics while training decentralized policies. This enables effective and scalable learning, supporting both cooperative and competitive settings, and scenarios with nonglobal states. In this work, we conduct a comprehensive evaluation of ADPA across a range of multiagent environments, including cooperative treasure collection and rover-tower communication. We compare ADPA with existing centralized training methods and ablated variants to showcase its advantages in terms of scalability, adaptability to various environments, and robustness. Our results demonstrate that ADPA offers a promising solution for addressing the complexities of large-scale MARL, providing the flexibility to handle diverse multiagent scenarios. By combining decentralized policies with attention mechanisms, we contribute to the advancement of MARL techniques, offering a powerful tool for real-world applications in dynamic and interactive multiagent systems.