{"title":"QTypeMix:通过异质和同质价值分解增强多代理合作策略","authors":"Songchen Fu, Shaojing Zhao, Ta Li, YongHong Yan","doi":"arxiv-2408.07098","DOIUrl":null,"url":null,"abstract":"In multi-agent cooperative tasks, the presence of heterogeneous agents is\nfamiliar. Compared to cooperation among homogeneous agents, collaboration\nrequires considering the best-suited sub-tasks for each agent. However, the\noperation of multi-agent systems often involves a large amount of complex\ninteraction information, making it more challenging to learn heterogeneous\nstrategies. Related multi-agent reinforcement learning methods sometimes use\ngrouping mechanisms to form smaller cooperative groups or leverage prior domain\nknowledge to learn strategies for different roles. In contrast, agents should\nlearn deeper role features without relying on additional information.\nTherefore, we propose QTypeMix, which divides the value decomposition process\ninto homogeneous and heterogeneous stages. QTypeMix learns to extract type\nfeatures from local historical observations through the TE loss. In addition,\nwe introduce advanced network structures containing attention mechanisms and\nhypernets to enhance the representation capability and achieve the value\ndecomposition process. The results of testing the proposed method on 14 maps\nfrom SMAC and SMACv2 show that QTypeMix achieves state-of-the-art performance\nin tasks of varying difficulty.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"QTypeMix: Enhancing Multi-Agent Cooperative Strategies through Heterogeneous and Homogeneous Value Decomposition\",\"authors\":\"Songchen Fu, Shaojing Zhao, Ta Li, YongHong Yan\",\"doi\":\"arxiv-2408.07098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In multi-agent cooperative tasks, the presence of heterogeneous agents is\\nfamiliar. Compared to cooperation among homogeneous agents, collaboration\\nrequires considering the best-suited sub-tasks for each agent. However, the\\noperation of multi-agent systems often involves a large amount of complex\\ninteraction information, making it more challenging to learn heterogeneous\\nstrategies. Related multi-agent reinforcement learning methods sometimes use\\ngrouping mechanisms to form smaller cooperative groups or leverage prior domain\\nknowledge to learn strategies for different roles. In contrast, agents should\\nlearn deeper role features without relying on additional information.\\nTherefore, we propose QTypeMix, which divides the value decomposition process\\ninto homogeneous and heterogeneous stages. QTypeMix learns to extract type\\nfeatures from local historical observations through the TE loss. In addition,\\nwe introduce advanced network structures containing attention mechanisms and\\nhypernets to enhance the representation capability and achieve the value\\ndecomposition process. The results of testing the proposed method on 14 maps\\nfrom SMAC and SMACv2 show that QTypeMix achieves state-of-the-art performance\\nin tasks of varying difficulty.\",\"PeriodicalId\":501315,\"journal\":{\"name\":\"arXiv - CS - Multiagent Systems\",\"volume\":\"31 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multiagent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.07098\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.07098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
在多代理合作任务中,异质代理的存在并不陌生。与同质代理之间的合作相比,合作需要考虑最适合每个代理的子任务。然而,多代理系统的运行往往涉及大量复杂的交互信息,这使得学习异构策略更具挑战性。相关的多代理强化学习方法有时会使用分组机制来形成较小的合作小组,或利用先前的领域知识来学习不同角色的策略。因此,我们提出了 QTypeMix,它将价值分解过程分为同质和异质两个阶段。QTypeMix 通过 TE loss 学习从本地历史观测中提取类型特征。此外,我们还引入了包含注意力机制和超网络的高级网络结构,以增强表示能力并实现值分解过程。对来自 SMAC 和 SMACv2 的 14 幅地图的测试结果表明,QTypeMix 在不同难度的任务中都取得了最先进的性能。
QTypeMix: Enhancing Multi-Agent Cooperative Strategies through Heterogeneous and Homogeneous Value Decomposition
In multi-agent cooperative tasks, the presence of heterogeneous agents is
familiar. Compared to cooperation among homogeneous agents, collaboration
requires considering the best-suited sub-tasks for each agent. However, the
operation of multi-agent systems often involves a large amount of complex
interaction information, making it more challenging to learn heterogeneous
strategies. Related multi-agent reinforcement learning methods sometimes use
grouping mechanisms to form smaller cooperative groups or leverage prior domain
knowledge to learn strategies for different roles. In contrast, agents should
learn deeper role features without relying on additional information.
Therefore, we propose QTypeMix, which divides the value decomposition process
into homogeneous and heterogeneous stages. QTypeMix learns to extract type
features from local historical observations through the TE loss. In addition,
we introduce advanced network structures containing attention mechanisms and
hypernets to enhance the representation capability and achieve the value
decomposition process. The results of testing the proposed method on 14 maps
from SMAC and SMACv2 show that QTypeMix achieves state-of-the-art performance
in tasks of varying difficulty.