QTypeMix：通过异质和同质价值分解增强多代理合作策略

arXiv - CS - Multiagent Systems Pub Date : 2024-08-12 DOI:arxiv-2408.07098

Songchen Fu, Shaojing Zhao, Ta Li, YongHong Yan

{"title":"QTypeMix：通过异质和同质价值分解增强多代理合作策略","authors":"Songchen Fu, Shaojing Zhao, Ta Li, YongHong Yan","doi":"arxiv-2408.07098","DOIUrl":null,"url":null,"abstract":"In multi-agent cooperative tasks, the presence of heterogeneous agents is\nfamiliar. Compared to cooperation among homogeneous agents, collaboration\nrequires considering the best-suited sub-tasks for each agent. However, the\noperation of multi-agent systems often involves a large amount of complex\ninteraction information, making it more challenging to learn heterogeneous\nstrategies. Related multi-agent reinforcement learning methods sometimes use\ngrouping mechanisms to form smaller cooperative groups or leverage prior domain\nknowledge to learn strategies for different roles. In contrast, agents should\nlearn deeper role features without relying on additional information.\nTherefore, we propose QTypeMix, which divides the value decomposition process\ninto homogeneous and heterogeneous stages. QTypeMix learns to extract type\nfeatures from local historical observations through the TE loss. In addition,\nwe introduce advanced network structures containing attention mechanisms and\nhypernets to enhance the representation capability and achieve the value\ndecomposition process. The results of testing the proposed method on 14 maps\nfrom SMAC and SMACv2 show that QTypeMix achieves state-of-the-art performance\nin tasks of varying difficulty.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"QTypeMix: Enhancing Multi-Agent Cooperative Strategies through Heterogeneous and Homogeneous Value Decomposition\",\"authors\":\"Songchen Fu, Shaojing Zhao, Ta Li, YongHong Yan\",\"doi\":\"arxiv-2408.07098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In multi-agent cooperative tasks, the presence of heterogeneous agents is\\nfamiliar. Compared to cooperation among homogeneous agents, collaboration\\nrequires considering the best-suited sub-tasks for each agent. However, the\\noperation of multi-agent systems often involves a large amount of complex\\ninteraction information, making it more challenging to learn heterogeneous\\nstrategies. Related multi-agent reinforcement learning methods sometimes use\\ngrouping mechanisms to form smaller cooperative groups or leverage prior domain\\nknowledge to learn strategies for different roles. In contrast, agents should\\nlearn deeper role features without relying on additional information.\\nTherefore, we propose QTypeMix, which divides the value decomposition process\\ninto homogeneous and heterogeneous stages. QTypeMix learns to extract type\\nfeatures from local historical observations through the TE loss. In addition,\\nwe introduce advanced network structures containing attention mechanisms and\\nhypernets to enhance the representation capability and achieve the value\\ndecomposition process. The results of testing the proposed method on 14 maps\\nfrom SMAC and SMACv2 show that QTypeMix achieves state-of-the-art performance\\nin tasks of varying difficulty.\",\"PeriodicalId\":501315,\"journal\":{\"name\":\"arXiv - CS - Multiagent Systems\",\"volume\":\"31 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multiagent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.07098\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.07098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在多代理合作任务中，异质代理的存在并不陌生。与同质代理之间的合作相比，合作需要考虑最适合每个代理的子任务。然而，多代理系统的运行往往涉及大量复杂的交互信息，这使得学习异构策略更具挑战性。相关的多代理强化学习方法有时会使用分组机制来形成较小的合作小组，或利用先前的领域知识来学习不同角色的策略。因此，我们提出了 QTypeMix，它将价值分解过程分为同质和异质两个阶段。QTypeMix 通过 TE loss 学习从本地历史观测中提取类型特征。此外，我们还引入了包含注意力机制和超网络的高级网络结构，以增强表示能力并实现值分解过程。对来自 SMAC 和 SMACv2 的 14 幅地图的测试结果表明，QTypeMix 在不同难度的任务中都取得了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

QTypeMix: Enhancing Multi-Agent Cooperative Strategies through Heterogeneous and Homogeneous Value Decomposition

In multi-agent cooperative tasks, the presence of heterogeneous agents is familiar. Compared to cooperation among homogeneous agents, collaboration requires considering the best-suited sub-tasks for each agent. However, the operation of multi-agent systems often involves a large amount of complex interaction information, making it more challenging to learn heterogeneous strategies. Related multi-agent reinforcement learning methods sometimes use grouping mechanisms to form smaller cooperative groups or leverage prior domain knowledge to learn strategies for different roles. In contrast, agents should learn deeper role features without relying on additional information. Therefore, we propose QTypeMix, which divides the value decomposition process into homogeneous and heterogeneous stages. QTypeMix learns to extract type features from local historical observations through the TE loss. In addition, we introduce advanced network structures containing attention mechanisms and hypernets to enhance the representation capability and achieve the value decomposition process. The results of testing the proposed method on 14 maps from SMAC and SMACv2 show that QTypeMix achieves state-of-the-art performance in tasks of varying difficulty.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Multiagent Systems

自引率

0.00%

发文量