{"title":"多代理连续控制与生成流网络","authors":"Shuang Luo, Yinchuan Li, Shunyu Liu, Xu Zhang, Yunfeng Shao, Chao Wu","doi":"arxiv-2408.06920","DOIUrl":null,"url":null,"abstract":"Generative Flow Networks (GFlowNets) aim to generate diverse trajectories\nfrom a distribution in which the final states of the trajectories are\nproportional to the reward, serving as a powerful alternative to reinforcement\nlearning for exploratory control tasks. However, the individual-flow matching\nconstraint in GFlowNets limits their applications for multi-agent systems,\nespecially continuous joint-control problems. In this paper, we propose a novel\nMulti-Agent generative Continuous Flow Networks (MACFN) method to enable\nmultiple agents to perform cooperative exploration for various compositional\ncontinuous objects. Technically, MACFN trains decentralized\nindividual-flow-based policies in a centralized global-flow-based matching\nfashion. During centralized training, MACFN introduces a continuous flow\ndecomposition network to deduce the flow contributions of each agent in the\npresence of only global rewards. Then agents can deliver actions solely based\non their assigned local flow in a decentralized way, forming a joint policy\ndistribution proportional to the rewards. To guarantee the expressiveness of\ncontinuous flow decomposition, we theoretically derive a consistency condition\non the decomposition network. Experimental results demonstrate that the\nproposed method yields results superior to the state-of-the-art counterparts\nand better exploration capability. Our code is available at\nhttps://github.com/isluoshuang/MACFN.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"29 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Agent Continuous Control with Generative Flow Networks\",\"authors\":\"Shuang Luo, Yinchuan Li, Shunyu Liu, Xu Zhang, Yunfeng Shao, Chao Wu\",\"doi\":\"arxiv-2408.06920\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generative Flow Networks (GFlowNets) aim to generate diverse trajectories\\nfrom a distribution in which the final states of the trajectories are\\nproportional to the reward, serving as a powerful alternative to reinforcement\\nlearning for exploratory control tasks. However, the individual-flow matching\\nconstraint in GFlowNets limits their applications for multi-agent systems,\\nespecially continuous joint-control problems. In this paper, we propose a novel\\nMulti-Agent generative Continuous Flow Networks (MACFN) method to enable\\nmultiple agents to perform cooperative exploration for various compositional\\ncontinuous objects. Technically, MACFN trains decentralized\\nindividual-flow-based policies in a centralized global-flow-based matching\\nfashion. During centralized training, MACFN introduces a continuous flow\\ndecomposition network to deduce the flow contributions of each agent in the\\npresence of only global rewards. Then agents can deliver actions solely based\\non their assigned local flow in a decentralized way, forming a joint policy\\ndistribution proportional to the rewards. To guarantee the expressiveness of\\ncontinuous flow decomposition, we theoretically derive a consistency condition\\non the decomposition network. Experimental results demonstrate that the\\nproposed method yields results superior to the state-of-the-art counterparts\\nand better exploration capability. Our code is available at\\nhttps://github.com/isluoshuang/MACFN.\",\"PeriodicalId\":501315,\"journal\":{\"name\":\"arXiv - CS - Multiagent Systems\",\"volume\":\"29 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multiagent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.06920\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.06920","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-Agent Continuous Control with Generative Flow Networks
Generative Flow Networks (GFlowNets) aim to generate diverse trajectories
from a distribution in which the final states of the trajectories are
proportional to the reward, serving as a powerful alternative to reinforcement
learning for exploratory control tasks. However, the individual-flow matching
constraint in GFlowNets limits their applications for multi-agent systems,
especially continuous joint-control problems. In this paper, we propose a novel
Multi-Agent generative Continuous Flow Networks (MACFN) method to enable
multiple agents to perform cooperative exploration for various compositional
continuous objects. Technically, MACFN trains decentralized
individual-flow-based policies in a centralized global-flow-based matching
fashion. During centralized training, MACFN introduces a continuous flow
decomposition network to deduce the flow contributions of each agent in the
presence of only global rewards. Then agents can deliver actions solely based
on their assigned local flow in a decentralized way, forming a joint policy
distribution proportional to the rewards. To guarantee the expressiveness of
continuous flow decomposition, we theoretically derive a consistency condition
on the decomposition network. Experimental results demonstrate that the
proposed method yields results superior to the state-of-the-art counterparts
and better exploration capability. Our code is available at
https://github.com/isluoshuang/MACFN.