Solving large-scale multi-agent tasks via transfer learning with dynamic state representation

IF 2.3 4区 计算机科学 Q2 Computer Science International Journal of Advanced Robotic Systems Pub Date : 2023-03-01 DOI:10.1177/17298806231162440
Lintao Dou, Zhen Jia, Jian Huang
{"title":"Solving large-scale multi-agent tasks via transfer learning with dynamic state representation","authors":"Lintao Dou, Zhen Jia, Jian Huang","doi":"10.1177/17298806231162440","DOIUrl":null,"url":null,"abstract":"Many research results have emerged in the past decade regarding multi-agent reinforcement learning. These include the successful application of asynchronous advantage actor-critic, double deep Q-network and other algorithms in multi-agent environments, and the more representative multi-agent training method based on the classical centralized training distributed execution algorithm QMIX. However, in a large-scale multi-agent environment, training becomes a major challenge due to the exponential growth of the state-action space. In this article, we design a training scheme from small-scale multi-agent training to large-scale multi-agent training. We use the transfer learning method to enable the training of large-scale agents to use the knowledge accumulated by training small-scale agents. We achieve policy transfer between tasks with different numbers of agents by designing a new dynamic state representation network, which uses a self-attention mechanism to capture and represent the local observations of agents. The dynamic state representation network makes it possible to expand the policy model from a few agents (4 agents, 10 agents) task to large-scale agents (16 agents, 50 agents) task. Furthermore, we conducted experiments in the famous real-time strategy game Starcraft II and the multi-agent research platform MAgent. And also set unmanned aerial vehicles trajectory planning simulations. Experimental results show that our approach not only reduces the time consumption of a large number of agent training tasks but also improves the final training performance.","PeriodicalId":50343,"journal":{"name":"International Journal of Advanced Robotic Systems","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advanced Robotic Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/17298806231162440","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

Many research results have emerged in the past decade regarding multi-agent reinforcement learning. These include the successful application of asynchronous advantage actor-critic, double deep Q-network and other algorithms in multi-agent environments, and the more representative multi-agent training method based on the classical centralized training distributed execution algorithm QMIX. However, in a large-scale multi-agent environment, training becomes a major challenge due to the exponential growth of the state-action space. In this article, we design a training scheme from small-scale multi-agent training to large-scale multi-agent training. We use the transfer learning method to enable the training of large-scale agents to use the knowledge accumulated by training small-scale agents. We achieve policy transfer between tasks with different numbers of agents by designing a new dynamic state representation network, which uses a self-attention mechanism to capture and represent the local observations of agents. The dynamic state representation network makes it possible to expand the policy model from a few agents (4 agents, 10 agents) task to large-scale agents (16 agents, 50 agents) task. Furthermore, we conducted experiments in the famous real-time strategy game Starcraft II and the multi-agent research platform MAgent. And also set unmanned aerial vehicles trajectory planning simulations. Experimental results show that our approach not only reduces the time consumption of a large number of agent training tasks but also improves the final training performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于动态状态表示的迁移学习求解大规模多智能体任务
在过去的十年中,已经出现了许多关于多智能体强化学习的研究成果。其中包括异步优势actor-critic、双深度Q网络等算法在多智能体环境中的成功应用,以及基于经典集中式训练分布式执行算法QMIX的更具代表性的多智能体训练方法。然而,在大规模的多智能体环境中,由于状态-动作空间的指数增长,训练成为一个主要挑战。在本文中,我们设计了一个从小规模多智能体训练到大规模多智能体培训的训练方案。我们使用迁移学习方法使大规模代理的训练能够利用训练小规模代理所积累的知识。我们通过设计一个新的动态状态表示网络来实现具有不同数量代理的任务之间的策略转移,该网络使用自注意机制来捕获和表示代理的局部观察。动态状态表示网络可以将策略模型从几个代理(4个代理,10个代理)任务扩展到大规模代理(16个代理,50个代理)。此外,我们还在著名的实时战略游戏《星际争霸II》和多智能体研究平台MAgent中进行了实验。并设置了无人机轨迹规划仿真。实验结果表明,我们的方法不仅减少了大量agent训练任务的时间消耗,而且提高了最终的训练性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.50
自引率
0.00%
发文量
65
审稿时长
6 months
期刊介绍: International Journal of Advanced Robotic Systems (IJARS) is a JCR ranked, peer-reviewed open access journal covering the full spectrum of robotics research. The journal is addressed to both practicing professionals and researchers in the field of robotics and its specialty areas. IJARS features fourteen topic areas each headed by a Topic Editor-in-Chief, integrating all aspects of research in robotics under the journal''s domain.
期刊最新文献
Expanded photo-model-based stereo vision pose estimation using a shooting distance unknown photo Enhanced lightweight deep network for efficient livestock detection in grazing areas Manipulate mechanism design and synchronous motion application for driving simulator A general method for the manipulability analysis of serial robot manipulators Design, simulation, and experiment for the end effector of a spherical fruit picking robot
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1