Toward multi-target self-organizing pursuit in a partially observable Markov game

Inf. Sci. Pub Date : 2022-06-24 DOI:10.48550/arXiv.2206.12330

Lijun Sun, Yu-Cheng Chang, Chao Lyu, Ye Shi, Yuhui Shi, Chin-Teng Lin

{"title":"Toward multi-target self-organizing pursuit in a partially observable Markov game","authors":"Lijun Sun, Yu-Cheng Chang, Chao Lyu, Ye Shi, Yuhui Shi, Chin-Teng Lin","doi":"10.48550/arXiv.2206.12330","DOIUrl":null,"url":null,"abstract":"The multiple-target self-organizing pursuit (SOP) problem has wide applications and has been considered a challenging self-organization game for distributed systems, in which intelligent agents cooperatively pursue multiple dynamic targets with partial observations. This work proposes a framework for decentralized multi-agent systems to improve the implicit coordination capabilities in search and pursuit. We model a self-organizing system as a partially observable Markov game (POMG) featured by large-scale, decentralization, partial observation, and noncommunication. The proposed distributed algorithm: fuzzy self-organizing cooperative coevolution (FSC2) is then leveraged to resolve the three challenges in multi-target SOP: distributed self-organizing search (SOS), distributed task allocation, and distributed single-target pursuit. FSC2 includes a coordinated multi-agent deep reinforcement learning (MARL) method that enables homogeneous agents to learn natural SOS patterns. Additionally, we propose a fuzzy-based distributed task allocation method, which locally decomposes multi-target SOP into several single-target pursuit problems. The cooperative coevolution principle is employed to coordinate distributed pursuers for each single-target pursuit problem. Therefore, the uncertainties of inherent partial observation and distributed decision-making in the POMG can be alleviated. The experimental results demonstrate that by decomposing the SOP task, FSC2 achieves superior performance compared with other implicit coordination policies fully trained by general MARL algorithms. The scalability of FSC2 is proved that up to 2048 FSC2 agents perform efficient multi-target SOP with almost 100 percent capture rates. Empirical analyses and ablation studies verify the interpretability, rationality, and effectiveness of component algorithms in FSC2.","PeriodicalId":13641,"journal":{"name":"Inf. Sci.","volume":"17 1","pages":"119475"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Inf. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2206.12330","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

The multiple-target self-organizing pursuit (SOP) problem has wide applications and has been considered a challenging self-organization game for distributed systems, in which intelligent agents cooperatively pursue multiple dynamic targets with partial observations. This work proposes a framework for decentralized multi-agent systems to improve the implicit coordination capabilities in search and pursuit. We model a self-organizing system as a partially observable Markov game (POMG) featured by large-scale, decentralization, partial observation, and noncommunication. The proposed distributed algorithm: fuzzy self-organizing cooperative coevolution (FSC2) is then leveraged to resolve the three challenges in multi-target SOP: distributed self-organizing search (SOS), distributed task allocation, and distributed single-target pursuit. FSC2 includes a coordinated multi-agent deep reinforcement learning (MARL) method that enables homogeneous agents to learn natural SOS patterns. Additionally, we propose a fuzzy-based distributed task allocation method, which locally decomposes multi-target SOP into several single-target pursuit problems. The cooperative coevolution principle is employed to coordinate distributed pursuers for each single-target pursuit problem. Therefore, the uncertainties of inherent partial observation and distributed decision-making in the POMG can be alleviated. The experimental results demonstrate that by decomposing the SOP task, FSC2 achieves superior performance compared with other implicit coordination policies fully trained by general MARL algorithms. The scalability of FSC2 is proved that up to 2048 FSC2 agents perform efficient multi-target SOP with almost 100 percent capture rates. Empirical analyses and ablation studies verify the interpretability, rationality, and effectiveness of component algorithms in FSC2.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

部分可观察马尔可夫对策中多目标自组织追击的研究

多目标自组织追求(SOP)问题具有广泛的应用，被认为是分布式系统中一种具有挑战性的自组织博弈问题。本文提出了一个分散的多智能体系统框架，以提高搜索和追捕中的隐式协调能力。我们将自组织系统建模为具有大规模、去中心化、部分观察和非通信特征的部分可观察马尔可夫博弈(POMG)。然后利用所提出的分布式算法模糊自组织协同进化(FSC2)解决了多目标SOP中的三个难题:分布式自组织搜索(SOS)、分布式任务分配(task allocation)和分布式单目标追踪(single-target pursuit)。FSC2包括一个协调的多智能体深度强化学习(MARL)方法，使同类智能体能够学习自然的SOS模式。此外，我们提出了一种基于模糊的分布式任务分配方法，该方法将多目标SOP局部分解为多个单目标跟踪问题。针对每个单目标跟踪问题，采用协同进化原理对分布式跟踪器进行协调。因此，可以减轻POMG中固有的局部观测和分布式决策的不确定性。实验结果表明，通过对SOP任务进行分解，FSC2与其他完全由一般MARL算法训练的隐式协调策略相比，具有更优越的性能。FSC2的可扩展性被证明，多达2048个FSC2代理执行高效的多目标SOP，捕获率几乎为100%。实证分析和消融研究验证了FSC2中分量算法的可解释性、合理性和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Inf. Sci.

自引率

0.00%

发文量