arXiv - CS - Multiagent Systems最新文献_第2页

On the limits of agency in agent-based models 基于代理的模型中代理的局限性

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-14 DOI: arxiv-2409.10568

Ayush Chopra, Shashank Kumar, Nurullah Giray-Kuru, Ramesh Raskar, Arnau Quera-Bofarull

Agent-based modeling (ABM) seeks to understand the behavior of complexsystems by simulating a collection of agents that act and interact within anenvironment. Their practical utility requires capturing realistic environmentdynamics and adaptive agent behavior while efficiently simulating million-sizepopulations. Recent advancements in large language models (LLMs) present anopportunity to enhance ABMs by using LLMs as agents with further potential tocapture adaptive behavior. However, the computational infeasibility of usingLLMs for large populations has hindered their widespread adoption. In thispaper, we introduce AgentTorch -- a framework that scales ABMs to millions ofagents while capturing high-resolution agent behavior using LLMs. We benchmarkthe utility of LLMs as ABM agents, exploring the trade-off between simulationscale and individual agency. Using the COVID-19 pandemic as a case study, wedemonstrate how AgentTorch can simulate 8.4 million agents representing NewYork City, capturing the impact of isolation and employment behavior on healthand economic outcomes. We compare the performance of different agentarchitectures based on heuristic and LLM agents in predicting disease waves andunemployment rates. Furthermore, we showcase AgentTorch's capabilities forretrospective, counterfactual, and prospective analyses, highlighting howadaptive agent behavior can help overcome the limitations of historical data inpolicy design. AgentTorch is an open-source project actively being used forpolicy-making and scientific discovery around the world. The framework isavailable here: github.com/AgentTorch/AgentTorch.

基于代理的建模（ABM）试图通过模拟在环境中行动和互动的代理集合来理解复杂系统的行为。它们的实用性要求在高效模拟百万规模种群的同时，捕捉真实的环境动力学和自适应代理行为。大型语言模型（LLMs）的最新进展为通过使用 LLMs 作为代理来增强 ABMs 提供了机会，LLMs 在捕捉适应性行为方面具有更大的潜力。然而，将 LLMs 用于大型群体的计算不可行性阻碍了它们的广泛应用。在本文中，我们介绍了 AgentTorch -- 一个可以将 ABM 扩展到数百万个代理的框架，同时利用 LLM 捕捉高分辨率的代理行为。我们将 LLM 作为 ABM 代理的效用基准，探索模拟规模与个体代理之间的权衡。以 COVID-19 大流行为案例，我们展示了 AgentTorch 如何模拟代表纽约市的 840 万代理，捕捉隔离和就业行为对健康和经济结果的影响。我们比较了基于启发式和 LLM 代理的不同代理架构在预测疾病浪潮和失业率方面的性能。此外，我们还展示了 AgentTorch 在回顾性、反事实和前瞻性分析方面的能力，强调了自适应代理行为如何帮助克服历史数据在政策设计中的局限性。AgentTorch 是一个开源项目，目前正积极用于世界各地的政策制定和科学发现。该框架可在此处获取：github.com/AgentTorch/AgentTorch。

{"title":"On the limits of agency in agent-based models","authors":"Ayush Chopra, Shashank Kumar, Nurullah Giray-Kuru, Ramesh Raskar, Arnau Quera-Bofarull","doi":"arxiv-2409.10568","DOIUrl":"https://doi.org/arxiv-2409.10568","url":null,"abstract":"Agent-based modeling (ABM) seeks to understand the behavior of complex\u0000systems by simulating a collection of agents that act and interact within an\u0000environment. Their practical utility requires capturing realistic environment\u0000dynamics and adaptive agent behavior while efficiently simulating million-size\u0000populations. Recent advancements in large language models (LLMs) present an\u0000opportunity to enhance ABMs by using LLMs as agents with further potential to\u0000capture adaptive behavior. However, the computational infeasibility of using\u0000LLMs for large populations has hindered their widespread adoption. In this\u0000paper, we introduce AgentTorch -- a framework that scales ABMs to millions of\u0000agents while capturing high-resolution agent behavior using LLMs. We benchmark\u0000the utility of LLMs as ABM agents, exploring the trade-off between simulation\u0000scale and individual agency. Using the COVID-19 pandemic as a case study, we\u0000demonstrate how AgentTorch can simulate 8.4 million agents representing New\u0000York City, capturing the impact of isolation and employment behavior on health\u0000and economic outcomes. We compare the performance of different agent\u0000architectures based on heuristic and LLM agents in predicting disease waves and\u0000unemployment rates. Furthermore, we showcase AgentTorch's capabilities for\u0000retrospective, counterfactual, and prospective analyses, highlighting how\u0000adaptive agent behavior can help overcome the limitations of historical data in\u0000policy design. AgentTorch is an open-source project actively being used for\u0000policy-making and scientific discovery around the world. The framework is\u0000available here: github.com/AgentTorch/AgentTorch.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142258971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Swarm Algorithms for Dynamic Task Allocation in Unknown Environments 用于未知环境中动态任务分配的蜂群算法

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-14 DOI: arxiv-2409.09550

Adithya Balachandran, Noble Harasha, Nancy Lynch

Robot swarms, systems of many robots that operate in a distributed fashion,have many applications in areas such as search-and-rescue, natural disasterresponse, and self-assembly. Several of these applications can be abstracted tothe general problem of task allocation in an environment, in which robots mustassign themselves to and complete tasks. While several algorithms for taskallocation have been proposed, most of them assume either prior knowledge oftask locations or a static set of tasks. Operating under a discrete generalmodel where tasks dynamically appear in unknown locations, we present three newswarm algorithms for task allocation. We demonstrate that when tasks appearslowly, our variant of a distributed algorithm based on propagating taskinformation completes tasks more efficiently than a Levy random walk algorithm,which is a strategy used by many organisms in nature to efficiently search anenvironment. We also propose a division of labor algorithm where some agentsare using our algorithm based on propagating task information while theremaining agents are using the Levy random walk algorithm. Finally, weintroduce a hybrid algorithm where each agent dynamically switches betweenusing propagated task information and following a Levy random walk. We showthat our division of labor and hybrid algorithms can perform better than bothour algorithm based on propagated task information and the Levy walk algorithm,especially at low and medium task rates. When tasks appear fast, we observe theLevy random walk strategy performs as well or better when compared to thesenovel approaches. Our work demonstrates the relative performance of thesealgorithms on a variety of task rates and also provide insight into optimizingour algorithms based on environment parameters.

机器人群是由许多机器人组成的分布式系统，在搜救、自然灾害应对和自我组装等领域有许多应用。其中一些应用可以抽象为环境中任务分配的一般问题，即机器人必须自行分配并完成任务。虽然已经提出了几种任务分配算法，但大多数算法都假定预先知道任务位置或任务集是静态的。在任务动态出现在未知位置的离散一般模型下，我们提出了三种新的任务分配算法。我们证明，当任务缓慢出现时，我们基于任务信息传播的分布式算法变体比列维随机行走算法更高效地完成任务，而列维随机行走算法是自然界中许多生物用来高效搜索环境的策略。我们还提出了一种分工算法，其中一些代理使用我们基于传播任务信息的算法，而剩下的代理则使用李维随机行走算法。最后，我们引入了一种混合算法，即每个代理在使用传播任务信息和遵循李维随机行走之间动态切换。我们的研究表明，我们的分工算法和混合算法比我们基于任务信息传播的算法和列维随机行走算法的性能都要好，尤其是在中低任务率的情况下。当任务出现速度较快时，我们观察到李维随机漫步策略的性能与上述两种方法不相上下，甚至更好。我们的工作展示了这些算法在各种任务率下的相对性能，也为我们根据环境参数优化算法提供了启示。

{"title":"Swarm Algorithms for Dynamic Task Allocation in Unknown Environments","authors":"Adithya Balachandran, Noble Harasha, Nancy Lynch","doi":"arxiv-2409.09550","DOIUrl":"https://doi.org/arxiv-2409.09550","url":null,"abstract":"Robot swarms, systems of many robots that operate in a distributed fashion,\u0000have many applications in areas such as search-and-rescue, natural disaster\u0000response, and self-assembly. Several of these applications can be abstracted to\u0000the general problem of task allocation in an environment, in which robots must\u0000assign themselves to and complete tasks. While several algorithms for task\u0000allocation have been proposed, most of them assume either prior knowledge of\u0000task locations or a static set of tasks. Operating under a discrete general\u0000model where tasks dynamically appear in unknown locations, we present three new\u0000swarm algorithms for task allocation. We demonstrate that when tasks appear\u0000slowly, our variant of a distributed algorithm based on propagating task\u0000information completes tasks more efficiently than a Levy random walk algorithm,\u0000which is a strategy used by many organisms in nature to efficiently search an\u0000environment. We also propose a division of labor algorithm where some agents\u0000are using our algorithm based on propagating task information while the\u0000remaining agents are using the Levy random walk algorithm. Finally, we\u0000introduce a hybrid algorithm where each agent dynamically switches between\u0000using propagated task information and following a Levy random walk. We show\u0000that our division of labor and hybrid algorithms can perform better than both\u0000our algorithm based on propagated task information and the Levy walk algorithm,\u0000especially at low and medium task rates. When tasks appear fast, we observe the\u0000Levy random walk strategy performs as well or better when compared to these\u0000novel approaches. Our work demonstrates the relative performance of these\u0000algorithms on a variety of task rates and also provide insight into optimizing\u0000our algorithms based on environment parameters.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"203 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142258975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task 人机协作中的相互心智理论：在实时共享工作区任务中使用 LLM 驱动的人工智能代理的实证研究

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-13 DOI: arxiv-2409.08811

Shao Zhang, Xihuai Wang, Wenhao Zhang, Yongshan Chen, Landi Gao, Dakuo Wang, Weinan Zhang, Xinbing Wang, Ying Wen

Theory of Mind (ToM) significantly impacts human collaboration andcommunication as a crucial capability to understand others. When AI agents withToM capability collaborate with humans, Mutual Theory of Mind (MToM) arises insuch human-AI teams (HATs). The MToM process, which involves interactivecommunication and ToM-based strategy adjustment, affects the team's performanceand collaboration process. To explore the MToM process, we conducted amixed-design experiment using a large language model-driven AI agent with ToMand communication modules in a real-time shared-workspace task. We find thatthe agent's ToM capability does not significantly impact team performance butenhances human understanding of the agent and the feeling of being understood.Most participants in our study believe verbal communication increases humanburden, and the results show that bidirectional communication leads to lowerHAT performance. We discuss the results' implications for designing AI agentsthat collaborate with humans in real-time shared workspace tasks.

心智理论（ToM）作为一种理解他人的重要能力，对人类的合作与交流产生了重大影响。当具有ToM能力的人工智能代理与人类合作时，人类-人工智能团队（HATs）中就会出现相互心智理论（MToM）。相互心智理论过程涉及互动交流和基于心智理论的策略调整，会影响团队的表现和协作过程。为了探索MToM过程，我们在一个实时共享工作空间任务中使用一个带有ToM和通信模块的大型语言模型驱动人工智能代理进行了混合设计实验。我们发现，代理的ToM能力并不会对团队绩效产生显著影响，但会增强人类对代理的理解以及被理解的感觉。我们研究中的大多数参与者都认为语言交流会增加人类的负担，结果表明双向交流会导致较低的HAT绩效。我们讨论了这些结果对设计在实时共享工作空间任务中与人类协作的人工智能代理的影响。

{"title":"Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task","authors":"Shao Zhang, Xihuai Wang, Wenhao Zhang, Yongshan Chen, Landi Gao, Dakuo Wang, Weinan Zhang, Xinbing Wang, Ying Wen","doi":"arxiv-2409.08811","DOIUrl":"https://doi.org/arxiv-2409.08811","url":null,"abstract":"Theory of Mind (ToM) significantly impacts human collaboration and\u0000communication as a crucial capability to understand others. When AI agents with\u0000ToM capability collaborate with humans, Mutual Theory of Mind (MToM) arises in\u0000such human-AI teams (HATs). The MToM process, which involves interactive\u0000communication and ToM-based strategy adjustment, affects the team's performance\u0000and collaboration process. To explore the MToM process, we conducted a\u0000mixed-design experiment using a large language model-driven AI agent with ToM\u0000and communication modules in a real-time shared-workspace task. We find that\u0000the agent's ToM capability does not significantly impact team performance but\u0000enhances human understanding of the agent and the feeling of being understood.\u0000Most participants in our study believe verbal communication increases human\u0000burden, and the results show that bidirectional communication leads to lower\u0000HAT performance. We discuss the results' implications for designing AI agents\u0000that collaborate with humans in real-time shared workspace tasks.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142258631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CollaMamba: Efficient Collaborative Perception with Cross-Agent Spatial-Temporal State Space Model CollaMamba：利用跨代理时空状态模型实现高效协作感知

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-12 DOI: arxiv-2409.07714

Yang Li, Quan Yuan, Guiyang Luo, Xiaoyuan Fu, Xuanhan Zhu, Yujia Yang, Rui Pan, Jinglin Li

By sharing complementary perceptual information, multi-agent collaborativeperception fosters a deeper understanding of the environment. Recent studies oncollaborative perception mostly utilize CNNs or Transformers to learn featurerepresentation and fusion in the spatial dimension, which struggle to handlelong-range spatial-temporal features under limited computing and communicationresources. Holistically modeling the dependencies over extensive spatial areasand extended temporal frames is crucial to enhancing feature quality. To thisend, we propose a resource efficient cross-agent spatial-temporal collaborativestate space model (SSM), named CollaMamba. Initially, we construct afoundational backbone network based on spatial SSM. This backbone adeptlycaptures positional causal dependencies from both single-agent and cross-agentviews, yielding compact and comprehensive intermediate features whilemaintaining linear complexity. Furthermore, we devise a history-aware featureboosting module based on temporal SSM, extracting contextual cues from extendedhistorical frames to refine vague features while preserving low overhead.Extensive experiments across several datasets demonstrate that CollaMambaoutperforms state-of-the-art methods, achieving higher model accuracy whilereducing computational and communication overhead by up to 71.9% and 1/64,respectively. This work pioneers the exploration of the Mamba's potential incollaborative perception. The source code will be made available.

通过共享互补的感知信息，多机器人协同感知有助于加深对环境的理解。最近关于协作感知的研究大多利用 CNN 或变换器来学习空间维度的特征表示和融合，但在计算和通信资源有限的情况下，它们很难处理长距离的时空特征。要提高特征质量，就必须对广泛的空间区域和扩展的时间框架的依赖关系进行整体建模。为此，我们提出了一种资源高效的跨代理时空协作状态空间模型（SSM），命名为 CollaMamba。首先，我们构建了基于空间 SSM 的基础骨干网络。该骨干网能从单个代理和跨代理视角巧妙地捕捉位置因果依赖关系，在保持线性复杂性的同时产生紧凑而全面的中间特征。此外，我们还设计了一个基于时间 SSM 的历史感知特征增强模块，从扩展历史帧中提取上下文线索，以完善模糊特征，同时保持较低的开销。在多个数据集上进行的广泛实验表明，CollaMamba 的性能优于最先进的方法，在实现更高的模型准确性的同时，计算和通信开销分别降低了 71.9% 和 1/64。这项工作率先探索了 Mamba 在协作感知方面的潜力。我们将提供源代码。

{"title":"CollaMamba: Efficient Collaborative Perception with Cross-Agent Spatial-Temporal State Space Model","authors":"Yang Li, Quan Yuan, Guiyang Luo, Xiaoyuan Fu, Xuanhan Zhu, Yujia Yang, Rui Pan, Jinglin Li","doi":"arxiv-2409.07714","DOIUrl":"https://doi.org/arxiv-2409.07714","url":null,"abstract":"By sharing complementary perceptual information, multi-agent collaborative\u0000perception fosters a deeper understanding of the environment. Recent studies on\u0000collaborative perception mostly utilize CNNs or Transformers to learn feature\u0000representation and fusion in the spatial dimension, which struggle to handle\u0000long-range spatial-temporal features under limited computing and communication\u0000resources. Holistically modeling the dependencies over extensive spatial areas\u0000and extended temporal frames is crucial to enhancing feature quality. To this\u0000end, we propose a resource efficient cross-agent spatial-temporal collaborative\u0000state space model (SSM), named CollaMamba. Initially, we construct a\u0000foundational backbone network based on spatial SSM. This backbone adeptly\u0000captures positional causal dependencies from both single-agent and cross-agent\u0000views, yielding compact and comprehensive intermediate features while\u0000maintaining linear complexity. Furthermore, we devise a history-aware feature\u0000boosting module based on temporal SSM, extracting contextual cues from extended\u0000historical frames to refine vague features while preserving low overhead.\u0000Extensive experiments across several datasets demonstrate that CollaMamba\u0000outperforms state-of-the-art methods, achieving higher model accuracy while\u0000reducing computational and communication overhead by up to 71.9% and 1/64,\u0000respectively. This work pioneers the exploration of the Mamba's potential in\u0000collaborative perception. The source code will be made available.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Self-Supervised Inference of Agents in Trustless Environments 无信任环境中的代理自监督推理

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-12 DOI: arxiv-2409.08386

Vladyslav Larin, Ivan Nikitin, Alexander Firsov

In this paper, we propose a novel approach where agents can form swarms toproduce high-quality responses effectively. This is accomplished by utilizingagents capable of data inference and ranking, which can be effectivelyimplemented using LLMs as response classifiers. We assess existing approachesfor trustless agent inference, define our methodology, estimate practicalparameters, and model various types of malicious agent attacks. Our methodleverages the collective intelligence of swarms, ensuring robust and efficientdecentralized AI inference with better accuracy, security, and reliability. Weshow that our approach is an order of magnitude faster than other trustlessinference strategies reaching less than 125 ms validation latency.

在本文中，我们提出了一种新颖的方法，即代理可以组成蜂群，有效地生成高质量的响应。这是通过利用能够进行数据推理和排序的代理来实现的，这可以有效地使用 LLM 作为响应分类器来实现。我们评估了现有的无信任代理推理方法，定义了我们的方法，估算了实用参数，并对各种类型的恶意代理攻击进行了建模。我们的方法利用了蜂群的集体智慧，确保了稳健高效的去中心化人工智能推理，具有更高的准确性、安全性和可靠性。Wesh显示，我们的方法比其他无信任推理策略快一个数量级，验证延迟小于125毫秒。

引用次数: 0

Simultaneous Topology Estimation and Synchronization of Dynamical Networks with Time-varying Topology 时变拓扑动态网络的同步拓扑估计与同步

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-12 DOI: arxiv-2409.08404

Nana Wang, Esteban Restrepo, Dimos V. Dimarogonas

We propose an adaptive control strategy for the simultaneous estimation oftopology and synchronization in complex dynamical networks with unknown,time-varying topology. Our approach transforms the problem of time-varyingtopology estimation into a problem of estimating the time-varying weights of acomplete graph, utilizing an edge-agreement framework. We introduce twoauxiliary networks: one that satisfies the persistent excitation condition tofacilitate topology estimation, while the other, a uniform-$delta$persistently exciting network, ensures the boundedness of both weightestimation and synchronization errors, assuming bounded time-varying weightsand their derivatives. A relevant numerical example shows the efficiency of ourmethods.

我们提出了一种自适应控制策略，用于在具有未知时变拓扑结构的复杂动态网络中同时估计拓扑结构和同步。我们的方法利用边缘协议框架，将时变拓扑估计问题转化为估计完整图的时变权重问题。我们引入了两个辅助网络：一个满足持续激励条件，以促进拓扑估计；另一个是均匀-$delta$持续激励网络，确保权重估计和同步误差的有界性，假定时变权重及其导数是有界的。一个相关的数值示例显示了我们方法的效率。

引用次数: 0

Leveraging Unstructured Text Data for Federated Instruction Tuning of Large Language Models 利用非结构化文本数据对大型语言模型进行联合教学调整

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-11 DOI: arxiv-2409.07136

Rui Ye, Rui Ge, Yuchi Fengting, Jingyi Chai, Yanfeng Wang, Siheng Chen

Federated instruction tuning enables multiple clients to collaborativelyfine-tune a shared large language model (LLM) that can follow humans'instructions without directly sharing raw data. However, existing literatureimpractically requires that all the clients readily hold instruction-tuningdata (i.e., structured instruction-response pairs), which necessitates massivehuman annotations since clients' data is usually unstructured text instead.Addressing this, we propose a novel and flexible framework FedIT-U2S, which canautomatically transform unstructured corpus into structured data for federatedinstruction tuning. FedIT-U2S consists two key steps: (1) few-shotinstruction-tuning data generation, where each unstructured data piece togetherwith several examples is combined to prompt an LLM in generating aninstruction-response pair. To further enhance the flexibility, aretrieval-based example selection technique is proposed, where the examples areautomatically selected based on the relatedness between the client's data pieceand example pool, bypassing the need of determining examples in advance. (2) Atypical federated instruction tuning process based on the generated data.Overall, FedIT-U2S can be applied to diverse scenarios as long as the clientholds valuable text corpus, broadening the application scope of federatedinstruction tuning. We conduct a series of experiments on three domains(medicine, knowledge, and math), showing that our proposed FedIT-U2S canconsistently and significantly brings improvement over the base LLM.

联合指令调谐使多个客户端能够协作精细调谐一个共享的大型语言模型（LLM），该模型能够遵循人类的指令，而无需直接共享原始数据。然而，现有文献实际上要求所有客户端都能随时掌握指令调谐数据（即结构化指令-响应对），这就需要大量的人工注释，因为客户端的数据通常是非结构化文本。FedIT-U2S 包括两个关键步骤：(1) 少量指令调整数据生成，将每个非结构化数据片段与多个示例结合起来，促使 LLM 生成指令-响应对。为了进一步提高灵活性，还提出了基于检索的示例选择技术，即根据客户数据片段与示例池之间的相关性自动选择示例，而无需事先确定示例。(总体而言，只要客户拥有有价值的文本语料库，FedIT-U2S就能应用于多种场景，拓宽了联合指令调优的应用范围。我们在三个领域（医学、知识和数学）进行了一系列实验，结果表明我们提出的 FedIT-U2S 取消了基础 LLM，并带来了显著的改进。

{"title":"Leveraging Unstructured Text Data for Federated Instruction Tuning of Large Language Models","authors":"Rui Ye, Rui Ge, Yuchi Fengting, Jingyi Chai, Yanfeng Wang, Siheng Chen","doi":"arxiv-2409.07136","DOIUrl":"https://doi.org/arxiv-2409.07136","url":null,"abstract":"Federated instruction tuning enables multiple clients to collaboratively\u0000fine-tune a shared large language model (LLM) that can follow humans'\u0000instructions without directly sharing raw data. However, existing literature\u0000impractically requires that all the clients readily hold instruction-tuning\u0000data (i.e., structured instruction-response pairs), which necessitates massive\u0000human annotations since clients' data is usually unstructured text instead.\u0000Addressing this, we propose a novel and flexible framework FedIT-U2S, which can\u0000automatically transform unstructured corpus into structured data for federated\u0000instruction tuning. FedIT-U2S consists two key steps: (1) few-shot\u0000instruction-tuning data generation, where each unstructured data piece together\u0000with several examples is combined to prompt an LLM in generating an\u0000instruction-response pair. To further enhance the flexibility, a\u0000retrieval-based example selection technique is proposed, where the examples are\u0000automatically selected based on the relatedness between the client's data piece\u0000and example pool, bypassing the need of determining examples in advance. (2) A\u0000typical federated instruction tuning process based on the generated data.\u0000Overall, FedIT-U2S can be applied to diverse scenarios as long as the client\u0000holds valuable text corpus, broadening the application scope of federated\u0000instruction tuning. We conduct a series of experiments on three domains\u0000(medicine, knowledge, and math), showing that our proposed FedIT-U2S can\u0000consistently and significantly brings improvement over the base LLM.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DCMAC: Demand-aware Customized Multi-Agent Communication via Upper Bound Training DCMAC：通过上限训练实现需求感知的定制多代理通信

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-11 DOI: arxiv-2409.07127

Dongkun Huo, Huateng Zhang, Yixue Hao, Yuanlin Ye, Long Hu, Rui Wang, Min Chen

Efficient communication can enhance the overall performance of collaborativemulti-agent reinforcement learning. A common approach is to share observationsthrough full communication, leading to significant communication overhead.Existing work attempts to perceive the global state by conducting teammatemodel based on local information. However, they ignore that the uncertaintygenerated by prediction may lead to difficult training. To address thisproblem, we propose a Demand-aware Customized Multi-Agent Communication (DCMAC)protocol, which use an upper bound training to obtain the ideal policy. Byutilizing the demand parsing module, agent can interpret the gain of sendinglocal message on teammate, and generate customized messages via compute thecorrelation between demands and local observation using cross-attentionmechanism. Moreover, our method can adapt to the communication resources ofagents and accelerate the training progress by appropriating the ideal policywhich is trained with joint observation. Experimental results reveal that DCMACsignificantly outperforms the baseline algorithms in both unconstrained andcommunication constrained scenarios.

高效的通信可以提高协作式多代理强化学习的整体性能。一种常见的方法是通过完全通信来共享观察结果，这会导致巨大的通信开销。现有的工作试图通过基于本地信息的团队模型来感知全局状态。然而，他们忽略了预测产生的不确定性可能会导致训练困难。为了解决这个问题，我们提出了一种需求感知定制多代理通信（DCMAC）协议，它使用上限训练来获得理想的策略。利用需求解析模块，代理可以解释对队友发送本地信息的收益，并利用交叉关注机制通过计算需求与本地观察之间的相关性生成定制信息。此外，我们的方法还能适应代理的通信资源，并通过使用联合观测训练出的理想策略来加快训练进度。实验结果表明，DCMAC 在无约束和通信受限场景下的表现都明显优于基线算法。

{"title":"DCMAC: Demand-aware Customized Multi-Agent Communication via Upper Bound Training","authors":"Dongkun Huo, Huateng Zhang, Yixue Hao, Yuanlin Ye, Long Hu, Rui Wang, Min Chen","doi":"arxiv-2409.07127","DOIUrl":"https://doi.org/arxiv-2409.07127","url":null,"abstract":"Efficient communication can enhance the overall performance of collaborative\u0000multi-agent reinforcement learning. A common approach is to share observations\u0000through full communication, leading to significant communication overhead.\u0000Existing work attempts to perceive the global state by conducting teammate\u0000model based on local information. However, they ignore that the uncertainty\u0000generated by prediction may lead to difficult training. To address this\u0000problem, we propose a Demand-aware Customized Multi-Agent Communication (DCMAC)\u0000protocol, which use an upper bound training to obtain the ideal policy. By\u0000utilizing the demand parsing module, agent can interpret the gain of sending\u0000local message on teammate, and generate customized messages via compute the\u0000correlation between demands and local observation using cross-attention\u0000mechanism. Moreover, our method can adapt to the communication resources of\u0000agents and accelerate the training progress by appropriating the ideal policy\u0000which is trained with joint observation. Experimental results reveal that DCMAC\u0000significantly outperforms the baseline algorithms in both unconstrained and\u0000communication constrained scenarios.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Can Agents Spontaneously Form a Society? Introducing a Novel Architecture for Generative Multi-Agents to Elicit Social Emergence 代理能自发形成社会吗？引入新的多代理生成架构以激发社会涌现

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-10 DOI: arxiv-2409.06750

H. Zhang, J. Yin, M. Jiang, C. Su

Generative agents have demonstrated impressive capabilities in specifictasks, but most of these frameworks focus on independent tasks and lackattention to social interactions. We introduce a generative agent architecturecalled ITCMA-S, which includes a basic framework for individual agents and aframework called LTRHA that supports social interactions among multi-agents.This architecture enables agents to identify and filter out behaviors that aredetrimental to social interactions, guiding them to choose more favorableactions. We designed a sandbox environment to simulate the natural evolution ofsocial relationships among multiple identity-less agents for experimentalevaluation. The results showed that ITCMA-S performed well on multipleevaluation indicators, demonstrating its ability to actively explore theenvironment, recognize new agents, and acquire new information throughcontinuous actions and dialogue. Observations show that as agents establishconnections with each other, they spontaneously form cliques with internalhierarchies around a selected leader and organize collective activities.

生成式代理已在特定任务中展现出令人印象深刻的能力，但这些框架大多专注于独立任务，缺乏对社会互动的关注。我们介绍了一种称为 ITCMA-S 的生成式代理架构，它包括一个用于单个代理的基本框架和一个称为 LTRHA 的框架，后者支持多代理之间的社会互动。这种架构使代理能够识别并过滤掉不利于社会互动的行为，引导它们选择更有利的行为。我们设计了一个沙盒环境，模拟多个无身份代理之间社会关系的自然演化，以进行实验评估。结果表明，ITCMA-S 在多个评价指标上表现良好，证明了它能够主动探索环境、识别新的代理，并通过连续的行动和对话获取新信息。观察结果表明，当代理彼此建立联系时，他们会自发地围绕选定的领导者形成具有内部等级制度的小团体，并组织集体活动。

{"title":"Can Agents Spontaneously Form a Society? Introducing a Novel Architecture for Generative Multi-Agents to Elicit Social Emergence","authors":"H. Zhang, J. Yin, M. Jiang, C. Su","doi":"arxiv-2409.06750","DOIUrl":"https://doi.org/arxiv-2409.06750","url":null,"abstract":"Generative agents have demonstrated impressive capabilities in specific\u0000tasks, but most of these frameworks focus on independent tasks and lack\u0000attention to social interactions. We introduce a generative agent architecture\u0000called ITCMA-S, which includes a basic framework for individual agents and a\u0000framework called LTRHA that supports social interactions among multi-agents.\u0000This architecture enables agents to identify and filter out behaviors that are\u0000detrimental to social interactions, guiding them to choose more favorable\u0000actions. We designed a sandbox environment to simulate the natural evolution of\u0000social relationships among multiple identity-less agents for experimental\u0000evaluation. The results showed that ITCMA-S performed well on multiple\u0000evaluation indicators, demonstrating its ability to actively explore the\u0000environment, recognize new agents, and acquire new information through\u0000continuous actions and dialogue. Observations show that as agents establish\u0000connections with each other, they spontaneously form cliques with internal\u0000hierarchies around a selected leader and organize collective activities.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Quality Diversity Approach to Automatically Generate Multi-Agent Path Finding Benchmark Maps 自动生成多代理路径查找基准图的质量多样性方法

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-10 DOI: arxiv-2409.06888

Cheng Qian, Yulun Zhang, Varun Bhatt, Matthew Christopher Fontaine, Stefanos Nikolaidis, Jiaoyang Li

We use the Quality Diversity (QD) algorithm with Neural Cellular Automata(NCA) to generate benchmark maps for Multi-Agent Path Finding (MAPF)algorithms. Previously, MAPF algorithms are tested using fixed, human-designedbenchmark maps. However, such fixed benchmark maps have several problems.First, these maps may not cover all the potential failure scenarios for thealgorithms. Second, when comparing different algorithms, fixed benchmark mapsmay introduce bias leading to unfair comparisons between algorithms. In thiswork, we take advantage of the QD algorithm and NCA with different objectivesand diversity measures to generate maps with patterns to comprehensivelyunderstand the performance of MAPF algorithms and be able to make faircomparisons between two MAPF algorithms to provide further information on theselection between two algorithms. Empirically, we employ this technique togenerate diverse benchmark maps to evaluate and compare the behavior ofdifferent types of MAPF algorithms such as bounded-suboptimal algorithms,suboptimal algorithms, and reinforcement-learning-based algorithms. Throughboth single-planner experiments and comparisons between algorithms, we identifypatterns where each algorithm excels and detect disparities in runtime orsuccess rates between different algorithms.

我们利用质量多样性（QD）算法和神经细胞自动机（NCA）为多代理路径查找（MAPF）算法生成基准地图。以往，MAPF 算法都是使用固定的、人为设计的基准图进行测试。然而，这种固定基准图存在几个问题：首先，这些基准图可能无法涵盖算法的所有潜在故障情况。其次，在比较不同算法时，固定基准图可能会引入偏差，导致算法之间的比较不公平。在这项工作中，我们利用 QD 算法和 NCA 的不同目标和多样性度量来生成具有模式的地图，以全面了解 MAPF 算法的性能，并能够在两种 MAPF 算法之间进行公平比较，从而为两种算法之间的选择提供进一步的信息。在实证研究中，我们利用该技术生成了多种基准图，以评估和比较不同类型的 MAPF 算法，如有界次优算法、次优算法和基于强化学习的算法。通过单规划实验和算法之间的比较，我们确定了每种算法的优势模式，并发现了不同算法在运行时间或成功率上的差异。

{"title":"A Quality Diversity Approach to Automatically Generate Multi-Agent Path Finding Benchmark Maps","authors":"Cheng Qian, Yulun Zhang, Varun Bhatt, Matthew Christopher Fontaine, Stefanos Nikolaidis, Jiaoyang Li","doi":"arxiv-2409.06888","DOIUrl":"https://doi.org/arxiv-2409.06888","url":null,"abstract":"We use the Quality Diversity (QD) algorithm with Neural Cellular Automata\u0000(NCA) to generate benchmark maps for Multi-Agent Path Finding (MAPF)\u0000algorithms. Previously, MAPF algorithms are tested using fixed, human-designed\u0000benchmark maps. However, such fixed benchmark maps have several problems.\u0000First, these maps may not cover all the potential failure scenarios for the\u0000algorithms. Second, when comparing different algorithms, fixed benchmark maps\u0000may introduce bias leading to unfair comparisons between algorithms. In this\u0000work, we take advantage of the QD algorithm and NCA with different objectives\u0000and diversity measures to generate maps with patterns to comprehensively\u0000understand the performance of MAPF algorithms and be able to make fair\u0000comparisons between two MAPF algorithms to provide further information on the\u0000selection between two algorithms. Empirically, we employ this technique to\u0000generate diverse benchmark maps to evaluate and compare the behavior of\u0000different types of MAPF algorithms such as bounded-suboptimal algorithms,\u0000suboptimal algorithms, and reinforcement-learning-based algorithms. Through\u0000both single-planner experiments and comparisons between algorithms, we identify\u0000patterns where each algorithm excels and detect disparities in runtime or\u0000success rates between different algorithms.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0