arXiv - CS - Multiagent Systems最新文献

英文中文

Responsible Blockchain: STEADI Principles and the Actor-Network Theory-based Development Methodology (ANT-RDM) 负责任的区块链：STEADI 原则和基于行为网络理论的开发方法（ANT-RDM）

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-10 DOI: arxiv-2409.06179

Yibai Li, Ahmed Gomaa, Xiaobing Li

This paper provides a comprehensive analysis of the challenges andcontroversies associated with blockchain technology. It identifies technicalchallenges such as scalability, security, privacy, and interoperability, aswell as business and adoption challenges, and the social, economic, ethical,and environmental controversies present in current blockchain systems. We arguethat responsible blockchain development is key to overcoming these challengesand achieving mass adoption. This paper defines Responsible Blockchain andintroduces the STEADI principles (sustainable, transparent, ethical, adaptive,decentralized, and inclusive) for responsible blockchain development.Additionally, it presents the Actor-Network Theory-based ResponsibleDevelopment Methodology (ANT-RDM) for blockchains, which includes the steps ofproblematization, interessement, enrollment, and mobilization.

本文全面分析了与区块链技术相关的挑战和争议。它指出了可扩展性、安全性、隐私和互操作性等技术挑战，以及商业和采用方面的挑战，还有当前区块链系统中存在的社会、经济、道德和环境争议。我们认为，负责任的区块链开发是克服这些挑战并实现大规模应用的关键。本文定义了负责任的区块链，并介绍了负责任的区块链开发的STEADI原则（可持续、透明、道德、适应性、去中心化和包容性）。此外，本文还介绍了基于行动者网络理论的区块链负责任开发方法（ANT-RDM），其中包括问题明晰化（problematization）、介入（interessement）、注册（enrollment）和动员（mobilization）等步骤。

引用次数: 0

Foragax: An Agent Based Modelling framework based on JAX Foragax：基于 JAX 的代理建模框架

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-10 DOI: arxiv-2409.06345

Siddharth Chaturvedi, Ahmed El-Gazzar, Marcel van Gerven

Foraging for resources is a ubiquitous activity conducted by living organismsin a shared environment to maintain their homeostasis. Modelling multi-agentforaging in-silico allows us to study both individual and collective emergentbehaviour in a tractable manner. Agent-based modelling has proven to beeffective in simulating such tasks, though scaling the simulations toaccommodate large numbers of agents with complex dynamics remains challenging.In this work, we present Foragax, a general-purpose, scalable,hardware-accelerated, multi-agent foraging toolkit. Leveraging the JAX library,our toolkit can simulate thousands of agents foraging in a common environment,in an end-to-end vectorized and differentiable manner. The toolkit providesagent-based modelling tools to model various foraging tasks, including optionsto design custom spatial and temporal agent dynamics, control policies, sensormodels, and boundary conditions. Further, the number of agents during suchsimulations can be increased or decreased based on custom rules. The toolkitcan also be used to potentially model more general multi-agent scenarios.

觅食是生物体在共享环境中为维持自身平衡而进行的一种无处不在的活动。通过对多物种觅食行为进行模拟，我们可以对个体和集体的突发行为进行深入研究。基于代理的建模已被证明能有效模拟此类任务，但如何扩展模拟以适应具有复杂动态的大量代理仍是一个挑战。在这项工作中，我们提出了 Foragax，一个通用的、可扩展的、硬件加速的多代理觅食工具包。利用 JAX 库，我们的工具包可以以端到端矢量化和可微分的方式，模拟数千个代理在共同环境中觅食。该工具包提供了基于代理的建模工具，用于模拟各种觅食任务，包括设计定制的空间和时间代理动态、控制策略、传感模型和边界条件等选项。此外，在此类模拟中，可根据自定义规则增加或减少代理数量。该工具包还可用于模拟更一般的多代理场景。

{"title":"Foragax: An Agent Based Modelling framework based on JAX","authors":"Siddharth Chaturvedi, Ahmed El-Gazzar, Marcel van Gerven","doi":"arxiv-2409.06345","DOIUrl":"https://doi.org/arxiv-2409.06345","url":null,"abstract":"Foraging for resources is a ubiquitous activity conducted by living organisms\u0000in a shared environment to maintain their homeostasis. Modelling multi-agent\u0000foraging in-silico allows us to study both individual and collective emergent\u0000behaviour in a tractable manner. Agent-based modelling has proven to be\u0000effective in simulating such tasks, though scaling the simulations to\u0000accommodate large numbers of agents with complex dynamics remains challenging.\u0000In this work, we present Foragax, a general-purpose, scalable,\u0000hardware-accelerated, multi-agent foraging toolkit. Leveraging the JAX library,\u0000our toolkit can simulate thousands of agents foraging in a common environment,\u0000in an end-to-end vectorized and differentiable manner. The toolkit provides\u0000agent-based modelling tools to model various foraging tasks, including options\u0000to design custom spatial and temporal agent dynamics, control policies, sensor\u0000models, and boundary conditions. Further, the number of agents during such\u0000simulations can be increased or decreased based on custom rules. The toolkit\u0000can also be used to potentially model more general multi-agent scenarios.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"113 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing the Performance of Multi-Vehicle Navigation in Unstructured Environments using Hard Sample Mining 利用硬样本挖掘提高非结构化环境中的多车导航性能

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-08 DOI: arxiv-2409.05119

Yining Ma, Ang Li, Qadeer Khan, Daniel Cremers

Contemporary research in autonomous driving has demonstrated tremendouspotential in emulating the traits of human driving. However, they primarilycater to areas with well built road infrastructure and appropriate trafficmanagement systems. Therefore, in the absence of traffic signals or inunstructured environments, these self-driving algorithms are expected to fail.This paper proposes a strategy for autonomously navigating multiple vehicles inclose proximity to their desired destinations without traffic rules inunstructured environments. Graphical Neural Networks (GNNs) have demonstrated good utility for this taskof multi-vehicle control. Among the different alternatives of training GNNs,supervised methods have proven to be most data-efficient, albeit require groundtruth labels. However, these labels may not always be available, particularlyin unstructured environments without traffic regulations. Therefore, a tediousoptimization process may be required to determine them while ensuring that thevehicles reach their desired destination and do not collide with each other orany obstacles. Therefore, in order to expedite the training process, it isessential to reduce the optimization time and select only those samples forlabeling that add most value to the training. In this paper, we propose a warmstart method that first uses a pre-trained model trained on a simpler subset ofdata. Inference is then done on more complicated scenarios, to determine thehard samples wherein the model faces the greatest predicament. This is measuredby the difficulty vehicles encounter in reaching their desired destinationwithout collision. Experimental results demonstrate that mining for hardsamples in this manner reduces the requirement for supervised training data by10 fold. Videos and code can be found here:url{https://yininghase.github.io/multiagent-collision-mining/}.

当代自动驾驶研究在模仿人类驾驶特征方面展现出巨大潜力。然而，它们主要适用于道路基础设施完备、交通管理系统完善的地区。因此，在没有交通信号或非结构化环境中，这些自动驾驶算法预计会失败。本文提出了一种策略，用于在非结构化环境中，在没有交通规则的情况下，自动导航多辆车，使其接近所需的目的地。图形神经网络（GNN）在多车控制任务中表现出良好的实用性。在训练 GNN 的各种方法中，有监督的方法被证明是数据效率最高的，尽管需要地面实况标签。然而，这些标签并不总是可用的，尤其是在没有交通规则的非结构化环境中。因此，可能需要一个繁琐的优化过程来确定这些标签，同时确保车辆到达预期目的地，并且不会相互碰撞或遇到任何障碍物。因此，为了加快训练过程，必须缩短优化时间，只选择那些对训练最有价值的样本进行标记。在本文中，我们提出了一种热启动方法，首先使用在较简单数据子集上训练的预训练模型。然后在更复杂的场景中进行推理，以确定模型面临最大困境的困难样本。这是以车辆在到达预期目的地时遇到的不碰撞困难来衡量的。实验结果表明，通过这种方式挖掘困难样本，对有监督训练数据的要求降低了 10 倍。视频和代码请点击这里：url{https://yininghase.github.io/multiagent-collision-mining/}.

{"title":"Enhancing the Performance of Multi-Vehicle Navigation in Unstructured Environments using Hard Sample Mining","authors":"Yining Ma, Ang Li, Qadeer Khan, Daniel Cremers","doi":"arxiv-2409.05119","DOIUrl":"https://doi.org/arxiv-2409.05119","url":null,"abstract":"Contemporary research in autonomous driving has demonstrated tremendous\u0000potential in emulating the traits of human driving. However, they primarily\u0000cater to areas with well built road infrastructure and appropriate traffic\u0000management systems. Therefore, in the absence of traffic signals or in\u0000unstructured environments, these self-driving algorithms are expected to fail.\u0000This paper proposes a strategy for autonomously navigating multiple vehicles in\u0000close proximity to their desired destinations without traffic rules in\u0000unstructured environments. Graphical Neural Networks (GNNs) have demonstrated good utility for this task\u0000of multi-vehicle control. Among the different alternatives of training GNNs,\u0000supervised methods have proven to be most data-efficient, albeit require ground\u0000truth labels. However, these labels may not always be available, particularly\u0000in unstructured environments without traffic regulations. Therefore, a tedious\u0000optimization process may be required to determine them while ensuring that the\u0000vehicles reach their desired destination and do not collide with each other or\u0000any obstacles. Therefore, in order to expedite the training process, it is\u0000essential to reduce the optimization time and select only those samples for\u0000labeling that add most value to the training. In this paper, we propose a warm\u0000start method that first uses a pre-trained model trained on a simpler subset of\u0000data. Inference is then done on more complicated scenarios, to determine the\u0000hard samples wherein the model faces the greatest predicament. This is measured\u0000by the difficulty vehicles encounter in reaching their desired destination\u0000without collision. Experimental results demonstrate that mining for hard\u0000samples in this manner reduces the requirement for supervised training data by\u000010 fold. Videos and code can be found here:\u0000url{https://yininghase.github.io/multiagent-collision-mining/}.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards Multi-agent Policy-based Directed Hypergraph Learning for Traffic Signal Control 为交通信号控制实现基于多代理策略的有向超图学习

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-08 DOI: arxiv-2409.05037

Kang Wang, Zhishu Shen, Zhenwei Wang, Tiehua Zhang

Deep reinforcement learning (DRL) methods that incorporate graph neuralnetworks (GNNs) have been extensively studied for intelligent traffic signalcontrol, which aims to coordinate traffic signals effectively across multipleintersections. Despite this progress, the standard graph learning used in thesemethods still struggles to capture higher-order correlations in real-worldtraffic flow. In this paper, we propose a multi-agent proximal policyoptimization framework DHG-PPO, which incorporates PPO and directed hypergraphmodule to extract the spatio-temporal attributes of the road networks. DHG-PPOenables multiple agents to ingeniously interact through the dynamicalconstruction of hypergraph. The effectiveness of DHG-PPO is validated in termsof average travel time and throughput against state-of-the-art baselinesthrough extensive experiments.

对于旨在有效协调多个交叉路口交通信号的智能交通信号控制，人们已经广泛研究了结合图神经网络（GNN）的深度强化学习（DRL）方法。尽管取得了这一进展，但这些方法中使用的标准图学习仍难以捕捉现实世界交通流中的高阶相关性。本文提出了一种多代理近端策略优化框架 DHG-PPO，它结合了 PPO 和有向超图模块来提取道路网络的时空属性。DHG-PPO 通过动态构建超图使多个代理巧妙地进行交互。通过大量实验，DHG-PPO 在平均旅行时间和吞吐量方面与最先进的基线进行了对比，验证了其有效性。

引用次数: 0

Adaptation Procedure in Misinformation Games 误导游戏中的适应程序

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-07 DOI: arxiv-2409.04854

Konstantinos Varsos, Merkouris Papamichail, Giorgos Flouris, Marina Bitsaki

We study interactions between agents in multi-agent systems, in which theagents are misinformed with regards to the game that they play, essentiallyhaving a subjective and incorrect understanding of the setting, without beingaware of it. For that, we introduce a new game-theoretic concept, calledmisinformation games, that provides the necessary toolkit to study thissituation. Subsequently, we enhance this framework by developing atime-discrete procedure (called the Adaptation Procedure) that capturesiterative interactions in the above context. During the Adaptation Procedure,the agents update their information and reassess their behaviour in each step.We demonstrate our ideas through an implementation, which is used to study theefficiency and characteristics of the Adaptation Procedure.

我们研究的是多代理系统中代理之间的互动，在这种互动中，代理对他们所玩的博弈的信息是错误的，基本上是对环境有一种主观的、不正确的理解，而自己却没有意识到这一点。为此，我们引入了一个新的博弈论概念--"错误信息博弈"，为研究这种情况提供了必要的工具包。随后，我们通过开发时间离散程序（称为 "适应程序"）来增强这一框架，该程序捕捉了上述情况下的互动。在适应过程中，代理人更新他们的信息，并在每一步中重新评估他们的行为。我们通过一个实现来展示我们的想法，并用它来研究适应过程的效率和特点。

引用次数: 0

PARCO: Learning Parallel Autoregressive Policies for Efficient Multi-Agent Combinatorial Optimization PARCO：学习并行自回归政策，实现高效的多代理组合优化

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-05 DOI: arxiv-2409.03811

Federico Berto, Chuanbo Hua, Laurin Luttmann, Jiwoo Son, Junyoung Park, Kyuree Ahn, Changhyun Kwon, Lin Xie, Jinkyoo Park

Multi-agent combinatorial optimization problems such as routing andscheduling have great practical relevance but present challenges due to theirNP-hard combinatorial nature, hard constraints on the number of possibleagents, and hard-to-optimize objective functions. This paper introduces PARCO(Parallel AutoRegressive Combinatorial Optimization), a novel approach thatlearns fast surrogate solvers for multi-agent combinatorial problems withreinforcement learning by employing parallel autoregressive decoding. Wepropose a model with a Multiple Pointer Mechanism to efficiently decodemultiple decisions simultaneously by different agents, enhanced by aPriority-based Conflict Handling scheme. Moreover, we design specializedCommunication Layers that enable effective agent collaboration, thus enrichingdecision-making. We evaluate PARCO in representative multi-agent combinatorialproblems in routing and scheduling and demonstrate that our learned solversoffer competitive results against both classical and neural baselines in termsof both solution quality and speed. We make our code openly available athttps://github.com/ai4co/parco.

路由和调度等多代理组合优化问题具有重要的现实意义，但由于其NP-硬组合性质、对可能代理数量的硬约束以及难以优化的目标函数，这些问题面临着挑战。本文介绍了 PARCO（并行自回归组合优化），这是一种新颖的方法，它通过采用并行自回归解码，为多代理组合问题学习快速代求解器。我们提出了一个具有多重指针机制的模型，通过基于优先级的冲突处理方案，同时对不同代理的多个决策进行高效解码。此外，我们还设计了专门的通信层（Communication Layers），以实现有效的代理协作，从而丰富决策过程。我们在路由和调度方面的代表性多代理组合问题中对 PARCO 进行了评估，结果表明，我们的学习求解器在求解质量和速度方面都能与经典和神经基线求解器相媲美。我们将代码公开在https://github.com/ai4co/parco。

{"title":"PARCO: Learning Parallel Autoregressive Policies for Efficient Multi-Agent Combinatorial Optimization","authors":"Federico Berto, Chuanbo Hua, Laurin Luttmann, Jiwoo Son, Junyoung Park, Kyuree Ahn, Changhyun Kwon, Lin Xie, Jinkyoo Park","doi":"arxiv-2409.03811","DOIUrl":"https://doi.org/arxiv-2409.03811","url":null,"abstract":"Multi-agent combinatorial optimization problems such as routing and\u0000scheduling have great practical relevance but present challenges due to their\u0000NP-hard combinatorial nature, hard constraints on the number of possible\u0000agents, and hard-to-optimize objective functions. This paper introduces PARCO\u0000(Parallel AutoRegressive Combinatorial Optimization), a novel approach that\u0000learns fast surrogate solvers for multi-agent combinatorial problems with\u0000reinforcement learning by employing parallel autoregressive decoding. We\u0000propose a model with a Multiple Pointer Mechanism to efficiently decode\u0000multiple decisions simultaneously by different agents, enhanced by a\u0000Priority-based Conflict Handling scheme. Moreover, we design specialized\u0000Communication Layers that enable effective agent collaboration, thus enriching\u0000decision-making. We evaluate PARCO in representative multi-agent combinatorial\u0000problems in routing and scheduling and demonstrate that our learned solvers\u0000offer competitive results against both classical and neural baselines in terms\u0000of both solution quality and speed. We make our code openly available at\u0000https://github.com/ai4co/parco.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Survey on Emergent Language 新兴语言调查

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-04 DOI: arxiv-2409.02645

Jannik Peters, Constantin Waubert de Puiseau, Hasan Tercan, Arya Gopikrishnan, Gustavo Adolpho Lucas De Carvalho, Christian Bitter, Tobias Meisen

The field of emergent language represents a novel area of research within thedomain of artificial intelligence, particularly within the context ofmulti-agent reinforcement learning. Although the concept of studying languageemergence is not new, early approaches were primarily concerned with explaininghuman language formation, with little consideration given to its potentialutility for artificial agents. In contrast, studies based on reinforcementlearning aim to develop communicative capabilities in agents that arecomparable to or even superior to human language. Thus, they extend beyond thelearned statistical representations that are common in natural languageprocessing research. This gives rise to a number of fundamental questions, fromthe prerequisites for language emergence to the criteria for measuring itssuccess. This paper addresses these questions by providing a comprehensivereview of 181 scientific publications on emergent language in artificialintelligence. Its objective is to serve as a reference for researchersinterested in or proficient in the field. Consequently, the main contributionsare the definition and overview of the prevailing terminology, the analysis ofexisting evaluation methods and metrics, and the description of the identifiedresearch gaps.

新兴语言领域是人工智能领域的一个新的研究领域，特别是在多代理强化学习的背景下。尽管研究新兴语言的概念并不新鲜，但早期的研究方法主要关注解释人类语言的形成，很少考虑其对人工智能的潜在作用。相比之下，基于强化学习的研究旨在开发人工智能的交际能力，这种能力可与人类语言相媲美，甚至更胜一筹。因此，它们超越了自然语言处理研究中常见的学习统计表征。这引发了一系列基本问题，从语言出现的前提条件到衡量语言成功与否的标准。本文针对这些问题，对 181 篇有关人工智能中出现语言的科学出版物进行了全面评述。其目的是为对该领域感兴趣或精通该领域的研究人员提供参考。因此，本文的主要贡献在于对流行术语的定义和概述、对现有评估方法和指标的分析，以及对已发现的研究空白的描述。

{"title":"A Survey on Emergent Language","authors":"Jannik Peters, Constantin Waubert de Puiseau, Hasan Tercan, Arya Gopikrishnan, Gustavo Adolpho Lucas De Carvalho, Christian Bitter, Tobias Meisen","doi":"arxiv-2409.02645","DOIUrl":"https://doi.org/arxiv-2409.02645","url":null,"abstract":"The field of emergent language represents a novel area of research within the\u0000domain of artificial intelligence, particularly within the context of\u0000multi-agent reinforcement learning. Although the concept of studying language\u0000emergence is not new, early approaches were primarily concerned with explaining\u0000human language formation, with little consideration given to its potential\u0000utility for artificial agents. In contrast, studies based on reinforcement\u0000learning aim to develop communicative capabilities in agents that are\u0000comparable to or even superior to human language. Thus, they extend beyond the\u0000learned statistical representations that are common in natural language\u0000processing research. This gives rise to a number of fundamental questions, from\u0000the prerequisites for language emergence to the criteria for measuring its\u0000success. This paper addresses these questions by providing a comprehensive\u0000review of 181 scientific publications on emergent language in artificial\u0000intelligence. Its objective is to serve as a reference for researchers\u0000interested in or proficient in the field. Consequently, the main contributions\u0000are the definition and overview of the prevailing terminology, the analysis of\u0000existing evaluation methods and metrics, and the description of the identified\u0000research gaps.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning 多代理合作强化学习中分散执行的集中训练简介

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-04 DOI: arxiv-2409.03052

Christopher Amato

Multi-agent reinforcement learning (MARL) has exploded in popularity inrecent years. Many approaches have been developed but they can be divided intothree main types: centralized training and execution (CTE), centralizedtraining for decentralized execution (CTDE), and Decentralized training andexecution (DTE). CTDE methods are the most common as they can use centralized informationduring training but execute in a decentralized manner -- using only informationavailable to that agent during execution. CTDE is the only paradigm thatrequires a separate training phase where any available information (e.g., otheragent policies, underlying states) can be used. As a result, they can be morescalable than CTE methods, do not require communication during execution, andcan often perform well. CTDE fits most naturally with the cooperative case, butcan be potentially applied in competitive or mixed settings depending on whatinformation is assumed to be observed. This text is an introduction to CTDE in cooperative MARL. It is meant toexplain the setting, basic concepts, and common methods. It does not cover allwork in CTDE MARL as the subarea is quite extensive. I have included work thatI believe is important for understanding the main concepts in the subarea andapologize to those that I have omitted.

多代理强化学习（MARL）近年来大受欢迎。目前已开发出许多方法，但主要可分为三类：集中训练和执行（CTE）、集中训练分散执行（CTDE）和分散训练和执行（DTE）。CTDE 方法是最常见的方法，因为它们可以在训练期间使用集中信息，但以分散方式执行--在执行期间仅使用该代理可用的信息。CTDE 是唯一一种需要单独训练阶段的模式，在训练阶段可以使用任何可用信息（如其他代理策略、底层状态）。因此，它们比 CTE 方法更具可扩展性，不需要在执行过程中进行通信，而且通常性能良好。CTDE 最自然地适用于合作情况，但也有可能应用于竞争或混合情况，这取决于假定观察到哪些信息。本文介绍了合作 MARL 中的 CTDE。它旨在解释设置、基本概念和常用方法。它并不涵盖 CTDE MARL 中的所有工作，因为该子领域相当广泛。我收录了我认为对理解该子领域主要概念非常重要的工作，并对遗漏的工作表示歉意。

{"title":"An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning","authors":"Christopher Amato","doi":"arxiv-2409.03052","DOIUrl":"https://doi.org/arxiv-2409.03052","url":null,"abstract":"Multi-agent reinforcement learning (MARL) has exploded in popularity in\u0000recent years. Many approaches have been developed but they can be divided into\u0000three main types: centralized training and execution (CTE), centralized\u0000training for decentralized execution (CTDE), and Decentralized training and\u0000execution (DTE). CTDE methods are the most common as they can use centralized information\u0000during training but execute in a decentralized manner -- using only information\u0000available to that agent during execution. CTDE is the only paradigm that\u0000requires a separate training phase where any available information (e.g., other\u0000agent policies, underlying states) can be used. As a result, they can be more\u0000scalable than CTE methods, do not require communication during execution, and\u0000can often perform well. CTDE fits most naturally with the cooperative case, but\u0000can be potentially applied in competitive or mixed settings depending on what\u0000information is assumed to be observed. This text is an introduction to CTDE in cooperative MARL. It is meant to\u0000explain the setting, basic concepts, and common methods. It does not cover all\u0000work in CTDE MARL as the subarea is quite extensive. I have included work that\u0000I believe is important for understanding the main concepts in the subarea and\u0000apologize to those that I have omitted.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Context-Aware Agent-based Model for Smart Long Distance Transport System 基于情境感知的智能长途运输系统代理模型

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-04 DOI: arxiv-2409.02434

Muhammad Raees, Afzal Ahmed

Long-distance transport plays a vital role in the economic growth ofcountries. However, there is a lack of systems being developed for monitoringand support of long-route vehicles (LRV). Sustainable and context-awaretransport systems with modern technologies are needed. We model forlong-distance vehicle transportation monitoring and support systems in amulti-agent environment. Our model incorporates the distance vehicle transportmechanism through agent-based modeling (ABM). This model constitutes the designprotocol of ABM called Overview, Design, and Details (ODD). This modelconstitutes that every category of agents is offering information as a service.Hence, a federation of services through protocol for the communication betweensensors and software components is desired. Such integration of servicessupports monitoring and tracking of vehicles on the route. The modelsimulations provide useful results for the integration of services based onsmart objects.

长途运输对各国的经济增长起着至关重要的作用。然而，目前缺乏用于监控和支持长途车辆（LRV）的系统。我们需要采用现代技术的可持续的、能感知环境的交通系统。我们在多代理环境中建立了长途车辆运输监控和支持系统模型。我们的模型通过基于代理的建模（ABM）纳入了远距离车辆运输机制。该模型构成了 ABM 的设计协议，称为 "概述、设计和细节（ODD）"。因此，需要通过传感器和软件组件之间的通信协议建立服务联盟。这种服务集成支持对路线上的车辆进行监控和跟踪。模型模拟为基于智能对象的服务集成提供了有用的结果。

引用次数: 0

From Grounding to Planning: Benchmarking Bottlenecks in Web Agents 从接地到规划：网络代理瓶颈的基准测试

arXiv - CS - Multiagent Systems

Pub Date : 2024-09-03 DOI: arxiv-2409.01927

Segev Shlomov, Ben wiesel, Aviad Sela, Ido Levy, Liane Galanti, Roy Abitbol

General web-based agents are increasingly essential for interacting withcomplex web environments, yet their performance in real-world web applicationsremains poor, yielding extremely low accuracy even with state-of-the-artfrontier models. We observe that these agents can be decomposed into twoprimary components: Planning and Grounding. Yet, most existing research treatsthese agents as black boxes, focusing on end-to-end evaluations which hindermeaningful improvements. We sharpen the distinction between the planning andgrounding components and conduct a novel analysis by refining experiments onthe Mind2Web dataset. Our work proposes a new benchmark for each of thecomponents separately, identifying the bottlenecks and pain points that limitagent performance. Contrary to prevalent assumptions, our findings suggest thatgrounding is not a significant bottleneck and can be effectively addressed withcurrent techniques. Instead, the primary challenge lies in the planningcomponent, which is the main source of performance degradation. Through thisanalysis, we offer new insights and demonstrate practical suggestions forimproving the capabilities of web agents, paving the way for more reliableagents.

基于网络的通用代理对于与复杂的网络环境进行交互越来越重要，但它们在实际网络应用中的性能仍然很差，即使使用最先进的前沿模型，准确率也极低。我们发现，这些代理可以分解为两个主要部分：规划和接地。然而，大多数现有研究都将这些代理视为黑盒子，专注于端到端的评估，从而阻碍了有意义的改进。我们进一步区分了规划和接地组件，并通过改进 Mind2Web 数据集上的实验进行了新颖的分析。我们的工作为每个组件分别提出了新的基准，找出了限制代理性能的瓶颈和痛点。与普遍的假设相反，我们的研究结果表明，接地并不是一个重要的瓶颈，目前的技术可以有效地解决这个问题。相反，主要挑战在于规划组件，它是性能下降的主要来源。通过这一分析，我们提出了新的见解，并展示了提高网络代理能力的实用建议，为开发更可靠的代理铺平了道路。

{"title":"From Grounding to Planning: Benchmarking Bottlenecks in Web Agents","authors":"Segev Shlomov, Ben wiesel, Aviad Sela, Ido Levy, Liane Galanti, Roy Abitbol","doi":"arxiv-2409.01927","DOIUrl":"https://doi.org/arxiv-2409.01927","url":null,"abstract":"General web-based agents are increasingly essential for interacting with\u0000complex web environments, yet their performance in real-world web applications\u0000remains poor, yielding extremely low accuracy even with state-of-the-art\u0000frontier models. We observe that these agents can be decomposed into two\u0000primary components: Planning and Grounding. Yet, most existing research treats\u0000these agents as black boxes, focusing on end-to-end evaluations which hinder\u0000meaningful improvements. We sharpen the distinction between the planning and\u0000grounding components and conduct a novel analysis by refining experiments on\u0000the Mind2Web dataset. Our work proposes a new benchmark for each of the\u0000components separately, identifying the bottlenecks and pain points that limit\u0000agent performance. Contrary to prevalent assumptions, our findings suggest that\u0000grounding is not a significant bottleneck and can be effectively addressed with\u0000current techniques. Instead, the primary challenge lies in the planning\u0000component, which is the main source of performance degradation. Through this\u0000analysis, we offer new insights and demonstrate practical suggestions for\u0000improving the capabilities of web agents, paving the way for more reliable\u0000agents.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - CS - Multiagent Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀