arXiv - CS - Multiagent Systems最新文献_第7页

Value-Enriched Population Synthesis: Integrating a Motivational Layer 价值丰富的人口合成：整合动机层

arXiv - CS - Multiagent Systems

Pub Date : 2024-08-18 DOI: arxiv-2408.09407

Alba Aguilera, Miquel Albertí, Nardine Osman, Georgina Curto

In recent years, computational improvements have allowed for more nuanced,data-driven and geographically explicit agent-based simulations. So far,simulations have struggled to adequately represent the attributes that motivatethe actions of the agents. In fact, existing population synthesis frameworksgenerate agent profiles limited to socio-demographic attributes. In this paper,we introduce a novel value-enriched population synthesis framework thatintegrates a motivational layer with the traditional individual and householdsocio-demographic layers. Our research highlights the significance of extendingthe profile of agents in synthetic populations by incorporating data on values,ideologies, opinions and vital priorities, which motivate the agents'behaviour. This motivational layer can help us develop a more nuanceddecision-making mechanism for the agents in social simulation settings. Ourmethodology integrates microdata and macrodata within different Bayesiannetwork structures. This contribution allows to generate synthetic populationswith integrated value systems that preserve the inherent socio-demographicdistributions of the real population in any specific region.

近年来，计算技术的进步使得基于代理的模拟变得更加细致入微、数据驱动和地理明确。迄今为止，模拟一直在努力充分体现激励代理行动的属性。事实上，现有的人口合成框架生成的代理概况仅限于社会人口属性。在本文中，我们介绍了一种新颖的价值丰富的人口合成框架，它将动机层与传统的个人和家庭社会人口层整合在一起。我们的研究强调了通过纳入价值观、意识形态、观点和重要优先事项的数据来扩展合成人口中的代理人特征的重要性，这些数据会激励代理人的行为。这一动机层可以帮助我们为社会模拟环境中的代理开发出更加细致入微的决策机制。我们的方法在不同的贝叶斯网络结构中整合了微观数据和宏观数据。这使得我们能够生成具有综合价值体系的合成人口，并保持任何特定地区真实人口的固有社会人口分布。

{"title":"Value-Enriched Population Synthesis: Integrating a Motivational Layer","authors":"Alba Aguilera, Miquel Albertí, Nardine Osman, Georgina Curto","doi":"arxiv-2408.09407","DOIUrl":"https://doi.org/arxiv-2408.09407","url":null,"abstract":"In recent years, computational improvements have allowed for more nuanced,\u0000data-driven and geographically explicit agent-based simulations. So far,\u0000simulations have struggled to adequately represent the attributes that motivate\u0000the actions of the agents. In fact, existing population synthesis frameworks\u0000generate agent profiles limited to socio-demographic attributes. In this paper,\u0000we introduce a novel value-enriched population synthesis framework that\u0000integrates a motivational layer with the traditional individual and household\u0000socio-demographic layers. Our research highlights the significance of extending\u0000the profile of agents in synthetic populations by incorporating data on values,\u0000ideologies, opinions and vital priorities, which motivate the agents'\u0000behaviour. This motivational layer can help us develop a more nuanced\u0000decision-making mechanism for the agents in social simulation settings. Our\u0000methodology integrates microdata and macrodata within different Bayesian\u0000network structures. This contribution allows to generate synthetic populations\u0000with integrated value systems that preserve the inherent socio-demographic\u0000distributions of the real population in any specific region.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"79 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Beyond Local Views: Global State Inference with Diffusion Models for Cooperative Multi-Agent Reinforcement Learning 超越局部观点：利用扩散模型进行全局状态推断，实现多代理合作强化学习

arXiv - CS - Multiagent Systems

Pub Date : 2024-08-18 DOI: arxiv-2408.09501

Zhiwei Xu, Hangyu Mao, Nianmin Zhang, Xin Xin, Pengjie Ren, Dapeng Li, Bin Zhang, Guoliang Fan, Zhumin Chen, Changwei Wang, Jiangjin Yin

In partially observable multi-agent systems, agents typically only haveaccess to local observations. This severely hinders their ability to makeprecise decisions, particularly during decentralized execution. To alleviatethis problem and inspired by image outpainting, we propose State Inference withDiffusion Models (SIDIFF), which uses diffusion models to reconstruct theoriginal global state based solely on local observations. SIDIFF consists of astate generator and a state extractor, which allow agents to choose suitableactions by considering both the reconstructed global state and localobservations. In addition, SIDIFF can be effortlessly incorporated into currentmulti-agent reinforcement learning algorithms to improve their performance.Finally, we evaluated SIDIFF on different experimental platforms, includingMulti-Agent Battle City (MABC), a novel and flexible multi-agent reinforcementlearning environment we developed. SIDIFF achieved desirable results andoutperformed other popular algorithms.

在部分可观测的多代理系统中，代理通常只能获得本地观测数据。这严重阻碍了它们做出精确决策的能力，尤其是在分散执行期间。为了缓解这一问题，我们受到图像外绘的启发，提出了利用扩散模型进行状态推理（SIDIFF），它利用扩散模型，仅根据局部观测结果就能重建最初的全局状态。SIDIFF 由状态生成器和状态提取器组成，它允许代理通过考虑重建的全局状态和本地观测结果来选择合适的行动。最后，我们在不同的实验平台上对 SIDIFF 进行了评估，包括我们开发的新颖灵活的多代理强化学习环境--多代理战城（MABC）。SIDIFF取得了令人满意的结果，并优于其他流行算法。

{"title":"Beyond Local Views: Global State Inference with Diffusion Models for Cooperative Multi-Agent Reinforcement Learning","authors":"Zhiwei Xu, Hangyu Mao, Nianmin Zhang, Xin Xin, Pengjie Ren, Dapeng Li, Bin Zhang, Guoliang Fan, Zhumin Chen, Changwei Wang, Jiangjin Yin","doi":"arxiv-2408.09501","DOIUrl":"https://doi.org/arxiv-2408.09501","url":null,"abstract":"In partially observable multi-agent systems, agents typically only have\u0000access to local observations. This severely hinders their ability to make\u0000precise decisions, particularly during decentralized execution. To alleviate\u0000this problem and inspired by image outpainting, we propose State Inference with\u0000Diffusion Models (SIDIFF), which uses diffusion models to reconstruct the\u0000original global state based solely on local observations. SIDIFF consists of a\u0000state generator and a state extractor, which allow agents to choose suitable\u0000actions by considering both the reconstructed global state and local\u0000observations. In addition, SIDIFF can be effortlessly incorporated into current\u0000multi-agent reinforcement learning algorithms to improve their performance.\u0000Finally, we evaluated SIDIFF on different experimental platforms, including\u0000Multi-Agent Battle City (MABC), a novel and flexible multi-agent reinforcement\u0000learning environment we developed. SIDIFF achieved desirable results and\u0000outperformed other popular algorithms.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"113 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Joint-perturbation simultaneous pseudo-gradient 联合扰动同步伪梯度

arXiv - CS - Multiagent Systems

Pub Date : 2024-08-17 DOI: arxiv-2408.09306

Carlos Martin, Tuomas Sandholm

We study the problem of computing an approximate Nash equilibrium of a gamewhose strategy space is continuous without access to gradients of the utilityfunction. Such games arise, for example, when players' strategies arerepresented by the parameters of a neural network. Lack of access to gradientsis common in reinforcement learning settings, where the environment is treatedas a black box, as well as equilibrium finding in mechanisms such as auctions,where the mechanism's payoffs are discontinuous in the players' actions. Totackle this problem, we turn to zeroth-order optimization techniques thatcombine pseudo-gradients with equilibrium-finding dynamics. Specifically, weintroduce a new technique that requires a number of utility functionevaluations per iteration that is constant rather than linear in the number ofplayers. It achieves this by performing a single joint perturbation on allplayers' strategies, rather than perturbing each one individually. This yieldsa dramatic improvement for many-player games, especially when the utilityfunction is expensive to compute in terms of wall time, memory, money, or otherresources. We evaluate our approach on various games, including auctions, whichhave important real-world applications. Our approach yields a significantreduction in the run time required to reach an approximate Nash equilibrium.

我们研究的问题是，在无法获得效用函数梯度的情况下，如何计算策略空间连续的博弈的近似纳什均衡。例如，当玩家的策略由神经网络的参数表示时，就会出现这种博弈。无法获得梯度在强化学习环境中很常见，在这种环境中，环境被视为一个黑盒子；在诸如拍卖等机制中也很常见，在这种机制中，机制的回报与玩家的行动是不连续的。为了解决这个问题，我们转向了将伪梯度与均衡寻找动力学相结合的零阶优化技术。具体来说，我们引入了一种新技术，它要求每次迭代的效用函数评估次数是恒定的，而不是与玩家数量成线性关系。它通过对所有玩家的策略进行一次联合扰动，而不是单独扰动每个玩家的策略来实现这一目标。这对多人博弈产生了巨大的改善，尤其是当效用函数的计算需要耗费大量时间、内存、金钱或其他资源时。我们在各种游戏中评估了我们的方法，包括在现实世界中有着重要应用的拍卖。我们的方法大大减少了达到近似纳什均衡所需的运行时间。

{"title":"Joint-perturbation simultaneous pseudo-gradient","authors":"Carlos Martin, Tuomas Sandholm","doi":"arxiv-2408.09306","DOIUrl":"https://doi.org/arxiv-2408.09306","url":null,"abstract":"We study the problem of computing an approximate Nash equilibrium of a game\u0000whose strategy space is continuous without access to gradients of the utility\u0000function. Such games arise, for example, when players' strategies are\u0000represented by the parameters of a neural network. Lack of access to gradients\u0000is common in reinforcement learning settings, where the environment is treated\u0000as a black box, as well as equilibrium finding in mechanisms such as auctions,\u0000where the mechanism's payoffs are discontinuous in the players' actions. To\u0000tackle this problem, we turn to zeroth-order optimization techniques that\u0000combine pseudo-gradients with equilibrium-finding dynamics. Specifically, we\u0000introduce a new technique that requires a number of utility function\u0000evaluations per iteration that is constant rather than linear in the number of\u0000players. It achieves this by performing a single joint perturbation on all\u0000players' strategies, rather than perturbing each one individually. This yields\u0000a dramatic improvement for many-player games, especially when the utility\u0000function is expensive to compute in terms of wall time, memory, money, or other\u0000resources. We evaluate our approach on various games, including auctions, which\u0000have important real-world applications. Our approach yields a significant\u0000reduction in the run time required to reach an approximate Nash equilibrium.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multilevel Graph Reinforcement Learning for Consistent Cognitive Decision-making in Heterogeneous Mixed Autonomy 多层次图强化学习促进异构混合自主中的一致认知决策

arXiv - CS - Multiagent Systems

Pub Date : 2024-08-16 DOI: arxiv-2408.08516

Xin Gao, Zhaoyang Ma, Xueyuan Li, Xiaoqiang Meng, Zirui Li

In the realm of heterogeneous mixed autonomy, vehicles experience dynamicspatial correlations and nonlinear temporal interactions in a complex,non-Euclidean space. These complexities pose significant challenges totraditional decision-making frameworks. Addressing this, we propose ahierarchical reinforcement learning framework integrated with multilevel graphrepresentations, which effectively comprehends and models the spatiotemporalinteractions among vehicles navigating through uncertain traffic conditionswith varying decision-making systems. Rooted in multilevel graph representationtheory, our approach encapsulates spatiotemporal relationships inherent innon-Euclidean spaces. A weighted graph represents spatiotemporal featuresbetween nodes, addressing the degree imbalance inherent in dynamic graphs. Weintegrate asynchronous parallel hierarchical reinforcement learning with amultilevel graph representation and a multi-head attention mechanism, whichenables connected autonomous vehicles (CAVs) to exhibit capabilities akin tohuman cognition, facilitating consistent decision-making across variouscritical dimensions. The proposed decision-making strategy is validated inchallenging environments characterized by high density, randomness, anddynamism on highway roads. We assess the performance of our framework throughablation studies, comparative analyses, and spatiotemporal trajectoryevaluations. This study presents a quantitative analysis of decision-makingmechanisms mirroring human cognitive functions in the realm of heterogeneousmixed autonomy, promoting the development of multi-dimensional decision-makingstrategies and a sophisticated distribution of attentional resources.

在异构混合自主领域，车辆在复杂的非欧几里得空间中经历动态空间关联和非线性时间交互。这些复杂性给传统决策框架带来了巨大挑战。针对这一问题，我们提出了一种与多层次图表示集成的层次强化学习框架，它能有效地理解和模拟在不确定的交通条件下航行的车辆之间的时空交互，并具有不同的决策系统。基于多层次图表示理论，我们的方法囊括了非欧几里得空间中固有的时空关系。加权图表示节点之间的时空特征，解决了动态图中固有的程度不平衡问题。我们将异步并行分层强化学习与多层次图表示和多头注意力机制相结合，使互联自动驾驶汽车（CAV）展现出与人类认知类似的能力，促进在各种关键维度上做出一致的决策。提议的决策策略在具有高密度、随机性和动态性特点的高速公路环境中得到了验证。我们通过相关研究、比较分析和时空轨迹评估来评估我们框架的性能。本研究对决策机制进行了定量分析，这些机制反映了人类在异质混合自主领域的认知功能，促进了多维决策策略的发展和注意力资源的精密分配。

{"title":"Multilevel Graph Reinforcement Learning for Consistent Cognitive Decision-making in Heterogeneous Mixed Autonomy","authors":"Xin Gao, Zhaoyang Ma, Xueyuan Li, Xiaoqiang Meng, Zirui Li","doi":"arxiv-2408.08516","DOIUrl":"https://doi.org/arxiv-2408.08516","url":null,"abstract":"In the realm of heterogeneous mixed autonomy, vehicles experience dynamic\u0000spatial correlations and nonlinear temporal interactions in a complex,\u0000non-Euclidean space. These complexities pose significant challenges to\u0000traditional decision-making frameworks. Addressing this, we propose a\u0000hierarchical reinforcement learning framework integrated with multilevel graph\u0000representations, which effectively comprehends and models the spatiotemporal\u0000interactions among vehicles navigating through uncertain traffic conditions\u0000with varying decision-making systems. Rooted in multilevel graph representation\u0000theory, our approach encapsulates spatiotemporal relationships inherent in\u0000non-Euclidean spaces. A weighted graph represents spatiotemporal features\u0000between nodes, addressing the degree imbalance inherent in dynamic graphs. We\u0000integrate asynchronous parallel hierarchical reinforcement learning with a\u0000multilevel graph representation and a multi-head attention mechanism, which\u0000enables connected autonomous vehicles (CAVs) to exhibit capabilities akin to\u0000human cognition, facilitating consistent decision-making across various\u0000critical dimensions. The proposed decision-making strategy is validated in\u0000challenging environments characterized by high density, randomness, and\u0000dynamism on highway roads. We assess the performance of our framework through\u0000ablation studies, comparative analyses, and spatiotemporal trajectory\u0000evaluations. This study presents a quantitative analysis of decision-making\u0000mechanisms mirroring human cognitive functions in the realm of heterogeneous\u0000mixed autonomy, promoting the development of multi-dimensional decision-making\u0000strategies and a sophisticated distribution of attentional resources.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ASGM-KG: Unveiling Alluvial Gold Mining Through Knowledge Graphs ASGM-KG：通过知识图谱揭开砂金开采的神秘面纱

arXiv - CS - Multiagent Systems

Pub Date : 2024-08-16 DOI: arxiv-2408.08972

Debashis Gupta, Aditi Golder, Luis Fernendez, Miles Silman, Greg Lersen, Fan Yang, Bob Plemmons, Sarra Alqahtani, Paul Victor Pauca

Artisanal and Small-Scale Gold Mining (ASGM) is a low-cost yet highlydestructive mining practice, leading to environmental disasters across theworld's tropical watersheds. The topic of ASGM spans multiple domains ofresearch and information, including natural and social systems, and knowledgeis often atomized across a diversity of media and documents. We thereforeintroduce a knowledge graph (ASGM-KG) that consolidates and provides crucialinformation about ASGM practices and their environmental effects. The currentversion of ASGM-KG consists of 1,899 triples extracted using a large languagemodel (LLM) from documents and reports published by both non-governmental andgovernmental organizations. These documents were carefully selected by a groupof tropical ecologists with expertise in ASGM. This knowledge graph wasvalidated using two methods. First, a small team of ASGM experts reviewed andlabeled triples as factual or non-factual. Second, we devised and applied anautomated factual reduction framework that relies on a search engine and an LLMfor labeling triples. Our framework performs as well as five baselines on apublicly available knowledge graph and achieves over 90 accuracy on our ASGM-KGvalidated by domain experts. ASGM-KG demonstrates an advancement in knowledgeaggregation and representation for complex, interdisciplinary environmentalcrises such as ASGM.

个体和小规模采金业（ASGM）是一种低成本但破坏性极大的采矿活动，在全球热带流域造成了环境灾难。个体和小规模采金业横跨多个研究和信息领域，包括自然和社会系统，而知识往往被分散在各种媒体和文件中。因此，我们引入了一个知识图谱（ASGM-KG），它整合并提供了有关个体和小规模采金业实践及其环境影响的重要信息。当前版本的 ASGM-KG 由 1,899 个三元组组成，这些三元组使用大型语言模型 (LLM) 从非政府组织和政府组织发布的文件和报告中提取。这些文件都是由一组具有 ASGM 专业知识的热带生态学家精心挑选的。该知识图谱通过两种方法进行验证。首先，一小组 ASGM 专家对三元组进行审查，并将其标记为事实或非事实。其次，我们设计并应用了一个自动事实还原框架，该框架依赖于搜索引擎和 LLM 来标记三元组。我们的框架在公开知识图谱上的表现与五条基准线不相上下，在经领域专家验证的 ASGM-KG 上的准确率超过 90%。ASGM-KG 展示了针对复杂、跨学科环境危机（如 ASGM）的知识聚合和表示方法的进步。

{"title":"ASGM-KG: Unveiling Alluvial Gold Mining Through Knowledge Graphs","authors":"Debashis Gupta, Aditi Golder, Luis Fernendez, Miles Silman, Greg Lersen, Fan Yang, Bob Plemmons, Sarra Alqahtani, Paul Victor Pauca","doi":"arxiv-2408.08972","DOIUrl":"https://doi.org/arxiv-2408.08972","url":null,"abstract":"Artisanal and Small-Scale Gold Mining (ASGM) is a low-cost yet highly\u0000destructive mining practice, leading to environmental disasters across the\u0000world's tropical watersheds. The topic of ASGM spans multiple domains of\u0000research and information, including natural and social systems, and knowledge\u0000is often atomized across a diversity of media and documents. We therefore\u0000introduce a knowledge graph (ASGM-KG) that consolidates and provides crucial\u0000information about ASGM practices and their environmental effects. The current\u0000version of ASGM-KG consists of 1,899 triples extracted using a large language\u0000model (LLM) from documents and reports published by both non-governmental and\u0000governmental organizations. These documents were carefully selected by a group\u0000of tropical ecologists with expertise in ASGM. This knowledge graph was\u0000validated using two methods. First, a small team of ASGM experts reviewed and\u0000labeled triples as factual or non-factual. Second, we devised and applied an\u0000automated factual reduction framework that relies on a search engine and an LLM\u0000for labeling triples. Our framework performs as well as five baselines on a\u0000publicly available knowledge graph and achieves over 90 accuracy on our ASGM-KG\u0000validated by domain experts. ASGM-KG demonstrates an advancement in knowledge\u0000aggregation and representation for complex, interdisciplinary environmental\u0000crises such as ASGM.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AgentSimulator: An Agent-based Approach for Data-driven Business Process Simulation AgentSimulator：基于代理的数据驱动型业务流程模拟方法

arXiv - CS - Multiagent Systems

Pub Date : 2024-08-16 DOI: arxiv-2408.08571

Lukas Kirchdorfer, Robert Blümel, Timotheus Kampik, Han van der Aa, Heiner Stuckenschmidt

Business process simulation (BPS) is a versatile technique for estimatingprocess performance across various scenarios. Traditionally, BPS approachesemploy a control-flow-first perspective by enriching a process model withsimulation parameters. Although such approaches can mimic the behavior ofcentrally orchestrated processes, such as those supported by workflow systems,current control-flow-first approaches cannot faithfully capture the dynamics ofreal-world processes that involve distinct resource behavior and decentralizeddecision-making. Recognizing this issue, this paper introduces AgentSimulator,a resource-first BPS approach that discovers a multi-agent system from an eventlog, modeling distinct resource behaviors and interaction patterns to simulatethe underlying process. Our experiments show that AgentSimulator achievesstate-of-the-art simulation accuracy with significantly lower computation timesthan existing approaches while providing high interpretability and adaptabilityto different types of process-execution scenarios.

业务流程模拟（BPS）是一种多用途技术，用于估算各种情况下的流程性能。传统的 BPS 方法采用控制流优先的视角，通过模拟参数来丰富流程模型。虽然这种方法可以模仿集中协调流程的行为，如工作流系统支持的流程，但目前的控制流优先方法无法忠实捕捉现实世界流程的动态，因为这些流程涉及不同的资源行为和分散决策。认识到这一问题后，本文引入了 AgentSimulator，这是一种资源优先的 BPS 方法，它能从事件日志中发现多代理系统，模拟不同的资源行为和交互模式，从而模拟底层流程。我们的实验表明，与现有方法相比，AgentSimulator 的计算时间大大缩短，达到了最先进的仿真精度，同时对不同类型的流程执行场景具有很高的可解释性和适应性。

{"title":"AgentSimulator: An Agent-based Approach for Data-driven Business Process Simulation","authors":"Lukas Kirchdorfer, Robert Blümel, Timotheus Kampik, Han van der Aa, Heiner Stuckenschmidt","doi":"arxiv-2408.08571","DOIUrl":"https://doi.org/arxiv-2408.08571","url":null,"abstract":"Business process simulation (BPS) is a versatile technique for estimating\u0000process performance across various scenarios. Traditionally, BPS approaches\u0000employ a control-flow-first perspective by enriching a process model with\u0000simulation parameters. Although such approaches can mimic the behavior of\u0000centrally orchestrated processes, such as those supported by workflow systems,\u0000current control-flow-first approaches cannot faithfully capture the dynamics of\u0000real-world processes that involve distinct resource behavior and decentralized\u0000decision-making. Recognizing this issue, this paper introduces AgentSimulator,\u0000a resource-first BPS approach that discovers a multi-agent system from an event\u0000log, modeling distinct resource behaviors and interaction patterns to simulate\u0000the underlying process. Our experiments show that AgentSimulator achieves\u0000state-of-the-art simulation accuracy with significantly lower computation times\u0000than existing approaches while providing high interpretability and adaptability\u0000to different types of process-execution scenarios.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Independent Policy Mirror Descent for Markov Potential Games: Scaling to Large Number of Players 马尔可夫势能博弈的独立策略镜像后裔：扩展到大量玩家

arXiv - CS - Multiagent Systems

Pub Date : 2024-08-15 DOI: arxiv-2408.08075

Pragnya Alatur, Anas Barakat, Niao He

Markov Potential Games (MPGs) form an important sub-class of Markov games,which are a common framework to model multi-agent reinforcement learningproblems. In particular, MPGs include as a special case the identical-interestsetting where all the agents share the same reward function. Scaling theperformance of Nash equilibrium learning algorithms to a large number of agentsis crucial for multi-agent systems. To address this important challenge, wefocus on the independent learning setting where agents can only have access totheir local information to update their own policy. In prior work on MPGs, theiteration complexity for obtaining $epsilon$-Nash regret scales linearly withthe number of agents $N$. In this work, we investigate the iteration complexityof an independent policy mirror descent (PMD) algorithm for MPGs. We show thatPMD with KL regularization, also known as natural policy gradient, enjoys abetter $sqrt{N}$ dependence on the number of agents, improving over PMD withEuclidean regularization and prior work. Furthermore, the iteration complexityis also independent of the sizes of the agents' action spaces.

马尔可夫势能博弈（MPGs）是马尔可夫博弈的一个重要子类，是多代理强化学习问题建模的常用框架。尤其是，马尔可夫势能博弈包括所有代理共享相同奖励函数的相同利益设置这一特例。将纳什均衡学习算法的性能扩展到大量的代理对于多代理系统来说至关重要。为了应对这一重要挑战，我们把重点放在了独立学习设置上，在这种设置中，代理只能获取自己的本地信息来更新自己的策略。在之前关于多代理系统的研究中，获得 $epsilon$-Nash regret 的迭代复杂度与代理数量 $N$ 成线性比例。在这项工作中，我们研究了 MPGs 独立策略镜像下降（PMD）算法的迭代复杂度。我们的研究表明，采用 KL 正则化的 PMD（也称为自然策略梯度）与代理数量的关系为 $sqrt{N}$，比采用欧几里得正则化的 PMD 和之前的研究都要好。此外，迭代复杂度也与代理的行动空间大小无关。

{"title":"Independent Policy Mirror Descent for Markov Potential Games: Scaling to Large Number of Players","authors":"Pragnya Alatur, Anas Barakat, Niao He","doi":"arxiv-2408.08075","DOIUrl":"https://doi.org/arxiv-2408.08075","url":null,"abstract":"Markov Potential Games (MPGs) form an important sub-class of Markov games,\u0000which are a common framework to model multi-agent reinforcement learning\u0000problems. In particular, MPGs include as a special case the identical-interest\u0000setting where all the agents share the same reward function. Scaling the\u0000performance of Nash equilibrium learning algorithms to a large number of agents\u0000is crucial for multi-agent systems. To address this important challenge, we\u0000focus on the independent learning setting where agents can only have access to\u0000their local information to update their own policy. In prior work on MPGs, the\u0000iteration complexity for obtaining $epsilon$-Nash regret scales linearly with\u0000the number of agents $N$. In this work, we investigate the iteration complexity\u0000of an independent policy mirror descent (PMD) algorithm for MPGs. We show that\u0000PMD with KL regularization, also known as natural policy gradient, enjoys a\u0000better $sqrt{N}$ dependence on the number of agents, improving over PMD with\u0000Euclidean regularization and prior work. Furthermore, the iteration complexity\u0000is also independent of the sizes of the agents' action spaces.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EmBARDiment: an Embodied AI Agent for Productivity in XR EmBARDiment：用于提高 XR 生产率的嵌入式人工智能代理

arXiv - CS - Multiagent Systems

Pub Date : 2024-08-15 DOI: arxiv-2408.08158

Riccardo Bovo, Steven Abreu, Karan Ahuja, Eric J Gonzalez, Li-Te Cheng, Mar Gonzalez-Franco

XR devices running chat-bots powered by Large Language Models (LLMs) havetremendous potential as always-on agents that can enable much betterproductivity scenarios. However, screen based chat-bots do not take advantageof the the full-suite of natural inputs available in XR, including inwardfacing sensor data, instead they over-rely on explicit voice or text prompts,sometimes paired with multi-modal data dropped as part of the query. We proposea solution that leverages an attention framework that derives contextimplicitly from user actions, eye-gaze, and contextual memory within the XRenvironment. This minimizes the need for engineered explicit prompts, fosteringgrounded and intuitive interactions that glean user insights for the chat-bot.Our user studies demonstrate the imminent feasibility and transformativepotential of our approach to streamline user interaction in XR with chat-bots,while offering insights for the design of future XR-embodied LLM agents.

运行由大型语言模型（LLM）驱动的聊天机器人的 XR 设备作为始终在线的代理具有巨大的潜力，可以大大提高工作效率。然而，基于屏幕的聊天机器人并没有利用 XR 中可用的全套自然输入，包括面向内部的传感器数据，而是过度依赖明确的语音或文本提示，有时还搭配作为查询一部分的多模态数据。我们提出的解决方案利用了一种注意力框架，该框架可以从 XR 环境中的用户行为、眼球注视和上下文记忆中获取上下文。我们的用户研究证明了我们的方法在 XR 中简化用户与聊天机器人交互的迫切可行性和变革潜力，同时也为未来 XR 嵌入式 LLM 代理的设计提供了启示。

引用次数: 0

Bridging Training and Execution via Dynamic Directed Graph-Based Communication in Cooperative Multi-Agent Systems 在多代理合作系统中通过基于动态有向图的通信架起训练与执行的桥梁

arXiv - CS - Multiagent Systems

Pub Date : 2024-08-14 DOI: arxiv-2408.07397

Zhuohui Zhang, Bin He, Bin Cheng, Gang Li

Multi-agent systems must learn to communicate and understand interactionsbetween agents to achieve cooperative goals in partially observed tasks.However, existing approaches lack a dynamic directed communication mechanismand rely on global states, thus diminishing the role of communication incentralized training. Thus, we propose the transformer-based graph coarseningnetwork (TGCNet), a novel multi-agent reinforcement learning (MARL) algorithm.TGCNet learns the topological structure of a dynamic directed graph torepresent the communication policy and integrates graph coarsening networks toapproximate the representation of global state during training. It alsoutilizes the transformer decoder for feature extraction during execution.Experiments on multiple cooperative MARL benchmarks demonstratestate-of-the-art performance compared to popular MARL algorithms. Furtherablation studies validate the effectiveness of our dynamic directed graphcommunication mechanism and graph coarsening networks.

多代理系统必须学会交流并理解代理之间的互动，才能在部分观察任务中实现合作目标。然而，现有方法缺乏动态有向交流机制，而且依赖于全局状态，从而削弱了交流在集中训练中的作用。因此，我们提出了一种新颖的多代理强化学习（MARL）算法--基于变换器的图粗化网络（TGCNet）。TGCNet 学习动态有向图的拓扑结构来表示通信策略，并整合图粗化网络以在训练过程中近似表示全局状态。在多个合作 MARL 基准上进行的实验表明，与流行的 MARL 算法相比，TGCNet 具有最先进的性能。进一步的研究验证了我们的动态有向图通信机制和图粗化网络的有效性。

引用次数: 0

A Nested Graph Reinforcement Learning-based Decision-making Strategy for Eco-platooning 基于嵌套图强化学习的生态排队决策策略

arXiv - CS - Multiagent Systems

Pub Date : 2024-08-14 DOI: arxiv-2408.07578

Xin Gao, Xueyuan Li, Hao Liu, Ao Li, Zhaoyang Ma, Zirui Li

Platooning technology is renowned for its precise vehicle control, trafficflow optimization, and energy efficiency enhancement. However, in large-scalemixed platoons, vehicle heterogeneity and unpredictable traffic conditions leadto virtual bottlenecks. These bottlenecks result in reduced traffic throughputand increased energy consumption within the platoon. To address thesechallenges, we introduce a decision-making strategy based on nested graphreinforcement learning. This strategy improves collaborative decision-making,ensuring energy efficiency and alleviating congestion. We propose a theory ofnested traffic graph representation that maps dynamic interactions betweenvehicles and platoons in non-Euclidean spaces. By incorporating spatio-temporalweighted graph into a multi-head attention mechanism, we further enhance themodel's capacity to process both local and global data. Additionally, we havedeveloped a nested graph reinforcement learning framework to enhance theself-iterative learning capabilities of platooning. Using the I-24 dataset, wedesigned and conducted comparative algorithm experiments, generalizabilitytesting, and permeability ablation experiments, thereby validating the proposedstrategy's effectiveness. Compared to the baseline, our strategy increasesthroughput by 10% and decreases energy use by 9%. Specifically, increasing thepenetration rate of CAVs significantly enhances traffic throughput, though italso increases energy consumption.

排车技术以其精确的车辆控制、交通流优化和能源效率提升而闻名。然而，在大规模混合编队中，车辆的异质性和不可预测的交通状况会导致虚拟瓶颈。这些瓶颈会导致排内交通吞吐量降低和能耗增加。为了应对这些挑战，我们引入了一种基于嵌套图强化学习的决策策略。该策略改进了协同决策，确保了能源效率并缓解了拥堵。我们提出了一种嵌套交通图表示理论，它可以映射非欧几里得空间中车辆和排之间的动态交互。通过将时空加权图纳入多头关注机制，我们进一步增强了模型处理本地和全局数据的能力。此外，我们还开发了嵌套图强化学习框架，以增强排线的自我迭代学习能力。利用I-24数据集，我们设计并进行了算法对比实验、泛化性测试和渗透性消减实验，从而验证了所提策略的有效性。与基线相比，我们的策略将吞吐量提高了 10%，能耗降低了 9%。具体来说，提高 CAV 的渗透率可显著提高流量吞吐量，但同时也会增加能耗。

{"title":"A Nested Graph Reinforcement Learning-based Decision-making Strategy for Eco-platooning","authors":"Xin Gao, Xueyuan Li, Hao Liu, Ao Li, Zhaoyang Ma, Zirui Li","doi":"arxiv-2408.07578","DOIUrl":"https://doi.org/arxiv-2408.07578","url":null,"abstract":"Platooning technology is renowned for its precise vehicle control, traffic\u0000flow optimization, and energy efficiency enhancement. However, in large-scale\u0000mixed platoons, vehicle heterogeneity and unpredictable traffic conditions lead\u0000to virtual bottlenecks. These bottlenecks result in reduced traffic throughput\u0000and increased energy consumption within the platoon. To address these\u0000challenges, we introduce a decision-making strategy based on nested graph\u0000reinforcement learning. This strategy improves collaborative decision-making,\u0000ensuring energy efficiency and alleviating congestion. We propose a theory of\u0000nested traffic graph representation that maps dynamic interactions between\u0000vehicles and platoons in non-Euclidean spaces. By incorporating spatio-temporal\u0000weighted graph into a multi-head attention mechanism, we further enhance the\u0000model's capacity to process both local and global data. Additionally, we have\u0000developed a nested graph reinforcement learning framework to enhance the\u0000self-iterative learning capabilities of platooning. Using the I-24 dataset, we\u0000designed and conducted comparative algorithm experiments, generalizability\u0000testing, and permeability ablation experiments, thereby validating the proposed\u0000strategy's effectiveness. Compared to the baseline, our strategy increases\u0000throughput by 10% and decreases energy use by 9%. Specifically, increasing the\u0000penetration rate of CAVs significantly enhances traffic throughput, though it\u0000also increases energy consumption.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0