Adaptive Agents and Multi-Agent Systems最新文献

英文中文

Establishing Shared Query Understanding in an Open Multi-Agent System 在开放多智能体系统中建立共享查询理解

Adaptive Agents and Multi-Agent Systems

Pub Date : 2023-05-16 DOI: 10.5555/3545946.3598649

Nikolaos Kondylidis, Ilaria Tiddi, A. T. Teije

We propose a method that allows to develop shared understanding between two agents for the purpose of performing a task that requires cooperation. Our method focuses on efficiently establishing successful task-oriented communication in an open multi-agent system, where the agents do not know anything about each other and can only communicate via grounded interaction. The method aims to assist researchers that work on human-machine interaction or scenarios that require a human-in-the-loop, by defining interaction restrictions and efficiency metrics. To that end, we point out the challenges and limitations of such a (diverse) setup, while also restrictions and requirements which aim to ensure that high task performance truthfully reflects the extent to which the agents correctly understand each other. Furthermore, we demonstrate a use-case where our method can be applied for the task of cooperative query answering. We design the experiments by modifying an established ontology alignment benchmark. In this example, the agents want to query each other, while representing different databases, defined in their own ontologies that contain different and incomplete knowledge. Grounded interaction here has the form of examples that consists of common instances, for which the agents are expected to have similar knowledge. Our experiments demonstrate successful communication establishment under the required restrictions, and compare different agent policies that aim to solve the task in an efficient manner.

我们提出了一种方法，允许在两个代理之间建立共享理解，以执行需要合作的任务。我们的方法侧重于在一个开放的多智能体系统中有效地建立成功的面向任务的通信，其中智能体彼此一无所知，只能通过接地交互进行通信。该方法旨在通过定义交互限制和效率指标，帮助研究人机交互或需要人在环的场景的研究人员。为此，我们指出了这种(多样化)设置的挑战和局限性，同时也指出了旨在确保高任务性能真实反映智能体正确理解彼此程度的限制和要求。此外，我们还演示了一个用例，其中我们的方法可以应用于协作查询应答任务。我们通过修改已建立的本体对齐基准来设计实验。在这个例子中，代理希望相互查询，同时表示不同的数据库，这些数据库定义在它们自己的本体中，包含不同的和不完整的知识。这里的基础交互具有由共同实例组成的示例的形式，对于这些实例，期望代理具有相似的知识。我们的实验证明了在要求的限制下成功地建立了通信，并比较了旨在以有效的方式解决任务的不同代理策略。

{"title":"Establishing Shared Query Understanding in an Open Multi-Agent System","authors":"Nikolaos Kondylidis, Ilaria Tiddi, A. T. Teije","doi":"10.5555/3545946.3598649","DOIUrl":"https://doi.org/10.5555/3545946.3598649","url":null,"abstract":"We propose a method that allows to develop shared understanding between two agents for the purpose of performing a task that requires cooperation. Our method focuses on efficiently establishing successful task-oriented communication in an open multi-agent system, where the agents do not know anything about each other and can only communicate via grounded interaction. The method aims to assist researchers that work on human-machine interaction or scenarios that require a human-in-the-loop, by defining interaction restrictions and efficiency metrics. To that end, we point out the challenges and limitations of such a (diverse) setup, while also restrictions and requirements which aim to ensure that high task performance truthfully reflects the extent to which the agents correctly understand each other. Furthermore, we demonstrate a use-case where our method can be applied for the task of cooperative query answering. We design the experiments by modifying an established ontology alignment benchmark. In this example, the agents want to query each other, while representing different databases, defined in their own ontologies that contain different and incomplete knowledge. Grounded interaction here has the form of examples that consists of common instances, for which the agents are expected to have similar knowledge. Our experiments demonstrate successful communication establishment under the required restrictions, and compare different agent policies that aim to solve the task in an efficient manner.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130139943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SocialLight: Distributed Cooperation Learning towards Network-Wide Traffic Signal Control SocialLight:面向全网交通信号控制的分布式合作学习

Adaptive Agents and Multi-Agent Systems

Pub Date : 2023-04-20 DOI: 10.5555/3545946.3598809

Harsh Goel, Yifeng Zhang, Mehul Damani, Guillaume Sartoretti

Many recent works have turned to multi-agent reinforcement learning (MARL) for adaptive traffic signal control to optimize the travel time of vehicles over large urban networks. However, achieving effective and scalable cooperation among junctions (agents) remains an open challenge, as existing methods often rely on extensive, non-generalizable reward shaping or on non-scalable centralized learning. To address these problems, we propose a new MARL method for traffic signal control, SocialLight, which learns cooperative traffic control policies by distributedly estimating the individual marginal contribution of agents on their local neighborhood. SocialLight relies on the Asynchronous Actor Critic (A3C) framework, and makes learning scalable by learning a locally-centralized critic conditioned over the states and actions of neighboring agents, used by agents to estimate individual contributions by counterfactual reasoning. We further introduce important modifications to the advantage calculation that help stabilize policy updates. These modifications decouple the impact of the neighbors' actions on the computed advantages, thereby reducing the variance in the gradient updates. We benchmark our trained network against state-of-the-art traffic signal control methods on standard benchmarks in two traffic simulators, SUMO and CityFlow. Our results show that SocialLight exhibits improved scalability to larger road networks and better performance across usual traffic metrics.

近年来，许多研究转向多智能体强化学习(MARL)用于自适应交通信号控制，以优化大型城市网络中车辆的行驶时间。然而，在结点(智能体)之间实现有效和可扩展的合作仍然是一个开放的挑战，因为现有的方法通常依赖于广泛的、不可推广的奖励形成或不可扩展的集中学习。为了解决这些问题，我们提出了一种新的用于交通信号控制的MARL方法SocialLight，该方法通过分布式估计agent在其局部邻域上的个体边际贡献来学习合作交通控制策略。SocialLight依赖于异步Actor评论家(A3C)框架，并通过学习一个局部集中的评论家来调节邻近代理的状态和行为，从而使学习具有可扩展性，代理使用该评论家通过反事实推理来估计个人贡献。我们进一步引入对优势计算的重要修改，以帮助稳定策略更新。这些修改解耦了邻居行为对计算优势的影响，从而减少了梯度更新中的方差。我们在两个交通模拟器SUMO和CityFlow的标准基准测试中，将我们训练过的网络与最先进的交通信号控制方法进行比较。我们的研究结果表明，SocialLight在更大的道路网络中表现出更好的可扩展性，在常规交通指标中表现出更好的性能。

{"title":"SocialLight: Distributed Cooperation Learning towards Network-Wide Traffic Signal Control","authors":"Harsh Goel, Yifeng Zhang, Mehul Damani, Guillaume Sartoretti","doi":"10.5555/3545946.3598809","DOIUrl":"https://doi.org/10.5555/3545946.3598809","url":null,"abstract":"Many recent works have turned to multi-agent reinforcement learning (MARL) for adaptive traffic signal control to optimize the travel time of vehicles over large urban networks. However, achieving effective and scalable cooperation among junctions (agents) remains an open challenge, as existing methods often rely on extensive, non-generalizable reward shaping or on non-scalable centralized learning. To address these problems, we propose a new MARL method for traffic signal control, SocialLight, which learns cooperative traffic control policies by distributedly estimating the individual marginal contribution of agents on their local neighborhood. SocialLight relies on the Asynchronous Actor Critic (A3C) framework, and makes learning scalable by learning a locally-centralized critic conditioned over the states and actions of neighboring agents, used by agents to estimate individual contributions by counterfactual reasoning. We further introduce important modifications to the advantage calculation that help stabilize policy updates. These modifications decouple the impact of the neighbors' actions on the computed advantages, thereby reducing the variance in the gradient updates. We benchmark our trained network against state-of-the-art traffic signal control methods on standard benchmarks in two traffic simulators, SUMO and CityFlow. Our results show that SocialLight exhibits improved scalability to larger road networks and better performance across usual traffic metrics.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133645109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Proportional Representation in Matching Markets: Selecting Multiple Matchings under Dichotomous Preferences 匹配市场中的比例代表制:二元偏好下的多重匹配选择

Adaptive Agents and Multi-Agent Systems

Pub Date : 2023-04-05 DOI: 10.5555/3535850.3535867

Niclas Boehmer, Markus Brill, Ulrike Schmidt-Kraepelin

Given a set of agents with approval preferences over each other, we study the task of finding k matchings fairly representing everyone’s preferences. To formalize fairness, we apply the concept of proportional representation as studied in approval-based multiwinner elections. To this end, we model the problem as a multiwinner election where the set of candidates consists of matchings of the agents, and agents’ preferences over each other are lifted to preferences over matchings. Due to the exponential number of candidates in such elections, standard algorithms for classical sequential voting rules (such as those proposed by Thiele and Phragmén) are rendered inefficient. We show that the computational tractability of these rules can be regained by exploiting the structure of the approval preferences. Moreover, we establish algorithmic results and axiomatic guarantees that go beyond those obtainable in the classical approval-based multiwinner setting: Assuming that approvals are symmetric, we show that Proportional Approval Voting (PAV), a well-established but computationally intractable voting rule, becomes polynomial-time computable, and that its sequential variant, which does not provide any proportionality guarantees in general, fulfills a rather strong guarantee known as extended justified representation. Some of our algorithmic results extend to other types of compactly representable elections with an exponential candidate space.

给定一组具有相互认可偏好的代理，我们研究的任务是找到公平代表每个人偏好的k个匹配。为了形式化公平，我们将比例代表制的概念应用于基于批准的多赢家选举中。为此，我们将问题建模为多赢家选举，其中候选人集由智能体的匹配组成，并且智能体对彼此的偏好被提升为对匹配的偏好。由于这种选举中的候选人数量呈指数级增长，经典顺序投票规则的标准算法(如Thiele和phragmsamen提出的那些)变得低效。我们证明了这些规则的计算可追溯性可以通过利用批准偏好的结构来恢复。此外，我们建立了超越经典的基于批准的多赢家设置的算法结果和公理保证:假设批准是对称的，我们证明了比例批准投票(PAV)，一个完善的但计算上难以处理的投票规则，变成了多项式时间可计算的，并且它的顺序变体，通常不提供任何比例保证，实现了一个相当强的保证，称为扩展合理表示。我们的一些算法结果扩展到具有指数候选空间的其他类型的紧可表示选举。

{"title":"Proportional Representation in Matching Markets: Selecting Multiple Matchings under Dichotomous Preferences","authors":"Niclas Boehmer, Markus Brill, Ulrike Schmidt-Kraepelin","doi":"10.5555/3535850.3535867","DOIUrl":"https://doi.org/10.5555/3535850.3535867","url":null,"abstract":"Given a set of agents with approval preferences over each other, we study the task of finding k matchings fairly representing everyone’s preferences. To formalize fairness, we apply the concept of proportional representation as studied in approval-based multiwinner elections. To this end, we model the problem as a multiwinner election where the set of candidates consists of matchings of the agents, and agents’ preferences over each other are lifted to preferences over matchings. Due to the exponential number of candidates in such elections, standard algorithms for classical sequential voting rules (such as those proposed by Thiele and Phragmén) are rendered inefficient. We show that the computational tractability of these rules can be regained by exploiting the structure of the approval preferences. Moreover, we establish algorithmic results and axiomatic guarantees that go beyond those obtainable in the classical approval-based multiwinner setting: Assuming that approvals are symmetric, we show that Proportional Approval Voting (PAV), a well-established but computationally intractable voting rule, becomes polynomial-time computable, and that its sequential variant, which does not provide any proportionality guarantees in general, fulfills a rather strong guarantee known as extended justified representation. Some of our algorithmic results extend to other types of compactly representable elections with an exponential candidate space.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123402642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Causality Detection for Efficient Multi-Agent Reinforcement Learning 高效多智能体强化学习的因果关系检测

Adaptive Agents and Multi-Agent Systems

Pub Date : 2023-03-24 DOI: 10.48550/arXiv.2303.14227

Rafael Pina, V. D. Silva, Corentin Artaud

When learning a task as a team, some agents in Multi-Agent Reinforcement Learning (MARL) may fail to understand their true impact in the performance of the team. Such agents end up learning sub-optimal policies, demonstrating undesired lazy behaviours. To investigate this problem, we start by formalising the use of temporal causality applied to MARL problems. We then show how causality can be used to penalise such lazy agents and improve their behaviours. By understanding how their local observations are causally related to the team reward, each agent in the team can adjust their individual credit based on whether they helped to cause the reward or not. We show empirically that using causality estimations in MARL improves not only the holistic performance of the team, but also the individual capabilities of each agent. We observe that the improvements are consistent in a set of different environments.

当作为一个团队学习一个任务时，多智能体强化学习(MARL)中的一些智能体可能无法理解他们对团队绩效的真正影响。这样的代理最终会学习次优策略，表现出不受欢迎的懒惰行为。为了研究这个问题，我们首先将时间因果关系应用于MARL问题的形式化。然后，我们展示了如何使用因果关系来惩罚这些懒惰的代理并改善他们的行为。通过了解他们的局部观察与团队奖励的因果关系，团队中的每个代理可以根据他们是否帮助获得奖励来调整他们的个人信用。我们的经验表明，在MARL中使用因果关系估计不仅可以提高团队的整体绩效，还可以提高每个代理的个人能力。我们观察到，这些改进在一组不同的环境中是一致的。

引用次数: 0

Presenting Multiagent Challenges in Team Sports Analytics 展示团队运动分析中的多智能体挑战

Adaptive Agents and Multi-Agent Systems

Pub Date : 2023-03-23 DOI: 10.48550/arXiv.2303.13660

D. Radke, Alexi Orchard

This paper draws correlations between several challenges and opportunities within the area of team sports analytics and key research areas within multiagent systems (MAS). We specifically consider invasion games, defined as sports where players invade the opposing team's territory and can interact anywhere on a playing surface such as ice hockey, soccer, and basketball. We argue that MAS is well-equipped to study invasion games and will benefit both MAS and sports analytics fields. Our discussion highlights areas for MAS implementation and further development along two axes: short-term in-game strategy (coaching) and long-term team planning (management).

本文在团队运动分析领域和多智能体系统(MAS)的关键研究领域中绘制了几个挑战和机遇之间的相关性。我们特别考虑了入侵游戏，即玩家入侵对方球队的领土并可以在任何场地进行互动的运动，如冰球、足球和篮球。我们认为，MAS有能力研究入侵游戏，这将有利于MAS和体育分析领域。我们的讨论强调了MAS实施和进一步发展的两个方面:短期游戏策略(教练)和长期团队计划(管理)。

引用次数: 2

Attention! Dynamic Epistemic Logic Models of (In)attentive Agents 注意!关注智能体的动态认知逻辑模型

Adaptive Agents and Multi-Agent Systems

Pub Date : 2023-03-23 DOI: 10.48550/arXiv.2303.13494

Gaia Belardinelli, Thomas Bolander

Attention is the crucial cognitive ability that limits and selects what information we observe. Previous work by Bolander et al. (2016) proposes a model of attention based on dynamic epistemic logic (DEL) where agents are either fully attentive or not attentive at all. While introducing the realistic feature that inattentive agents believe nothing happens, the model does not represent the most essential aspect of attention: its selectivity. Here, we propose a generalization that allows for paying attention to subsets of atomic formulas. We introduce the corresponding logic for propositional attention, and show its axiomatization to be sound and complete. We then extend the framework to account for inattentive agents that, instead of assuming nothing happens, may default to a specific truth-value of what they failed to attend to (a sort of prior concerning the unattended atoms). This feature allows for a more cognitively plausible representation of the inattentional blindness phenomenon, where agents end up with false beliefs due to their failure to attend to conspicuous but unexpected events. Both versions of the model define attention-based learning through appropriate DEL event models based on a few and clear edge principles. While the size of such event models grow exponentially both with the number of agents and the number of atoms, we introduce a new logical language for describing event models syntactically and show that using this language our event models can be represented linearly in the number of agents and atoms. Furthermore, representing our event models using this language is achieved by a straightforward formalisation of the aforementioned edge principles.

注意力是一种至关重要的认知能力，它限制和选择我们观察到的信息。Bolander等人(2016)之前的工作提出了一种基于动态认知逻辑(DEL)的注意力模型，其中智能体要么完全注意，要么根本不注意。虽然引入了不专注的主体相信什么都不会发生的现实特征，但该模型并没有代表注意力最基本的方面:它的选择性。在这里，我们提出一个泛化，允许关注原子公式的子集。我们引入了相应的命题注意逻辑，并证明了它的公理化是健全和完备的。然后，我们扩展这个框架来解释不注意的代理，而不是假设什么都没有发生，它们可能默认为它们没有注意到的特定真值(一种关于无人注意的原子的先验)。这一特征为无意失明现象提供了一种更合理的认知表征，在这种现象中，由于未能注意到明显但意外的事件，代理人最终产生了错误的信念。这两个版本的模型都通过基于一些明确的边缘原则的适当DEL事件模型来定义基于注意的学习。虽然此类事件模型的大小随着代理数量和原子数量呈指数级增长，但我们引入了一种新的逻辑语言来语法地描述事件模型，并表明使用这种语言我们的事件模型可以用代理和原子的数量线性表示。此外，使用这种语言表示我们的事件模型是通过上述边缘原则的直接形式化实现的。

{"title":"Attention! Dynamic Epistemic Logic Models of (In)attentive Agents","authors":"Gaia Belardinelli, Thomas Bolander","doi":"10.48550/arXiv.2303.13494","DOIUrl":"https://doi.org/10.48550/arXiv.2303.13494","url":null,"abstract":"Attention is the crucial cognitive ability that limits and selects what information we observe. Previous work by Bolander et al. (2016) proposes a model of attention based on dynamic epistemic logic (DEL) where agents are either fully attentive or not attentive at all. While introducing the realistic feature that inattentive agents believe nothing happens, the model does not represent the most essential aspect of attention: its selectivity. Here, we propose a generalization that allows for paying attention to subsets of atomic formulas. We introduce the corresponding logic for propositional attention, and show its axiomatization to be sound and complete. We then extend the framework to account for inattentive agents that, instead of assuming nothing happens, may default to a specific truth-value of what they failed to attend to (a sort of prior concerning the unattended atoms). This feature allows for a more cognitively plausible representation of the inattentional blindness phenomenon, where agents end up with false beliefs due to their failure to attend to conspicuous but unexpected events. Both versions of the model define attention-based learning through appropriate DEL event models based on a few and clear edge principles. While the size of such event models grow exponentially both with the number of agents and the number of atoms, we introduce a new logical language for describing event models syntactically and show that using this language our event models can be represented linearly in the number of agents and atoms. Furthermore, representing our event models using this language is achieved by a straightforward formalisation of the aforementioned edge principles.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133707780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Demonstrating Performance Benefits of Human-Swarm Teaming 展示人类群体团队的绩效效益

Adaptive Agents and Multi-Agent Systems

Pub Date : 2023-03-22 DOI: 10.48550/arXiv.2303.12390

William Hunt, Jack Ryan, A. Abioye, S. Ramchurn, Mohammad Divband Soorati

Autonomous swarms of robots can bring robustness, scalability and adaptability to safety-critical tasks such as search and rescue but their application is still very limited. Using semi-autonomous swarms with human control can bring robot swarms to real-world applications. Human operators can define goals for the swarm, monitor their performance and interfere with, or overrule, the decisions and behaviour. We present the ``Human And Robot Interactive Swarm'' simulator (HARIS) that allows multi-user interaction with a robot swarm and facilitates qualitative and quantitative user studies through simulation of robot swarms completing tasks, from package delivery to search and rescue, with varying levels of human control. In this demonstration, we showcase the simulator by using it to study the performance gain offered by maintaining a ``human-in-the-loop'' over a fully autonomous system as an example. This is illustrated in the context of search and rescue, with an autonomous allocation of resources to those in need.

自主机器人群可以为搜索和救援等安全关键任务带来鲁棒性、可扩展性和适应性，但它们的应用仍然非常有限。使用人类控制的半自治群体可以将机器人群体带入现实世界。人类操作员可以为蜂群定义目标，监控它们的表现，干预或否决决策和行为。我们提出了“人机交互群”模拟器(HARIS)，它允许多用户与机器人群交互，并通过模拟机器人群完成任务，从包裹递送到搜索和救援，在不同程度的人类控制下，促进定性和定量的用户研究。在本演示中，我们通过使用模拟器来研究通过在完全自主系统上维护“人在环”所提供的性能增益作为示例。这在搜索和救援的背景下得到了说明，资源自主分配给有需要的人。

引用次数: 2

Revealed Multi-Objective Utility Aggregation in Human Driving 揭示人类驾驶中的多目标效用聚集

Adaptive Agents and Multi-Agent Systems

Pub Date : 2023-03-13 DOI: 10.48550/arXiv.2303.07435

Atrisha Sarkar, K. Larson, K. Czarnecki

A central design problem in game theoretic analysis is the estimation of the players' utilities. In many real-world interactive situations of human decision making, including human driving, the utilities are multi-objective in nature; therefore, estimating the parameters of aggregation, i.e., mapping of multi-objective utilities to a scalar value, becomes an essential part of game construction. However, estimating this parameter from observational data introduces several challenges due to a host of unobservable factors, including the underlying modality of aggregation and the possibly boundedly rational behaviour model that generated the observation. Based on the concept of rationalisability, we develop algorithms for estimating multi-objective aggregation parameters for two common aggregation methods, weighted and satisficing aggregation, and for both strategic and non-strategic reasoning models. Based on three different datasets, we provide insights into how human drivers aggregate the utilities of safety and progress, as well as the situational dependence of the aggregation process. Additionally, we show that irrespective of the specific solution concept used for solving the games, a data-driven estimation of utility aggregation significantly improves the predictive accuracy of behaviour models with respect to observed human behaviour.

博弈论分析中的一个核心设计问题是对参与者效用的估计。在许多现实世界的人类决策互动情境中，包括人类驾驶，效用本质上是多目标的;因此，估计聚合参数，即多目标效用到标量值的映射，成为博弈构建的重要组成部分。然而，由于许多不可观测的因素，包括潜在的聚集模式和产生观测结果的可能有界理性行为模型，从观测数据估计该参数带来了一些挑战。基于合理性的概念，我们针对两种常见的聚合方法——加权聚合和满足聚合，以及策略和非策略推理模型，开发了多目标聚合参数的估计算法。基于三种不同的数据集，我们提供了人类驾驶员如何聚合安全和进步的效用，以及聚合过程的情境依赖性的见解。此外，我们表明，无论用于解决游戏的具体解决方案概念如何，数据驱动的效用聚合估计显着提高了行为模型相对于观察到的人类行为的预测准确性。

{"title":"Revealed Multi-Objective Utility Aggregation in Human Driving","authors":"Atrisha Sarkar, K. Larson, K. Czarnecki","doi":"10.48550/arXiv.2303.07435","DOIUrl":"https://doi.org/10.48550/arXiv.2303.07435","url":null,"abstract":"A central design problem in game theoretic analysis is the estimation of the players' utilities. In many real-world interactive situations of human decision making, including human driving, the utilities are multi-objective in nature; therefore, estimating the parameters of aggregation, i.e., mapping of multi-objective utilities to a scalar value, becomes an essential part of game construction. However, estimating this parameter from observational data introduces several challenges due to a host of unobservable factors, including the underlying modality of aggregation and the possibly boundedly rational behaviour model that generated the observation. Based on the concept of rationalisability, we develop algorithms for estimating multi-objective aggregation parameters for two common aggregation methods, weighted and satisficing aggregation, and for both strategic and non-strategic reasoning models. Based on three different datasets, we provide insights into how human drivers aggregate the utilities of safety and progress, as well as the situational dependence of the aggregation process. Additionally, we show that irrespective of the specific solution concept used for solving the games, a data-driven estimation of utility aggregation significantly improves the predictive accuracy of behaviour models with respect to observed human behaviour.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130869504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Strategic Planning for Flexible Agent Availability in Large Taxi Fleets 大型出租车车队灵活代理可用性的策略规划

Adaptive Agents and Multi-Agent Systems

Pub Date : 2023-03-08 DOI: 10.48550/arXiv.2303.04337

Rajiv Ranjan Kumar, Pradeep Varakantham, Shih-Fen Cheng

In large-scale multi-agent systems like taxi fleets, individual agents (taxi drivers) are self-interested (maximizing their own profits) and this can introduce inefficiencies in the system. One such inefficiency is with regard to the"required"availability of taxis at different time periods during the day. Since a taxi driver can work for a limited number of hours in a day (e.g., 8-10 hours in a city like Singapore), there is a need to optimize the specific hours, so as to maximize individual as well as social welfare. Technically, this corresponds to solving a large-scale multi-stage selfish routing game with transition uncertainty. Existing work in addressing this problem is either unable to handle ``driver"constraints (e.g., breaks during work hours) or not scalable. To that end, we provide a novel mechanism that builds on replicator dynamics through ideas from behavior cloning. We demonstrate that our methods provide significantly better policies than the existing approach in terms of improving individual agent revenue and overall agent availability.

在像出租车车队这样的大规模多智能体系统中，个体智能体(出租车司机)是自利的(最大化他们自己的利润)，这可能会导致系统效率低下。其中一个低效率是关于在一天的不同时间段“要求”出租车的可用性。由于出租车司机每天的工作时间有限(如新加坡等城市为8-10小时)，因此有必要优化具体的工作时间，以最大限度地提高个人和社会福利。从技术上讲，这对应于求解一个具有过渡不确定性的大规模多阶段自私路由博弈。解决此问题的现有工作要么无法处理“驱动程序”约束(例如，工作时间的休息)，要么不可扩展。为此，我们提供了一种新的机制，通过行为克隆的思想建立在复制因子动力学的基础上。我们证明，在提高个体代理收入和整体代理可用性方面，我们的方法提供了比现有方法更好的策略。

引用次数: 0

A Redistribution Framework for Diffusion Auctions 扩散拍卖的再分配框架

Adaptive Agents and Multi-Agent Systems

Pub Date : 2023-03-06 DOI: 10.48550/arXiv.2303.03075

Sizhe Gu, Yao Zhang, Yi-Zhou Zhao, Dengji Zhao

Redistribution mechanism design aims to redistribute the revenue collected by a truthful auction back to its participants without affecting the truthfulness. We study redistribution mechanisms for diffusion auctions, which is a new trend in mechanism design [19]. The key property of a diffusion auction is that the existing participants are incentivized to invite new participants to join the auctions. Hence, when we design redistributions, we also need to maintain this incentive. Existing redistribution mechanisms in the traditional setting are targeted at modifying the payment design of a truthful mechanism, such as the Vickrey auction. In this paper, we do not focus on one specific mechanism. Instead, we propose a general framework to redistribute the revenue back for all truthful diffusion auctions for selling a single item. The framework treats the original truthful diffusion auction as a black box, and it does not affect its truthfulness. The framework can also distribute back almost all the revenue.

再分配机制设计的目的是在不影响真实性的前提下，将真实拍卖所得的收益重新分配给参与者。我们研究扩散拍卖的再分配机制，这是机制设计的一个新趋势。扩散拍卖的关键特性是现有的参与者被激励去邀请新的参与者加入拍卖。因此，当我们设计再分配时，我们也需要保持这种激励。现有的再分配机制，在传统的设置是针对修改支付设计的一个真实的机制，如维克里拍卖。在本文中，我们不关注某一特定机制。相反，我们提出了一个通用的框架来重新分配销售单一物品的所有真实扩散拍卖的收入。该框架将原有的真实扩散拍卖视为一个黑盒子，不影响其真实性。该框架还可以分配回几乎所有的收入。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Adaptive Agents and Multi-Agent Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀