Nicolò Dal Fabbro, Arman Adibi, Aritra Mitra, George J. Pappas
Recent research endeavours have theoretically shown the beneficial effect of cooperation in multi-agent reinforcement learning (MARL). In a setting involving $N$ agents, this beneficial effect usually comes in the form of an $N$-fold linear convergence speedup, i.e., a reduction - proportional to $N$ - in the number of iterations required to reach a certain convergence precision. In this paper, we show for the first time that this speedup property also holds for a MARL framework subject to asynchronous delays in the local agents' updates. In particular, we consider a policy evaluation problem in which multiple agents cooperate to evaluate a common policy by communicating with a central aggregator. In this setting, we study the finite-time convergence of texttt{AsyncMATD}, an asynchronous multi-agent temporal difference (TD) learning algorithm in which agents' local TD update directions are subject to asynchronous bounded delays. Our main contribution is providing a finite-time analysis of texttt{AsyncMATD}, for which we establish a linear convergence speedup while highlighting the effect of time-varying asynchronous delays on the resulting convergence rate.
{"title":"Finite-Time Analysis of Asynchronous Multi-Agent TD Learning","authors":"Nicolò Dal Fabbro, Arman Adibi, Aritra Mitra, George J. Pappas","doi":"arxiv-2407.20441","DOIUrl":"https://doi.org/arxiv-2407.20441","url":null,"abstract":"Recent research endeavours have theoretically shown the beneficial effect of\u0000cooperation in multi-agent reinforcement learning (MARL). In a setting\u0000involving $N$ agents, this beneficial effect usually comes in the form of an\u0000$N$-fold linear convergence speedup, i.e., a reduction - proportional to $N$ -\u0000in the number of iterations required to reach a certain convergence precision.\u0000In this paper, we show for the first time that this speedup property also holds\u0000for a MARL framework subject to asynchronous delays in the local agents'\u0000updates. In particular, we consider a policy evaluation problem in which\u0000multiple agents cooperate to evaluate a common policy by communicating with a\u0000central aggregator. In this setting, we study the finite-time convergence of\u0000texttt{AsyncMATD}, an asynchronous multi-agent temporal difference (TD)\u0000learning algorithm in which agents' local TD update directions are subject to\u0000asynchronous bounded delays. Our main contribution is providing a finite-time\u0000analysis of texttt{AsyncMATD}, for which we establish a linear convergence\u0000speedup while highlighting the effect of time-varying asynchronous delays on\u0000the resulting convergence rate.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141871368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper investigates the utilization of Quantum Computing and Neuromorphic Computing for Safe, Reliable, and Explainable Multi_Agent Reinforcement Learning (MARL) in the context of optimal control in autonomous robotics. The objective was to address the challenges of optimizing the behavior of autonomous agents while ensuring safety, reliability, and explainability. Quantum Computing techniques, including Quantum Approximate Optimization Algorithm (QAOA), were employed to efficiently explore large solution spaces and find approximate solutions to complex MARL problems. Neuromorphic Computing, inspired by the architecture of the human brain, provided parallel and distributed processing capabilities, which were leveraged to develop intelligent and adaptive systems. The combination of these technologies held the potential to enhance the safety, reliability, and explainability of MARL in autonomous robotics. This research contributed to the advancement of autonomous robotics by exploring cutting-edge technologies and their applications in multi-agent systems. Codes and data are available.
{"title":"Quantum Computing and Neuromorphic Computing for Safe, Reliable, and explainable Multi-Agent Reinforcement Learning: Optimal Control in Autonomous Robotics","authors":"Mazyar Taghavi","doi":"arxiv-2408.03884","DOIUrl":"https://doi.org/arxiv-2408.03884","url":null,"abstract":"This paper investigates the utilization of Quantum Computing and Neuromorphic\u0000Computing for Safe, Reliable, and Explainable Multi_Agent Reinforcement\u0000Learning (MARL) in the context of optimal control in autonomous robotics. The\u0000objective was to address the challenges of optimizing the behavior of\u0000autonomous agents while ensuring safety, reliability, and explainability.\u0000Quantum Computing techniques, including Quantum Approximate Optimization\u0000Algorithm (QAOA), were employed to efficiently explore large solution spaces\u0000and find approximate solutions to complex MARL problems. Neuromorphic\u0000Computing, inspired by the architecture of the human brain, provided parallel\u0000and distributed processing capabilities, which were leveraged to develop\u0000intelligent and adaptive systems. The combination of these technologies held\u0000the potential to enhance the safety, reliability, and explainability of MARL in\u0000autonomous robotics. This research contributed to the advancement of autonomous\u0000robotics by exploring cutting-edge technologies and their applications in\u0000multi-agent systems. Codes and data are available.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"374 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141949342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jack Dippel, Max Dupré la Tour, April Niu, Sanjukta Roy, Adrian Vetta
Majority Illusion is a phenomenon in social networks wherein the decision by the majority of the network is not the same as one's personal social circle's majority, leading to an incorrect perception of the majority in a large network. In this paper, we present polynomial-time algorithms which can eliminate majority illusion in a network by altering as few connections as possible. Additionally, we prove that the more general problem of ensuring all neighbourhoods in the network are at least a $p$-fraction of the majority is NP-hard for most values of $p$.
{"title":"Eliminating Majority Illusion is Easy","authors":"Jack Dippel, Max Dupré la Tour, April Niu, Sanjukta Roy, Adrian Vetta","doi":"arxiv-2407.20187","DOIUrl":"https://doi.org/arxiv-2407.20187","url":null,"abstract":"Majority Illusion is a phenomenon in social networks wherein the decision by\u0000the majority of the network is not the same as one's personal social circle's\u0000majority, leading to an incorrect perception of the majority in a large\u0000network. In this paper, we present polynomial-time algorithms which can\u0000eliminate majority illusion in a network by altering as few connections as\u0000possible. Additionally, we prove that the more general problem of ensuring all\u0000neighbourhoods in the network are at least a $p$-fraction of the majority is\u0000NP-hard for most values of $p$.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"78 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141871369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The proliferation of human-AI ecosystems involving human interaction with algorithms, such as assistants and recommenders, raises concerns about large-scale social behaviour. Despite evidence of such phenomena across several contexts, the collective impact of GPS navigation services remains unclear: while beneficial to the user, they can also cause chaos if too many vehicles are driven through the same few roads. Our study employs a simulation framework to assess navigation services' influence on road network usage and CO2 emissions. The results demonstrate a universal pattern of amplified conformity: increasing adoption rates of navigation services cause a reduction of route diversity of mobile travellers and increased concentration of traffic and emissions on fewer roads, thus exacerbating an unequal distribution of negative externalities on selected neighbourhoods. Although navigation services recommendations can help reduce CO2 emissions when their adoption rate is low, these benefits diminish or even disappear when the adoption rate is high and exceeds a certain city- and service-dependent threshold. We summarize these discoveries in a non-linear function that connects the marginal increase of conformity with the marginal reduction in CO2 emissions. Our simulation approach addresses the challenges posed by the complexity of transportation systems and the lack of data and algorithmic transparency.
{"title":"Navigation services amplify concentration of traffic and emissions in our cities","authors":"Giuliano Cornacchia, Mirco Nanni, Dino Pedreschi, Luca Pappalardo","doi":"arxiv-2407.20004","DOIUrl":"https://doi.org/arxiv-2407.20004","url":null,"abstract":"The proliferation of human-AI ecosystems involving human interaction with\u0000algorithms, such as assistants and recommenders, raises concerns about\u0000large-scale social behaviour. Despite evidence of such phenomena across several\u0000contexts, the collective impact of GPS navigation services remains unclear:\u0000while beneficial to the user, they can also cause chaos if too many vehicles\u0000are driven through the same few roads. Our study employs a simulation framework\u0000to assess navigation services' influence on road network usage and CO2\u0000emissions. The results demonstrate a universal pattern of amplified conformity:\u0000increasing adoption rates of navigation services cause a reduction of route\u0000diversity of mobile travellers and increased concentration of traffic and\u0000emissions on fewer roads, thus exacerbating an unequal distribution of negative\u0000externalities on selected neighbourhoods. Although navigation services\u0000recommendations can help reduce CO2 emissions when their adoption rate is low,\u0000these benefits diminish or even disappear when the adoption rate is high and\u0000exceeds a certain city- and service-dependent threshold. We summarize these\u0000discoveries in a non-linear function that connects the marginal increase of\u0000conformity with the marginal reduction in CO2 emissions. Our simulation\u0000approach addresses the challenges posed by the complexity of transportation\u0000systems and the lack of data and algorithmic transparency.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141871370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper explores the Mechanism Design aspects of the $m$-Capacitated Facility Location Problem where the total facility capacity is less than the number of agents. Following the framework outlined by Aziz et al., the Social Welfare of the facility location is determined through a First-Come-First-Served (FCFS) game, in which agents compete once the facility positions are established. When the number of facilities is $m > 1$, the Nash Equilibrium (NE) of the FCFS game is not unique, making the utility of the agents and the concept of truthfulness unclear. To tackle these issues, we consider absolutely truthful mechanisms, i.e. mechanisms that prevent agents from misreporting regardless of the strategies used during the FCFS game. We combine this stricter truthfulness requirement with the notion of Equilibrium Stable (ES) mechanisms, which are mechanisms whose Social Welfare does not depend on the NE of the FCFS game. We demonstrate that the class of percentile mechanisms is absolutely truthful and identify the conditions under which they are ES. We also show that the approximation ratio of each ES percentile mechanism is bounded and determine its value. Notably, when all the facilities have the same capacity and the number of agents is sufficiently large, it is possible to achieve an approximation ratio smaller than $1+frac{1}{2m-1}$. Finally, we extend our study to encompass higher-dimensional problems. Within this framework, we demonstrate that the class of ES percentile mechanisms is even more restricted and characterize the mechanisms that are both ES and absolutely truthful. We further support our findings by empirically evaluating the performance of the mechanisms when the agents are the samples of a distribution.
本文探讨了总设施容量小于代理数量的 $m$ 有能力设施位置问题的机制设计问题。按照 Aziz 等人概述的框架,设施位置的社会福利通过先到先得(FCFS)博弈来确定,在博弈中,一旦设施位置确定,代理就会展开竞争。当设施数量为 $m > 1$ 时,FCFS 博弈的纳什均衡(NE)并不是唯一的,这使得代理的效用和真实性的概念变得不明确。为了解决这些问题,我们考虑了绝对真实的机制,即无论在 FCFS 博弈过程中使用何种策略,都能防止代理人误报的机制。我们将这一更严格的真实性要求与均衡稳定(ES)机制的概念相结合,后者是指社会福利不依赖于 FCFS 博弈的 NE 的机制。我们证明了百分数机制是绝对真实的,并确定了它们成为 ES 机制的条件。我们还证明了每种 ES 百分比机制的近似率都是有界的,并确定了其值。值得注意的是,当所有设施的容量相同且代理数量足够大时,有可能实现小于 1+frac{1}{2m-1}$ 的近似率。最后,我们将研究扩展到了高维问题。在这个框架下,我们证明了 ES 百分位机制的类别甚至受到了更大的限制,并描述了既是 ES 又是绝对真实的机制的特征。我们还通过实证评估了代理人作为分布样本时的机制性能,进一步支持了我们的发现。
{"title":"Mechanism Design for Locating Facilities with Capacities with Insufficient Resources","authors":"Gennaro Auricchio, Harry J. Clough, Jie Zhang","doi":"arxiv-2407.18547","DOIUrl":"https://doi.org/arxiv-2407.18547","url":null,"abstract":"This paper explores the Mechanism Design aspects of the $m$-Capacitated\u0000Facility Location Problem where the total facility capacity is less than the\u0000number of agents. Following the framework outlined by Aziz et al., the Social\u0000Welfare of the facility location is determined through a\u0000First-Come-First-Served (FCFS) game, in which agents compete once the facility\u0000positions are established. When the number of facilities is $m > 1$, the Nash\u0000Equilibrium (NE) of the FCFS game is not unique, making the utility of the\u0000agents and the concept of truthfulness unclear. To tackle these issues, we\u0000consider absolutely truthful mechanisms, i.e. mechanisms that prevent agents\u0000from misreporting regardless of the strategies used during the FCFS game. We\u0000combine this stricter truthfulness requirement with the notion of Equilibrium\u0000Stable (ES) mechanisms, which are mechanisms whose Social Welfare does not\u0000depend on the NE of the FCFS game. We demonstrate that the class of percentile\u0000mechanisms is absolutely truthful and identify the conditions under which they\u0000are ES. We also show that the approximation ratio of each ES percentile\u0000mechanism is bounded and determine its value. Notably, when all the facilities\u0000have the same capacity and the number of agents is sufficiently large, it is\u0000possible to achieve an approximation ratio smaller than $1+frac{1}{2m-1}$.\u0000Finally, we extend our study to encompass higher-dimensional problems. Within\u0000this framework, we demonstrate that the class of ES percentile mechanisms is\u0000even more restricted and characterize the mechanisms that are both ES and\u0000absolutely truthful. We further support our findings by empirically evaluating\u0000the performance of the mechanisms when the agents are the samples of a\u0000distribution.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141871371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In social decision-making among strategic agents, a universal focus lies on the balance between social and individual interests. Socially efficient mechanisms are thus desirably designed to not only maximize the social welfare but also incentivize the agents for their own profit. Under a generalized model that includes applications such as double auctions and trading networks, this study establishes a socially efficient (SE), dominant-strategy incentive compatible (DSIC), and individually rational (IR) mechanism with the minimum total budget expensed to the agents. The present method exploits discrete and known type domains to reduce a set of constraints into the shortest path problem in a weighted graph. In addition to theoretical derivation, we substantiate the optimality of the proposed mechanism through numerical experiments, where it certifies strictly lower budget than Vickery-Clarke-Groves (VCG) mechanisms for a wide class of instances.
{"title":"Socially efficient mechanism on the minimum budget","authors":"Hirota Kinoshita, Takayuki Osogami, Kohei Miyaguchi","doi":"arxiv-2407.18515","DOIUrl":"https://doi.org/arxiv-2407.18515","url":null,"abstract":"In social decision-making among strategic agents, a universal focus lies on\u0000the balance between social and individual interests. Socially efficient\u0000mechanisms are thus desirably designed to not only maximize the social welfare\u0000but also incentivize the agents for their own profit. Under a generalized model\u0000that includes applications such as double auctions and trading networks, this\u0000study establishes a socially efficient (SE), dominant-strategy incentive\u0000compatible (DSIC), and individually rational (IR) mechanism with the minimum\u0000total budget expensed to the agents. The present method exploits discrete and\u0000known type domains to reduce a set of constraints into the shortest path\u0000problem in a weighted graph. In addition to theoretical derivation, we\u0000substantiate the optimality of the proposed mechanism through numerical\u0000experiments, where it certifies strictly lower budget than\u0000Vickery-Clarke-Groves (VCG) mechanisms for a wide class of instances.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141871372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dima Ivanov, Paul Dütting, Inbal Talgam-Cohen, Tonghan Wang, David C. Parkes
Contracts are the economic framework which allows a principal to delegate a task to an agent -- despite misaligned interests, and even without directly observing the agent's actions. In many modern reinforcement learning settings, self-interested agents learn to perform a multi-stage task delegated to them by a principal. We explore the significant potential of utilizing contracts to incentivize the agents. We model the delegated task as an MDP, and study a stochastic game between the principal and agent where the principal learns what contracts to use, and the agent learns an MDP policy in response. We present a learning-based algorithm for optimizing the principal's contracts, which provably converges to the subgame-perfect equilibrium of the principal-agent game. A deep RL implementation allows us to apply our method to very large MDPs with unknown transition dynamics. We extend our approach to multiple agents, and demonstrate its relevance to resolving a canonical sequential social dilemma with minimal intervention to agent rewards.
{"title":"Principal-Agent Reinforcement Learning","authors":"Dima Ivanov, Paul Dütting, Inbal Talgam-Cohen, Tonghan Wang, David C. Parkes","doi":"arxiv-2407.18074","DOIUrl":"https://doi.org/arxiv-2407.18074","url":null,"abstract":"Contracts are the economic framework which allows a principal to delegate a\u0000task to an agent -- despite misaligned interests, and even without directly\u0000observing the agent's actions. In many modern reinforcement learning settings,\u0000self-interested agents learn to perform a multi-stage task delegated to them by\u0000a principal. We explore the significant potential of utilizing contracts to\u0000incentivize the agents. We model the delegated task as an MDP, and study a\u0000stochastic game between the principal and agent where the principal learns what\u0000contracts to use, and the agent learns an MDP policy in response. We present a\u0000learning-based algorithm for optimizing the principal's contracts, which\u0000provably converges to the subgame-perfect equilibrium of the principal-agent\u0000game. A deep RL implementation allows us to apply our method to very large MDPs\u0000with unknown transition dynamics. We extend our approach to multiple agents,\u0000and demonstrate its relevance to resolving a canonical sequential social\u0000dilemma with minimal intervention to agent rewards.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advances in large language models (LLMs) have opened new avenues for applying multi-agent systems in very large-scale simulations. However, there remain several challenges when conducting multi-agent simulations with existing platforms, such as limited scalability and low efficiency, unsatisfied agent diversity, and effort-intensive management processes. To address these challenges, we develop several new features and components for AgentScope, a user-friendly multi-agent platform, enhancing its convenience and flexibility for supporting very large-scale multi-agent simulations. Specifically, we propose an actor-based distributed mechanism as the underlying technological infrastructure towards great scalability and high efficiency, and provide flexible environment support for simulating various real-world scenarios, which enables parallel execution of multiple agents, centralized workflow orchestration, and both inter-agent and agent-environment interactions among agents. Moreover, we integrate an easy-to-use configurable tool and an automatic background generation pipeline in AgentScope, simplifying the process of creating agents with diverse yet detailed background settings. Last but not least, we provide a web-based interface for conveniently monitoring and managing a large number of agents that might deploy across multiple devices. We conduct a comprehensive simulation to demonstrate the effectiveness of the proposed enhancements in AgentScope, and provide detailed observations and discussions to highlight the great potential of applying multi-agent systems in large-scale simulations. The source code is released on GitHub at https://github.com/modelscope/agentscope to inspire further research and development in large-scale multi-agent simulations.
{"title":"Very Large-Scale Multi-Agent Simulation in AgentScope","authors":"Xuchen Pan, Dawei Gao, Yuexiang Xie, Zhewei Wei, Yaliang Li, Bolin Ding, Ji-Rong Wen, Jingren Zhou","doi":"arxiv-2407.17789","DOIUrl":"https://doi.org/arxiv-2407.17789","url":null,"abstract":"Recent advances in large language models (LLMs) have opened new avenues for\u0000applying multi-agent systems in very large-scale simulations. However, there\u0000remain several challenges when conducting multi-agent simulations with existing\u0000platforms, such as limited scalability and low efficiency, unsatisfied agent\u0000diversity, and effort-intensive management processes. To address these\u0000challenges, we develop several new features and components for AgentScope, a\u0000user-friendly multi-agent platform, enhancing its convenience and flexibility\u0000for supporting very large-scale multi-agent simulations. Specifically, we\u0000propose an actor-based distributed mechanism as the underlying technological\u0000infrastructure towards great scalability and high efficiency, and provide\u0000flexible environment support for simulating various real-world scenarios, which\u0000enables parallel execution of multiple agents, centralized workflow\u0000orchestration, and both inter-agent and agent-environment interactions among\u0000agents. Moreover, we integrate an easy-to-use configurable tool and an\u0000automatic background generation pipeline in AgentScope, simplifying the process\u0000of creating agents with diverse yet detailed background settings. Last but not\u0000least, we provide a web-based interface for conveniently monitoring and\u0000managing a large number of agents that might deploy across multiple devices. We\u0000conduct a comprehensive simulation to demonstrate the effectiveness of the\u0000proposed enhancements in AgentScope, and provide detailed observations and\u0000discussions to highlight the great potential of applying multi-agent systems in\u0000large-scale simulations. The source code is released on GitHub at\u0000https://github.com/modelscope/agentscope to inspire further research and\u0000development in large-scale multi-agent simulations.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Piotr Faliszewski, Łukasz Janeczko, Andrzej Kaczmarczyk, Grzegorz Lisowski, Piotr Skowron, Stanisław Szufa
We study strategic behavior of project proposers in the context of approval-based participatory budgeting (PB). In our model we assume that the votes are fixed and known and the proposers want to set as high project prices as possible, provided that their projects get selected and the prices are not below the minimum costs of their delivery. We study the existence of pure Nash equilibria (NE) in such games, focusing on the AV/Cost, Phragm'en, and Method of Equal Shares rules. Furthermore, we report an experimental study of strategic cost selection on real-life PB election data.
{"title":"Strategic Cost Selection in Participatory Budgeting","authors":"Piotr Faliszewski, Łukasz Janeczko, Andrzej Kaczmarczyk, Grzegorz Lisowski, Piotr Skowron, Stanisław Szufa","doi":"arxiv-2407.18092","DOIUrl":"https://doi.org/arxiv-2407.18092","url":null,"abstract":"We study strategic behavior of project proposers in the context of\u0000approval-based participatory budgeting (PB). In our model we assume that the\u0000votes are fixed and known and the proposers want to set as high project prices\u0000as possible, provided that their projects get selected and the prices are not\u0000below the minimum costs of their delivery. We study the existence of pure Nash\u0000equilibria (NE) in such games, focusing on the AV/Cost, Phragm'en, and Method\u0000of Equal Shares rules. Furthermore, we report an experimental study of\u0000strategic cost selection on real-life PB election data.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"88 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In many multi-player interactions, players incur strictly positive costs each time they execute actions e.g. 'menu costs' or transaction costs in financial systems. Since acting at each available opportunity would accumulate prohibitively large costs, the resulting decision problem is one in which players must make strategic decisions about when to execute actions in addition to their choice of action. This paper analyses a discrete-time stochastic game (SG) in which players face minimally bounded positive costs for each action and influence the system using impulse controls. We prove SGs of two-sided impulse control have a unique value and characterise the saddle point equilibrium in which the players execute actions at strategically chosen times in accordance with Markovian strategies. We prove the game respects a dynamic programming principle and that the Markov perfect equilibrium can be computed as a limit point of a sequence of Bellman operations. We then introduce a new Q-learning variant which we show converges almost surely to the value of the game enabling solutions to be extracted in unknown settings. Lastly, we extend our results to settings with budgetory constraints.
{"title":"Stochastic Games with Minimally Bounded Action Costs","authors":"David Mguni","doi":"arxiv-2407.18010","DOIUrl":"https://doi.org/arxiv-2407.18010","url":null,"abstract":"In many multi-player interactions, players incur strictly positive costs each\u0000time they execute actions e.g. 'menu costs' or transaction costs in financial\u0000systems. Since acting at each available opportunity would accumulate\u0000prohibitively large costs, the resulting decision problem is one in which\u0000players must make strategic decisions about when to execute actions in addition\u0000to their choice of action. This paper analyses a discrete-time stochastic game\u0000(SG) in which players face minimally bounded positive costs for each action and\u0000influence the system using impulse controls. We prove SGs of two-sided impulse\u0000control have a unique value and characterise the saddle point equilibrium in\u0000which the players execute actions at strategically chosen times in accordance\u0000with Markovian strategies. We prove the game respects a dynamic programming\u0000principle and that the Markov perfect equilibrium can be computed as a limit\u0000point of a sequence of Bellman operations. We then introduce a new Q-learning\u0000variant which we show converges almost surely to the value of the game enabling\u0000solutions to be extracted in unknown settings. Lastly, we extend our results to\u0000settings with budgetory constraints.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"165 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141777168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}