Pub Date : 2023-02-21DOI: 10.48550/arXiv.2302.10595
Ágnes Cseh, Pascal Führlich, Pascal Lenzner
In each round of a Swiss-system tournament, players of similar score are paired against each other. An intentional early loss therefore might lead to weaker opponents in later rounds and thus to a better final tournament result - a phenomenon known as the Swiss Gambit. To the best of our knowledge it is an open question whether this strategy can actually work. This paper provides answers based on an empirical agent-based analysis for the most prominent application area of the Swiss-system format, namely chess tournaments. We simulate realistic tournaments by employing the official FIDE pairing system for computing the player pairings in each round. We show that even though gambits are widely possible in Swiss-system chess tournaments, profiting from them requires a high degree of predictability of match results. Moreover, even if a Swiss Gambit succeeds, the obtained improvement in the final ranking is limited. Our experiments prove that counting on a Swiss Gambit is indeed a lot more of a risky gambit than a reliable strategy to improve the final rank.
{"title":"The Swiss Gambit","authors":"Ágnes Cseh, Pascal Führlich, Pascal Lenzner","doi":"10.48550/arXiv.2302.10595","DOIUrl":"https://doi.org/10.48550/arXiv.2302.10595","url":null,"abstract":"In each round of a Swiss-system tournament, players of similar score are paired against each other. An intentional early loss therefore might lead to weaker opponents in later rounds and thus to a better final tournament result - a phenomenon known as the Swiss Gambit. To the best of our knowledge it is an open question whether this strategy can actually work. This paper provides answers based on an empirical agent-based analysis for the most prominent application area of the Swiss-system format, namely chess tournaments. We simulate realistic tournaments by employing the official FIDE pairing system for computing the player pairings in each round. We show that even though gambits are widely possible in Swiss-system chess tournaments, profiting from them requires a high degree of predictability of match results. Moreover, even if a Swiss Gambit succeeds, the obtained improvement in the final ranking is limited. Our experiments prove that counting on a Swiss Gambit is indeed a lot more of a risky gambit than a reliable strategy to improve the final rank.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133247684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We approach the problem of understanding how people interact with each other in collaborative settings, especially when individuals know little about their teammates, via Multiagent Inverse Reinforcement Learning (MIRL), where the goal is to infer the reward functions guiding the behavior of each individual given trajectories of a team's behavior during some task. Unlike current MIRL approaches, we do not assume that team members know each other's goals a priori; rather, that they collaborate by adapting to the goals of others perceived by observing their behavior, all while jointly performing a task. To address this problem, we propose a novel approach to MIRL via Theory of Mind (MIRL-ToM). For each agent, we first use ToM reasoning to estimate a posterior distribution over baseline reward profiles given their demonstrated behavior. We then perform MIRL via decentralized equilibrium by employing single-agent Maximum Entropy IRL to infer a reward function for each agent, where we simulate the behavior of other teammates according to the time-varying distribution over profiles. We evaluate our approach in a simulated 2-player search-and-rescue operation where the goal of the agents, playing different roles, is to search for and evacuate victims in the environment. Our results show that the choice of baseline profiles is paramount to the recovery of the ground-truth rewards, and that MIRL-ToM is able to recover the rewards used by agents interacting both with known and unknown teammates.
{"title":"Multiagent Inverse Reinforcement Learning via Theory of Mind Reasoning","authors":"Haochen Wu, Pedro Sequeira, D. Pynadath","doi":"10.5555/3545946.3598703","DOIUrl":"https://doi.org/10.5555/3545946.3598703","url":null,"abstract":"We approach the problem of understanding how people interact with each other in collaborative settings, especially when individuals know little about their teammates, via Multiagent Inverse Reinforcement Learning (MIRL), where the goal is to infer the reward functions guiding the behavior of each individual given trajectories of a team's behavior during some task. Unlike current MIRL approaches, we do not assume that team members know each other's goals a priori; rather, that they collaborate by adapting to the goals of others perceived by observing their behavior, all while jointly performing a task. To address this problem, we propose a novel approach to MIRL via Theory of Mind (MIRL-ToM). For each agent, we first use ToM reasoning to estimate a posterior distribution over baseline reward profiles given their demonstrated behavior. We then perform MIRL via decentralized equilibrium by employing single-agent Maximum Entropy IRL to infer a reward function for each agent, where we simulate the behavior of other teammates according to the time-varying distribution over profiles. We evaluate our approach in a simulated 2-player search-and-rescue operation where the goal of the agents, playing different roles, is to search for and evacuate victims in the environment. Our results show that the choice of baseline profiles is paramount to the recovery of the ground-truth rewards, and that MIRL-ToM is able to recover the rewards used by agents interacting both with known and unknown teammates.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130288836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-20DOI: 10.48550/arXiv.2302.09959
David Sychrovsky, Jakub Černý, Sylvain Lichau, M. Loebl
Measures of allocation optimality differ significantly when distributing standard tradable goods in peaceful times and scarce resources in crises. While realistic markets offer asymptotic efficiency, they may not necessarily guarantee fair allocation desirable when distributing the critical resources. To achieve fairness, mechanisms often rely on a central authority, which may act inefficiently in times of need when swiftness and good organization are crucial. In this work, we study a hybrid trading system called Crisdis, introduced by Jedliv{c}kov'{a} et al., which combines fair allocation of buying rights with a market - leveraging the best of both worlds. A frustration of a buyer in Crisdis is defined as a difference between the amount of goods they are entitled to according to the assigned buying rights and the amount of goods they are able to acquire by trading. We define a Price of Anarchy (PoA) in this system as a conceptual analogue of the original definition in the context of frustration. Our main contribution is a study of PoA in realistic complex double-sided market mechanisms for Crisdis. The performed empirical analysis suggests that in contrast to market free of governmental interventions, the PoA in our system decreases.
{"title":"Price of Anarchy in a Double-Sided Critical Distribution System","authors":"David Sychrovsky, Jakub Černý, Sylvain Lichau, M. Loebl","doi":"10.48550/arXiv.2302.09959","DOIUrl":"https://doi.org/10.48550/arXiv.2302.09959","url":null,"abstract":"Measures of allocation optimality differ significantly when distributing standard tradable goods in peaceful times and scarce resources in crises. While realistic markets offer asymptotic efficiency, they may not necessarily guarantee fair allocation desirable when distributing the critical resources. To achieve fairness, mechanisms often rely on a central authority, which may act inefficiently in times of need when swiftness and good organization are crucial. In this work, we study a hybrid trading system called Crisdis, introduced by Jedliv{c}kov'{a} et al., which combines fair allocation of buying rights with a market - leveraging the best of both worlds. A frustration of a buyer in Crisdis is defined as a difference between the amount of goods they are entitled to according to the assigned buying rights and the amount of goods they are able to acquire by trading. We define a Price of Anarchy (PoA) in this system as a conceptual analogue of the original definition in the context of frustration. Our main contribution is a study of PoA in realistic complex double-sided market mechanisms for Crisdis. The performed empirical analysis suggests that in contrast to market free of governmental interventions, the PoA in our system decreases.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121637451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-20DOI: 10.48550/arXiv.2302.09859
Theodor Cimpeanu, L. Pereira, H. Anh
Building ethical machines may involve bestowing upon them the emotional capacity to self-evaluate and repent on their actions. While reparative measures, such as apologies, are often considered as possible strategic interactions, the explicit evolution of the emotion of guilt as a behavioural phenotype is not yet well understood. Here, we study the co-evolution of social and non-social guilt of homogeneous or heterogeneous populations, including well-mixed, lattice and scale-free networks. Socially aware guilt comes at a cost, as it requires agents to make demanding efforts to observe and understand the internal state and behaviour of others, while non-social guilt only requires the awareness of the agents' own state and hence incurs no social cost. Those choosing to be non-social are however more sensitive to exploitation by other agents due to their social unawareness. Resorting to methods from evolutionary game theory, we study analytically, and through extensive numerical and agent-based simulations, whether and how such social and non-social guilt can evolve and deploy, depending on the underlying structure of the populations, or systems, of agents. The results show that, in both lattice and scale-free networks, emotional guilt prone strategies are dominant for a larger range of the guilt and social costs incurred, compared to the well-mixed population setting, leading therefore to significantly higher levels of cooperation for a wider range of the costs. In structured population settings, both social and non-social guilt can evolve and deploy through clustering with emotional prone strategies, allowing them to be protected from exploiters, especially in case of non-social (less costly) strategies. Overall, our findings provide important insights into the design and engineering of self-organised and distributed cooperative multi-agent systems.
{"title":"Co-evolution of Social and Non-Social Guilt","authors":"Theodor Cimpeanu, L. Pereira, H. Anh","doi":"10.48550/arXiv.2302.09859","DOIUrl":"https://doi.org/10.48550/arXiv.2302.09859","url":null,"abstract":"Building ethical machines may involve bestowing upon them the emotional capacity to self-evaluate and repent on their actions. While reparative measures, such as apologies, are often considered as possible strategic interactions, the explicit evolution of the emotion of guilt as a behavioural phenotype is not yet well understood. Here, we study the co-evolution of social and non-social guilt of homogeneous or heterogeneous populations, including well-mixed, lattice and scale-free networks. Socially aware guilt comes at a cost, as it requires agents to make demanding efforts to observe and understand the internal state and behaviour of others, while non-social guilt only requires the awareness of the agents' own state and hence incurs no social cost. Those choosing to be non-social are however more sensitive to exploitation by other agents due to their social unawareness. Resorting to methods from evolutionary game theory, we study analytically, and through extensive numerical and agent-based simulations, whether and how such social and non-social guilt can evolve and deploy, depending on the underlying structure of the populations, or systems, of agents. The results show that, in both lattice and scale-free networks, emotional guilt prone strategies are dominant for a larger range of the guilt and social costs incurred, compared to the well-mixed population setting, leading therefore to significantly higher levels of cooperation for a wider range of the costs. In structured population settings, both social and non-social guilt can evolve and deploy through clustering with emotional prone strategies, allowing them to be protected from exploiters, especially in case of non-social (less costly) strategies. Overall, our findings provide important insights into the design and engineering of self-organised and distributed cooperative multi-agent systems.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"312 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122773018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-19DOI: 10.48550/arXiv.2302.09449
H. Aziz, S. Chu, Zhaohong Sun
Selection under category or diversity constraints is a ubiquitous and widely-applicable problem that is encountered in immigration, school choice, hiring, and healthcare rationing. These diversity constraints are typically represented by minimum and maximum quotas on various categories or types. We undertake a detailed comparative study of applicant selection algorithms with respect to the diversity goals.
{"title":"Matching Algorithms under Diversity-Based Reservations","authors":"H. Aziz, S. Chu, Zhaohong Sun","doi":"10.48550/arXiv.2302.09449","DOIUrl":"https://doi.org/10.48550/arXiv.2302.09449","url":null,"abstract":"Selection under category or diversity constraints is a ubiquitous and widely-applicable problem that is encountered in immigration, school choice, hiring, and healthcare rationing. These diversity constraints are typically represented by minimum and maximum quotas on various categories or types. We undertake a detailed comparative study of applicant selection algorithms with respect to the diversity goals.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125320406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-18DOI: 10.48550/arXiv.2302.09250
Yuki Miyashita, Tomoki Yamauchi, T. Sugawara
We propose a distributed planning method with asynchronous execution for multi-agent pickup and delivery (MAPD) problems for environments with occasional delays in agents' activities and flexible endpoints. MAPD is a crucial problem framework with many applications; however, most existing studies assume ideal agent behaviors and environments, such as a fixed speed of agents, synchronized movements, and a well-designed environment with many short detours for multiple agents to perform tasks easily. However, such an environment is often infeasible; for example, the moving speed of agents may be affected by weather and floor conditions and is often prone to delays. The proposed method can relax some infeasible conditions to apply MAPD in more realistic environments by allowing fluctuated speed in agents' actions and flexible working locations (endpoints). Our experiments showed that our method enables agents to perform MAPD in such an environment efficiently, compared to the baseline methods. We also analyzed the behaviors of agents using our method and discuss the limitations.
{"title":"Distributed Planning with Asynchronous Execution with Local Navigation for Multi-agent Pickup and Delivery Problem","authors":"Yuki Miyashita, Tomoki Yamauchi, T. Sugawara","doi":"10.48550/arXiv.2302.09250","DOIUrl":"https://doi.org/10.48550/arXiv.2302.09250","url":null,"abstract":"We propose a distributed planning method with asynchronous execution for multi-agent pickup and delivery (MAPD) problems for environments with occasional delays in agents' activities and flexible endpoints. MAPD is a crucial problem framework with many applications; however, most existing studies assume ideal agent behaviors and environments, such as a fixed speed of agents, synchronized movements, and a well-designed environment with many short detours for multiple agents to perform tasks easily. However, such an environment is often infeasible; for example, the moving speed of agents may be affected by weather and floor conditions and is often prone to delays. The proposed method can relax some infeasible conditions to apply MAPD in more realistic environments by allowing fluctuated speed in agents' actions and flexible working locations (endpoints). Our experiments showed that our method enables agents to perform MAPD in such an environment efficiently, compared to the baseline methods. We also analyzed the behaviors of agents using our method and discuss the limitations.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123700643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-16DOI: 10.48550/arXiv.2302.08001
Libo Zhang, Yang Chen, Toru Takisaka, B. Khoussainov, Michael Witbrock, Jiamou Liu
Correlated Equilibrium (CE) is a well-established solution concept that captures coordination among agents and enjoys good algorithmic properties. In real-world multi-agent systems, in addition to being in an equilibrium, agents' policies are often expected to meet requirements with respect to safety, and fairness. Such additional requirements can often be expressed in terms of the state density which measures the state-visitation frequencies during the course of a game. However, existing CE notions or CE-finding approaches cannot explicitly specify a CE with particular properties concerning state density; they do so implicitly by either modifying reward functions or using value functions as the selection criteria. The resulting CE may thus not fully fulfil the state-density requirements. In this paper, we propose Density-Based Correlated Equilibria (DBCE), a new notion of CE that explicitly takes state density as selection criterion. Concretely, we instantiate DBCE by specifying different state-density requirements motivated by real-world applications. To compute DBCE, we put forward the Density Based Correlated Policy Iteration algorithm for the underlying control problem. We perform experiments on various games where results demonstrate the advantage of our CE-finding approach over existing methods in scenarios with state-density concerns.
{"title":"Learning Density-Based Correlated Equilibria for Markov Games","authors":"Libo Zhang, Yang Chen, Toru Takisaka, B. Khoussainov, Michael Witbrock, Jiamou Liu","doi":"10.48550/arXiv.2302.08001","DOIUrl":"https://doi.org/10.48550/arXiv.2302.08001","url":null,"abstract":"Correlated Equilibrium (CE) is a well-established solution concept that captures coordination among agents and enjoys good algorithmic properties. In real-world multi-agent systems, in addition to being in an equilibrium, agents' policies are often expected to meet requirements with respect to safety, and fairness. Such additional requirements can often be expressed in terms of the state density which measures the state-visitation frequencies during the course of a game. However, existing CE notions or CE-finding approaches cannot explicitly specify a CE with particular properties concerning state density; they do so implicitly by either modifying reward functions or using value functions as the selection criteria. The resulting CE may thus not fully fulfil the state-density requirements. In this paper, we propose Density-Based Correlated Equilibria (DBCE), a new notion of CE that explicitly takes state density as selection criterion. Concretely, we instantiate DBCE by specifying different state-density requirements motivated by real-world applications. To compute DBCE, we put forward the Density Based Correlated Policy Iteration algorithm for the underlying control problem. We perform experiments on various games where results demonstrate the advantage of our CE-finding approach over existing methods in scenarios with state-density concerns.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128987727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-15DOI: 10.48550/arXiv.2302.07515
Fanqing Lin, Shiyu Huang, Tim Pearce, Wenze Chen, Weijuan Tu
Multi-agent football poses an unsolved challenge in AI research. Existing work has focused on tackling simplified scenarios of the game, or else leveraging expert demonstrations. In this paper, we develop a multi-agent system to play the full 11 vs. 11 game mode, without demonstrations. This game mode contains aspects that present major challenges to modern reinforcement learning algorithms; multi-agent coordination, long-term planning, and non-transitivity. To address these challenges, we present TiZero; a self-evolving, multi-agent system that learns from scratch. TiZero introduces several innovations, including adaptive curriculum learning, a novel self-play strategy, and an objective that optimizes the policies of multiple agents jointly. Experimentally, it outperforms previous systems by a large margin on the Google Research Football environment, increasing win rates by over 30%. To demonstrate the generality of TiZero's innovations, they are assessed on several environments beyond football; Overcooked, Multi-agent Particle-Environment, Tic-Tac-Toe and Connect-Four.
{"title":"TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play","authors":"Fanqing Lin, Shiyu Huang, Tim Pearce, Wenze Chen, Weijuan Tu","doi":"10.48550/arXiv.2302.07515","DOIUrl":"https://doi.org/10.48550/arXiv.2302.07515","url":null,"abstract":"Multi-agent football poses an unsolved challenge in AI research. Existing work has focused on tackling simplified scenarios of the game, or else leveraging expert demonstrations. In this paper, we develop a multi-agent system to play the full 11 vs. 11 game mode, without demonstrations. This game mode contains aspects that present major challenges to modern reinforcement learning algorithms; multi-agent coordination, long-term planning, and non-transitivity. To address these challenges, we present TiZero; a self-evolving, multi-agent system that learns from scratch. TiZero introduces several innovations, including adaptive curriculum learning, a novel self-play strategy, and an objective that optimizes the policies of multiple agents jointly. Experimentally, it outperforms previous systems by a large margin on the Google Research Football environment, increasing win rates by over 30%. To demonstrate the generality of TiZero's innovations, they are assessed on several environments beyond football; Overcooked, Multi-agent Particle-Environment, Tic-Tac-Toe and Connect-Four.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128658343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-14DOI: 10.48550/arXiv.2302.07072
Fengjuan Jia, Mengxiao Zhang, Jiamou Liu, B. Khoussainov
Diffusion auction refers to an emerging paradigm of online marketplace where an auctioneer utilises a social network to attract potential buyers. Diffusion auction poses significant privacy risks. From the auction outcome, it is possible to infer hidden, and potentially sensitive, preferences of buyers. To mitigate such risks, we initiate the study of differential privacy (DP) in diffusion auction mechanisms. DP is a well-established notion of privacy that protects a system against inference attacks. Achieving DP in diffusion auctions is non-trivial as the well-designed auction rules are required to incentivise the buyers to truthfully report their neighbourhood. We study the single-unit case and design two differentially private diffusion mechanisms (DPDMs): recursive DPDM and layered DPDM. We prove that these mechanisms guarantee differential privacy, incentive compatibility and individual rationality for both valuations and neighbourhood. We then empirically compare their performance on real and synthetic datasets.
{"title":"Differentially Private Diffusion Auction: The Single-unit Case","authors":"Fengjuan Jia, Mengxiao Zhang, Jiamou Liu, B. Khoussainov","doi":"10.48550/arXiv.2302.07072","DOIUrl":"https://doi.org/10.48550/arXiv.2302.07072","url":null,"abstract":"Diffusion auction refers to an emerging paradigm of online marketplace where an auctioneer utilises a social network to attract potential buyers. Diffusion auction poses significant privacy risks. From the auction outcome, it is possible to infer hidden, and potentially sensitive, preferences of buyers. To mitigate such risks, we initiate the study of differential privacy (DP) in diffusion auction mechanisms. DP is a well-established notion of privacy that protects a system against inference attacks. Achieving DP in diffusion auctions is non-trivial as the well-designed auction rules are required to incentivise the buyers to truthfully report their neighbourhood. We study the single-unit case and design two differentially private diffusion mechanisms (DPDMs): recursive DPDM and layered DPDM. We prove that these mechanisms guarantee differential privacy, incentive compatibility and individual rationality for both valuations and neighbourhood. We then empirically compare their performance on real and synthetic datasets.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125068162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-14DOI: 10.48550/arXiv.2302.06803
Licheng Wen, Pinlong Cai, Daocheng Fu, Song Mao, Yikang Li
With the development of autonomous driving, it is becoming increasingly common for autonomous vehicles (AVs) and human-driven vehicles (HVs) to travel on the same roads. Existing single-vehicle planning algorithms on board struggle to handle sophisticated social interactions in the real world. Decisions made by these methods are difficult to understand for humans, raising the risk of crashes and making them unlikely to be applied in practice. Moreover, vehicle flows produced by open-source traffic simulators suffer from being overly conservative and lacking behavioral diversity. We propose a hierarchical multi-vehicle decision-making and planning framework with several advantages. The framework jointly makes decisions for all vehicles within the flow and reacts promptly to the dynamic environment through a high-frequency planning module. The decision module produces interpretable action sequences that can explicitly communicate self-intent to the surrounding HVs. We also present the cooperation factor and trajectory weight set, bringing diversity to autonomous vehicles in traffic at both the social and individual levels. The superiority of our proposed framework is validated through experiments with multiple scenarios, and the diverse behaviors in the generated vehicle trajectories are demonstrated through closed-loop simulations.
{"title":"Bringing Diversity to Autonomous Vehicles: An Interpretable Multi-vehicle Decision-making and Planning Framework","authors":"Licheng Wen, Pinlong Cai, Daocheng Fu, Song Mao, Yikang Li","doi":"10.48550/arXiv.2302.06803","DOIUrl":"https://doi.org/10.48550/arXiv.2302.06803","url":null,"abstract":"With the development of autonomous driving, it is becoming increasingly common for autonomous vehicles (AVs) and human-driven vehicles (HVs) to travel on the same roads. Existing single-vehicle planning algorithms on board struggle to handle sophisticated social interactions in the real world. Decisions made by these methods are difficult to understand for humans, raising the risk of crashes and making them unlikely to be applied in practice. Moreover, vehicle flows produced by open-source traffic simulators suffer from being overly conservative and lacking behavioral diversity. We propose a hierarchical multi-vehicle decision-making and planning framework with several advantages. The framework jointly makes decisions for all vehicles within the flow and reacts promptly to the dynamic environment through a high-frequency planning module. The decision module produces interpretable action sequences that can explicitly communicate self-intent to the surrounding HVs. We also present the cooperation factor and trajectory weight set, bringing diversity to autonomous vehicles in traffic at both the social and individual levels. The superiority of our proposed framework is validated through experiments with multiple scenarios, and the diverse behaviors in the generated vehicle trajectories are demonstrated through closed-loop simulations.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129266134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}