One of the widely used peak reduction methods in smart grids is demand response, where one analyzes the shift in customers' (agents') usage patterns in response to the signal from the distribution company. Often, these signals are in the form of incentives offered to agents. This work studies the effect of incentives on the probabilities of accepting such offers in a real-world smart grid simulator, PowerTAC. We first show that there exists a function that depicts the probability of an agent reducing its load as a function of the discounts offered to them. We call it reduction probability (RP). RP function is further parametrized by the rate of reduction (RR), which can differ for each agent. We provide an optimal algorithm, MJS--ExpResponse, that outputs the discounts to each agent by maximizing the expected reduction under a budget constraint. When RRs are unknown, we propose a Multi-Armed Bandit (MAB) based online algorithm, namely MJSUCB--ExpResponse, to learn RRs. Experimentally we show that it exhibits sublinear regret. Finally, we showcase the efficacy of the proposed algorithm in mitigating demand peaks in a real-world smart grid system using the PowerTAC simulator as a test bed.
{"title":"A Novel Demand Response Model and Method for Peak Reduction in Smart Grids - PowerTAC","authors":"Sanjay Chandlekar, Arthik Boroju, Shweta Jain, Sujit Gujar","doi":"10.48550/arXiv.2302.12520","DOIUrl":"https://doi.org/10.48550/arXiv.2302.12520","url":null,"abstract":"One of the widely used peak reduction methods in smart grids is demand response, where one analyzes the shift in customers' (agents') usage patterns in response to the signal from the distribution company. Often, these signals are in the form of incentives offered to agents. This work studies the effect of incentives on the probabilities of accepting such offers in a real-world smart grid simulator, PowerTAC. We first show that there exists a function that depicts the probability of an agent reducing its load as a function of the discounts offered to them. We call it reduction probability (RP). RP function is further parametrized by the rate of reduction (RR), which can differ for each agent. We provide an optimal algorithm, MJS--ExpResponse, that outputs the discounts to each agent by maximizing the expected reduction under a budget constraint. When RRs are unknown, we propose a Multi-Armed Bandit (MAB) based online algorithm, namely MJSUCB--ExpResponse, to learn RRs. Experimentally we show that it exhibits sublinear regret. Finally, we showcase the efficacy of the proposed algorithm in mitigating demand peaks in a real-world smart grid system using the PowerTAC simulator as a test bed.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114356210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-24DOI: 10.48550/arXiv.2302.12676
Stelios Triantafyllou, Goran Radanovic
Responsibility attribution is a key concept of accountable multi-agent decision making. Given a sequence of actions, responsibility attribution mechanisms quantify the impact of each participating agent to the final outcome. One such popular mechanism is based on actual causality, and it assigns (causal) responsibility based on the actions that were found to be pivotal for the considered outcome. However, the inherent problem of pinpointing actual causes and consequently determining the exact responsibility assignment has shown to be computationally intractable. In this paper, we aim to provide a practical algorithmic solution to the problem of responsibility attribution under a computational budget. We first formalize the problem in the framework of Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) augmented by a specific class of Structural Causal Models (SCMs). Under this framework, we introduce a Monte Carlo Tree Search (MCTS) type of method which efficiently approximates the agents' degrees of responsibility. This method utilizes the structure of a novel search tree and a pruning technique, both tailored to the problem of responsibility attribution. Other novel components of our method are (a) a child selection policy based on linear scalarization and (b) a backpropagation procedure that accounts for a minimality condition that is typically used to define actual causality. We experimentally evaluate the efficacy of our algorithm through a simulation-based test-bed, which includes three team-based card games.
{"title":"Towards Computationally Efficient Responsibility Attribution in Decentralized Partially Observable MDPs","authors":"Stelios Triantafyllou, Goran Radanovic","doi":"10.48550/arXiv.2302.12676","DOIUrl":"https://doi.org/10.48550/arXiv.2302.12676","url":null,"abstract":"Responsibility attribution is a key concept of accountable multi-agent decision making. Given a sequence of actions, responsibility attribution mechanisms quantify the impact of each participating agent to the final outcome. One such popular mechanism is based on actual causality, and it assigns (causal) responsibility based on the actions that were found to be pivotal for the considered outcome. However, the inherent problem of pinpointing actual causes and consequently determining the exact responsibility assignment has shown to be computationally intractable. In this paper, we aim to provide a practical algorithmic solution to the problem of responsibility attribution under a computational budget. We first formalize the problem in the framework of Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) augmented by a specific class of Structural Causal Models (SCMs). Under this framework, we introduce a Monte Carlo Tree Search (MCTS) type of method which efficiently approximates the agents' degrees of responsibility. This method utilizes the structure of a novel search tree and a pruning technique, both tailored to the problem of responsibility attribution. Other novel components of our method are (a) a child selection policy based on linear scalarization and (b) a backpropagation procedure that accounts for a minimality condition that is typically used to define actual causality. We experimentally evaluate the efficacy of our algorithm through a simulation-based test-bed, which includes three team-based card games.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114457717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-24DOI: 10.48550/arXiv.2302.12689
Tobias Huber, Maximilian Demmler, Silvan Mertes, Matthew Lyle Olson, Elisabeth Andr'e
Counterfactual explanations are a common tool to explain artificial intelligence models. For Reinforcement Learning (RL) agents, they answer"Why not?"or"What if?"questions by illustrating what minimal change to a state is needed such that an agent chooses a different action. Generating counterfactual explanations for RL agents with visual input is especially challenging because of their large state spaces and because their decisions are part of an overarching policy, which includes long-term decision-making. However, research focusing on counterfactual explanations, specifically for RL agents with visual input, is scarce and does not go beyond identifying defective agents. It is unclear whether counterfactual explanations are still helpful for more complex tasks like analyzing the learned strategies of different agents or choosing a fitting agent for a specific task. We propose a novel but simple method to generate counterfactual explanations for RL agents by formulating the problem as a domain transfer problem which allows the use of adversarial learning techniques like StarGAN. Our method is fully model-agnostic and we demonstrate that it outperforms the only previous method in several computational metrics. Furthermore, we show in a user study that our method performs best when analyzing which strategies different agents pursue.
{"title":"GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations","authors":"Tobias Huber, Maximilian Demmler, Silvan Mertes, Matthew Lyle Olson, Elisabeth Andr'e","doi":"10.48550/arXiv.2302.12689","DOIUrl":"https://doi.org/10.48550/arXiv.2302.12689","url":null,"abstract":"Counterfactual explanations are a common tool to explain artificial intelligence models. For Reinforcement Learning (RL) agents, they answer\"Why not?\"or\"What if?\"questions by illustrating what minimal change to a state is needed such that an agent chooses a different action. Generating counterfactual explanations for RL agents with visual input is especially challenging because of their large state spaces and because their decisions are part of an overarching policy, which includes long-term decision-making. However, research focusing on counterfactual explanations, specifically for RL agents with visual input, is scarce and does not go beyond identifying defective agents. It is unclear whether counterfactual explanations are still helpful for more complex tasks like analyzing the learned strategies of different agents or choosing a fitting agent for a specific task. We propose a novel but simple method to generate counterfactual explanations for RL agents by formulating the problem as a domain transfer problem which allows the use of adversarial learning techniques like StarGAN. Our method is fully model-agnostic and we demonstrate that it outperforms the only previous method in several computational metrics. Furthermore, we show in a user study that our method performs best when analyzing which strategies different agents pursue.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126350122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-23DOI: 10.48550/arXiv.2302.12140
F. Brandt, Patrick Lederer, Sascha Tausch
One of the central economic paradigms in multi-agent systems is that agents should not be better off by acting dishonestly. In the context of collective decision-making, this axiom is known as strategyproofness and turns out to be rather prohibitive, even when allowing for randomization. In particular, Gibbard's random dictatorship theorem shows that only rather unattractive social decision schemes (SDSs) satisfy strategyproofness on the full domain of preferences. In this paper, we obtain more positive results by investigating strategyproof SDSs on the Condorcet domain, which consists of all preference profiles that admit a Condorcet winner. In more detail, we show that, if the number of voters $n$ is odd, every strategyproof and non-imposing SDS on the Condorcet domain can be represented as a mixture of dictatorial SDSs and the Condorcet rule (which chooses the Condorcet winner with probability $1$). Moreover, we prove that the Condorcet domain is a maximal connected domain that allows for attractive strategyproof SDSs if $n$ is odd as only random dictatorships are strategyproof and non-imposing on any sufficiently connected superset of it. We also derive analogous results for even $n$ by slightly extending the Condorcet domain. Finally, we also characterize the set of group-strategyproof and non-imposing SDSs on the Condorcet domain and its supersets. These characterizations strengthen Gibbard's random dictatorship theorem and establish that the Condorcet domain is essentially a maximal domain that allows for attractive strategyproof SDSs.
{"title":"Strategyproof Social Decision Schemes on Super Condorcet Domains","authors":"F. Brandt, Patrick Lederer, Sascha Tausch","doi":"10.48550/arXiv.2302.12140","DOIUrl":"https://doi.org/10.48550/arXiv.2302.12140","url":null,"abstract":"One of the central economic paradigms in multi-agent systems is that agents should not be better off by acting dishonestly. In the context of collective decision-making, this axiom is known as strategyproofness and turns out to be rather prohibitive, even when allowing for randomization. In particular, Gibbard's random dictatorship theorem shows that only rather unattractive social decision schemes (SDSs) satisfy strategyproofness on the full domain of preferences. In this paper, we obtain more positive results by investigating strategyproof SDSs on the Condorcet domain, which consists of all preference profiles that admit a Condorcet winner. In more detail, we show that, if the number of voters $n$ is odd, every strategyproof and non-imposing SDS on the Condorcet domain can be represented as a mixture of dictatorial SDSs and the Condorcet rule (which chooses the Condorcet winner with probability $1$). Moreover, we prove that the Condorcet domain is a maximal connected domain that allows for attractive strategyproof SDSs if $n$ is odd as only random dictatorships are strategyproof and non-imposing on any sufficiently connected superset of it. We also derive analogous results for even $n$ by slightly extending the Condorcet domain. Finally, we also characterize the set of group-strategyproof and non-imposing SDSs on the Condorcet domain and its supersets. These characterizations strengthen Gibbard's random dictatorship theorem and establish that the Condorcet domain is essentially a maximal domain that allows for attractive strategyproof SDSs.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129602058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-23DOI: 10.48550/arXiv.2302.12359
Alexandre Trudeau, Michael H. Bowling
AlphaZero is a self-play reinforcement learning algorithm that achieves superhuman play in chess, shogi, and Go via policy iteration. To be an effective policy improvement operator, AlphaZero's search requires accurate value estimates for the states appearing in its search tree. AlphaZero trains upon self-play matches beginning from the initial state of a game and only samples actions over the first few moves, limiting its exploration of states deeper in the game tree. We introduce Go-Exploit, a novel search control strategy for AlphaZero. Go-Exploit samples the start state of its self-play trajectories from an archive of states of interest. Beginning self-play trajectories from varied starting states enables Go-Exploit to more effectively explore the game tree and to learn a value function that generalizes better. Producing shorter self-play trajectories allows Go-Exploit to train upon more independent value targets, improving value training. Finally, the exploration inherent in Go-Exploit reduces its need for exploratory actions, enabling it to train under more exploitative policies. In the games of Connect Four and 9x9 Go, we show that Go-Exploit learns with a greater sample efficiency than standard AlphaZero, resulting in stronger performance against reference opponents and in head-to-head play. We also compare Go-Exploit to KataGo, a more sample efficient reimplementation of AlphaZero, and demonstrate that Go-Exploit has a more effective search control strategy. Furthermore, Go-Exploit's sample efficiency improves when KataGo's other innovations are incorporated.
{"title":"Targeted Search Control in AlphaZero for Effective Policy Improvement","authors":"Alexandre Trudeau, Michael H. Bowling","doi":"10.48550/arXiv.2302.12359","DOIUrl":"https://doi.org/10.48550/arXiv.2302.12359","url":null,"abstract":"AlphaZero is a self-play reinforcement learning algorithm that achieves superhuman play in chess, shogi, and Go via policy iteration. To be an effective policy improvement operator, AlphaZero's search requires accurate value estimates for the states appearing in its search tree. AlphaZero trains upon self-play matches beginning from the initial state of a game and only samples actions over the first few moves, limiting its exploration of states deeper in the game tree. We introduce Go-Exploit, a novel search control strategy for AlphaZero. Go-Exploit samples the start state of its self-play trajectories from an archive of states of interest. Beginning self-play trajectories from varied starting states enables Go-Exploit to more effectively explore the game tree and to learn a value function that generalizes better. Producing shorter self-play trajectories allows Go-Exploit to train upon more independent value targets, improving value training. Finally, the exploration inherent in Go-Exploit reduces its need for exploratory actions, enabling it to train under more exploitative policies. In the games of Connect Four and 9x9 Go, we show that Go-Exploit learns with a greater sample efficiency than standard AlphaZero, resulting in stronger performance against reference opponents and in head-to-head play. We also compare Go-Exploit to KataGo, a more sample efficient reimplementation of AlphaZero, and demonstrate that Go-Exploit has a more effective search control strategy. Furthermore, Go-Exploit's sample efficiency improves when KataGo's other innovations are incorporated.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116793013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-23DOI: 10.48550/arXiv.2302.11890
Chris Dong, Patrick Lederer
Approval-based committee (ABC) voting rules elect a fixed size subset of the candidates, a so-called committee, based on the voters' approval ballots over the candidates. While these rules have recently attracted significant attention, axiomatic characterizations are largely missing so far. We address this problem by characterizing ABC voting rules within the broad and intuitive class of sequential valuation rules. These rules compute the winning committees by sequentially adding candidates that increase the score of the chosen committee the most. In more detail, we first characterize almost the full class of sequential valuation rules based on mild standard conditions and a new axiom called consistent committee monotonicity. This axiom postulates that the winning committees of size k can be derived from those of size k-1 by only adding candidates and that these new candidates are chosen consistently. By requiring additional conditions, we derive from this result also a characterization of the prominent class of sequential Thiele rules. Finally, we refine our results to characterize three well-known ABC voting rules, namely sequential approval voting, sequential proportional approval voting, and sequential Chamberlin-Courant approval voting.
{"title":"Characterizations of Sequential Valuation Rules","authors":"Chris Dong, Patrick Lederer","doi":"10.48550/arXiv.2302.11890","DOIUrl":"https://doi.org/10.48550/arXiv.2302.11890","url":null,"abstract":"Approval-based committee (ABC) voting rules elect a fixed size subset of the candidates, a so-called committee, based on the voters' approval ballots over the candidates. While these rules have recently attracted significant attention, axiomatic characterizations are largely missing so far. We address this problem by characterizing ABC voting rules within the broad and intuitive class of sequential valuation rules. These rules compute the winning committees by sequentially adding candidates that increase the score of the chosen committee the most. In more detail, we first characterize almost the full class of sequential valuation rules based on mild standard conditions and a new axiom called consistent committee monotonicity. This axiom postulates that the winning committees of size k can be derived from those of size k-1 by only adding candidates and that these new candidates are chosen consistently. By requiring additional conditions, we derive from this result also a characterization of the prominent class of sequential Thiele rules. Finally, we refine our results to characterize three well-known ABC voting rules, namely sequential approval voting, sequential proportional approval voting, and sequential Chamberlin-Courant approval voting.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116330615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-23DOI: 10.48550/arXiv.2302.12121
Jesse Milzman, Cody Moser
Previous investigations into creative and innovation networks have suggested that innovations often occurs at the boundary between the network's core and periphery. In this work, we investigate the effect of global core-periphery network structure on the speed and quality of cultural innovation. Drawing on differing notions of core-periphery structure from [arXiv:1808.07801] and [doi:10.1016/S0378-8733(99)00019-2], we distinguish decentralized core-periphery, centralized core-periphery, and affinity network structure. We generate networks of these three classes from stochastic block models (SBMs), and use them to run an agent-based model (ABM) of collective cultural innovation, in which agents can only directly interact with their network neighbors. In order to discover the highest-scoring innovation, agents must discover and combine the highest innovations from two completely parallel technology trees. We find that decentralized core-periphery networks outperform the others by finding the final crossover innovation more quickly on average. We hypothesize that decentralized core-periphery network structure accelerates collective problem-solving by shielding peripheral nodes from the local optima known by the core community at any given time. We then build upon the"Two Truths"hypothesis regarding community structure in spectral graph embeddings, first articulated in [arXiv:1808.07801], which suggests that the adjacency spectral embedding (ASE) captures core-periphery structure, while the Laplacian spectral embedding (LSE) captures affinity. We find that, for core-periphery networks, ASE-based resampling best recreates networks with similar performance on the innovation SBM, compared to LSE-based resampling. Since the Two Truths hypothesis suggests that ASE captures core-periphery structure, this result further supports our hypothesis.
{"title":"Decentralized core-periphery structure in social networks accelerates cultural innovation in agent-based model","authors":"Jesse Milzman, Cody Moser","doi":"10.48550/arXiv.2302.12121","DOIUrl":"https://doi.org/10.48550/arXiv.2302.12121","url":null,"abstract":"Previous investigations into creative and innovation networks have suggested that innovations often occurs at the boundary between the network's core and periphery. In this work, we investigate the effect of global core-periphery network structure on the speed and quality of cultural innovation. Drawing on differing notions of core-periphery structure from [arXiv:1808.07801] and [doi:10.1016/S0378-8733(99)00019-2], we distinguish decentralized core-periphery, centralized core-periphery, and affinity network structure. We generate networks of these three classes from stochastic block models (SBMs), and use them to run an agent-based model (ABM) of collective cultural innovation, in which agents can only directly interact with their network neighbors. In order to discover the highest-scoring innovation, agents must discover and combine the highest innovations from two completely parallel technology trees. We find that decentralized core-periphery networks outperform the others by finding the final crossover innovation more quickly on average. We hypothesize that decentralized core-periphery network structure accelerates collective problem-solving by shielding peripheral nodes from the local optima known by the core community at any given time. We then build upon the\"Two Truths\"hypothesis regarding community structure in spectral graph embeddings, first articulated in [arXiv:1808.07801], which suggests that the adjacency spectral embedding (ASE) captures core-periphery structure, while the Laplacian spectral embedding (LSE) captures affinity. We find that, for core-periphery networks, ASE-based resampling best recreates networks with similar performance on the innovation SBM, compared to LSE-based resampling. Since the Two Truths hypothesis suggests that ASE captures core-periphery structure, this result further supports our hypothesis.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121851801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-23DOI: 10.48550/arXiv.2302.12090
R. Galimullin, Fernando R. Vel'azquez-Quesada
Communication within groups of agents has been lately the focus of research in dynamic epistemic logic (DEL). This paper studies a recently introduced form of partial (more precisely, topic-based) communication. This type of communication allows for modelling scenarios of multi-agent collaboration and negotiation, and it is particularly well-suited for situations in which sharing all information is not feasible/advisable. After presenting results on invariance and complexity of model checking, the paper compares partial communication to public announcements, probably the most well-known type of communication in DEL. It is shown that the settings are, update-wise, incomparable: there are scenarios in which the effect of a public announcement cannot be replicated by partial communication, and vice versa. Then, the paper shifts its attention to strategic topic-based communication. It does so by extending the language with a modality that quantifies over the topics the agents can `talk about'. For this new framework, it provides a complete axiomatisation, showing also that the new language's model checking problem is PSPACE-complete. The paper closes showing that, in terms of expressivity, this new language of arbitrary partial communication is incomparable to that of arbitrary public announcements.
{"title":"(Arbitrary) Partial Communication","authors":"R. Galimullin, Fernando R. Vel'azquez-Quesada","doi":"10.48550/arXiv.2302.12090","DOIUrl":"https://doi.org/10.48550/arXiv.2302.12090","url":null,"abstract":"Communication within groups of agents has been lately the focus of research in dynamic epistemic logic (DEL). This paper studies a recently introduced form of partial (more precisely, topic-based) communication. This type of communication allows for modelling scenarios of multi-agent collaboration and negotiation, and it is particularly well-suited for situations in which sharing all information is not feasible/advisable. After presenting results on invariance and complexity of model checking, the paper compares partial communication to public announcements, probably the most well-known type of communication in DEL. It is shown that the settings are, update-wise, incomparable: there are scenarios in which the effect of a public announcement cannot be replicated by partial communication, and vice versa. Then, the paper shifts its attention to strategic topic-based communication. It does so by extending the language with a modality that quantifies over the topics the agents can `talk about'. For this new framework, it provides a complete axiomatisation, showing also that the new language's model checking problem is PSPACE-complete. The paper closes showing that, in terms of expressivity, this new language of arbitrary partial communication is incomparable to that of arbitrary public announcements.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129650774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-22DOI: 10.48550/arXiv.2302.11530
Siddharth Barman, V. V. Narayan, Paritosh Verma
We study the problem of dividing indivisible chores among agents whose costs (for the chores) are supermodular set functions with binary marginals. Such functions capture complementarity among chores, i.e., they constitute an expressive class wherein the marginal disutility of each chore is either one or zero, and the marginals increase with respect to supersets. In this setting, we study the broad landscape of finding fair and efficient chore allocations. In particular, we establish the existence of $(i)$ EF1 and Pareto efficient chore allocations, $(ii)$ MMS-fair and Pareto efficient allocations, and $(iii)$ Lorenz dominating chore allocations. Furthermore, we develop polynomial-time algorithms--in the value oracle model--for computing the chore allocations for each of these fairness and efficiency criteria. Complementing these existential and algorithmic results, we show that in this chore division setting, the aforementioned fairness notions, namely EF1, MMS, and Lorenz domination are incomparable: an allocation that satisfies any one of these notions does not necessarily satisfy the others. Additionally, we study EFX chore division. In contrast to the above-mentioned positive results, we show that, for binary supermodular costs, Pareto efficient allocations that are even approximately EFX do not exist, for any arbitrarily small approximation constant. Focusing on EFX fairness alone, when the cost functions are identical we present an algorithm (Add-and-Fix) that computes an EFX allocation. For binary marginals, we show that Add-and-Fix runs in polynomial time.
{"title":"Fair Chore Division under Binary Supermodular Costs","authors":"Siddharth Barman, V. V. Narayan, Paritosh Verma","doi":"10.48550/arXiv.2302.11530","DOIUrl":"https://doi.org/10.48550/arXiv.2302.11530","url":null,"abstract":"We study the problem of dividing indivisible chores among agents whose costs (for the chores) are supermodular set functions with binary marginals. Such functions capture complementarity among chores, i.e., they constitute an expressive class wherein the marginal disutility of each chore is either one or zero, and the marginals increase with respect to supersets. In this setting, we study the broad landscape of finding fair and efficient chore allocations. In particular, we establish the existence of $(i)$ EF1 and Pareto efficient chore allocations, $(ii)$ MMS-fair and Pareto efficient allocations, and $(iii)$ Lorenz dominating chore allocations. Furthermore, we develop polynomial-time algorithms--in the value oracle model--for computing the chore allocations for each of these fairness and efficiency criteria. Complementing these existential and algorithmic results, we show that in this chore division setting, the aforementioned fairness notions, namely EF1, MMS, and Lorenz domination are incomparable: an allocation that satisfies any one of these notions does not necessarily satisfy the others. Additionally, we study EFX chore division. In contrast to the above-mentioned positive results, we show that, for binary supermodular costs, Pareto efficient allocations that are even approximately EFX do not exist, for any arbitrarily small approximation constant. Focusing on EFX fairness alone, when the cost functions are identical we present an algorithm (Add-and-Fix) that computes an EFX allocation. For binary marginals, we show that Add-and-Fix runs in polynomial time.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129242452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-21DOI: 10.48550/arXiv.2302.10421
Ryo Nishida, Masaki Onishi, Koichi Hashimoto
Modeling and simulation approaches that express crowd movement with mathematical models are widely and actively studied to understand crowd movement and resolve crowd accidents. Existing literature on crowd modeling focuses on only the decision-making of walking behavior. However, the decision-making of route choice, which is a higher-level decision, should also be modeled for constructing more practical simulations. Furthermore, the reproducibility evaluation of the crowd simulation incorporating the route choice model using real data is insufficient. Therefore, we generalize and propose a crowd simulation framework that includes actual crowd movement measurements, route choice model estimation, and crowd simulator construction. We use the Discrete choice model as the route choice model and the Social force model as the walking model. In experiments, we measure crowd movements during an evacuation drill in a theater and a firework event where tens of thousands of people moved and prove that the crowd simulation incorporating the route choice model can reproduce the real large-scale crowd movement more accurately.
{"title":"Crowd simulation incorporating a route choice model and similarity evaluation using real large-scale data","authors":"Ryo Nishida, Masaki Onishi, Koichi Hashimoto","doi":"10.48550/arXiv.2302.10421","DOIUrl":"https://doi.org/10.48550/arXiv.2302.10421","url":null,"abstract":"Modeling and simulation approaches that express crowd movement with mathematical models are widely and actively studied to understand crowd movement and resolve crowd accidents. Existing literature on crowd modeling focuses on only the decision-making of walking behavior. However, the decision-making of route choice, which is a higher-level decision, should also be modeled for constructing more practical simulations. Furthermore, the reproducibility evaluation of the crowd simulation incorporating the route choice model using real data is insufficient. Therefore, we generalize and propose a crowd simulation framework that includes actual crowd movement measurements, route choice model estimation, and crowd simulator construction. We use the Discrete choice model as the route choice model and the Social force model as the walking model. In experiments, we measure crowd movements during an evacuation drill in a theater and a firework event where tens of thousands of people moved and prove that the crowd simulation incorporating the route choice model can reproduce the real large-scale crowd movement more accurately.","PeriodicalId":326727,"journal":{"name":"Adaptive Agents and Multi-Agent Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134143077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}