首页 > 最新文献

arXiv - CS - Computer Science and Game Theory最新文献

英文 中文
Online Test Synthesis From Requirements: Enhancing Reinforcement Learning with Game Theory 从需求出发的在线测试合成:用博弈论加强强化学习
Pub Date : 2024-07-26 DOI: arxiv-2407.18994
Ocan SankurDEVINE, UR, Thierry JéronDEVINE, UR, Nicolas MarkeyDEVINE, UR, David MentréMERCE-France, Reiya Noguchi
We consider the automatic online synthesis of black-box test cases fromfunctional requirements specified as automata for reactive implementations. Thegoal of the tester is to reach some given state, so as to satisfy a coveragecriterion, while monitoring the violation of the requirements. We develop anapproach based on Monte Carlo Tree Search, which is a classical technique inreinforcement learning for efficiently selecting promising inputs. Seeing theautomata requirements as a game between the implementation and the tester, wedevelop a heuristic by biasing the search towards inputs that are promising inthis game. We experimentally show that our heuristic accelerates theconvergence of the Monte Carlo Tree Search algorithm, thus improving theperformance of testing.
我们考虑的是根据功能需求自动在线合成黑盒测试用例。测试人员的目标是达到某个给定的状态,以满足覆盖标准,同时监控对需求的违反情况。我们开发了一种基于蒙特卡洛树搜索的方法,它是强化学习中的一种经典技术,用于高效选择有前途的输入。我们将自动测试要求视为实现者和测试者之间的一场博弈,通过偏向于搜索在这场博弈中有希望的输入来开发启发式方法。实验表明,我们的启发式加速了蒙特卡洛树搜索算法的收敛,从而提高了测试性能。
{"title":"Online Test Synthesis From Requirements: Enhancing Reinforcement Learning with Game Theory","authors":"Ocan SankurDEVINE, UR, Thierry JéronDEVINE, UR, Nicolas MarkeyDEVINE, UR, David MentréMERCE-France, Reiya Noguchi","doi":"arxiv-2407.18994","DOIUrl":"https://doi.org/arxiv-2407.18994","url":null,"abstract":"We consider the automatic online synthesis of black-box test cases from\u0000functional requirements specified as automata for reactive implementations. The\u0000goal of the tester is to reach some given state, so as to satisfy a coverage\u0000criterion, while monitoring the violation of the requirements. We develop an\u0000approach based on Monte Carlo Tree Search, which is a classical technique in\u0000reinforcement learning for efficiently selecting promising inputs. Seeing the\u0000automata requirements as a game between the implementation and the tester, we\u0000develop a heuristic by biasing the search towards inputs that are promising in\u0000this game. We experimentally show that our heuristic accelerates the\u0000convergence of the Monte Carlo Tree Search algorithm, thus improving the\u0000performance of testing.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141871864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nested replicator dynamics, nested logit choice, and similarity-based learning 嵌套复制动态、嵌套对数选择和基于相似性的学习
Pub Date : 2024-07-25 DOI: arxiv-2407.17815
Panayotis Mertikopoulos, William H. Sandholm
We consider a model of learning and evolution in games whose action sets areendowed with a partition-based similarity structure intended to captureexogenous similarities between strategies. In this model, revising agents havea higher probability of comparing their current strategy with other strategiesthat they deem similar, and they switch to the observed strategy withprobability proportional to its payoff excess. Because of this implicit biastoward similar strategies, the resulting dynamics - which we call the nestedreplicator dynamics - do not satisfy any of the standard monotonicitypostulates for imitative game dynamics; nonetheless, we show that they retainthe main long-run rationality properties of the replicator dynamics, albeit atquantitatively different rates. We also show that the induced dynamics can beviewed as a stimulus-response model in the spirit of Erev & Roth (1998), withchoice probabilities given by the nested logit choice rule of Ben-Akiva (1973)and McFadden (1978). This result generalizes an existing relation between thereplicator dynamics and the exponential weights algorithm in online learning,and provides an additional layer of interpretation to our analysis and results.
我们考虑了一个博弈中的学习和演化模型,该模型的行动集具有基于分区的相似性结构,旨在捕捉策略间的外生相似性。在这个模型中,修正策略的代理有更高的概率将他们当前的策略与他们认为相似的其他策略进行比较,然后他们切换到观察到的策略,其概率与策略的超额报酬成正比。由于这种隐含的对相似策略的偏好,由此产生的动态--我们称之为嵌套复制者动态--并不满足模仿博弈动态的任何标准单调性假设;尽管如此,我们还是证明了它们保留了复制者动态的主要长期理性特性,尽管在定量上的速率有所不同。我们还证明,根据 Erev & Roth(1998)的精神,可以把诱导动态看作一个刺激-反应模型,选择概率由 Ben-Akiva(1973)和 McFadden(1978)的嵌套 logit 选择规则给出。这一结果概括了在线学习中施放器动态与指数加权算法之间的现有关系,并为我们的分析和结果提供了额外的解释。
{"title":"Nested replicator dynamics, nested logit choice, and similarity-based learning","authors":"Panayotis Mertikopoulos, William H. Sandholm","doi":"arxiv-2407.17815","DOIUrl":"https://doi.org/arxiv-2407.17815","url":null,"abstract":"We consider a model of learning and evolution in games whose action sets are\u0000endowed with a partition-based similarity structure intended to capture\u0000exogenous similarities between strategies. In this model, revising agents have\u0000a higher probability of comparing their current strategy with other strategies\u0000that they deem similar, and they switch to the observed strategy with\u0000probability proportional to its payoff excess. Because of this implicit bias\u0000toward similar strategies, the resulting dynamics - which we call the nested\u0000replicator dynamics - do not satisfy any of the standard monotonicity\u0000postulates for imitative game dynamics; nonetheless, we show that they retain\u0000the main long-run rationality properties of the replicator dynamics, albeit at\u0000quantitatively different rates. We also show that the induced dynamics can be\u0000viewed as a stimulus-response model in the spirit of Erev & Roth (1998), with\u0000choice probabilities given by the nested logit choice rule of Ben-Akiva (1973)\u0000and McFadden (1978). This result generalizes an existing relation between the\u0000replicator dynamics and the exponential weights algorithm in online learning,\u0000and provides an additional layer of interpretation to our analysis and results.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Approximately Strategy-Proof Tournament Rules for Collusions of Size at Least Three 关于规模至少为三的串通的近似策略证明锦标赛规则
Pub Date : 2024-07-24 DOI: arxiv-2407.17569
David Mikšaník, Ariel Schvartzman, Jan Soukup
A tournament organizer must select one of $n$ possible teams as the winner ofa competition after observing all $binom{n}{2}$ matches between them. Theorganizer would like to find a tournament rule that simultaneously satisfiesthe following desiderata. It must be Condorcet-consistent (henceforth, CC),meaning it selects as the winner the unique team that beats all other teams (ifone exists). It must also be strongly non-manipulable for groups of size $k$ atprobability $alpha$ (henceforth, k-SNM-$alpha$), meaning that no subset of$leq k$ teams can fix the matches among themselves in order to increase thechances any of it's members being selected by more than $alpha$. Ourcontributions are threefold. First, wee consider a natural generalization ofthe Randomized Single Elimination Bracket rule from [Schneider et al. 2017] to$d$-ary trees and provide upper bounds to its manipulability. Then, we proposea novel tournament rule that is CC and 3-SNM-1/2, a strict improvement upon theconcurrent work of [Dinev and Weinberg, 2022] who proposed a CC and 3-SNM-31/60rule. Finally, we initiate the study of reductions among tournament rules.
比赛组织者必须在观察了 $binom{n}{2}$ 之间的所有比赛后,从 $n$ 可能的队伍中选出一支队伍作为比赛的获胜者。组织者希望找到一种比赛规则,同时满足以下要求。它必须是康德赛特一致的(以下简称 CC),即它能选出击败所有其他队伍(如果存在的话)的唯一一支队伍作为获胜者。对于概率为 $alpha$ 的大小为 $k$ 的小组来说,它还必须是强不可操纵的(以下简称为 k-SNM-$alpha$),也就是说,没有任何一个由 $leq k$ 小组组成的子集可以固定它们之间的匹配,以增加其任何一个成员被选中的概率超过 $alpha$。我们的贡献有三方面。首先,我们考虑将 [Schneider 等人,2017] 中的 "随机单败淘汰赛"(Randomized Single Elimination Bracket)规则自然推广到 $d$-ary 树,并提供其可操作性的上限。然后,我们提出了一个 CC 和 3-SNM-1/2 的新锦标赛规则,这是对 [Dinev and Weinberg, 2022] 目前工作的严格改进,后者提出了一个 CC 和 3-SNM-31/60 规则。最后,我们开始研究锦标赛规则之间的还原。
{"title":"On Approximately Strategy-Proof Tournament Rules for Collusions of Size at Least Three","authors":"David Mikšaník, Ariel Schvartzman, Jan Soukup","doi":"arxiv-2407.17569","DOIUrl":"https://doi.org/arxiv-2407.17569","url":null,"abstract":"A tournament organizer must select one of $n$ possible teams as the winner of\u0000a competition after observing all $binom{n}{2}$ matches between them. The\u0000organizer would like to find a tournament rule that simultaneously satisfies\u0000the following desiderata. It must be Condorcet-consistent (henceforth, CC),\u0000meaning it selects as the winner the unique team that beats all other teams (if\u0000one exists). It must also be strongly non-manipulable for groups of size $k$ at\u0000probability $alpha$ (henceforth, k-SNM-$alpha$), meaning that no subset of\u0000$leq k$ teams can fix the matches among themselves in order to increase the\u0000chances any of it's members being selected by more than $alpha$. Our\u0000contributions are threefold. First, wee consider a natural generalization of\u0000the Randomized Single Elimination Bracket rule from [Schneider et al. 2017] to\u0000$d$-ary trees and provide upper bounds to its manipulability. Then, we propose\u0000a novel tournament rule that is CC and 3-SNM-1/2, a strict improvement upon the\u0000concurrent work of [Dinev and Weinberg, 2022] who proposed a CC and 3-SNM-31/60\u0000rule. Finally, we initiate the study of reductions among tournament rules.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Sharing for Mean Estimation Among Heterogeneous Strategic Agents 异构战略代理之间的均值估计数据共享
Pub Date : 2024-07-20 DOI: arxiv-2407.15881
Alex Clinton, Yiding Chen, Xiaojin Zhu, Kirthevasan Kandasamy
We study a collaborative learning problem where $m$ agents estimate a vector$muinmathbb{R}^d$ by collecting samples from normal distributions, with eachagent $i$ incurring a cost $c_{i,k} in (0, infty]$ to sample from the$k^{text{th}}$ distribution $mathcal{N}(mu_k, sigma^2)$. Instead of workingon their own, agents can collect data that is cheap to them, and share it withothers in exchange for data that is expensive or even inaccessible to them,thereby simultaneously reducing data collection costs and estimation error.However, when agents have different collection costs, we need to first decidehow to fairly divide the work of data collection so as to benefit all agents.Moreover, in naive sharing protocols, strategic agents may under-collect and/orfabricate data, leading to socially undesirable outcomes. Our mechanismaddresses these challenges by combining ideas from cooperative andnon-cooperative game theory. We use ideas from axiomatic bargaining to dividethe cost of data collection. Given such a solution, we develop a Nashincentive-compatible (NIC) mechanism to enforce truthful reporting. We achievea $mathcal{O}(sqrt{m})$ approximation to the minimum social penalty (sum ofagent estimation errors and data collection costs) in the worst case, and a$mathcal{O}(1)$ approximation under favorable conditions. We complement thiswith a hardness result, showing that $Omega(sqrt{m})$ is unavoidable in anyNIC mechanism.
我们研究了一个协作学习问题,在这个问题中,$m$ 代理通过从正态分布中收集样本来估计一个向量$muinmathbb{R}^d$,每个代理 $i$ 产生的成本为 $c_{i,k} (0, infty]$)。in (0, infty]$从$k^{text{th}}$分布$mathcal{N}(mu_k, sigma^2)$中采样。代理人可以收集对他们来说便宜的数据,并与其他人分享这些数据,以换取对他们来说昂贵甚至无法获取的数据,从而同时降低数据收集成本和估计误差。然而,当代理人有不同的收集成本时,我们需要首先决定如何公平地分配数据收集工作,以便让所有代理人受益。此外,在天真的分享协议中,策略代理人可能会收集不足和/或伪造数据,从而导致社会不期望的结果。我们的机制结合了合作博弈论和非合作博弈论的思想,解决了这些难题。我们利用公理讨价还价的思想来分摊数据收集的成本。鉴于这种解决方案,我们开发了一种纳什激励兼容(NIC)机制来强制执行真实报告。在最坏的情况下,我们实现了最小社会惩罚(代理估计误差与数据收集成本之和)的$mathcal{O}(sqrt{m})$近似值,在有利条件下实现了$mathcal{O}(1)$近似值。我们补充了一个硬度结果,表明在任何NIC机制中$Omega(sqrt{m})$都是不可避免的。
{"title":"Data Sharing for Mean Estimation Among Heterogeneous Strategic Agents","authors":"Alex Clinton, Yiding Chen, Xiaojin Zhu, Kirthevasan Kandasamy","doi":"arxiv-2407.15881","DOIUrl":"https://doi.org/arxiv-2407.15881","url":null,"abstract":"We study a collaborative learning problem where $m$ agents estimate a vector\u0000$muinmathbb{R}^d$ by collecting samples from normal distributions, with each\u0000agent $i$ incurring a cost $c_{i,k} in (0, infty]$ to sample from the\u0000$k^{text{th}}$ distribution $mathcal{N}(mu_k, sigma^2)$. Instead of working\u0000on their own, agents can collect data that is cheap to them, and share it with\u0000others in exchange for data that is expensive or even inaccessible to them,\u0000thereby simultaneously reducing data collection costs and estimation error.\u0000However, when agents have different collection costs, we need to first decide\u0000how to fairly divide the work of data collection so as to benefit all agents.\u0000Moreover, in naive sharing protocols, strategic agents may under-collect and/or\u0000fabricate data, leading to socially undesirable outcomes. Our mechanism\u0000addresses these challenges by combining ideas from cooperative and\u0000non-cooperative game theory. We use ideas from axiomatic bargaining to divide\u0000the cost of data collection. Given such a solution, we develop a Nash\u0000incentive-compatible (NIC) mechanism to enforce truthful reporting. We achieve\u0000a $mathcal{O}(sqrt{m})$ approximation to the minimum social penalty (sum of\u0000agent estimation errors and data collection costs) in the worst case, and a\u0000$mathcal{O}(1)$ approximation under favorable conditions. We complement this\u0000with a hardness result, showing that $Omega(sqrt{m})$ is unavoidable in any\u0000NIC mechanism.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrated Resource Allocation and Strategy Synthesis in Safety Games on Graphs with Deception 有欺骗性的图上安全游戏中的综合资源分配与策略合成
Pub Date : 2024-07-19 DOI: arxiv-2407.14436
Abhishek N. Kulkarni, Matthew S. Cohen, Charles A. Kamhoua, Jie Fu
Deception plays a crucial role in strategic interactions with incompleteinformation. Motivated by security applications, we study a class of two-playerturn-based deterministic games with one-sided incomplete information, in whichplayer 1 (P1) aims to prevent player 2 (P2) from reaching a set of targetstates. In addition to actions, P1 can place two kinds of deception resources:"traps" and "fake targets" to disinform P2 about the transition dynamics andpayoff of the game. Traps "hide the real" by making trap states appear normal,while fake targets "reveal the fiction" by advertising non-target states astargets. We are interested in jointly synthesizing optimal decoy placement anddeceptive defense strategies for P1 that exploits P2's misinformation. Weintroduce a novel hypergame on graph model and two solution concepts: stealthydeceptive sure winning and stealthy deceptive almost-sure winning. Theseidentify states from which P1 can prevent P2 from reaching the target in afinite number of steps or with probability one without allowing P2 to becomeaware that it is being deceived. Consequently, determining the optimal decoyplacement corresponds to maximizing the size of P1's deceptive winning region.Considering the combinatorial complexity of exploring all decoy allocations, weutilize compositional synthesis concepts to show that the objective functionfor decoy placement is monotone, non-decreasing, and, in certain cases, sub- orsuper-modular. This leads to a greedy algorithm for decoy placement, achievinga $(1 - 1/e)$-approximation when the objective function is sub- orsuper-modular. The proposed hypergame model and solution concepts contribute tounderstanding the optimal deception resource allocation and deceptionstrategies in various security applications.
在不完全信息的战略互动中,欺骗起着至关重要的作用。受安全应用的启发,我们研究了一类具有单边不完全信息的基于双回合的确定性博弈,其中玩家 1(P1)的目标是阻止玩家 2(P2)达到一组目标状态。除了行动之外,P1 还可以放置两种欺骗资源:"陷阱 "和 "假目标",让 P2 不知道博弈的过渡动态和回报。陷阱通过让陷阱状态看起来正常来 "掩人耳目",而假目标则通过把非目标状态当作目标来 "揭露真相"。我们感兴趣的是为 P1 共同合成利用 P2 的错误信息的最佳诱饵放置和欺骗性防御策略。我们在图模型上引入了一个新颖的超博弈和两个求解概念:隐蔽欺骗性必胜和隐蔽欺骗性几乎必胜。这两个概念确定了 P1 可以阻止 P2 在无限步数或概率为一的情况下到达目标而不让 P2 意识到自己被欺骗的状态。考虑到探索所有诱饵分配的组合复杂性,我们利用组合合成概念来证明诱饵放置的目标函数是单调的、非递减的,并且在某些情况下是次模态或超模态的。这就产生了一种诱饵放置的贪婪算法,当目标函数是次模态或超模态时,该算法可实现 $(1 - 1/e)$ 近似值。所提出的超游戏模型和解决概念有助于理解各种安全应用中的最佳欺骗资源分配和欺骗策略。
{"title":"Integrated Resource Allocation and Strategy Synthesis in Safety Games on Graphs with Deception","authors":"Abhishek N. Kulkarni, Matthew S. Cohen, Charles A. Kamhoua, Jie Fu","doi":"arxiv-2407.14436","DOIUrl":"https://doi.org/arxiv-2407.14436","url":null,"abstract":"Deception plays a crucial role in strategic interactions with incomplete\u0000information. Motivated by security applications, we study a class of two-player\u0000turn-based deterministic games with one-sided incomplete information, in which\u0000player 1 (P1) aims to prevent player 2 (P2) from reaching a set of target\u0000states. In addition to actions, P1 can place two kinds of deception resources:\u0000\"traps\" and \"fake targets\" to disinform P2 about the transition dynamics and\u0000payoff of the game. Traps \"hide the real\" by making trap states appear normal,\u0000while fake targets \"reveal the fiction\" by advertising non-target states as\u0000targets. We are interested in jointly synthesizing optimal decoy placement and\u0000deceptive defense strategies for P1 that exploits P2's misinformation. We\u0000introduce a novel hypergame on graph model and two solution concepts: stealthy\u0000deceptive sure winning and stealthy deceptive almost-sure winning. These\u0000identify states from which P1 can prevent P2 from reaching the target in a\u0000finite number of steps or with probability one without allowing P2 to become\u0000aware that it is being deceived. Consequently, determining the optimal decoy\u0000placement corresponds to maximizing the size of P1's deceptive winning region.\u0000Considering the combinatorial complexity of exploring all decoy allocations, we\u0000utilize compositional synthesis concepts to show that the objective function\u0000for decoy placement is monotone, non-decreasing, and, in certain cases, sub- or\u0000super-modular. This leads to a greedy algorithm for decoy placement, achieving\u0000a $(1 - 1/e)$-approximation when the objective function is sub- or\u0000super-modular. The proposed hypergame model and solution concepts contribute to\u0000understanding the optimal deception resource allocation and deception\u0000strategies in various security applications.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"64 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unravelling in Collaborative Learning 合作学习中的解构
Pub Date : 2024-07-19 DOI: arxiv-2407.14332
Aymeric Capitaine, Etienne Boursier, Antoine Scheid, Eric Moulines, Michael I. Jordan, El-Mahdi El-Mhamdi, Alain Durmus
Collaborative learning offers a promising avenue for leveraging decentralizeddata. However, collaboration in groups of strategic learners is not a given. Inthis work, we consider strategic agents who wish to train a model together buthave sampling distributions of different quality. The collaboration isorganized by a benevolent aggregator who gathers samples so as to maximizetotal welfare, but is unaware of data quality. This setting allows us to shedlight on the deleterious effect of adverse selection in collaborative learning.More precisely, we demonstrate that when data quality indices are private, thecoalition may undergo a phenomenon known as unravelling, wherein it shrinks upto the point that it becomes empty or solely comprised of the worst agent. Weshow how this issue can be addressed without making use of external transfers,by proposing a novel method inspired by probabilistic verification. Thisapproach makes the grand coalition a Nash equilibrium with high probabilitydespite information asymmetry, thereby breaking unravelling.
协作学习为利用分散数据提供了一条大有可为的途径。然而,战略学习者群体中的协作并不是必然的。在这项工作中,我们考虑的是希望共同训练模型但采样分布质量不同的战略代理。合作由一个仁慈的聚合者组织,他收集样本以最大化总福利,但并不了解数据质量。更准确地说,我们证明了当数据质量指标不公开时,联盟可能会出现一种被称为 "解构 "的现象,即联盟不断缩小,最后变成空壳或只剩下最差的代理。通过提出一种受概率验证启发的新方法,我们展示了如何在不使用外部转移的情况下解决这一问题。尽管信息不对称,这种方法仍能使大联盟成为高概率的纳什均衡,从而打破解蔽。
{"title":"Unravelling in Collaborative Learning","authors":"Aymeric Capitaine, Etienne Boursier, Antoine Scheid, Eric Moulines, Michael I. Jordan, El-Mahdi El-Mhamdi, Alain Durmus","doi":"arxiv-2407.14332","DOIUrl":"https://doi.org/arxiv-2407.14332","url":null,"abstract":"Collaborative learning offers a promising avenue for leveraging decentralized\u0000data. However, collaboration in groups of strategic learners is not a given. In\u0000this work, we consider strategic agents who wish to train a model together but\u0000have sampling distributions of different quality. The collaboration is\u0000organized by a benevolent aggregator who gathers samples so as to maximize\u0000total welfare, but is unaware of data quality. This setting allows us to shed\u0000light on the deleterious effect of adverse selection in collaborative learning.\u0000More precisely, we demonstrate that when data quality indices are private, the\u0000coalition may undergo a phenomenon known as unravelling, wherein it shrinks up\u0000to the point that it becomes empty or solely comprised of the worst agent. We\u0000show how this issue can be addressed without making use of external transfers,\u0000by proposing a novel method inspired by probabilistic verification. This\u0000approach makes the grand coalition a Nash equilibrium with high probability\u0000despite information asymmetry, thereby breaking unravelling.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On sibyl-proof mechanisms 关于防咝咝声机制
Pub Date : 2024-07-19 DOI: arxiv-2407.14485
Minghao Pan, Akaki Mamageishvili, Christoph Schlegel
We show that in the single-parameter mechanism design environment, the onlynon-wasteful, symmetric, incentive compatible and sibyl-proof mechanism is asecond price auction with symmetric tie-breaking. Thus, if there is privateinformation, lotteries or other mechanisms that do not always allocate to ahighest-value bidder are not sibyl-proof or not incentive compatible.
我们证明,在单参数机制设计环境中,唯一不浪费、对称、激励兼容且防止西比勒效应的机制是对称打破平局的第二价格拍卖。因此,如果存在私人信息,抽签或其他并不总是分配给价值最高的投标人的机制就不能防止西比勒效应,也不能与激励相容。
{"title":"On sibyl-proof mechanisms","authors":"Minghao Pan, Akaki Mamageishvili, Christoph Schlegel","doi":"arxiv-2407.14485","DOIUrl":"https://doi.org/arxiv-2407.14485","url":null,"abstract":"We show that in the single-parameter mechanism design environment, the only\u0000non-wasteful, symmetric, incentive compatible and sibyl-proof mechanism is a\u0000second price auction with symmetric tie-breaking. Thus, if there is private\u0000information, lotteries or other mechanisms that do not always allocate to a\u0000highest-value bidder are not sibyl-proof or not incentive compatible.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Persuading while Learning 边学习边说服
Pub Date : 2024-07-19 DOI: arxiv-2407.13964
Itai Arieli, Yakov Babichenko, Dimitry Shaiderman, Xianwen Shi
We propose a dynamic product adoption persuasion model involving an impatientpartially informed sender who gradually learns the state. In this model, thesender gathers information over time, and hence her posteriors' sequence formsa discrete-time martingale. The sender commits to a dynamic revelation policyto persuade the agent to adopt a product. We demonstrate that under theassumption that the sender's martingale possesses Blackwell-preserving kernels,the family of optimal strategies for the sender takes an interval form; namely,in every period the set of martingale realizations in which adoption occurs isan interval. Utilizing this, we prove that if the sender is sufficientlyimpatient, then under a random walk martingale, the optimal policy is fullytransparent up to the moment of adoption; namely, the sender reveals the entireinformation she privately holds in every period.
我们提出了一个动态产品采纳说服模型,该模型涉及一个不耐烦的部分知情的发送者,她会逐渐了解状态。在这个模型中,发送者会随着时间的推移收集信息,因此她的后验序列会形成离散时间的马丁格尔。发送者承诺采用动态启示政策来说服代理人采用某种产品。我们证明,在假设发送者的马丁格尔具有布莱克韦尔保全核的情况下,发送者的最优策略族采用区间形式;即在每个周期中,采用产品的马丁格尔实现集是一个区间。利用这一点,我们证明了如果发送者足够耐心,那么在随机游走的马丁格尔条件下,最优策略直到采用的那一刻都是完全透明的;也就是说,发送者在每个时期都会透露她私人持有的全部信息。
{"title":"Persuading while Learning","authors":"Itai Arieli, Yakov Babichenko, Dimitry Shaiderman, Xianwen Shi","doi":"arxiv-2407.13964","DOIUrl":"https://doi.org/arxiv-2407.13964","url":null,"abstract":"We propose a dynamic product adoption persuasion model involving an impatient\u0000partially informed sender who gradually learns the state. In this model, the\u0000sender gathers information over time, and hence her posteriors' sequence forms\u0000a discrete-time martingale. The sender commits to a dynamic revelation policy\u0000to persuade the agent to adopt a product. We demonstrate that under the\u0000assumption that the sender's martingale possesses Blackwell-preserving kernels,\u0000the family of optimal strategies for the sender takes an interval form; namely,\u0000in every period the set of martingale realizations in which adoption occurs is\u0000an interval. Utilizing this, we prove that if the sender is sufficiently\u0000impatient, then under a random walk martingale, the optimal policy is fully\u0000transparent up to the moment of adoption; namely, the sender reveals the entire\u0000information she privately holds in every period.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Strategies in Ranked-Choice Voting 排序选择投票的最佳策略
Pub Date : 2024-07-18 DOI: arxiv-2407.13661
Sanyukta Deshpande, Nikhil Garg, Sheldon Jacobson
Ranked Choice Voting (RCV) and Single Transferable Voting (STV) are widelyvalued; but are complex to understand due to intricate per-round votetransfers. Questions like determining how far a candidate is from winning oridentifying effective election strategies are computationally challenging asminor changes in voter rankings can lead to significant ripple effects - forexample, lending support to a losing candidate can prevent their votes fromtransferring to a more competitive opponent. We study optimal strategies -persuading voters to change their ballots or adding new voters - bothalgorithmically and theoretically. Algorithmically, we develop efficientmethods to reduce election instances while maintaining optimization accuracy,effectively circumventing the computational complexity barrier. Theoretically,we analyze the effectiveness of strategies under both perfect and imperfectpolling information. Our algorithmic approach applies to the ranked-choicepolling data on the US 2024 Republican Primary, finding, for example, thatseveral candidates would have been optimally served by boosting anothercandidate instead of themselves.
排名选择投票(RCV)和单一可转移投票(STV)受到广泛重视,但由于每轮投票的转移错综复杂,因此很难理解。确定候选人离获胜还有多远或识别有效的选举策略等问题在计算上极具挑战性,因为选民排名的微小变化都可能导致巨大的连锁反应--例如,向落选候选人提供支持可能会阻止他们的选票转移到更具竞争力的对手手中。我们从算法和理论两方面研究了最优策略--说服选民更改选票或增加新选民。在算法上,我们开发了有效的方法来减少选举实例,同时保持优化的准确性,有效地规避了计算复杂性障碍。理论上,我们分析了完美和不完美投票信息下策略的有效性。我们的算法方法适用于美国 2024 年共和党初选的排序选择投票数据,例如,我们发现有几位候选人本可以通过助推另一位候选人而不是自己来达到最佳效果。
{"title":"Optimal Strategies in Ranked-Choice Voting","authors":"Sanyukta Deshpande, Nikhil Garg, Sheldon Jacobson","doi":"arxiv-2407.13661","DOIUrl":"https://doi.org/arxiv-2407.13661","url":null,"abstract":"Ranked Choice Voting (RCV) and Single Transferable Voting (STV) are widely\u0000valued; but are complex to understand due to intricate per-round vote\u0000transfers. Questions like determining how far a candidate is from winning or\u0000identifying effective election strategies are computationally challenging as\u0000minor changes in voter rankings can lead to significant ripple effects - for\u0000example, lending support to a losing candidate can prevent their votes from\u0000transferring to a more competitive opponent. We study optimal strategies -\u0000persuading voters to change their ballots or adding new voters - both\u0000algorithmically and theoretically. Algorithmically, we develop efficient\u0000methods to reduce election instances while maintaining optimization accuracy,\u0000effectively circumventing the computational complexity barrier. Theoretically,\u0000we analyze the effectiveness of strategies under both perfect and imperfect\u0000polling information. Our algorithmic approach applies to the ranked-choice\u0000polling data on the US 2024 Republican Primary, finding, for example, that\u0000several candidates would have been optimally served by boosting another\u0000candidate instead of themselves.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Truthful and Almost Envy-Free Mechanism of Allocating Indivisible Goods: the Power of Randomness 真实且几乎不受嫉妒影响的不可分割物品分配机制:随机性的力量
Pub Date : 2024-07-18 DOI: arxiv-2407.13634
Xiaolin Bu, Biaoshuai Tao
We study the problem of fairly and truthfully allocating $m$ indivisibleitems to $n$ agents with additive preferences. Specifically, we considertruthful mechanisms outputting allocations that satisfy EF$^{+u}_{-v}$, where,in an EF$^{+u}_{-v}$ allocation, for any pair of agents $i$ and $j$, agent $i$will not envy agent $j$ if $u$ items were added to $i$'s bundle and $v$ itemswere removed from $j$'s bundle. Previous work easily indicates that, whenrestricted to deterministic mechanisms, truthfulness will lead to a poorguarantee of fairness: even with two agents, for any $u$ and $v$,EF$^{+u}_{-v}$ cannot be guaranteed by truthful mechanisms when the number ofitems is large enough. In this work, we focus on randomized mechanisms, wherewe consider ex-ante truthfulness and ex-post fairness. For two agents, wepresent a truthful mechanism that achieves EF$^{+0}_{-1}$ (i.e., thewell-studied fairness notion EF$1$). For three agents, we present a truthfulmechanism that achieves EF$^{+1}_{-1}$. For $n$ agents in general, we show thatthere exist truthful mechanisms that achieve EF$^{+u}_{-v}$ for some $u$ and$v$ that depend only on $n$ (not $m$). We further consider fair and truthful mechanisms that also satisfy thestandard efficiency guarantee: Pareto-optimality. We provide a mechanism thatsimultaneously achieves truthfulness, EF$1$, and Pareto-optimality forbi-valued utilities (where agents' valuation on each item is either $p$ or $q$for some $p>qgeq0$). For tri-valued utilities (where agents' valuations oneach item belong to ${p,q,r}$ for some $p>q>rgeq0$) and any $u,v$, we showthat truthfulness is incompatible with EF$^{+u}_{-v}$ and Pareto-optimalityeven for two agents.
我们研究的问题是,如何将 $m$ 不可分割的物品公平、真实地分配给具有相加偏好的 $n$ 代理人。具体来说,我们考虑的是输出满足 EF$^{+u}_{-v}$ 的分配的真实机制,在 EF$^{+u}_{-v}$ 分配中,对于任意一对代理 $i$ 和 $j$,如果 $i$ 的捆绑物品中增加了 $u$,而 $j$ 的捆绑物品中删除了 $v$,那么代理 $i$ 不会嫉妒代理 $j$。以前的研究很容易就表明,如果仅限于确定性机制,真实性将导致对公平性的不良保证:即使有两个代理人,对于任意 $u$ 和 $v$,当物品数量足够大时,真实性机制也不能保证 EF$^{+u}_{-v}$。在这项工作中,我们将重点放在随机机制上,考虑事前的真实性和事后的公平性。对于两个代理人,我们提出了一种可达到 EF$^{+0}_{-1}$ 的真实机制(即已被广泛研究的公平性概念 EF$1$)。对于三个代理,我们提出了一种可实现 EF$^{+1}_{-1}$ 的真实机制。对于一般的 $n$ 代理,我们证明了存在能够实现 EF$^{+u}_{-v}$的真实机制,对于某些 $u$ 和 $v$,它们只取决于 $n$(而不是 $m$)。我们将进一步考虑同样满足标准效率保证的公平真实机制:帕累托最优。我们提供了一种机制,它能同时实现真实性、EF$1$ 和帕累托最优性(其中代理人对每个项目的估值要么是 $p$ 要么是 $q$,而某个 $p>qgeq0$)。对于三值效用(即代理人对每个项目的估值都属于${p,q,r}$,对于某个$p>q>rgeq0$)和任意$u,v$,我们证明了真实性与EF$^{+u}_{-v}$和帕累托最优性不相容,甚至对于两个代理人也是如此。
{"title":"Truthful and Almost Envy-Free Mechanism of Allocating Indivisible Goods: the Power of Randomness","authors":"Xiaolin Bu, Biaoshuai Tao","doi":"arxiv-2407.13634","DOIUrl":"https://doi.org/arxiv-2407.13634","url":null,"abstract":"We study the problem of fairly and truthfully allocating $m$ indivisible\u0000items to $n$ agents with additive preferences. Specifically, we consider\u0000truthful mechanisms outputting allocations that satisfy EF$^{+u}_{-v}$, where,\u0000in an EF$^{+u}_{-v}$ allocation, for any pair of agents $i$ and $j$, agent $i$\u0000will not envy agent $j$ if $u$ items were added to $i$'s bundle and $v$ items\u0000were removed from $j$'s bundle. Previous work easily indicates that, when\u0000restricted to deterministic mechanisms, truthfulness will lead to a poor\u0000guarantee of fairness: even with two agents, for any $u$ and $v$,\u0000EF$^{+u}_{-v}$ cannot be guaranteed by truthful mechanisms when the number of\u0000items is large enough. In this work, we focus on randomized mechanisms, where\u0000we consider ex-ante truthfulness and ex-post fairness. For two agents, we\u0000present a truthful mechanism that achieves EF$^{+0}_{-1}$ (i.e., the\u0000well-studied fairness notion EF$1$). For three agents, we present a truthful\u0000mechanism that achieves EF$^{+1}_{-1}$. For $n$ agents in general, we show that\u0000there exist truthful mechanisms that achieve EF$^{+u}_{-v}$ for some $u$ and\u0000$v$ that depend only on $n$ (not $m$). We further consider fair and truthful mechanisms that also satisfy the\u0000standard efficiency guarantee: Pareto-optimality. We provide a mechanism that\u0000simultaneously achieves truthfulness, EF$1$, and Pareto-optimality for\u0000bi-valued utilities (where agents' valuation on each item is either $p$ or $q$\u0000for some $p>qgeq0$). For tri-valued utilities (where agents' valuations on\u0000each item belong to ${p,q,r}$ for some $p>q>rgeq0$) and any $u,v$, we show\u0000that truthfulness is incompatible with EF$^{+u}_{-v}$ and Pareto-optimality\u0000even for two agents.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Computer Science and Game Theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1