具有同步目标的随机游戏

IF 2.3 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Journal of the ACM Pub Date : 2023-05-23 DOI:https://dl.acm.org/doi/10.1145/3588866
Laurent Doyen
{"title":"具有同步目标的随机游戏","authors":"Laurent Doyen","doi":"https://dl.acm.org/doi/10.1145/3588866","DOIUrl":null,"url":null,"abstract":"<p>We consider two-player stochastic games played on a finite graph for infinitely many rounds. Stochastic games generalize both Markov decision processes (MDP) by adding an adversary player, and two-player deterministic games by adding stochasticity. The outcome of the game is a sequence of distributions over the graph states, representing the evolution of a population consisting of a continuum number of identical copies of a process modeled by the game graph. We consider synchronization objectives, which require the probability mass to accumulate in a set of target states, either always, once, infinitely often, or always after some point in the outcome sequence; and the winning modes of sure winning (if the accumulated probability is equal to 1) and almost-sure winning (if the accumulated probability is arbitrarily close to 1).</p><p>We present algorithms to compute the set of winning distributions for each of these synchronization modes, showing that the corresponding decision problem is PSPACE-complete for synchronizing once and infinitely often and PTIME-complete for synchronizing always and always after some point. These bounds are remarkably in line with the special case of MDPs, while the algorithmic solution and proof technique are considerably more involved, even for deterministic games. This is because those games have a flavor of imperfect information, in particular they are not determined and randomized strategies need to be considered, even if there is no stochastic choice in the game graph. Moreover, in combination with stochasticity in the game graph, finite-memory strategies are not sufficient in general.</p>","PeriodicalId":50022,"journal":{"name":"Journal of the ACM","volume":"45 6","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2023-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Stochastic Games with Synchronization Objectives\",\"authors\":\"Laurent Doyen\",\"doi\":\"https://dl.acm.org/doi/10.1145/3588866\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>We consider two-player stochastic games played on a finite graph for infinitely many rounds. Stochastic games generalize both Markov decision processes (MDP) by adding an adversary player, and two-player deterministic games by adding stochasticity. The outcome of the game is a sequence of distributions over the graph states, representing the evolution of a population consisting of a continuum number of identical copies of a process modeled by the game graph. We consider synchronization objectives, which require the probability mass to accumulate in a set of target states, either always, once, infinitely often, or always after some point in the outcome sequence; and the winning modes of sure winning (if the accumulated probability is equal to 1) and almost-sure winning (if the accumulated probability is arbitrarily close to 1).</p><p>We present algorithms to compute the set of winning distributions for each of these synchronization modes, showing that the corresponding decision problem is PSPACE-complete for synchronizing once and infinitely often and PTIME-complete for synchronizing always and always after some point. These bounds are remarkably in line with the special case of MDPs, while the algorithmic solution and proof technique are considerably more involved, even for deterministic games. This is because those games have a flavor of imperfect information, in particular they are not determined and randomized strategies need to be considered, even if there is no stochastic choice in the game graph. Moreover, in combination with stochasticity in the game graph, finite-memory strategies are not sufficient in general.</p>\",\"PeriodicalId\":50022,\"journal\":{\"name\":\"Journal of the ACM\",\"volume\":\"45 6\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2023-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the ACM\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/https://dl.acm.org/doi/10.1145/3588866\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the ACM","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3588866","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

我们考虑在有限图上进行无限回合的两人随机博弈。随机对策通过增加一个对手来推广马尔可夫决策过程,通过增加随机性来推广双参与者确定性对策。游戏的结果是图形状态上的一系列分布,表示由游戏图形模拟的过程的连续数相同副本组成的群体的进化。我们考虑同步目标,它要求概率质量在一组目标状态中积累,或者总是,一次,无限频繁,或者总是在结果序列中的某个点之后;并给出了确定获胜(累积概率等于1)和几乎获胜(累积概率任意接近1)的获胜模式。我们给出了计算每种同步模式的获胜分布集的算法,表明对应的决策问题对于一次和无限频繁同步是PSPACE-complete,对于总是和总是在某点之后同步是PTIME-complete。这些界限与mdp的特殊情况非常一致,而算法解决方案和证明技术则更加复杂,甚至对于确定性游戏也是如此。这是因为这些游戏具有不完全信息的特点,特别是它们不确定,需要考虑随机策略,即使游戏图表中没有随机选择。此外,结合博弈图的随机性,有限记忆策略在一般情况下是不够的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Stochastic Games with Synchronization Objectives

We consider two-player stochastic games played on a finite graph for infinitely many rounds. Stochastic games generalize both Markov decision processes (MDP) by adding an adversary player, and two-player deterministic games by adding stochasticity. The outcome of the game is a sequence of distributions over the graph states, representing the evolution of a population consisting of a continuum number of identical copies of a process modeled by the game graph. We consider synchronization objectives, which require the probability mass to accumulate in a set of target states, either always, once, infinitely often, or always after some point in the outcome sequence; and the winning modes of sure winning (if the accumulated probability is equal to 1) and almost-sure winning (if the accumulated probability is arbitrarily close to 1).

We present algorithms to compute the set of winning distributions for each of these synchronization modes, showing that the corresponding decision problem is PSPACE-complete for synchronizing once and infinitely often and PTIME-complete for synchronizing always and always after some point. These bounds are remarkably in line with the special case of MDPs, while the algorithmic solution and proof technique are considerably more involved, even for deterministic games. This is because those games have a flavor of imperfect information, in particular they are not determined and randomized strategies need to be considered, even if there is no stochastic choice in the game graph. Moreover, in combination with stochasticity in the game graph, finite-memory strategies are not sufficient in general.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of the ACM
Journal of the ACM 工程技术-计算机:理论方法
CiteScore
7.50
自引率
0.00%
发文量
51
审稿时长
3 months
期刊介绍: The best indicator of the scope of the journal is provided by the areas covered by its Editorial Board. These areas change from time to time, as the field evolves. The following areas are currently covered by a member of the Editorial Board: Algorithms and Combinatorial Optimization; Algorithms and Data Structures; Algorithms, Combinatorial Optimization, and Games; Artificial Intelligence; Complexity Theory; Computational Biology; Computational Geometry; Computer Graphics and Computer Vision; Computer-Aided Verification; Cryptography and Security; Cyber-Physical, Embedded, and Real-Time Systems; Database Systems and Theory; Distributed Computing; Economics and Computation; Information Theory; Logic and Computation; Logic, Algorithms, and Complexity; Machine Learning and Computational Learning Theory; Networking; Parallel Computing and Architecture; Programming Languages; Quantum Computing; Randomized Algorithms and Probabilistic Analysis of Algorithms; Scientific Computing and High Performance Computing; Software Engineering; Web Algorithms and Data Mining
期刊最新文献
Query lower bounds for log-concave sampling Transaction Fee Mechanism Design Sparse Higher Order Čech Filtrations Killing a Vortex Separations in Proof Complexity and TFNP
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1