未折现的二人零和交流随机博弈

M. Baykal-Gursoy, Z. Avsar
{"title":"未折现的二人零和交流随机博弈","authors":"M. Baykal-Gursoy, Z. Avsar","doi":"10.1109/CDC.1999.832844","DOIUrl":null,"url":null,"abstract":"Consider two-person zero-sum communicating stochastic games with finite state and finite action spaces under the long-run average payoff criterion. A communicating game is irreducible on a restricted strategy space where every pair of action is taken with positive probability. The proposed approach applies Hoffman and Karp's (1996) algorithm for irreducible games successively over a sequence of restricted strategy spaces that gets larger until an /spl epsiv/-optimal stationary policy pair is obtained for any /spl epsiv/>0. This algorithm is convergent for the games that have optimal strategies with a value independent of the initial state.","PeriodicalId":137513,"journal":{"name":"Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1999-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Undiscounted two-person zero-sum communicating stochastic games\",\"authors\":\"M. Baykal-Gursoy, Z. Avsar\",\"doi\":\"10.1109/CDC.1999.832844\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Consider two-person zero-sum communicating stochastic games with finite state and finite action spaces under the long-run average payoff criterion. A communicating game is irreducible on a restricted strategy space where every pair of action is taken with positive probability. The proposed approach applies Hoffman and Karp's (1996) algorithm for irreducible games successively over a sequence of restricted strategy spaces that gets larger until an /spl epsiv/-optimal stationary policy pair is obtained for any /spl epsiv/>0. This algorithm is convergent for the games that have optimal strategies with a value independent of the initial state.\",\"PeriodicalId\":137513,\"journal\":{\"name\":\"Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CDC.1999.832844\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC.1999.832844","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

考虑在长期平均收益标准下,具有有限状态和有限行动空间的两人零和交流随机博弈。交流博弈在有限的策略空间中是不可约的,其中每一对行动都是正概率的。所提出的方法将Hoffman和Karp(1996)的算法应用于不可约博弈,在一系列受限策略空间上连续变大,直到获得任何/spl epsiv/>0的/spl epsiv/-最优平稳策略对。对于具有独立于初始状态的最优策略的博弈,该算法是收敛的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Undiscounted two-person zero-sum communicating stochastic games
Consider two-person zero-sum communicating stochastic games with finite state and finite action spaces under the long-run average payoff criterion. A communicating game is irreducible on a restricted strategy space where every pair of action is taken with positive probability. The proposed approach applies Hoffman and Karp's (1996) algorithm for irreducible games successively over a sequence of restricted strategy spaces that gets larger until an /spl epsiv/-optimal stationary policy pair is obtained for any /spl epsiv/>0. This algorithm is convergent for the games that have optimal strategies with a value independent of the initial state.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A systematic and numerically efficient procedure for stable dynamic model inversion of LTI systems Controller design for improving the degree of stability of periodic solutions in forced nonlinear systems A Bayesian approach to the missing features problem in classification Stability analysis and systematic design of fuzzy controllers with simplified linear control rules Best linear unbiased estimation filters with FIR structures for state space signal models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1