在游戏中学习，逐步隐藏

arXiv - CS - Computer Science and Game Theory Pub Date : 2024-09-05 DOI:arxiv-2409.03875

Heymann Benjamin, Lanctot Marc

{"title":"在游戏中学习，逐步隐藏","authors":"Heymann Benjamin, Lanctot Marc","doi":"arxiv-2409.03875","DOIUrl":null,"url":null,"abstract":"When learning to play an imperfect information game, it is often easier to\nfirst start with the basic mechanics of the game rules. For example, one can\nplay several example rounds with private cards revealed to all players to\nbetter understand the basic actions and their effects. Building on this\nintuition, this paper introduces {\\it progressive hiding}, an algorithm that\nlearns to play imperfect information games by first learning the basic\nmechanics and then progressively adding information constraints over time.\nProgressive hiding is inspired by methods from stochastic multistage\noptimization such as scenario decomposition and progressive hedging. We prove\nthat it enables the adaptation of counterfactual regret minimization to games\nwhere perfect recall is not satisfied. Numerical experiments illustrate that\nprogressive hiding can achieve optimal payoff in a benchmark of emergent\ncommunication trading game.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning in Games with progressive hiding\",\"authors\":\"Heymann Benjamin, Lanctot Marc\",\"doi\":\"arxiv-2409.03875\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When learning to play an imperfect information game, it is often easier to\\nfirst start with the basic mechanics of the game rules. For example, one can\\nplay several example rounds with private cards revealed to all players to\\nbetter understand the basic actions and their effects. Building on this\\nintuition, this paper introduces {\\\\it progressive hiding}, an algorithm that\\nlearns to play imperfect information games by first learning the basic\\nmechanics and then progressively adding information constraints over time.\\nProgressive hiding is inspired by methods from stochastic multistage\\noptimization such as scenario decomposition and progressive hedging. We prove\\nthat it enables the adaptation of counterfactual regret minimization to games\\nwhere perfect recall is not satisfied. Numerical experiments illustrate that\\nprogressive hiding can achieve optimal payoff in a benchmark of emergent\\ncommunication trading game.\",\"PeriodicalId\":501316,\"journal\":{\"name\":\"arXiv - CS - Computer Science and Game Theory\",\"volume\":\"30 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Science and Game Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03875\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Science and Game Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03875","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在学习玩不完全信息游戏时，首先从游戏规则的基本机制入手通常会比较容易。例如，我们可以玩几轮向所有玩家公开私人牌的示例游戏，以便更好地理解基本行动及其效果。基于这一理念，本文介绍了{it progressive hiding}，这是一种通过首先学习基本机制，然后随着时间的推移逐步增加信息约束来学习玩不完全信息博弈的算法。我们证明，它能使反事实遗憾最小化适应不满足完美召回的博弈。数值实验表明，渐进式隐藏可以在一个基准的突发通信交易博弈中获得最佳收益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Learning in Games with progressive hiding

When learning to play an imperfect information game, it is often easier to first start with the basic mechanics of the game rules. For example, one can play several example rounds with private cards revealed to all players to better understand the basic actions and their effects. Building on this intuition, this paper introduces {\it progressive hiding}, an algorithm that learns to play imperfect information games by first learning the basic mechanics and then progressively adding information constraints over time. Progressive hiding is inspired by methods from stochastic multistage optimization such as scenario decomposition and progressive hedging. We prove that it enables the adaptation of counterfactual regret minimization to games where perfect recall is not satisfied. Numerical experiments illustrate that progressive hiding can achieve optimal payoff in a benchmark of emergent communication trading game.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Computer Science and Game Theory

自引率

0.00%

发文量