{"title":"在游戏中学习,逐步隐藏","authors":"Heymann Benjamin, Lanctot Marc","doi":"arxiv-2409.03875","DOIUrl":null,"url":null,"abstract":"When learning to play an imperfect information game, it is often easier to\nfirst start with the basic mechanics of the game rules. For example, one can\nplay several example rounds with private cards revealed to all players to\nbetter understand the basic actions and their effects. Building on this\nintuition, this paper introduces {\\it progressive hiding}, an algorithm that\nlearns to play imperfect information games by first learning the basic\nmechanics and then progressively adding information constraints over time.\nProgressive hiding is inspired by methods from stochastic multistage\noptimization such as scenario decomposition and progressive hedging. We prove\nthat it enables the adaptation of counterfactual regret minimization to games\nwhere perfect recall is not satisfied. Numerical experiments illustrate that\nprogressive hiding can achieve optimal payoff in a benchmark of emergent\ncommunication trading game.","PeriodicalId":501316,"journal":{"name":"arXiv - CS - Computer Science and Game Theory","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning in Games with progressive hiding\",\"authors\":\"Heymann Benjamin, Lanctot Marc\",\"doi\":\"arxiv-2409.03875\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When learning to play an imperfect information game, it is often easier to\\nfirst start with the basic mechanics of the game rules. For example, one can\\nplay several example rounds with private cards revealed to all players to\\nbetter understand the basic actions and their effects. Building on this\\nintuition, this paper introduces {\\\\it progressive hiding}, an algorithm that\\nlearns to play imperfect information games by first learning the basic\\nmechanics and then progressively adding information constraints over time.\\nProgressive hiding is inspired by methods from stochastic multistage\\noptimization such as scenario decomposition and progressive hedging. We prove\\nthat it enables the adaptation of counterfactual regret minimization to games\\nwhere perfect recall is not satisfied. Numerical experiments illustrate that\\nprogressive hiding can achieve optimal payoff in a benchmark of emergent\\ncommunication trading game.\",\"PeriodicalId\":501316,\"journal\":{\"name\":\"arXiv - CS - Computer Science and Game Theory\",\"volume\":\"30 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Science and Game Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03875\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Science and Game Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03875","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
When learning to play an imperfect information game, it is often easier to
first start with the basic mechanics of the game rules. For example, one can
play several example rounds with private cards revealed to all players to
better understand the basic actions and their effects. Building on this
intuition, this paper introduces {\it progressive hiding}, an algorithm that
learns to play imperfect information games by first learning the basic
mechanics and then progressively adding information constraints over time.
Progressive hiding is inspired by methods from stochastic multistage
optimization such as scenario decomposition and progressive hedging. We prove
that it enables the adaptation of counterfactual regret minimization to games
where perfect recall is not satisfied. Numerical experiments illustrate that
progressive hiding can achieve optimal payoff in a benchmark of emergent
communication trading game.