Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent

AAAI Workshop: Computer Poker and Imperfect Information Pub Date : 2015-05-04 DOI:10.5555/2772879.2772885

Noam Brown, Sam Ganzfried, T. Sandholm

{"title":"Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent","authors":"Noam Brown, Sam Ganzfried, T. Sandholm","doi":"10.5555/2772879.2772885","DOIUrl":null,"url":null,"abstract":"The leading approach for solving large imperfect-information games is automated abstraction followed by running an equilibrium-finding algorithm. We introduce a distributed version of the most commonly used equilibrium-finding algorithm, counterfactual regret minimization (CFR), which enables CFR to scale to dramatically larger abstractions and numbers of cores. The new algorithm begets constraints on the abstraction so as to make the pieces running on different computers disjoint. We introduce an algorithm for generating such abstractions while capitalizing on state-of-the-art abstraction ideas such as imperfect recall and earth-mover's distance. Our techniques enabled an equilibrium computation of unprecedented size on a supercomputer with a high inter-blade memory latency. Prior approaches run slowly on this architecture. Our approach also leads to a significant improvement over using the prior best approach on a large shared-memory server with low memory latency. Finally, we introduce a family of post-processing techniques that outperform prior ones. We applied these techniques to generate an agent for two-player no-limit Texas Hold'em, called Tartanian7, that won the 2014 Annual Computer Poker Competition, beating each opponent with statistical significance.","PeriodicalId":106568,"journal":{"name":"AAAI Workshop: Computer Poker and Imperfect Information","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"65","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AAAI Workshop: Computer Poker and Imperfect Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/2772879.2772885","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 65

Abstract

The leading approach for solving large imperfect-information games is automated abstraction followed by running an equilibrium-finding algorithm. We introduce a distributed version of the most commonly used equilibrium-finding algorithm, counterfactual regret minimization (CFR), which enables CFR to scale to dramatically larger abstractions and numbers of cores. The new algorithm begets constraints on the abstraction so as to make the pieces running on different computers disjoint. We introduce an algorithm for generating such abstractions while capitalizing on state-of-the-art abstraction ideas such as imperfect recall and earth-mover's distance. Our techniques enabled an equilibrium computation of unprecedented size on a supercomputer with a high inter-blade memory latency. Prior approaches run slowly on this architecture. Our approach also leads to a significant improvement over using the prior best approach on a large shared-memory server with low memory latency. Finally, we introduce a family of post-processing techniques that outperform prior ones. We applied these techniques to generate an agent for two-player no-limit Texas Hold'em, called Tartanian7, that won the 2014 Annual Computer Poker Competition, beating each opponent with statistical significance.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

分层抽象、分布式均衡计算及后处理，并应用于冠军无限德州扑克智能体

解决大型不完全信息博弈的主要方法是自动抽象，然后运行均衡查找算法。我们介绍了最常用的平衡查找算法的分布式版本，反事实遗憾最小化(CFR)，它使CFR能够扩展到显着更大的抽象和核心数量。新算法对抽象产生约束，使得在不同计算机上运行的片段不相交。我们引入了一种算法来生成这样的抽象，同时利用了最先进的抽象思想，如不完全召回和推土机的距离。我们的技术在具有高刀片间内存延迟的超级计算机上实现了前所未有的平衡计算。先前的方法在这种体系结构上运行缓慢。与在具有低内存延迟的大型共享内存服务器上使用先前的最佳方法相比，我们的方法还带来了显著的改进。最后，我们介绍了一系列优于先前技术的后处理技术。我们应用这些技术生成了一个名为Tartanian7的双玩家无限制德州扑克代理，它赢得了2014年年度计算机扑克比赛，以统计显著性击败了每个对手。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

AAAI Workshop: Computer Poker and Imperfect Information

自引率

0.00%

发文量

期刊最新文献

Hierarchical Abstraction, Distributed Equilibrium Computation, and Post-Processing, with Application to a Champion No-Limit Texas Hold'em Agent Decision-Theoretic Clustering of Strategies Solving Games with Functional Regret Estimation