Automatically designing counterfactual regret minimization algorithms for solving imperfect-information games

IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Artificial Intelligence Pub Date : 2024-10-11 DOI:10.1016/j.artint.2024.104232
Kai Li , Hang Xu , Haobo Fu , Qiang Fu , Junliang Xing
{"title":"Automatically designing counterfactual regret minimization algorithms for solving imperfect-information games","authors":"Kai Li ,&nbsp;Hang Xu ,&nbsp;Haobo Fu ,&nbsp;Qiang Fu ,&nbsp;Junliang Xing","doi":"10.1016/j.artint.2024.104232","DOIUrl":null,"url":null,"abstract":"<div><div>Strategic decision-making in imperfect-information games is an important problem in artificial intelligence. Counterfactual regret minimization (CFR), a family of iterative algorithms, has been the workhorse for solving these types of games since its inception. In recent years, a series of novel CFR variants have been proposed, significantly improving the convergence rate of vanilla CFR. However, most of these new variants are hand-designed by researchers through trial and error, often based on different motivations, which generally requires a tremendous amount of effort and insight. This work proposes AutoCFR, a systematic framework that meta-learns novel CFR algorithms through evolution, easing the burden of manual algorithm design. We first design a search language that is rich enough to represent various CFR variants. We then exploit a scalable regularized evolution algorithm with a set of acceleration techniques to efficiently search over the combinatorial space of algorithms defined by this language. The learned novel CFR algorithm can generalize to new imperfect-information games not seen during training and performs on par with or better than existing state-of-the-art CFR variants. In addition to superior empirical performance, we also theoretically show that the learned algorithm converges to an approximate Nash equilibrium. Extensive experiments across diverse imperfect-information games highlight the scalability, extensibility, and generalizability of AutoCFR, establishing it as a general-purpose framework for solving imperfect-information games.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"337 ","pages":"Article 104232"},"PeriodicalIF":5.1000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370224001681","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Strategic decision-making in imperfect-information games is an important problem in artificial intelligence. Counterfactual regret minimization (CFR), a family of iterative algorithms, has been the workhorse for solving these types of games since its inception. In recent years, a series of novel CFR variants have been proposed, significantly improving the convergence rate of vanilla CFR. However, most of these new variants are hand-designed by researchers through trial and error, often based on different motivations, which generally requires a tremendous amount of effort and insight. This work proposes AutoCFR, a systematic framework that meta-learns novel CFR algorithms through evolution, easing the burden of manual algorithm design. We first design a search language that is rich enough to represent various CFR variants. We then exploit a scalable regularized evolution algorithm with a set of acceleration techniques to efficiently search over the combinatorial space of algorithms defined by this language. The learned novel CFR algorithm can generalize to new imperfect-information games not seen during training and performs on par with or better than existing state-of-the-art CFR variants. In addition to superior empirical performance, we also theoretically show that the learned algorithm converges to an approximate Nash equilibrium. Extensive experiments across diverse imperfect-information games highlight the scalability, extensibility, and generalizability of AutoCFR, establishing it as a general-purpose framework for solving imperfect-information games.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
自动设计用于解决不完全信息博弈的反事实遗憾最小化算法
不完全信息博弈中的战略决策是人工智能领域的一个重要问题。反事实遗憾最小化(CFR)是一个迭代算法系列,自诞生以来一直是解决这类博弈的主力。近年来,人们提出了一系列新颖的 CFR 变体,大大提高了 vanilla CFR 的收敛速度。然而,这些新变体大多是研究人员通过试验和错误手工设计出来的,通常基于不同的动机,这通常需要巨大的努力和洞察力。本研究提出的 AutoCFR 是一个系统框架,可通过进化元学习新型 CFR 算法,减轻人工设计算法的负担。我们首先设计了一种足够丰富的搜索语言,以表示各种 CFR 变体。然后,我们利用可扩展的正则化进化算法和一系列加速技术,在该语言定义的算法组合空间中进行高效搜索。学习到的新型 CFR 算法可以泛化到训练过程中未出现过的新的不完全信息博弈,其表现与现有的最先进 CFR 变体相当,甚至更好。除了卓越的经验性能外,我们还从理论上证明了所学算法能收敛到近似纳什均衡。在不同的不完全信息博弈中进行的大量实验凸显了 AutoCFR 的可扩展性、可扩展性和通用性,使其成为解决不完全信息博弈的通用框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Artificial Intelligence
Artificial Intelligence 工程技术-计算机:人工智能
CiteScore
11.20
自引率
1.40%
发文量
118
审稿时长
8 months
期刊介绍: The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.
期刊最新文献
Multi-rank smart reserves: A general framework for selection and matching diversity goals Out-of-distribution detection by regaining lost clues Formal verification and synthesis of mechanisms for social choice A simple yet effective self-debiasing framework for transformer models EMOA*: A framework for search-based multi-objective path planning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1