Multi-agent Dual Level Reinforcement Learning of Strategy and Tactics in Competitive Games

Chengping Yuan , Md Abdullah Al Forhad , Ranak Bansal , Anna Sidorova , Mark V. Albert
{"title":"Multi-agent Dual Level Reinforcement Learning of Strategy and Tactics in Competitive Games","authors":"Chengping Yuan ,&nbsp;Md Abdullah Al Forhad ,&nbsp;Ranak Bansal ,&nbsp;Anna Sidorova ,&nbsp;Mark V. Albert","doi":"10.1016/j.rico.2024.100471","DOIUrl":null,"url":null,"abstract":"<div><p>Reinforcement learning has been used extensively to learn the low-level tactical choices during gameplay; however, less effort is invested in the strategic decisions governing the effective engagement of a diverse set of opponents. In this paper, a two-tier reinforcement learning model is created to play competitive games and effectively engage in matches with different opponents to maximize earnings. The multi-agent environment has four types of learners, which vary in their ability to learn gameplay directly (tactics) and their ability to learn to bet or withdraw from gameplay (strategy). The players are tested in three different competitive games: Connect 4, Dots and Boxes, and Tic-Tac-Toe. Analyzing the behavior of players as they progress from naivety to game mastery reveals some interesting features: (1) learners who optimize strategy and tactics outperform all learners, (2) learners who initially optimize their strategy to engage in matches outperform those who focus on optimizing tactical gameplay, and (3) the advantage of strategy optimization versus tactical gameplay optimization diminishes as more games are played. A reinforcement learning model with a dual learning scheme presents possible applications in adversarial scenarios where both strategic and tactical learning are critical. We present detailed results in a systematic manner, providing strong support for our claim.</p></div>","PeriodicalId":34733,"journal":{"name":"Results in Control and Optimization","volume":"16 ","pages":"Article 100471"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666720724001012/pdfft?md5=d89231a731f9cb6cfdcb9dcc91471705&pid=1-s2.0-S2666720724001012-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Results in Control and Optimization","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666720724001012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

Abstract

Reinforcement learning has been used extensively to learn the low-level tactical choices during gameplay; however, less effort is invested in the strategic decisions governing the effective engagement of a diverse set of opponents. In this paper, a two-tier reinforcement learning model is created to play competitive games and effectively engage in matches with different opponents to maximize earnings. The multi-agent environment has four types of learners, which vary in their ability to learn gameplay directly (tactics) and their ability to learn to bet or withdraw from gameplay (strategy). The players are tested in three different competitive games: Connect 4, Dots and Boxes, and Tic-Tac-Toe. Analyzing the behavior of players as they progress from naivety to game mastery reveals some interesting features: (1) learners who optimize strategy and tactics outperform all learners, (2) learners who initially optimize their strategy to engage in matches outperform those who focus on optimizing tactical gameplay, and (3) the advantage of strategy optimization versus tactical gameplay optimization diminishes as more games are played. A reinforcement learning model with a dual learning scheme presents possible applications in adversarial scenarios where both strategic and tactical learning are critical. We present detailed results in a systematic manner, providing strong support for our claim.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
竞技游戏中战略和战术的多代理双层强化学习
强化学习已被广泛应用于学习游戏过程中的低级战术选择;然而,在管理与不同对手有效交战的战略决策方面投入的精力却较少。本文创建了一个双层强化学习模型,用于玩竞技游戏,并有效地与不同对手进行比赛,以实现收益最大化。多代理环境中有四种类型的学习者,它们在直接学习游戏玩法(战术)和学习下注或退出游戏玩法(策略)的能力上各不相同。玩家在三种不同的竞技游戏中接受测试:4 号连线、点和方块以及井字游戏。分析玩家从幼稚到精通游戏的过程中的行为,可以发现一些有趣的特点:(1) 优化战略战术的学习者优于所有学习者;(2) 最初优化战略以参与比赛的学习者优于专注于优化战术游戏的学习者;(3) 战略优化与战术游戏优化的优势随着游戏的增多而减弱。具有双重学习方案的强化学习模型可应用于对抗性场景,在这种场景中,战略和战术学习都至关重要。我们以系统的方式展示了详细的结果,为我们的主张提供了有力的支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Results in Control and Optimization
Results in Control and Optimization Mathematics-Control and Optimization
CiteScore
3.00
自引率
0.00%
发文量
51
审稿时长
91 days
期刊最新文献
A note on “Study on multi-objective linear fractional programming problem involving pentagonal intuitionistic fuzzy number” Frequency regulation of two-area thermal and photovoltaic power system via flood algorithm An efficient parametric kernel function of IPMs for Linear optimization problems Multi-objective optimization of an open-pit mining system to determine safety buffer using the modified NBI method and the meta-model approach COVID-19 detection from optimized features of breathing audio signals using explainable ensemble machine learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1