Creating large numbers of game AIs by learning behavior for cooperating units

2013 IEEE Conference on Computational Inteligence in Games (CIG) Pub Date : 2013-10-17 DOI:10.1109/CIG.2013.6633608

Stephen Wiens, J. Denzinger, Sanjeev Paskaradevan

{"title":"Creating large numbers of game AIs by learning behavior for cooperating units","authors":"Stephen Wiens, J. Denzinger, Sanjeev Paskaradevan","doi":"10.1109/CIG.2013.6633608","DOIUrl":null,"url":null,"abstract":"We present two improvements to the hybrid learning method for the shout-ahead architecture for units in the game Battle for Wesnoth. The shout-ahead architecture allows for units to perform decision making in two stages, first determining an action without knowledge of the intentions of other units, then, after communicating the intended action and likewise receiving the intentions of the other units, taking these intentions into account for the final decision on the next action. The decision making uses two rule sets and reinforcement learning is used to learn rule weights (that influence decision making), while evolutionary learning is used to evolve good rule sets. Our improvements add knowledge about terrain to the learning and also evaluate unit behaviors on several scenario maps to learn more general rules. The use of terrain knowledge resulted in improvements in the win percentage of evolved teams between 3 and 14 percentage points for different maps, while using several maps to learn from resulted in nearly similar win percentages on maps not learned from as on the maps learned from.","PeriodicalId":158902,"journal":{"name":"2013 IEEE Conference on Computational Inteligence in Games (CIG)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Conference on Computational Inteligence in Games (CIG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2013.6633608","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

We present two improvements to the hybrid learning method for the shout-ahead architecture for units in the game Battle for Wesnoth. The shout-ahead architecture allows for units to perform decision making in two stages, first determining an action without knowledge of the intentions of other units, then, after communicating the intended action and likewise receiving the intentions of the other units, taking these intentions into account for the final decision on the next action. The decision making uses two rule sets and reinforcement learning is used to learn rule weights (that influence decision making), while evolutionary learning is used to evolve good rule sets. Our improvements add knowledge about terrain to the learning and also evaluate unit behaviors on several scenario maps to learn more general rules. The use of terrain knowledge resulted in improvements in the win percentage of evolved teams between 3 and 14 percentage points for different maps, while using several maps to learn from resulted in nearly similar win percentages on maps not learned from as on the maps learned from.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过学习合作单位的行为创造大量的游戏ai

我们对游戏《Battle for Wesnoth》中单位的呼喊提前架构的混合学习方法进行了两项改进。提前喊话架构允许单位分两个阶段执行决策，首先在不知道其他单位意图的情况下决定行动，然后在传达预期行动并同样接收到其他单位的意图后，将这些意图考虑到下一个行动的最终决定。决策使用两个规则集，强化学习用于学习规则权重(影响决策)，而进化学习用于进化好的规则集。我们的改进增加了关于地形的知识，并在几个场景地图上评估单位的行为，以学习更多的一般规则。地形知识的使用使进化团队在不同地图上的胜率提高了3到14个百分点，而使用几张地图进行学习，在没有学习的地图上的胜率和学习过的地图上的胜率几乎相同。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 IEEE Conference on Computational Inteligence in Games (CIG)

自引率

0.00%

发文量

期刊最新文献

QL-BT: Enhancing behaviour tree design and implementation with Q-learning Landscape automata for search based procedural content generation The structure of a 3-state finite transducer representation for Prisoner's Dilemma LGOAP: Adaptive layered planning for real-time videogames Evolved weapons for RPG drop systems