蒙特卡洛树搜索与选项一般视频游戏玩

2016 IEEE Conference on Computational Intelligence and Games (CIG) Pub Date : 2016-09-01 DOI:10.1109/CIG.2016.7860383

M. D. Waard, Diederik M. Roijers, S. Bakkes

{"title":"蒙特卡洛树搜索与选项一般视频游戏玩","authors":"M. D. Waard, Diederik M. Roijers, S. Bakkes","doi":"10.1109/CIG.2016.7860383","DOIUrl":null,"url":null,"abstract":"General video game playing is a challenging research area in which the goal is to find one algorithm that can play many games successfully. “Monte Carlo Tree Search” (MCTS) is a popular algorithm that has often been used for this purpose. It incrementally builds a search tree based on observed states after applying actions. However, the MCTS algorithm always plans over actions and does not incorporate any higher level planning, as one would expect from a human player. Furthermore, although many games have similar game dynamics, often no prior knowledge is available to general video game playing algorithms. In this paper, we introduce a new algorithm called “Option Monte Carlo Tree Search” (O-MCTS). It offers general video game knowledge and high level planning in the form of “options”, which are action sequences aimed at achieving a specific subgoal. Additionally, we introduce “Option Learning MCTS” (OL-MCTS), which applies a progressive widening technique to the expected returns of options in order to focus exploration on fruitful parts of the search tree. Our new algorithms are compared to MCTS on a diverse set of twenty-eight games from the general video game AI competition. Our results indicate that by using MCTS's efficient tree searching technique on options, O-MCTS outperforms MCTS on most of the games, especially those in which a certain subgoal has to be reached before the game can be won. Lastly, we show that OL-MCTS improves its performance on specific games by learning expected values for options and moving a bias to higher valued options.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"122 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Monte Carlo Tree Search with options for general video game playing\",\"authors\":\"M. D. Waard, Diederik M. Roijers, S. Bakkes\",\"doi\":\"10.1109/CIG.2016.7860383\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"General video game playing is a challenging research area in which the goal is to find one algorithm that can play many games successfully. “Monte Carlo Tree Search” (MCTS) is a popular algorithm that has often been used for this purpose. It incrementally builds a search tree based on observed states after applying actions. However, the MCTS algorithm always plans over actions and does not incorporate any higher level planning, as one would expect from a human player. Furthermore, although many games have similar game dynamics, often no prior knowledge is available to general video game playing algorithms. In this paper, we introduce a new algorithm called “Option Monte Carlo Tree Search” (O-MCTS). It offers general video game knowledge and high level planning in the form of “options”, which are action sequences aimed at achieving a specific subgoal. Additionally, we introduce “Option Learning MCTS” (OL-MCTS), which applies a progressive widening technique to the expected returns of options in order to focus exploration on fruitful parts of the search tree. Our new algorithms are compared to MCTS on a diverse set of twenty-eight games from the general video game AI competition. Our results indicate that by using MCTS's efficient tree searching technique on options, O-MCTS outperforms MCTS on most of the games, especially those in which a certain subgoal has to be reached before the game can be won. Lastly, we show that OL-MCTS improves its performance on specific games by learning expected values for options and moving a bias to higher valued options.\",\"PeriodicalId\":6594,\"journal\":{\"name\":\"2016 IEEE Conference on Computational Intelligence and Games (CIG)\",\"volume\":\"122 1\",\"pages\":\"1-8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Conference on Computational Intelligence and Games (CIG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIG.2016.7860383\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2016.7860383","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

一般的电子游戏是一个具有挑战性的研究领域，其目标是找到一种能够成功玩多款游戏的算法。“蒙特卡罗树搜索”(MCTS)是一种常用的算法，经常用于此目的。在应用操作后，它基于观察到的状态增量地构建搜索树。然而，MCTS算法总是对行动进行计划，并且不包含任何更高级别的计划，就像人们对人类玩家所期望的那样。此外，尽管许多游戏都具有相似的游戏动态，但一般的电子游戏玩法算法通常不具备先验知识。在本文中，我们介绍了一个新的算法，称为“选项蒙特卡罗树搜索”(O-MCTS)。它以“选项”的形式提供了一般的电子游戏知识和高级规划，即旨在实现特定子目标的动作序列。此外，我们引入了“期权学习MCTS”(OL-MCTS)，它对期权的预期回报应用了渐进扩展技术，以便将探索集中在搜索树的有效部分。我们的新算法与MCTS在28个不同的游戏中进行了比较，这些游戏来自一般的电子游戏人工智能比赛。我们的研究结果表明，通过使用MCTS在选项上的高效树搜索技术，O-MCTS在大多数博弈中都优于MCTS，特别是在那些必须达到某个子目标才能获胜的博弈中。最后，我们表明OL-MCTS通过学习选项的期望值和将偏差移动到更高价值的选项来提高其在特定游戏中的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Monte Carlo Tree Search with options for general video game playing

General video game playing is a challenging research area in which the goal is to find one algorithm that can play many games successfully. “Monte Carlo Tree Search” (MCTS) is a popular algorithm that has often been used for this purpose. It incrementally builds a search tree based on observed states after applying actions. However, the MCTS algorithm always plans over actions and does not incorporate any higher level planning, as one would expect from a human player. Furthermore, although many games have similar game dynamics, often no prior knowledge is available to general video game playing algorithms. In this paper, we introduce a new algorithm called “Option Monte Carlo Tree Search” (O-MCTS). It offers general video game knowledge and high level planning in the form of “options”, which are action sequences aimed at achieving a specific subgoal. Additionally, we introduce “Option Learning MCTS” (OL-MCTS), which applies a progressive widening technique to the expected returns of options in order to focus exploration on fruitful parts of the search tree. Our new algorithms are compared to MCTS on a diverse set of twenty-eight games from the general video game AI competition. Our results indicate that by using MCTS's efficient tree searching technique on options, O-MCTS outperforms MCTS on most of the games, especially those in which a certain subgoal has to be reached before the game can be won. Lastly, we show that OL-MCTS improves its performance on specific games by learning expected values for options and moving a bias to higher valued options.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助