{"title":"蒙特卡洛树搜索与选项一般视频游戏玩","authors":"M. D. Waard, Diederik M. Roijers, S. Bakkes","doi":"10.1109/CIG.2016.7860383","DOIUrl":null,"url":null,"abstract":"General video game playing is a challenging research area in which the goal is to find one algorithm that can play many games successfully. “Monte Carlo Tree Search” (MCTS) is a popular algorithm that has often been used for this purpose. It incrementally builds a search tree based on observed states after applying actions. However, the MCTS algorithm always plans over actions and does not incorporate any higher level planning, as one would expect from a human player. Furthermore, although many games have similar game dynamics, often no prior knowledge is available to general video game playing algorithms. In this paper, we introduce a new algorithm called “Option Monte Carlo Tree Search” (O-MCTS). It offers general video game knowledge and high level planning in the form of “options”, which are action sequences aimed at achieving a specific subgoal. Additionally, we introduce “Option Learning MCTS” (OL-MCTS), which applies a progressive widening technique to the expected returns of options in order to focus exploration on fruitful parts of the search tree. Our new algorithms are compared to MCTS on a diverse set of twenty-eight games from the general video game AI competition. Our results indicate that by using MCTS's efficient tree searching technique on options, O-MCTS outperforms MCTS on most of the games, especially those in which a certain subgoal has to be reached before the game can be won. Lastly, we show that OL-MCTS improves its performance on specific games by learning expected values for options and moving a bias to higher valued options.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"122 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Monte Carlo Tree Search with options for general video game playing\",\"authors\":\"M. D. Waard, Diederik M. Roijers, S. Bakkes\",\"doi\":\"10.1109/CIG.2016.7860383\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"General video game playing is a challenging research area in which the goal is to find one algorithm that can play many games successfully. “Monte Carlo Tree Search” (MCTS) is a popular algorithm that has often been used for this purpose. It incrementally builds a search tree based on observed states after applying actions. However, the MCTS algorithm always plans over actions and does not incorporate any higher level planning, as one would expect from a human player. Furthermore, although many games have similar game dynamics, often no prior knowledge is available to general video game playing algorithms. In this paper, we introduce a new algorithm called “Option Monte Carlo Tree Search” (O-MCTS). It offers general video game knowledge and high level planning in the form of “options”, which are action sequences aimed at achieving a specific subgoal. Additionally, we introduce “Option Learning MCTS” (OL-MCTS), which applies a progressive widening technique to the expected returns of options in order to focus exploration on fruitful parts of the search tree. Our new algorithms are compared to MCTS on a diverse set of twenty-eight games from the general video game AI competition. Our results indicate that by using MCTS's efficient tree searching technique on options, O-MCTS outperforms MCTS on most of the games, especially those in which a certain subgoal has to be reached before the game can be won. Lastly, we show that OL-MCTS improves its performance on specific games by learning expected values for options and moving a bias to higher valued options.\",\"PeriodicalId\":6594,\"journal\":{\"name\":\"2016 IEEE Conference on Computational Intelligence and Games (CIG)\",\"volume\":\"122 1\",\"pages\":\"1-8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Conference on Computational Intelligence and Games (CIG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIG.2016.7860383\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2016.7860383","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Monte Carlo Tree Search with options for general video game playing
General video game playing is a challenging research area in which the goal is to find one algorithm that can play many games successfully. “Monte Carlo Tree Search” (MCTS) is a popular algorithm that has often been used for this purpose. It incrementally builds a search tree based on observed states after applying actions. However, the MCTS algorithm always plans over actions and does not incorporate any higher level planning, as one would expect from a human player. Furthermore, although many games have similar game dynamics, often no prior knowledge is available to general video game playing algorithms. In this paper, we introduce a new algorithm called “Option Monte Carlo Tree Search” (O-MCTS). It offers general video game knowledge and high level planning in the form of “options”, which are action sequences aimed at achieving a specific subgoal. Additionally, we introduce “Option Learning MCTS” (OL-MCTS), which applies a progressive widening technique to the expected returns of options in order to focus exploration on fruitful parts of the search tree. Our new algorithms are compared to MCTS on a diverse set of twenty-eight games from the general video game AI competition. Our results indicate that by using MCTS's efficient tree searching technique on options, O-MCTS outperforms MCTS on most of the games, especially those in which a certain subgoal has to be reached before the game can be won. Lastly, we show that OL-MCTS improves its performance on specific games by learning expected values for options and moving a bias to higher valued options.