Pub Date : 2016-09-01DOI: 10.1109/CIG.2016.7860400
D. Carvalho, E. Clua, A. Paes
Stories have become an important element of games, since they can increase their immersion level by giving the players the context and the motivation to play. However, despite the interactive nature of games, their stories usually do not develop considering every decision and/or action the players are capable of, because depending on the game size, it would take too much effort to author alternative routes for all of them. To make these alternatives viable, an interesting solution would be to procedurally generate them, which could be achieved by using the story generation approaches already developed by many works of the storytelling field. Some of these approaches are based on the simulation of virtual worlds, in which the stories are generated by making the characters that inhabit the worlds act trying to reach their goals. The resulting actions and the world's reactions compose the final story. Since the actions are the building blocks of the stories, the characters' acting capabilities are determinant features of the generation potential of simulations. For instance, it is only possible to generate stories with deception if the characters are capable of deceiving each other. To allow the generation of stories where the characters are capable of manipulation, cooperation and other social behaviors by actively using what the others will do based on what they know and see, we propose a recursive planning approach that deals with the uncertainty of the others' knowledge and with a purposely error-prone perception simulation. To test our proposal we developed a story generation system and designed an adaptation of Little Red Riding Hood world as test scenario. With our approach, the system was capable of generating coherent story variations with deceptive actions.
{"title":"Planning social actions through the others' eyes for emergent storytelling","authors":"D. Carvalho, E. Clua, A. Paes","doi":"10.1109/CIG.2016.7860400","DOIUrl":"https://doi.org/10.1109/CIG.2016.7860400","url":null,"abstract":"Stories have become an important element of games, since they can increase their immersion level by giving the players the context and the motivation to play. However, despite the interactive nature of games, their stories usually do not develop considering every decision and/or action the players are capable of, because depending on the game size, it would take too much effort to author alternative routes for all of them. To make these alternatives viable, an interesting solution would be to procedurally generate them, which could be achieved by using the story generation approaches already developed by many works of the storytelling field. Some of these approaches are based on the simulation of virtual worlds, in which the stories are generated by making the characters that inhabit the worlds act trying to reach their goals. The resulting actions and the world's reactions compose the final story. Since the actions are the building blocks of the stories, the characters' acting capabilities are determinant features of the generation potential of simulations. For instance, it is only possible to generate stories with deception if the characters are capable of deceiving each other. To allow the generation of stories where the characters are capable of manipulation, cooperation and other social behaviors by actively using what the others will do based on what they know and see, we propose a recursive planning approach that deals with the uncertainty of the others' knowledge and with a purposely error-prone perception simulation. To test our proposal we developed a story generation system and designed an adaptation of Little Red Riding Hood world as test scenario. With our approach, the system was capable of generating coherent story variations with deceptive actions.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"37 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83326351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/CIG.2016.7860415
Noor Shaker, Mohamed Abou-Zleikha
Several studies on cross-domain users' behaviour revealed generic personality trails and behavioural patterns. This paper, proposes quantitative approaches to use the knowledge of player behaviour in one game to seed the process of building player experience models in another. We investigate two settings: in the supervised feature mapping method, we use labeled datasets about players' behaviour in two games. The goal is to establish a mapping between the features so that the models build on one dataset could be used on the other by simple feature replacement. For the unsupervised transfer learning scenario, our goal is to find a shared space of correlated features based on unlabelled data. The features in the shared space are then used to construct models for one game that directly work on the transferred features of the other game. We implemented and analysed the two approaches and we show that transferring the knowledge of player experience between domains is indeed possible and ultimately useful when studying players' behaviour and when designing user studies.
{"title":"Transfer learning for cross-game prediction of player experience","authors":"Noor Shaker, Mohamed Abou-Zleikha","doi":"10.1109/CIG.2016.7860415","DOIUrl":"https://doi.org/10.1109/CIG.2016.7860415","url":null,"abstract":"Several studies on cross-domain users' behaviour revealed generic personality trails and behavioural patterns. This paper, proposes quantitative approaches to use the knowledge of player behaviour in one game to seed the process of building player experience models in another. We investigate two settings: in the supervised feature mapping method, we use labeled datasets about players' behaviour in two games. The goal is to establish a mapping between the features so that the models build on one dataset could be used on the other by simple feature replacement. For the unsupervised transfer learning scenario, our goal is to find a shared space of correlated features based on unlabelled data. The features in the shared space are then used to construct models for one game that directly work on the transferred features of the other game. We implemented and analysed the two approaches and we show that transferring the knowledge of player experience between domains is indeed possible and ultimately useful when studying players' behaviour and when designing user studies.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"100 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80589726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/CIG.2016.7860426
P. García-Sánchez, A. Tonda, Giovanni Squillero, A. García, J. J. M. Guervós
One of the most notable features of collectible card games is deckbuilding, that is, defining a personalized deck before the real game. Deckbuilding is a challenge that involves a big and rugged search space, with different and unpredictable behaviour after simple card changes and even hidden information. In this paper, we explore the possibility of automated deckbuilding: a genetic algorithm is applied to the task, with the evaluation delegated to a game simulator that tests every potential deck against a varied and representative range of human-made decks. In these preliminary experiments, the approach has proven able to create quite effective decks, a promising result that proves that, even in this challenging environment, evolutionary algorithms can find good solutions.
{"title":"Evolutionary deckbuilding in hearthstone","authors":"P. García-Sánchez, A. Tonda, Giovanni Squillero, A. García, J. J. M. Guervós","doi":"10.1109/CIG.2016.7860426","DOIUrl":"https://doi.org/10.1109/CIG.2016.7860426","url":null,"abstract":"One of the most notable features of collectible card games is deckbuilding, that is, defining a personalized deck before the real game. Deckbuilding is a challenge that involves a big and rugged search space, with different and unpredictable behaviour after simple card changes and even hidden information. In this paper, we explore the possibility of automated deckbuilding: a genetic algorithm is applied to the task, with the evaluation delegated to a game simulator that tests every potential deck against a varied and representative range of human-made decks. In these preliminary experiments, the approach has proven able to create quite effective decks, a promising result that proves that, even in this challenging environment, evolutionary algorithms can find good solutions.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"1 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88678100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/CIG.2016.7860421
Thibault Allart, G. Levieux, M. Pierfitte, Agathe Guilloux, S. Natkin
This paper proposes a method to help understanding the influence of a game design on player retention. Using Far Cry® 4 data, we illustrate how playtime measures can be used to identify time periods where players are more likely to stop playing. First, we show that a benchmark can easily be performed for every game available on Steam using publicly available data. Then, we introduce how survival analysis can help to model the influence of game variables on player retention. Game environment and player characteristics change over time and tracking systems already store those changes. But existing model which deals with time varying covariate cannot scale on huge datasets produced by video game monitoring. That is why we propose a model that can both deal with time varying covariates and is well suited for big datasets. As a given game variable can have a changing effect over time, we also include time-varying coefficients in our model. We used this survival analysis model to quantify the effect of Far Cry 4 weapons usage on player retention.
{"title":"Design influence on player retention: A method based on time varying survival analysis","authors":"Thibault Allart, G. Levieux, M. Pierfitte, Agathe Guilloux, S. Natkin","doi":"10.1109/CIG.2016.7860421","DOIUrl":"https://doi.org/10.1109/CIG.2016.7860421","url":null,"abstract":"This paper proposes a method to help understanding the influence of a game design on player retention. Using Far Cry® 4 data, we illustrate how playtime measures can be used to identify time periods where players are more likely to stop playing. First, we show that a benchmark can easily be performed for every game available on Steam using publicly available data. Then, we introduce how survival analysis can help to model the influence of game variables on player retention. Game environment and player characteristics change over time and tracking systems already store those changes. But existing model which deals with time varying covariate cannot scale on huge datasets produced by video game monitoring. That is why we propose a model that can both deal with time varying covariates and is well suited for big datasets. As a given game variable can have a changing effect over time, we also include time-varying coefficients in our model. We used this survival analysis model to quantify the effect of Far Cry 4 weapons usage on player retention.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"22 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74695744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/CIG.2016.7860449
C. Chu, Suguru Ito, Tomohiro Harada, R. Thawonmas
This paper proposes an application of reinforcement learning and position-based features in rollout bias training of Monte-Carlo Tree Search (MCTS) for General Video Game Playing (GVGP). As an improvement on Knowledge-based Fast-Evo MCTS proposed by Perez et al., the proposed method is designated for both the GVG-AI Competition and improvement of the learning mechanism of the original method. The performance of the proposed method is evaluated empirically, using all games from six training sets available in the GVG-AI Framework, and the proposed method achieves better scores than five other existing MCTS-based methods overall.
{"title":"Position-based reinforcement learning biased MCTS for General Video Game Playing","authors":"C. Chu, Suguru Ito, Tomohiro Harada, R. Thawonmas","doi":"10.1109/CIG.2016.7860449","DOIUrl":"https://doi.org/10.1109/CIG.2016.7860449","url":null,"abstract":"This paper proposes an application of reinforcement learning and position-based features in rollout bias training of Monte-Carlo Tree Search (MCTS) for General Video Game Playing (GVGP). As an improvement on Knowledge-based Fast-Evo MCTS proposed by Perez et al., the proposed method is designated for both the GVG-AI Competition and improvement of the learning mechanism of the original method. The performance of the proposed method is evaluated empirically, using all games from six training sets available in the GVG-AI Framework, and the proposed method achieves better scores than five other existing MCTS-based methods overall.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"24 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82979627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/CIG.2016.7860417
D. Aversa, Sebastian Sardiña, S. Vassos
Inventory-Aware Pathfinding is concerned with finding paths while taking into account that picking up items, e.g., keys, allow the character to unlock blocked pathways, e.g., locked doors. In this work we present a pruning method and a preprocessing method that can improve significantly the scalability of such approaches. We apply our methods to the recent approach of Inventory-Driven Jump-Point Search (InvJPS). First, we introduce InvJPS+ that allows to prune large parts of the search space by favoring short detours to pick up items, offering a trade-off between efficiency and optimality. Second, we propose a preprocessing step that allows to decide on runtime which items, e.g., keys, are worth using thus pruning potentially unnecessary items before the search starts. We show results for combinations of the pruning and preprocessing methods illustrating the best choices over various scenarios.
{"title":"Pruning and preprocessing methods for inventory-aware pathfinding","authors":"D. Aversa, Sebastian Sardiña, S. Vassos","doi":"10.1109/CIG.2016.7860417","DOIUrl":"https://doi.org/10.1109/CIG.2016.7860417","url":null,"abstract":"Inventory-Aware Pathfinding is concerned with finding paths while taking into account that picking up items, e.g., keys, allow the character to unlock blocked pathways, e.g., locked doors. In this work we present a pruning method and a preprocessing method that can improve significantly the scalability of such approaches. We apply our methods to the recent approach of Inventory-Driven Jump-Point Search (InvJPS). First, we introduce InvJPS+ that allows to prune large parts of the search space by favoring short detours to pick up items, offering a trade-off between efficiency and optimality. Second, we propose a preprocessing step that allows to decide on runtime which items, e.g., keys, are worth using thus pruning potentially unnecessary items before the search starts. We show results for combinations of the pruning and preprocessing methods illustrating the best choices over various scenarios.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"37 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80896974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/CIG.2016.7860411
Tobias Graf, M. Platzner
Simulation Balancing is an optimization algorithm to automatically tune the parameters of a playout policy used inside a Monte Carlo Tree Search. The algorithm fits a policy so that the expected result of a policy matches given target values of the training set. Up to now it has been successfully applied to Computer Go on small 9 × 9 boards but failed for larger board sizes like 19 × 19. On these large boards apprenticeship learning, which fits a policy so that it closely follows an expert, continues to be the algorithm of choice. In this paper we introduce several improvements to the original simulation balancing algorithm and test their effectiveness in Computer Go. The proposed additions remove the necessity to generate target values by deep searches, optimize faster and make the algorithm less prone to overfitting. The experiments show that simulation balancing improves the playing strength of a Go program using apprenticeship learning by more than 200 ELO on the large board size 19 × 19.
{"title":"Monte-Carlo simulation balancing revisited","authors":"Tobias Graf, M. Platzner","doi":"10.1109/CIG.2016.7860411","DOIUrl":"https://doi.org/10.1109/CIG.2016.7860411","url":null,"abstract":"Simulation Balancing is an optimization algorithm to automatically tune the parameters of a playout policy used inside a Monte Carlo Tree Search. The algorithm fits a policy so that the expected result of a policy matches given target values of the training set. Up to now it has been successfully applied to Computer Go on small 9 × 9 boards but failed for larger board sizes like 19 × 19. On these large boards apprenticeship learning, which fits a policy so that it closely follows an expert, continues to be the algorithm of choice. In this paper we introduce several improvements to the original simulation balancing algorithm and test their effectiveness in Computer Go. The proposed additions remove the necessity to generate target values by deep searches, optimize faster and make the algorithm less prone to overfitting. The experiments show that simulation balancing improves the playing strength of a Go program using apprenticeship learning by more than 200 ELO on the large board size 19 × 19.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"20 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91305088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/CIG.2016.7860446
P. R. Williams, Diego Perez Liebana, S. Lucas
This paper introduces the revival of the popular Ms. Pac-Man Versus Ghost Team competition. We present an updated game engine with Partial Observability constraints, a new Multi-Agent Systems approach to developing Ghost agents, and several sample controllers to ease the development of entries. A restricted communication protocol is provided for the Ghosts, providing a more challenging environment than before. The competition will debut at the IEEE Computational Intelligence and Games Conference 2016. Some preliminary results showing the effects of Partial Observability and the benefits of simple communication are also presented.
本文将介绍流行的Ms. Pac-Man vs . Ghost Team比赛的复兴。我们提出了一个具有部分可观察性约束的更新游戏引擎,一个新的多代理系统方法来开发幽灵代理,以及几个示例控制器来简化条目的开发。为幽灵提供了一个受限的通信协议,提供了一个比以前更具挑战性的环境。该竞赛将在2016年IEEE计算智能与游戏大会上首次亮相。一些初步结果显示了部分可观测性的影响和简单通信的好处。
{"title":"Ms. Pac-Man Versus Ghost Team CIG 2016 competition","authors":"P. R. Williams, Diego Perez Liebana, S. Lucas","doi":"10.1109/CIG.2016.7860446","DOIUrl":"https://doi.org/10.1109/CIG.2016.7860446","url":null,"abstract":"This paper introduces the revival of the popular Ms. Pac-Man Versus Ghost Team competition. We present an updated game engine with Partial Observability constraints, a new Multi-Agent Systems approach to developing Ghost agents, and several sample controllers to ease the development of entries. A restricted communication protocol is provided for the Ghosts, providing a more challenging environment than before. The competition will debut at the IEEE Computational Intelligence and Games Conference 2016. Some preliminary results showing the effects of Partial Observability and the benefits of simple communication are also presented.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"21 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87771969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/CIG.2016.7860419
Ahmed Abdelkader
Pursuit-evasion games encompass a wide range of planning problems with a variety of constraints on the motion of agents. We study the visibility-based variant where a pursuer is required to keep an evader in sight, while the evader is assumed to attempt to hide as soon as possible. This is particularly relevant in the context of video games where non-player characters of varying skill levels frequently chase after and attack the player. In this paper, we show that a simple dual formulation of the problem can be integrated into the traditional model to derive optimal strategies that tolerate interruptions in visibility resulting from motion among obstacles. Furthermore, using the enhanced model we propose a competitive procedure to maintain the optimal strategies in a dynamic environment where obstacles can change both shape and location. We prove the correctness of our algorithms and present results for different maps.
{"title":"Recovering visibility and dodging obstacles in pursuit-evasion games","authors":"Ahmed Abdelkader","doi":"10.1109/CIG.2016.7860419","DOIUrl":"https://doi.org/10.1109/CIG.2016.7860419","url":null,"abstract":"Pursuit-evasion games encompass a wide range of planning problems with a variety of constraints on the motion of agents. We study the visibility-based variant where a pursuer is required to keep an evader in sight, while the evader is assumed to attempt to hide as soon as possible. This is particularly relevant in the context of video games where non-player characters of varying skill levels frequently chase after and attack the player. In this paper, we show that a simple dual formulation of the problem can be integrated into the traditional model to derive optimal strategies that tolerate interruptions in visibility resulting from motion among obstacles. Furthermore, using the enhanced model we propose a competitive procedure to maintain the optimal strategies in a dynamic environment where obstacles can change both shape and location. We prove the correctness of our algorithms and present results for different maps.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"6 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84097165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-09-01DOI: 10.1109/CIG.2016.7860383
M. D. Waard, Diederik M. Roijers, S. Bakkes
General video game playing is a challenging research area in which the goal is to find one algorithm that can play many games successfully. “Monte Carlo Tree Search” (MCTS) is a popular algorithm that has often been used for this purpose. It incrementally builds a search tree based on observed states after applying actions. However, the MCTS algorithm always plans over actions and does not incorporate any higher level planning, as one would expect from a human player. Furthermore, although many games have similar game dynamics, often no prior knowledge is available to general video game playing algorithms. In this paper, we introduce a new algorithm called “Option Monte Carlo Tree Search” (O-MCTS). It offers general video game knowledge and high level planning in the form of “options”, which are action sequences aimed at achieving a specific subgoal. Additionally, we introduce “Option Learning MCTS” (OL-MCTS), which applies a progressive widening technique to the expected returns of options in order to focus exploration on fruitful parts of the search tree. Our new algorithms are compared to MCTS on a diverse set of twenty-eight games from the general video game AI competition. Our results indicate that by using MCTS's efficient tree searching technique on options, O-MCTS outperforms MCTS on most of the games, especially those in which a certain subgoal has to be reached before the game can be won. Lastly, we show that OL-MCTS improves its performance on specific games by learning expected values for options and moving a bias to higher valued options.
{"title":"Monte Carlo Tree Search with options for general video game playing","authors":"M. D. Waard, Diederik M. Roijers, S. Bakkes","doi":"10.1109/CIG.2016.7860383","DOIUrl":"https://doi.org/10.1109/CIG.2016.7860383","url":null,"abstract":"General video game playing is a challenging research area in which the goal is to find one algorithm that can play many games successfully. “Monte Carlo Tree Search” (MCTS) is a popular algorithm that has often been used for this purpose. It incrementally builds a search tree based on observed states after applying actions. However, the MCTS algorithm always plans over actions and does not incorporate any higher level planning, as one would expect from a human player. Furthermore, although many games have similar game dynamics, often no prior knowledge is available to general video game playing algorithms. In this paper, we introduce a new algorithm called “Option Monte Carlo Tree Search” (O-MCTS). It offers general video game knowledge and high level planning in the form of “options”, which are action sequences aimed at achieving a specific subgoal. Additionally, we introduce “Option Learning MCTS” (OL-MCTS), which applies a progressive widening technique to the expected returns of options in order to focus exploration on fruitful parts of the search tree. Our new algorithms are compared to MCTS on a diverse set of twenty-eight games from the general video game AI competition. Our results indicate that by using MCTS's efficient tree searching technique on options, O-MCTS outperforms MCTS on most of the games, especially those in which a certain subgoal has to be reached before the game can be won. Lastly, we show that OL-MCTS improves its performance on specific games by learning expected values for options and moving a bias to higher valued options.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"122 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88114215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}