{"title":"Athénan wins sixteen gold medals at the Computer Olympiad","authors":"Quentin Cohen-Solal, Tristan Cazenave","doi":"10.3233/icg-230239","DOIUrl":null,"url":null,"abstract":"Unlike Alpha Zero-like algorithms (Silver et al., 2018), Athénan is based on the Descent framework (Cohen-Solal, 2020). Thus, during the training process, it uses a variant of Unbounded Minimax (Korf and Chickering, 1996) called Descent, instead of Monte Carlo Tree Search, to construct the partial game tree used to determine the best action to play and to collect data for learning. With Descent, at each move, the best sequences of moves are iteratively extended until terminal states. During evaluations, another variant of Unbounded Minimax is used. This variant contains in particular a generic solver and it chooses the safest action to decide between actions. Moreover, contrary to Alpha Zero, Athénan does not use a policy network, only a value network. The actions therefore do not need to be encoded. In addition, unlike the Alpha Zero paradigm, with Athénan all data generated during the searches to determine the best actions to play is used for learning. As a result, much more data is generated per match (Cohen-Solal and Cazenave, 2023), and thus the training is done more quickly and does not require a (massive) parallelization to give good results (contrary to Alpha Zero). Athénan can use end-of-game heuristic evaluations to improve its level of play, such as game score or game length (in order to win quickly and lose slowly). Further improvements are described in (Cohen-Solal, 2020).","PeriodicalId":50395,"journal":{"name":"Icga Journal","volume":"18 7","pages":""},"PeriodicalIF":0.2000,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Icga Journal","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/icg-230239","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Unlike Alpha Zero-like algorithms (Silver et al., 2018), Athénan is based on the Descent framework (Cohen-Solal, 2020). Thus, during the training process, it uses a variant of Unbounded Minimax (Korf and Chickering, 1996) called Descent, instead of Monte Carlo Tree Search, to construct the partial game tree used to determine the best action to play and to collect data for learning. With Descent, at each move, the best sequences of moves are iteratively extended until terminal states. During evaluations, another variant of Unbounded Minimax is used. This variant contains in particular a generic solver and it chooses the safest action to decide between actions. Moreover, contrary to Alpha Zero, Athénan does not use a policy network, only a value network. The actions therefore do not need to be encoded. In addition, unlike the Alpha Zero paradigm, with Athénan all data generated during the searches to determine the best actions to play is used for learning. As a result, much more data is generated per match (Cohen-Solal and Cazenave, 2023), and thus the training is done more quickly and does not require a (massive) parallelization to give good results (contrary to Alpha Zero). Athénan can use end-of-game heuristic evaluations to improve its level of play, such as game score or game length (in order to win quickly and lose slowly). Further improvements are described in (Cohen-Solal, 2020).
期刊介绍:
The ICGA Journal provides an international forum for computer games researchers presenting new results on ongoing work. The editors invite contributors to submit papers on all aspects of research related to computers and games. Relevant topics include, but are not limited to:
(1) the current state of game-playing programs for classic and modern board and card games
(2) the current state of virtual, casual and video games
(3) new theoretical developments in game-related research, and
(4) general scientific contributions produced by the study of games.
Also welcome is research on topics such as:
(5) social aspects of computer games
(6) cognitive research of how humans play games
(7) capture and analysis of game data, and
(8) issues related to networked games are invited to submit their contributions.