Board Representations for Neural Go Players Learning by Temporal Difference

2007 IEEE Symposium on Computational Intelligence and Games Pub Date : 2007-04-01 DOI:10.1109/CIG.2007.368096

H. A. Mayer

{"title":"Board Representations for Neural Go Players Learning by Temporal Difference","authors":"H. A. Mayer","doi":"10.1109/CIG.2007.368096","DOIUrl":null,"url":null,"abstract":"The majority of work on artificial neural networks (ANNs) playing the game of Go focus on network architectures and training regimes to improve the quality of the neural player. A less investigated problem is the board representation conveying the information on the current state of the game to the network. Common approaches suggest a straight-forward encoding by assigning each point on the board to a single (or more) input neurons. However, these basic representations do not capture elementary structural relationships between stones (and points) being essential to the game. We compare three different board representations for self-learning ANNs on a 5 times 5 board employing temporal difference learning (TDL) with two types of move selection (during training). The strength of the trained networks is evaluated in games against three computer players of different quality. A tournament of the best neural players, addition of alpha-beta search, and a commented game of a neural player against the best computer player further explore the potential of the neural players and its respective board representations","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Symposium on Computational Intelligence and Games","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2007.368096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

The majority of work on artificial neural networks (ANNs) playing the game of Go focus on network architectures and training regimes to improve the quality of the neural player. A less investigated problem is the board representation conveying the information on the current state of the game to the network. Common approaches suggest a straight-forward encoding by assigning each point on the board to a single (or more) input neurons. However, these basic representations do not capture elementary structural relationships between stones (and points) being essential to the game. We compare three different board representations for self-learning ANNs on a 5 times 5 board employing temporal difference learning (TDL) with two types of move selection (during training). The strength of the trained networks is evaluated in games against three computer players of different quality. A tournament of the best neural players, addition of alpha-beta search, and a commented game of a neural player against the best computer player further explore the potential of the neural players and its respective board representations

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于时间差异的神经围棋棋手学习的棋盘表示

大多数关于人工神经网络(ANNs)下围棋的工作都集中在网络架构和训练机制上，以提高神经棋手的质量。一个较少研究的问题是棋盘表示向网络传递游戏当前状态的信息。常见的方法是通过将棋盘上的每个点分配给单个(或多个)输入神经元来直接编码。然而，这些基本的表示并没有捕捉到棋子(和点数)之间的基本结构关系。我们比较了自学习人工神经网络在5 × 5棋盘上的三种不同的棋盘表示，使用时间差异学习(TDL)和两种类型的移动选择(在训练期间)。经过训练的网络的强度在与三个不同水平的电脑玩家的比赛中被评估。最好的神经棋手的比赛，加上alpha-beta搜索，以及神经棋手对最好的计算机棋手的评论游戏，进一步探索了神经棋手的潜力及其各自的棋盘表示

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2007 IEEE Symposium on Computational Intelligence and Games

自引率

0.00%

发文量

期刊最新文献

Hybrid Evolutionary Learning Approaches for The Virus Game Vidya: A God Game Based on Intelligent Agents Whose Actions are Devised Through Evolutionary Computation Evolving Pac-Man Players: Can We Learn from Raw Input? Tournament Particle Swarm Optimization EvoTanks: Co-Evolutionary Development of Game-Playing Agents