2007 IEEE Symposium on Computational Intelligence and Games最新文献

英文中文

Hybrid of Evolution and Reinforcement Learning for Othello Players 奥赛罗玩家进化和强化学习的混合

2007 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2007-04-01 DOI: 10.1109/CIG.2007.368099

Kyung-Joong Kim, He-Seong Choi, Sung-Bae Cho

Although the reinforcement learning and evolutionary algorithm show good results in board evaluation optimization, the hybrid of both approaches is rarely addressed in the literature. In this paper, the evolutionary algorithm is boosted using resources from the reinforcement learning. 1) The initialization of initial population using solution optimized by temporal difference learning 2) Exploitation of domain knowledge extracted from reinforcement learning. Experiments on Othello game strategies show that the proposed methods can effectively search the solution space and improve the performance

虽然强化学习和进化算法在棋盘评估优化中表现出良好的效果，但两种方法的混合在文献中很少得到解决。本文利用强化学习中的资源对进化算法进行了改进。1)利用时间差分学习优化的解初始化初始种群2)利用强化学习中提取的领域知识。对奥赛罗博弈策略的实验表明，该方法能有效地搜索解空间，提高算法性能

引用次数: 23

Board Representations for Neural Go Players Learning by Temporal Difference 基于时间差异的神经围棋棋手学习的棋盘表示

2007 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2007-04-01 DOI: 10.1109/CIG.2007.368096

H. A. Mayer

The majority of work on artificial neural networks (ANNs) playing the game of Go focus on network architectures and training regimes to improve the quality of the neural player. A less investigated problem is the board representation conveying the information on the current state of the game to the network. Common approaches suggest a straight-forward encoding by assigning each point on the board to a single (or more) input neurons. However, these basic representations do not capture elementary structural relationships between stones (and points) being essential to the game. We compare three different board representations for self-learning ANNs on a 5 times 5 board employing temporal difference learning (TDL) with two types of move selection (during training). The strength of the trained networks is evaluated in games against three computer players of different quality. A tournament of the best neural players, addition of alpha-beta search, and a commented game of a neural player against the best computer player further explore the potential of the neural players and its respective board representations

大多数关于人工神经网络(ANNs)下围棋的工作都集中在网络架构和训练机制上，以提高神经棋手的质量。一个较少研究的问题是棋盘表示向网络传递游戏当前状态的信息。常见的方法是通过将棋盘上的每个点分配给单个(或多个)输入神经元来直接编码。然而，这些基本的表示并没有捕捉到棋子(和点数)之间的基本结构关系。我们比较了自学习人工神经网络在5 × 5棋盘上的三种不同的棋盘表示，使用时间差异学习(TDL)和两种类型的移动选择(在训练期间)。经过训练的网络的强度在与三个不同水平的电脑玩家的比赛中被评估。最好的神经棋手的比赛，加上alpha-beta搜索，以及神经棋手对最好的计算机棋手的评论游戏，进一步探索了神经棋手的潜力及其各自的棋盘表示

引用次数: 12

Temporal Difference Learning of an Othello Evaluation Function for a Small Neural Network with Shared Weights 具有共享权值的小型神经网络的Othello评价函数的时间差分学习

2007 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2007-04-01 DOI: 10.1109/CIG.2007.368101

E. Manning

This paper presents an artificial neural network with shared weights, trained to play the game of Othello by self-play with temporal difference learning (TDL). The network performs as well as the champion of the CEC 2006 Othello Evaluation Function Competition. The TDL-trained network contains only 67 unique weights compared to 2113 for the champion

本文提出了一种具有共享权值的人工神经网络，利用时间差分学习(TDL)的自游戏方法训练其玩奥赛罗游戏。该网络的表现与CEC 2006年奥赛罗评估功能竞赛的冠军一样好。tdl训练的网络只包含67个独特的权重，而冠军网络则有2113个

引用次数: 5

Using Stochastic AI Techniques to Achieve Unbounded Resolution in Finite Player Goore Games and its Applications 利用随机人工智能技术实现有限人围棋游戏的无界分辨率及其应用

2007 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2007-04-01 DOI: 10.1109/CIG.2007.368093

B. Oommen, Ole-Christoffer Granmo, A. Pedersen

The Goore Game (GG) introduced by M. L. Tsetlin in 1973 has the fascinating property that it can be resolved in a completely distributed manner with no intercommunication between the players. The game has recently found applications in many domains, including the field of sensor networks and quality-of-service (QoS) routing. In actual implementations of the solution, the players are typically replaced by learning automata (LA). The problem with the existing reported approaches is that the accuracy of the solution achieved is intricately related to the number of players participating in the game -which, in turn, determines the resolution. In other words, an arbitrary accuracy can be obtained only if the game has an infinite number of players. In this paper, we show how we can attain an unbounded accuracy for the GG by utilizing no more than three stochastic learning machines, and by recursively pruning the solution space to guarantee that the retained domain contains the solution to the game with a probability as close to unity as desired. The paper also conjectures on how the solution can be applied to some of the application domains

由M. L. Tsetlin于1973年提出的Goore Game (GG)具有一个令人着迷的特性，即它可以在参与者之间没有相互通信的情况下以完全分布式的方式解决。该游戏最近在许多领域得到了应用，包括传感器网络和服务质量(QoS)路由领域。在解决方案的实际实现中，玩家通常被学习自动机(LA)所取代。现有报告方法的问题在于，解决方案的准确性与参与游戏的玩家数量有着复杂的关系，而玩家数量又决定了解决方案。换句话说，只有当游戏拥有无限数量的玩家时，才能获得任意精度。在本文中，我们展示了如何通过使用不超过三个随机学习机，并通过递归地修剪解空间来保证保留域以接近于期望的概率包含博弈的解，从而获得GG的无界精度。本文还推测了如何将该解决方案应用于某些应用领域

{"title":"Using Stochastic AI Techniques to Achieve Unbounded Resolution in Finite Player Goore Games and its Applications","authors":"B. Oommen, Ole-Christoffer Granmo, A. Pedersen","doi":"10.1109/CIG.2007.368093","DOIUrl":"https://doi.org/10.1109/CIG.2007.368093","url":null,"abstract":"The Goore Game (GG) introduced by M. L. Tsetlin in 1973 has the fascinating property that it can be resolved in a completely distributed manner with no intercommunication between the players. The game has recently found applications in many domains, including the field of sensor networks and quality-of-service (QoS) routing. In actual implementations of the solution, the players are typically replaced by learning automata (LA). The problem with the existing reported approaches is that the accuracy of the solution achieved is intricately related to the number of players participating in the game -which, in turn, determines the resolution. In other words, an arbitrary accuracy can be obtained only if the game has an infinite number of players. In this paper, we show how we can attain an unbounded accuracy for the GG by utilizing no more than three stochastic learning machines, and by recursively pruning the solution space to guarantee that the retained domain contains the solution to the game with a probability as close to unity as desired. The paper also conjectures on how the solution can be applied to some of the application domains","PeriodicalId":365269,"journal":{"name":"2007 IEEE Symposium on Computational Intelligence and Games","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116350491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Concept Accessibility as Basis for Evolutionary Reinforcement Learning of Dots and Boxes 概念可及性:点盒进化强化学习的基础

2007 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2007-04-01 DOI: 10.1109/CIG.2007.368090

Anthony Knittel, T. Bossomaier, A. Snyder

The challenge of creating teams of agents, which evolve or learn, to solve complex problems is addressed in the combinatorially complex game of dots and boxes (strings and coins). Previous evolutionary reinforcement learning (ERL) systems approaching this task based on dynamic agent populations have shown some degree of success in game play, however are sensitive to conditions and suffer from unstable agent populations under difficult play and poor development against an easier opponent. A novel technique for preserving stability and allowing balance of specialised and generalised rules in an ERL system is presented, motivated by accessibility of concepts in human cognition, as opposed to natural selection through population survivability common to ERL systems. Reinforcement learning in dynamic teams of mutable agents enables play comparable to hand-crafted artificial players. Performance and stability of development is enhanced when a measure of the frequency of reinforcement is separated from the quality measure of rules

创建代理团队的挑战，通过进化或学习来解决复杂的问题，是在点和盒子(字符串和硬币)的组合复杂游戏中解决的。先前的基于动态智能体种群的进化强化学习(ERL)系统在游戏中取得了一定程度的成功，但是对条件很敏感，并且在困难的游戏中遭受不稳定的智能体种群的影响，并且在对抗更容易的对手时发展不佳。提出了一种在ERL系统中保持稳定性并允许专业规则和一般规则平衡的新技术，其动机是人类认知中概念的可及性，而不是通过ERL系统中常见的群体生存能力进行自然选择。在可变代理的动态团队中进行强化学习，可以使游戏与手工制作的人工玩家相媲美。当加固频率的度量与规则的质量度量分离时，性能和开发的稳定性得到增强

引用次数: 8

Reward Allotment Considered Roles for Learning Classifier System For Soccer Video Games 基于奖励分配的足球电子游戏分类器学习系统

2007 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2007-04-01 DOI: 10.1109/CIG.2007.368111

Yosuke Akatsuka, Yuji Sato

In recent years, the video-game environment has begun to change due to the explosive growth of the Internet. As a result, it makes the time for maintenance longer and the development cost increased. In addition, the life cycle of the game program shortens. To solve the above-mentioned problem, we have already proposed the event-driven hybrid learning classifier system and showed that the system is effective to improving the game winning rate and making the learning time shorten. This paper describes the investigation result of the effect in case we apply the reward allotment considered each role for classifier learning system. Concretely, we investigate the influence to each player's actions by changing the algorithm of the opponent and to team strategy by changing reward setting, and analyze them. As a result, we show that the influence of learning effects to each player's actions does not depend on the algorithm of opponent. And we also show that the reward allotment considered each role has possible to evolve the game strategy to improving the game winning rate

近年来，由于互联网的爆炸式增长，视频游戏环境开始发生变化。因此，它使维护时间更长，开发成本增加。此外，游戏程序的生命周期缩短。为了解决上述问题，我们已经提出了事件驱动的混合学习分类器系统，并表明该系统能够有效地提高游戏胜率和缩短学习时间。本文描述了在分类器学习系统中应用考虑每个角色的奖励分配的效果的调查结果。具体来说，我们研究了通过改变对手的算法和通过改变奖励设置对团队策略的影响，并对它们进行了分析。结果表明，学习效应对每个玩家行为的影响不依赖于对手的算法。我们还表明，考虑到每个角色的奖励分配有可能进化游戏策略以提高游戏胜率

引用次数: 2

Evolutionary Computations for Designing Game Rules of the COMMONS GAME 公地博弈规则设计的进化计算

2007 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2007-04-01 DOI: 10.1109/CIG.2007.368117

H. Handa, N. Baba

In this paper, we focus on game rule design by using two evolutionary computations. The first EC is a multi-objective evolutionary algorithm in order to generate various skilled players. By using acquired skilled players, i.e., Pareto individuals in MOEA, another EC (evolutionary programming) adjusts game rule parameters i.e, an appropriate point of each card in the COMMONS GAME

在本文中，我们主要通过两种进化计算来研究游戏规则的设计。第一种EC是一种多目标进化算法，目的是生成不同的熟练玩家。通过使用获得的熟练玩家，即MOEA中的帕累托个体，另一个EC(进化编程)调整游戏规则参数，即公共游戏中每张卡牌的适当点

引用次数: 2

Game AI in Delta3D

2007 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2007-04-01 DOI: 10.1109/CIG.2007.368114

C. Darken, Bradley G. Anderegg, Perry McDowell

Delta3D is a GNU-licensed open source game engine with an orientation towards supporting "serious games" such as those with defense and homeland security applications. AI is an important issue for serious games, since there is more pressure to "get the AI right", as opposed to providing an entertaining user experience. We describe several of our near-and longer-term AI projects oriented towards making it easier to build AI-enhanced applications in Delta3D.

Delta3D是一个gnu授权的开源游戏引擎，其目标是支持“严肃游戏”，例如那些与国防和国土安全应用有关的游戏。AI对于严肃游戏来说是一个重要问题，因为比起提供有趣的用户体验，“正确设置AI”的压力更大。我们描述了几个近期和长期的人工智能项目，这些项目旨在使在Delta3D中构建人工智能增强应用程序变得更容易。

引用次数: 8

Evolving Players for an Ancient Game: Hnefatafl 为一款古老的游戏进化玩家:太好了

2007 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2007-04-01 DOI: 10.1109/CIG.2007.368094

P. Hingston

Hnefatafl is an ancient Norse game - an ancestor of chess. In this paper, we report on the development of computer players for this game. In the spirit of Blondie24, we evolve neural networks as board evaluation functions for different versions of the game. An unusual aspect of this game is that there is no general agreement on the rules: it is no longer much played, and game historians attempt to infer the rules from scraps of historical texts, with ambiguities often resolved on gut feeling as to what the rules must have been in order to achieve a balanced game. We offer the evolutionary method as a means by which to judge the merits of alternative rule sets

围棋是一种古老的挪威游戏，是国际象棋的祖先。在本文中，我们报道了这款游戏的电脑播放器的开发。本着《Blondie24》的精神，我们将神经网络进化为不同版本游戏的棋盘评估功能。这款游戏的一个不同寻常之处在于，它的规则并没有达成普遍共识:人们不再玩它了，游戏历史学家试图从历史文本的碎片中推断出规则，而模糊的规则往往是通过直觉来解决的，即为了实现游戏的平衡，规则必须是什么。我们提出了一种进化的方法来判断可选规则集的优劣

引用次数: 4

Evolving Pac-Man Players: Can We Learn from Raw Input? 进化《吃豆人》玩家:我们能否从原始输入中学习?

2007 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2007-04-01 DOI: 10.1109/CIG.2007.368110

M. Gallagher, M. Ledwich

Pac-Man (and variant) computer games have received some recent attention in artificial intelligence research. One reason is that the game provides a platform that is both simple enough to conduct experimental research and complex enough to require non-trivial strategies for successful game-play. This paper describes an approach to developing Pac-Man playing agents that learn game-play based on minimal onscreen information. The agents are based on evolving neural network controllers using a simple evolutionary algorithm. The results show that neuroevolution is able to produce agents that display novice playing ability, with a minimal amount of onscreen information, no knowledge of the rules of the game and a minimally informative fitness function. The limitations of the approach are also discussed, together with possible directions for extending the work towards producing better Pac-Man playing agents

最近在人工智能研究中，吃豆人(及其变体)电脑游戏受到了一些关注。一个原因是，游戏提供了一个平台，它既简单到可以进行实验研究，又复杂到需要有非凡的策略才能成功玩游戏。本文描述了一种开发基于最小屏幕信息学习游戏玩法的吃豆人游戏代理的方法。代理是基于使用简单进化算法的进化神经网络控制器。结果表明，神经进化能够产生显示新手游戏能力的智能体，具有最少的屏幕信息，不了解游戏规则和最小的信息适应度函数。我们还讨论了该方法的局限性，以及扩展工作以制作更好的《吃豆人》游戏代理的可能方向

引用次数: 47

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2007 IEEE Symposium on Computational Intelligence and Games

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀