2009 IEEE Symposium on Computational Intelligence and Games最新文献

英文中文

CHANCEPROBCUT: Forward pruning in chance nodes CHANCEPROBCUT:随机节点的正向剪枝

2009 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2009-09-07 DOI: 10.1109/CIG.2009.5286476

Maarten P. D. Schadd, M. Winands, J. Uiterwijk

This article describes a new, game-independent forward-pruning technique for EXPECTIMAX, called CHANCEPROBCUT. It is the first technique to forward prune in chance nodes. Based on the strong correlation between evaluations obtained from searches at different depths, the technique prunes chance events if the result of the chance node is likely to fall outside the search window. In this article, CHANCEPROBCUT is tested in two games, i.e., Stratego and Dice. Experiments reveal that the technique is able to reduce the search tree significantly without a loss of move quality. Moreover, in both games there is also an increase of playing performance.

本文描述了一种新的、与游戏无关的用于EXPECTIMAX的前向修剪技术，称为CHANCEPROBCUT。这是在随机节点上进行前向剪枝的第一种技术。基于从不同深度的搜索中得到的评价之间的强相关性，如果机会节点的结果可能落在搜索窗口之外，该技术将对机会事件进行修剪。在这篇文章中，CHANCEPROBCUT在两个游戏中进行了测试，即Stratego和Dice。实验表明，该方法能够在不损失移动质量的前提下显著减少搜索树。此外，在这两款游戏中，玩家的游戏表现也有所提高。

引用次数: 13

Coevolutionary Temporal Difference Learning for Othello 《奥赛罗》的共同进化时间差异学习

2009 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2009-09-07 DOI: 10.1109/CIG.2009.5286486

M. Szubert, Wojciech Jaśkowski, K. Krawiec

This paper presents Coevolutionary Temporal Difference Learning (CTDL), a novel way of hybridizing co-evolutionary search with reinforcement learning that works by interlacing one-population competitive coevolution with temporal difference learning. The coevolutionary part of the algorithm provides for exploration of the solution space, while the temporal difference learning performs its exploitation by local search. We apply CTDL to the board game of Othello, using weighted piece counter for representing players' strategies. The results of an extensive computational experiment demonstrate CTDL's superiority when compared to coevolution and reinforcement learning alone, particularly when coevolution maintains an archive to provide historical progress. The paper investigates the role of the relative intensity of coevolutionary search and temporal difference search, which turns out to be an essential parameter. The formulation of CTDL leads also to the introduction of Lamarckian form of coevolution, which we discuss in detail.

本文提出了一种将协同进化搜索与强化学习相结合的新方法——共同进化时间差异学习(CTDL)，该方法将单种群竞争共同进化与时间差异学习相结合。算法的协同进化部分提供了对解空间的探索，而时间差分学习则通过局部搜索来实现对解空间的利用。我们将CTDL应用到棋盘游戏《奥赛罗》中，使用加权棋子计数器来表示玩家的策略。一项广泛的计算实验结果表明，与单独的共同进化和强化学习相比，CTDL具有优势，特别是当共同进化维护一个档案以提供历史进展时。本文研究了共同进化搜索的相对强度和时间差异搜索的作用，这是一个重要的参数。CTDL的表述也导致了拉马克形式的共同进化的引入，我们对此进行了详细的讨论。

引用次数: 29

Backpropagation without human supervision for visual control in Quake II 在《雷神之锤2》中，没有人类监督的反向传播视觉控制

2009 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2009-09-07 DOI: 10.1109/CIG.2009.5286462

M. Parker, B. D. Bryant

Backpropagation and neuroevolution are used in a Lamarckian evolution process to train a neural network visual controller for agents in the Quake II environment. In previous work, we hand-coded a non-visual controller for supervising in backpropagation, but hand-coding can only be done for problems with known solutions. In this research the problem for the agent is to attack a moving enemy in a visually complex room with a large central pillar. Because we did not know a solution to the problem, we could not hand-code a supervising controller; instead, we evolve a non-visual neural network as supervisor to the visual controller. This setup creates controllers that learn much faster and have a greater fitness than those learning by neuroevolution-only on the same problem in the same amount of time.

在拉马克进化过程中使用反向传播和神经进化来训练Quake II环境中代理的神经网络视觉控制器。在以前的工作中，我们手工编码了一个非视觉控制器来监督反向传播，但是手工编码只能用于已知解的问题。在这项研究中，智能体的问题是在一个视觉复杂的房间里攻击一个移动的敌人，房间里有一个大的中心柱子。因为我们不知道问题的解决方案，所以我们无法手工编写监督控制器;相反，我们进化了一个非视觉神经网络作为视觉控制器的监督器。这种设置使控制器比那些通过神经进化学习的控制器学习得更快，具有更强的适应性——只能在相同的时间内解决相同的问题。

引用次数: 7

Deceptive strategies for the evolutionary minority game 进化少数博弈的欺骗策略

2009 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2009-09-07 DOI: 10.1109/CIG.2009.5286499

G. Greenwood

The evolutionary minority game is extensively used to study adaptive behavior in a population of interacting agents. In time the agents self-organize despite the fact agents act independently in choosing how to play the game and do not know the play of any other agent. In this paper we study agents who collude with each other to play the same strategy. However, nothing prevents agents from being deceptive and playing a different strategy instead. It is shown that deceptive strategies can be profitable if the number of deceptive agents is small enough.

进化少数博弈被广泛用于研究相互作用主体群体中的适应性行为。随着时间的推移，智能体自组织，尽管智能体在选择如何玩游戏时是独立的，并且不知道任何其他智能体的游戏方式。在本文中，我们研究了相互勾结以采取相同策略的代理。然而，没有什么能阻止代理人欺骗并采取不同的策略。研究表明，如果欺骗代理的数量足够小，欺骗策略是有利可图的。

引用次数: 8

Improving testing of multi-unit computer players for unwanted behavior using coordination macros 使用协调宏改进多单元计算机玩家的不良行为测试

2009 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2009-09-07 DOI: 10.1109/CIG.2009.5286455

Attala Malik, J. Denzinger

We present an improvement to behavior testing of computer players based on evolutionary learning of cooperative behavior that extends the known approach to allow for so-called coordination macros. These macros represent knowledge about the application and are interpreted by the agents that are testing the computer player based on the current situation to achieve coordination between the agents. Our experimental evaluation using this approach to test computer players for one competition scenario of the ORTS real-time strategy game showed that the macros enabled the testing system to find weaknesses much faster than the previous approach, respectively to find weaknesses that the previous approach was not able to find within the given resource limit.

我们提出了一种基于合作行为进化学习的计算机玩家行为测试的改进，扩展了已知的方法，允许所谓的协调宏。这些宏表示有关应用程序的知识，并由基于当前情况测试计算机玩家的代理进行解释，以实现代理之间的协调。我们使用该方法测试计算机玩家的一个ORTS实时策略游戏竞赛场景的实验评估表明，宏使测试系统能够比之前的方法更快地找到弱点，分别找到之前的方法在给定的资源限制内无法找到的弱点。

引用次数: 7

Temporal difference learning with interpolated table value functions 插值表值函数的时间差分学习

2009 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2009-09-07 DOI: 10.1109/CIG.2009.5286496

S. Lucas

This paper introduces a novel function approximation architecture especially well suited to temporal difference learning. The architecture is based on using sets of interpolated table look-up functions. These offer rapid and stable learning, and are efficient when the number of inputs is small. An empirical investigation is conducted to test their performance on a supervised learning task, and on themountain car problem, a standard reinforcement learning benchmark. In each case, the interpolated table functions offer competitive performance.

本文介绍了一种特别适合于时间差分学习的新的函数逼近结构。该体系结构基于使用插值表查找函数集。它们提供了快速和稳定的学习，并且在输入数量较少时效率很高。我们进行了一项实证调查，以测试他们在监督学习任务上的表现，以及在山地车问题上的表现，这是一个标准的强化学习基准。在每种情况下，内插表函数都提供了具有竞争力的性能。

引用次数: 6

Learning drivers for TORCS through imitation using supervised methods 使用监督方法通过模仿学习TORCS驱动程序

2009 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2009-09-07 DOI: 10.1109/CIG.2009.5286480

L. Cardamone, D. Loiacono, P. Lanzi

In this paper, we apply imitation learning to develop drivers for The Open Racing Car Simulator (TORCS). Our approach can be classified as a direct method in that it applies supervised learning to learn car racing behaviors from the data collected from other drivers. In the literature, this approach is known to have led to extremely poor performance with drivers capable of completing only very small parts of a track. In this paper we show that, by using high-level information about the track ahead of the car and by predicting high-level actions, it is possible to develop drivers with performances that in some cases are only 15% lower than the performance of the fastest driver available in TORCS. Our experimental results suggest that our approach can be effective in developing drivers with good performance in non-trivial tracks using a very limited amount of data and computational resources. We analyze the driving behavior of the controllers developed using our approach and identify perceptual aliasing as one of the factors which can limit performance of our approach.

在本文中，我们将模仿学习应用于开放式赛车模拟器(TORCS)的车手开发。我们的方法可以被归类为直接方法，因为它应用监督学习从其他车手收集的数据中学习赛车行为。在文献中，这种方法被认为会导致车手的表现非常差，他们只能完成赛道很小的一部分。在本文中，我们表明，通过使用有关汽车前方赛道的高级信息并通过预测高级动作，可以开发出在某些情况下性能仅比TORCS中可用的最快驾驶员性能低15%的驾驶员。我们的实验结果表明，我们的方法可以在使用非常有限的数据和计算资源的情况下，有效地开发出在非平凡赛道上表现良好的驱动程序。我们分析了使用我们的方法开发的控制器的驾驶行为，并将感知混叠识别为限制我们方法性能的因素之一。

引用次数: 60

Chuck Norris rocks! 罗礼士太棒了!

2009 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2009-09-07 DOI: 10.1109/CIG.2009.5286493

D. Cuadrado, Y. Sáez

In this introductory work we present our first approach to a computer controller player for first-person shooter videogames, where we have applied genetic algorithms in order to evolve the best dodge rules. This paper is a report of the results obtained by our bot during the first competition held in Trondheim, Norway, during the IEEE Congress on Evolutionary Computation (CEC 2009).

在这篇介绍性的文章中，我们向第一人称射击电子游戏的电脑控制器玩家展示了我们的第一种方法，我们在其中应用了遗传算法来进化最佳闪避规则。本文是在IEEE进化计算大会(CEC 2009)期间在挪威特隆赫姆举行的第一场比赛中，我们的机器人获得的结果的报告。

引用次数: 9

Capturing augmented sensing capabilities and intrusion delay in patrolling-intrusion games 在巡逻-入侵游戏中获取增强传感能力和入侵延迟

2009 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2009-09-07 DOI: 10.1109/CIG.2009.5286477

Nicola Basilico, N. Gatti, Thomas Rossi

Patrolling-intrusion games are recently receiving more and more attention in the literature. They are twoplayer non zero-sum games where an intruder tries to attack one place of interest and one patroller (or more) tries to capture the intruder. The patroller cannot completely cover the environment following a cycle, otherwise the intruder will successfully strike at least a target. Thus, the patroller employs a randomized strategy. These games are usually studied as leader-follower games, where the patroller is the leader and the intruder is the follower. The models proposed in the state of the art so far present several limitations that prevent their employment in realistic settings. In this paper, we refine the models from the state-of-the-art capturing patroller's augmented sensing capabilities and a possible delay in the intrusion, we propose algorithms to solve efficiently our extensions, and we experimentally evaluate the computational time in some case studies.

近年来，巡逻入侵游戏受到了越来越多的关注。它们是两方非零和游戏，入侵者试图攻击一个感兴趣的地方，一个巡逻人员(或更多)试图捕获入侵者。巡逻队不能在一个周期内完全覆盖周围环境，否则入侵者至少会成功袭击一个目标。因此，巡警采用随机策略。这些博弈通常被研究为领导者-追随者博弈，其中巡逻者是领导者，入侵者是追随者。到目前为止，在最先进的技术中提出的模型存在一些限制，阻碍了它们在现实环境中的应用。在本文中，我们从最先进的捕获巡逻员的增强传感能力和入侵的可能延迟来改进模型，我们提出了有效解决我们扩展的算法，并在一些案例研究中实验评估了计算时间。

引用次数: 12

Optimal strategy selection of non-player character on real time strategy game using a speciated evolutionary algorithm 基于物种进化算法的实时策略博弈非玩家角色最优策略选择

2009 IEEE Symposium on Computational Intelligence and Games

Pub Date : 2009-09-07 DOI: 10.1109/CIG.2009.5286490

Su-Hyung Jang, Jongwon Yoon, Sung-Bae Cho

In the real-time strategy game, success of AI depends on consecutive and effective decision making on actions by NPCs in the game. In this regard, there have been many researchers to find the optimized choice. This paper confirms the improvement of NPC performance in a real-time strategy game by using the speciated evolutionary algorithm for such decision making on actions, which has been largely applied to the classification problems. Creation and selection of members to use for this ensemble method is manifested through speciation and the performance is verified through ‘conqueror’, a real-time strategy game platform developed by our previous work.

在即时战略游戏中，AI的成功取决于游戏中npc连续有效的行动决策。在这方面，已经有很多研究者在寻找最优的选择。本文通过使用特定的进化算法来进行行动决策，证实了NPC在实时策略游戏中的性能提高，该算法已大量应用于分类问题。创建和选择用于该集成方法的成员通过物种形成来体现，并通过我们之前开发的实时策略游戏平台“征服者”来验证性能。

引用次数: 30

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2009 IEEE Symposium on Computational Intelligence and Games

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀