低维多智能体环境下的异构团队深度q学习

2016 IEEE Conference on Computational Intelligence and Games (CIG) Pub Date : 2016-09-01 DOI:10.1109/CIG.2016.7860413

Mateusz Kurek, Wojciech Jaśkowski

{"title":"低维多智能体环境下的异构团队深度q学习","authors":"Mateusz Kurek, Wojciech Jaśkowski","doi":"10.1109/CIG.2016.7860413","DOIUrl":null,"url":null,"abstract":"Deep Q-Learning is an effective reinforcement learning method, which has recently obtained human-level performance for a set of Atari 2600 games. Remarkably, the system was trained on the high-dimensional raw visual data. Is Deep Q-Learning equally valid for problems involving a low-dimensional state space? To answer this question, we evaluate the components of Deep Q-Learning (deep architecture, experience replay, target network freezing, and meta-state) on a Keepaway soccer problem, where the state is described only by 13 variables. The results indicate that although experience replay indeed improves the agent performance, target network freezing and meta-state slow down the learning process. Moreover, the deep architecture does not help for this task since a rather shallow network with just two hidden layers worked the best. By selecting the best settings, and employing heterogeneous team learning, we were able to outperform all previous methods applied to Keepaway soccer using a fraction of the runner-up's computational expense. These results extend our understanding of the Deep Q-Learning effectiveness for low-dimensional reinforcement learning tasks.","PeriodicalId":6594,"journal":{"name":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"74 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Heterogeneous team deep q-learning in low-dimensional multi-agent environments\",\"authors\":\"Mateusz Kurek, Wojciech Jaśkowski\",\"doi\":\"10.1109/CIG.2016.7860413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep Q-Learning is an effective reinforcement learning method, which has recently obtained human-level performance for a set of Atari 2600 games. Remarkably, the system was trained on the high-dimensional raw visual data. Is Deep Q-Learning equally valid for problems involving a low-dimensional state space? To answer this question, we evaluate the components of Deep Q-Learning (deep architecture, experience replay, target network freezing, and meta-state) on a Keepaway soccer problem, where the state is described only by 13 variables. The results indicate that although experience replay indeed improves the agent performance, target network freezing and meta-state slow down the learning process. Moreover, the deep architecture does not help for this task since a rather shallow network with just two hidden layers worked the best. By selecting the best settings, and employing heterogeneous team learning, we were able to outperform all previous methods applied to Keepaway soccer using a fraction of the runner-up's computational expense. These results extend our understanding of the Deep Q-Learning effectiveness for low-dimensional reinforcement learning tasks.\",\"PeriodicalId\":6594,\"journal\":{\"name\":\"2016 IEEE Conference on Computational Intelligence and Games (CIG)\",\"volume\":\"74 1\",\"pages\":\"1-8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Conference on Computational Intelligence and Games (CIG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIG.2016.7860413\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Conference on Computational Intelligence and Games (CIG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2016.7860413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

摘要

深度Q-Learning是一种有效的强化学习方法，最近在一组雅达利2600游戏中获得了人类水平的表现。值得注意的是，该系统是在高维原始视觉数据上进行训练的。深度Q-Learning对涉及低维状态空间的问题同样有效吗?为了回答这个问题，我们在一个Keepaway足球问题上评估了深度Q-Learning的组成部分(深度架构、经验回放、目标网络冻结和元状态)，其中状态仅由13个变量描述。结果表明，虽然经验重放确实提高了智能体的性能，但目标网络冻结和元状态减慢了学习过程。此外，深层架构对这项任务没有帮助，因为只有两个隐藏层的相当浅的网络工作得最好。通过选择最佳设置，并采用异构团队学习，我们能够超越之前应用于Keepaway足球的所有方法，而使用的计算费用只是亚军的一小部分。这些结果扩展了我们对Deep Q-Learning在低维强化学习任务中的有效性的理解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Heterogeneous team deep q-learning in low-dimensional multi-agent environments

Deep Q-Learning is an effective reinforcement learning method, which has recently obtained human-level performance for a set of Atari 2600 games. Remarkably, the system was trained on the high-dimensional raw visual data. Is Deep Q-Learning equally valid for problems involving a low-dimensional state space? To answer this question, we evaluate the components of Deep Q-Learning (deep architecture, experience replay, target network freezing, and meta-state) on a Keepaway soccer problem, where the state is described only by 13 variables. The results indicate that although experience replay indeed improves the agent performance, target network freezing and meta-state slow down the learning process. Moreover, the deep architecture does not help for this task since a rather shallow network with just two hidden layers worked the best. By selecting the best settings, and employing heterogeneous team learning, we were able to outperform all previous methods applied to Keepaway soccer using a fraction of the runner-up's computational expense. These results extend our understanding of the Deep Q-Learning effectiveness for low-dimensional reinforcement learning tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE Conference on Computational Intelligence and Games (CIG)

自引率

0.00%

发文量

期刊最新文献

Human gesture classification by brute-force machine learning for exergaming in physiotherapy Evolving micro for 3D Real-Time Strategy games Constrained surprise search for content generation Design influence on player retention: A method based on time varying survival analysis Deep Q-learning using redundant outputs in visual doom