结合想象力和启发式学习泛化策略

Neurons, behavior, data analysis and theory Pub Date : 2018-09-10 DOI:10.51628/001c.13477

Erik J Peterson, Necati Alp Müyesser, T. Verstynen, Kyle Dunovan

{"title":"结合想象力和启发式学习泛化策略","authors":"Erik J Peterson, Necati Alp Müyesser, T. Verstynen, Kyle Dunovan","doi":"10.51628/001c.13477","DOIUrl":null,"url":null,"abstract":"Deep reinforcement learning can match or exceed human performance in stable contexts, but with minor changes to the environment artificial networks, unlike humans, often cannot adapt. Humans rely on a combination of heuristics to simplify computational load and imagination to extend experiential learning to new and more challenging environments. Motivated by theories of the hierarchical organization of the human prefrontal networks, we have developed a model of hierarchical reinforcement learning that combines both heuristics and imagination into a “stumbler-strategist” network. We test performance of this network using Wythoff’s game, a gridworld environment with a known optimal strategy. We show that a heuristic labeling of each position as hot or cold, combined with imagined play, both accelerates learning and promotes transfer to novel games, while also improving model interpretability","PeriodicalId":74289,"journal":{"name":"Neurons, behavior, data analysis and theory","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Combining Imagination and Heuristics to Learn Strategies that Generalize\",\"authors\":\"Erik J Peterson, Necati Alp Müyesser, T. Verstynen, Kyle Dunovan\",\"doi\":\"10.51628/001c.13477\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep reinforcement learning can match or exceed human performance in stable contexts, but with minor changes to the environment artificial networks, unlike humans, often cannot adapt. Humans rely on a combination of heuristics to simplify computational load and imagination to extend experiential learning to new and more challenging environments. Motivated by theories of the hierarchical organization of the human prefrontal networks, we have developed a model of hierarchical reinforcement learning that combines both heuristics and imagination into a “stumbler-strategist” network. We test performance of this network using Wythoff’s game, a gridworld environment with a known optimal strategy. We show that a heuristic labeling of each position as hot or cold, combined with imagined play, both accelerates learning and promotes transfer to novel games, while also improving model interpretability\",\"PeriodicalId\":74289,\"journal\":{\"name\":\"Neurons, behavior, data analysis and theory\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurons, behavior, data analysis and theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.51628/001c.13477\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurons, behavior, data analysis and theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51628/001c.13477","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

深度强化学习可以在稳定的环境中匹配或超过人类的表现，但与人类不同，人工网络在环境发生微小变化时往往无法适应。人类依靠启发式的组合来简化计算负荷和想象力，将体验式学习扩展到新的和更具挑战性的环境。受人类前额叶网络分层组织理论的启发，我们开发了一种分层强化学习模型，该模型将启发式和想象力结合到“绊倒-战略家”网络中。我们使用Wythoff游戏测试了该网络的性能，这是一个具有已知最优策略的网格世界环境。我们表明，将每个位置标记为热或冷的启发式标签，结合想象游戏，既加速了学习，又促进了向新游戏的迁移，同时也提高了模型的可解释性

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Combining Imagination and Heuristics to Learn Strategies that Generalize

Deep reinforcement learning can match or exceed human performance in stable contexts, but with minor changes to the environment artificial networks, unlike humans, often cannot adapt. Humans rely on a combination of heuristics to simplify computational load and imagination to extend experiential learning to new and more challenging environments. Motivated by theories of the hierarchical organization of the human prefrontal networks, we have developed a model of hierarchical reinforcement learning that combines both heuristics and imagination into a “stumbler-strategist” network. We test performance of this network using Wythoff’s game, a gridworld environment with a known optimal strategy. We show that a heuristic labeling of each position as hot or cold, combined with imagined play, both accelerates learning and promotes transfer to novel games, while also improving model interpretability

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neurons, behavior, data analysis and theory

自引率

0.00%

发文量