Isaac Kauvar, Christopher Doyle, Linqi Zhou, N. Haber
{"title":"Curious Replay for Model-based Adaptation","authors":"Isaac Kauvar, Christopher Doyle, Linqi Zhou, N. Haber","doi":"10.48550/arXiv.2306.15934","DOIUrl":null,"url":null,"abstract":"Agents must be able to adapt quickly as an environment changes. We find that existing model-based reinforcement learning agents are unable to do this well, in part because of how they use past experiences to train their world model. Here, we present Curious Replay -- a form of prioritized experience replay tailored to model-based agents through use of a curiosity-based priority signal. Agents using Curious Replay exhibit improved performance in an exploration paradigm inspired by animal behavior and on the Crafter benchmark. DreamerV3 with Curious Replay surpasses state-of-the-art performance on Crafter, achieving a mean score of 19.4 that substantially improves on the previous high score of 14.5 by DreamerV3 with uniform replay, while also maintaining similar performance on the Deepmind Control Suite. Code for Curious Replay is available at https://github.com/AutonomousAgentsLab/curiousreplay","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"11 1","pages":"16018-16048"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2306.15934","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Agents must be able to adapt quickly as an environment changes. We find that existing model-based reinforcement learning agents are unable to do this well, in part because of how they use past experiences to train their world model. Here, we present Curious Replay -- a form of prioritized experience replay tailored to model-based agents through use of a curiosity-based priority signal. Agents using Curious Replay exhibit improved performance in an exploration paradigm inspired by animal behavior and on the Crafter benchmark. DreamerV3 with Curious Replay surpasses state-of-the-art performance on Crafter, achieving a mean score of 19.4 that substantially improves on the previous high score of 14.5 by DreamerV3 with uniform replay, while also maintaining similar performance on the Deepmind Control Suite. Code for Curious Replay is available at https://github.com/AutonomousAgentsLab/curiousreplay
代理必须能够快速适应环境的变化。我们发现现有的基于模型的强化学习代理无法很好地做到这一点,部分原因在于它们如何使用过去的经验来训练它们的世界模型。在这里,我们提出好奇重放——一种通过使用基于好奇心的优先级信号为基于模型的代理量身定制的优先级体验重放形式。使用Curious Replay的智能体在受动物行为启发的探索范式和Crafter基准中表现出更高的性能。带有Curious Replay的DreamerV3在craft上的表现超过了最先进的水平,达到了19.4分的平均分数,大大提高了具有统一Replay的DreamerV3之前的高分14.5分,同时在Deepmind Control Suite上也保持了类似的表现。代码好奇重放可在https://github.com/AutonomousAgentsLab/curiousreplay