Image Augmentation-Based Momentum Memory Intrinsic Reward for Sparse Reward Visual Scenes

IF 2.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Games Pub Date : 2023-06-20 DOI:10.1109/TG.2023.3288042

Zheng Fang;Biao Zhao;Guizhong Liu

{"title":"Image Augmentation-Based Momentum Memory Intrinsic Reward for Sparse Reward Visual Scenes","authors":"Zheng Fang;Biao Zhao;Guizhong Liu","doi":"10.1109/TG.2023.3288042","DOIUrl":null,"url":null,"abstract":"Many real-life tasks can be abstracted as sparse reward visual scenes, which can make it difficult for an agent to accomplish tasks accepting only images and sparse reward. To address this problem, we split it into two parts: visual representation and sparse reward, and propose our novel framework, called image augmentation-based momentum memory intrinsic reward, which combines self-supervised representation learning with intrinsic motivation. For visual representation, we acquire a representation driven by a combination of image-augmented forward dynamics and reward. To handle sparse reward, we design a new type of intrinsic reward called momentum memory intrinsic reward, which uses the difference between the outputs from the current model (online network) and the historical model (target network) to indicate the agent's state familiarity. We evaluate our method on a visual navigation task with sparse reward in VizDoom and demonstrate that it achieves state-of-the-art performance in terms of sample efficiency. Our method is at least two times faster than existing methods and reaches a 100% success rate.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 3","pages":"509-517"},"PeriodicalIF":2.8000,"publicationDate":"2023-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Games","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10158428/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Many real-life tasks can be abstracted as sparse reward visual scenes, which can make it difficult for an agent to accomplish tasks accepting only images and sparse reward. To address this problem, we split it into two parts: visual representation and sparse reward, and propose our novel framework, called image augmentation-based momentum memory intrinsic reward, which combines self-supervised representation learning with intrinsic motivation. For visual representation, we acquire a representation driven by a combination of image-augmented forward dynamics and reward. To handle sparse reward, we design a new type of intrinsic reward called momentum memory intrinsic reward, which uses the difference between the outputs from the current model (online network) and the historical model (target network) to indicate the agent's state familiarity. We evaluate our method on a visual navigation task with sparse reward in VizDoom and demonstrate that it achieves state-of-the-art performance in terms of sample efficiency. Our method is at least two times faster than existing methods and reaches a 100% success rate.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于图像增强的动量记忆内在奖励，用于稀疏奖励视觉场景

现实生活中的许多任务都可以抽象为稀疏奖励的视觉场景，这就使得代理很难完成只接受图像和稀疏奖励的任务。为了解决这个问题，我们将其分为两个部分：视觉表征和稀疏奖励，并提出了我们的新框架，即基于图像增强的动量记忆内在奖励，它将自我监督表征学习与内在动机相结合。对于视觉表征，我们通过图像增强前向动力学和奖励的结合来获取表征。为了处理稀疏奖励，我们设计了一种名为动量记忆内在奖励的新型内在奖励，它使用当前模型（在线网络）和历史模型（目标网络）输出之间的差值来表示代理的状态熟悉程度。我们在 VizDoom 中的一个具有稀疏奖励的视觉导航任务中评估了我们的方法，并证明它在样本效率方面达到了最先进的性能。我们的方法比现有方法至少快两倍，成功率达到 100%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊