研究强化学习中神经网络表征的特性

IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Artificial Intelligence Pub Date : 2024-03-01 DOI:10.1016/j.artint.2024.104100
Han Wang , Erfan Miahi , Martha White , Marlos C. Machado , Zaheer Abbas , Raksha Kumaraswamy , Vincent Liu , Adam White
{"title":"研究强化学习中神经网络表征的特性","authors":"Han Wang ,&nbsp;Erfan Miahi ,&nbsp;Martha White ,&nbsp;Marlos C. Machado ,&nbsp;Zaheer Abbas ,&nbsp;Raksha Kumaraswamy ,&nbsp;Vincent Liu ,&nbsp;Adam White","doi":"10.1016/j.artint.2024.104100","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper we investigate the properties of representations learned by deep reinforcement learning systems. Much of the early work on representations for reinforcement learning focused on designing fixed-basis architectures to achieve properties thought to be desirable, such as orthogonality and sparsity. In contrast, the idea behind deep reinforcement learning methods is that the agent designer should not encode representational properties, but rather that the data stream should determine the properties of the representation—good representations emerge under appropriate training schemes. In this paper we bring these two perspectives together, empirically investigating the properties of representations that support transfer in reinforcement learning. We introduce and measure six representational properties over more than 25,000 agent-task settings. We consider Deep Q-learning agents with different auxiliary losses in a pixel-based navigation environment, with source and transfer tasks corresponding to different goal locations. We develop a method to better understand <em>why</em> some representations work better for transfer, through a systematic approach varying task similarity and measuring and correlating representation properties with transfer performance. We demonstrate the generality of the methodology by investigating representations learned by a Rainbow agent that successfully transfers across Atari 2600 game modes.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"330 ","pages":"Article 104100"},"PeriodicalIF":5.1000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000365/pdfft?md5=6e885307a80c3c36ff25f169599f1f61&pid=1-s2.0-S0004370224000365-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Investigating the properties of neural network representations in reinforcement learning\",\"authors\":\"Han Wang ,&nbsp;Erfan Miahi ,&nbsp;Martha White ,&nbsp;Marlos C. Machado ,&nbsp;Zaheer Abbas ,&nbsp;Raksha Kumaraswamy ,&nbsp;Vincent Liu ,&nbsp;Adam White\",\"doi\":\"10.1016/j.artint.2024.104100\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In this paper we investigate the properties of representations learned by deep reinforcement learning systems. Much of the early work on representations for reinforcement learning focused on designing fixed-basis architectures to achieve properties thought to be desirable, such as orthogonality and sparsity. In contrast, the idea behind deep reinforcement learning methods is that the agent designer should not encode representational properties, but rather that the data stream should determine the properties of the representation—good representations emerge under appropriate training schemes. In this paper we bring these two perspectives together, empirically investigating the properties of representations that support transfer in reinforcement learning. We introduce and measure six representational properties over more than 25,000 agent-task settings. We consider Deep Q-learning agents with different auxiliary losses in a pixel-based navigation environment, with source and transfer tasks corresponding to different goal locations. We develop a method to better understand <em>why</em> some representations work better for transfer, through a systematic approach varying task similarity and measuring and correlating representation properties with transfer performance. We demonstrate the generality of the methodology by investigating representations learned by a Rainbow agent that successfully transfers across Atari 2600 game modes.</p></div>\",\"PeriodicalId\":8434,\"journal\":{\"name\":\"Artificial Intelligence\",\"volume\":\"330 \",\"pages\":\"Article 104100\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0004370224000365/pdfft?md5=6e885307a80c3c36ff25f169599f1f61&pid=1-s2.0-S0004370224000365-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0004370224000365\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370224000365","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

在本文中,我们研究了深度强化学习系统学习到的表征的特性。早期有关强化学习表征的工作大多集中在设计固定基础架构,以实现人们认为理想的属性,如正交性和稀疏性。相比之下,深度强化学习方法背后的理念是,代理设计者不应编码表征属性,而应由数据流决定表征属性--好的表征会在适当的训练方案下出现。在本文中,我们将这两种观点结合在一起,通过实证研究支持强化学习中转移的表征属性。我们在超过 25,000 个代理任务设置中引入并测量了六种表征属性。我们考虑了在基于像素的导航环境中具有不同辅助损失的深度 Q 学习代理,源任务和转移任务对应于不同的目标位置。我们开发了一种方法,通过系统化的方法来改变任务相似性,测量表征属性并将其与转移性能相关联,从而更好地理解为什么某些表征对转移效果更好。我们通过研究成功在 Atari 2600 游戏模式中转移的 Rainbow 代理所学习的表征,证明了该方法的通用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Investigating the properties of neural network representations in reinforcement learning

In this paper we investigate the properties of representations learned by deep reinforcement learning systems. Much of the early work on representations for reinforcement learning focused on designing fixed-basis architectures to achieve properties thought to be desirable, such as orthogonality and sparsity. In contrast, the idea behind deep reinforcement learning methods is that the agent designer should not encode representational properties, but rather that the data stream should determine the properties of the representation—good representations emerge under appropriate training schemes. In this paper we bring these two perspectives together, empirically investigating the properties of representations that support transfer in reinforcement learning. We introduce and measure six representational properties over more than 25,000 agent-task settings. We consider Deep Q-learning agents with different auxiliary losses in a pixel-based navigation environment, with source and transfer tasks corresponding to different goal locations. We develop a method to better understand why some representations work better for transfer, through a systematic approach varying task similarity and measuring and correlating representation properties with transfer performance. We demonstrate the generality of the methodology by investigating representations learned by a Rainbow agent that successfully transfers across Atari 2600 game modes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Artificial Intelligence
Artificial Intelligence 工程技术-计算机:人工智能
CiteScore
11.20
自引率
1.40%
发文量
118
审稿时长
8 months
期刊介绍: The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.
期刊最新文献
Lifted action models learning from partial traces Human-AI coevolution Editorial Board Separate but equal: Equality in belief propagation for single-cycle graphs Generative models for grid-based and image-based pathfinding
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1