Action Priors for Learning Domain Invariances

Benjamin Rosman, S. Ramamoorthy
{"title":"Action Priors for Learning Domain Invariances","authors":"Benjamin Rosman, S. Ramamoorthy","doi":"10.1109/TAMD.2015.2419715","DOIUrl":null,"url":null,"abstract":"An agent tasked with solving a number of different decision making problems in similar environments has an opportunity to learn over a longer timescale than each individual task. Through examining solutions to different tasks, it can uncover behavioral invariances in the domain, by identifying actions to be prioritized in local contexts, invariant to task details. This information has the effect of greatly increasing the speed of solving new problems. We formalise this notion as action priors, defined as distributions over the action space, conditioned on environment state, and show how these can be learnt from a set of value functions. We apply action priors in the setting of reinforcement learning, to bias action selection during exploration. Aggressive use of action priors performs context based pruning of the available actions, thus reducing the complexity of lookahead during search. We additionally define action priors over observation features, rather than states, which provides further flexibility and generalizability, with the additional benefit of enabling feature selection. Action priors are demonstrated in experiments in a simulated factory environment and a large random graph domain, and show significant speed ups in learning new tasks. Furthermore, we argue that this mechanism is cognitively plausible, and is compatible with findings from cognitive psychology.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"7 1","pages":"107-118"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2015.2419715","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Autonomous Mental Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAMD.2015.2419715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

An agent tasked with solving a number of different decision making problems in similar environments has an opportunity to learn over a longer timescale than each individual task. Through examining solutions to different tasks, it can uncover behavioral invariances in the domain, by identifying actions to be prioritized in local contexts, invariant to task details. This information has the effect of greatly increasing the speed of solving new problems. We formalise this notion as action priors, defined as distributions over the action space, conditioned on environment state, and show how these can be learnt from a set of value functions. We apply action priors in the setting of reinforcement learning, to bias action selection during exploration. Aggressive use of action priors performs context based pruning of the available actions, thus reducing the complexity of lookahead during search. We additionally define action priors over observation features, rather than states, which provides further flexibility and generalizability, with the additional benefit of enabling feature selection. Action priors are demonstrated in experiments in a simulated factory environment and a large random graph domain, and show significant speed ups in learning new tasks. Furthermore, we argue that this mechanism is cognitively plausible, and is compatible with findings from cognitive psychology.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
学习领域不变性的动作先验
在类似的环境中解决许多不同决策问题的智能体有机会在比单个任务更长的时间尺度上学习。通过检查不同任务的解决方案,它可以通过识别要在局部上下文中优先考虑的操作,以及任务细节的不变性,来揭示领域中的行为不变性。这些信息可以大大提高解决新问题的速度。我们将这个概念形式化为动作先验,定义为动作空间上的分布,以环境状态为条件,并展示如何从一组价值函数中学习这些。我们在强化学习的设置中应用动作先验,以在探索过程中偏向行动选择。积极使用动作先验执行基于上下文的可用动作修剪,从而降低了搜索过程中前瞻性的复杂性。我们还定义了先于观察特征的动作,而不是状态,这提供了进一步的灵活性和泛化性,并具有启用特征选择的额外好处。在模拟工厂环境和大型随机图域的实验中证明了动作先验,并显示出学习新任务的显着速度。此外,我们认为这种机制在认知上是合理的,并且与认知心理学的发现是一致的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Autonomous Mental Development
IEEE Transactions on Autonomous Mental Development COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-ROBOTICS
自引率
0.00%
发文量
0
审稿时长
3 months
期刊最新文献
Types, Locations, and Scales from Cluttered Natural Video and Actions Guest Editorial Multimodal Modeling and Analysis Informed by Brain Imaging—Part 1 Discriminating Bipolar Disorder From Major Depression Based on SVM-FoBa: Efficient Feature Selection With Multimodal Brain Imaging Data A Robust Gradient-Based Algorithm to Correct Bias Fields of Brain MR Images Editorial Announcing the Title Change of the IEEE Transactions on Autonomous Mental Development in 2016
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1