Transferable dynamics models for efficient object-oriented reinforcement learning

IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Artificial Intelligence Pub Date : 2024-01-26 DOI:10.1016/j.artint.2024.104079
Ofir Marom, Benjamin Rosman
{"title":"Transferable dynamics models for efficient object-oriented reinforcement learning","authors":"Ofir Marom,&nbsp;Benjamin Rosman","doi":"10.1016/j.artint.2024.104079","DOIUrl":null,"url":null,"abstract":"<div><p>The Reinforcement Learning (RL) framework offers a general paradigm for constructing autonomous agents that can make effective decisions when solving tasks. An important area of study within the field of RL is transfer learning, where an agent utilizes knowledge gained from solving previous tasks to solve a new task more efficiently. While the notion of transfer learning is conceptually appealing, in practice, not all RL representations are amenable to transfer learning. Moreover, much of the research on transfer learning in RL is purely empirical. Previous research has shown that object-oriented representations are suitable for the purposes of transfer learning with theoretical efficiency guarantees. Such representations leverage the notion of object classes to learn lifted rules that apply to grounded object instantiations. In this paper, we extend previous research on object-oriented representations and introduce two formalisms: the first is based on deictic predicates, and is used to learn a transferable transition dynamics model; the second is based on propositions, and is used to learn a transferable reward dynamics model. In addition, we extend previously introduced efficient learning algorithms for object-oriented representations to our proposed formalisms. Our frameworks are then combined into a single efficient algorithm that learns transferable transition and reward dynamics models across a domain of related tasks. We illustrate our proposed algorithm empirically on an extended version of the Taxi domain, as well as the more difficult Sokoban domain, showing the benefits of our approach with regards to efficient learning and transfer.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"329 ","pages":"Article 104079"},"PeriodicalIF":5.1000,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000158/pdfft?md5=f71414a73bce8f86455910c6a084672b&pid=1-s2.0-S0004370224000158-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370224000158","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The Reinforcement Learning (RL) framework offers a general paradigm for constructing autonomous agents that can make effective decisions when solving tasks. An important area of study within the field of RL is transfer learning, where an agent utilizes knowledge gained from solving previous tasks to solve a new task more efficiently. While the notion of transfer learning is conceptually appealing, in practice, not all RL representations are amenable to transfer learning. Moreover, much of the research on transfer learning in RL is purely empirical. Previous research has shown that object-oriented representations are suitable for the purposes of transfer learning with theoretical efficiency guarantees. Such representations leverage the notion of object classes to learn lifted rules that apply to grounded object instantiations. In this paper, we extend previous research on object-oriented representations and introduce two formalisms: the first is based on deictic predicates, and is used to learn a transferable transition dynamics model; the second is based on propositions, and is used to learn a transferable reward dynamics model. In addition, we extend previously introduced efficient learning algorithms for object-oriented representations to our proposed formalisms. Our frameworks are then combined into a single efficient algorithm that learns transferable transition and reward dynamics models across a domain of related tasks. We illustrate our proposed algorithm empirically on an extended version of the Taxi domain, as well as the more difficult Sokoban domain, showing the benefits of our approach with regards to efficient learning and transfer.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高效面向对象强化学习的可转移动力学模型
强化学习(RL)框架为构建能在解决任务时做出有效决策的自主代理提供了一种通用范式。强化学习领域的一个重要研究领域是迁移学习,即代理利用从解决以前任务中获得的知识,更高效地解决新任务。虽然迁移学习的概念很吸引人,但在实践中,并非所有的 RL 表征都适合迁移学习。此外,有关 RL 中迁移学习的研究大多纯属经验之谈。以往的研究表明,面向对象的表示法适合迁移学习的目的,并有理论上的效率保证。这类表征利用对象类的概念来学习适用于基础对象实例的提升规则。在本文中,我们扩展了之前关于面向对象表征的研究,并引入了两种形式主义:第一种形式主义基于谓词,用于学习可迁移的过渡动力学模型;第二种形式主义基于命题,用于学习可迁移的奖励动力学模型。此外,我们还将之前介绍的面向对象表征的高效学习算法扩展到了我们提出的形式主义中。然后,我们将这些框架组合成一个单一的高效算法,在相关任务领域学习可迁移的过渡和奖励动态模型。我们在扩展版的出租车领域以及难度更大的推箱子领域对我们提出的算法进行了实证说明,显示了我们的方法在高效学习和迁移方面的优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Artificial Intelligence
Artificial Intelligence 工程技术-计算机:人工智能
CiteScore
11.20
自引率
1.40%
发文量
118
审稿时长
8 months
期刊介绍: The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.
期刊最新文献
Lifted action models learning from partial traces Human-AI coevolution Editorial Board Separate but equal: Equality in belief propagation for single-cycle graphs Generative models for grid-based and image-based pathfinding
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1