基于模型和无模型强化学习相结合的混合控制

IF 7.5 1区计算机科学 Q1 ROBOTICS International Journal of Robotics Research Pub Date : 2022-06-02 DOI:10.1177/02783649221083331

Allison Pinosky, Ian Abraham, Alexander Broad, B. Argall, T. Murphey

{"title":"基于模型和无模型强化学习相结合的混合控制","authors":"Allison Pinosky, Ian Abraham, Alexander Broad, B. Argall, T. Murphey","doi":"10.1177/02783649221083331","DOIUrl":null,"url":null,"abstract":"We develop an approach to improve the learning capabilities of robotic systems by combining learned predictive models with experience-based state-action policy mappings. Predictive models provide an understanding of the task and the dynamics, while experience-based (model-free) policy mappings encode favorable actions that override planned actions. We refer to our approach of systematically combining model-based and model-free learning methods as hybrid learning. Our approach efficiently learns motor skills and improves the performance of predictive models and experience-based policies. Moreover, our approach enables policies (both model-based and model-free) to be updated using any off-policy reinforcement learning method. We derive a deterministic method of hybrid learning by optimally switching between learning modalities. We adapt our method to a stochastic variation that relaxes some of the key assumptions in the original derivation. Our deterministic and stochastic variations are tested on a variety of robot control benchmark tasks in simulation as well as a hardware manipulation task. We extend our approach for use with imitation learning methods, where experience is provided through demonstrations, and we test the expanded capability with a real-world pick-and-place task. The results show that our method is capable of improving the performance and sample efficiency of learning motor skills in a variety of experimental domains.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"42 1","pages":"337 - 355"},"PeriodicalIF":7.5000,"publicationDate":"2022-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Hybrid control for combining model-based and model-free reinforcement learning\",\"authors\":\"Allison Pinosky, Ian Abraham, Alexander Broad, B. Argall, T. Murphey\",\"doi\":\"10.1177/02783649221083331\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We develop an approach to improve the learning capabilities of robotic systems by combining learned predictive models with experience-based state-action policy mappings. Predictive models provide an understanding of the task and the dynamics, while experience-based (model-free) policy mappings encode favorable actions that override planned actions. We refer to our approach of systematically combining model-based and model-free learning methods as hybrid learning. Our approach efficiently learns motor skills and improves the performance of predictive models and experience-based policies. Moreover, our approach enables policies (both model-based and model-free) to be updated using any off-policy reinforcement learning method. We derive a deterministic method of hybrid learning by optimally switching between learning modalities. We adapt our method to a stochastic variation that relaxes some of the key assumptions in the original derivation. Our deterministic and stochastic variations are tested on a variety of robot control benchmark tasks in simulation as well as a hardware manipulation task. We extend our approach for use with imitation learning methods, where experience is provided through demonstrations, and we test the expanded capability with a real-world pick-and-place task. The results show that our method is capable of improving the performance and sample efficiency of learning motor skills in a variety of experimental domains.\",\"PeriodicalId\":54942,\"journal\":{\"name\":\"International Journal of Robotics Research\",\"volume\":\"42 1\",\"pages\":\"337 - 355\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2022-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Robotics Research\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1177/02783649221083331\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Robotics Research","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/02783649221083331","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 7

摘要

我们开发了一种方法，通过将学习的预测模型与基于经验的状态-行动-策略映射相结合，来提高机器人系统的学习能力。预测模型提供了对任务和动态的理解，而基于经验（无模型）的策略映射编码了覆盖计划行动的有利行动。我们将基于模型和无模型的学习方法系统地结合起来的方法称为混合学习。我们的方法有效地学习运动技能，并提高预测模型和基于经验的策略的性能。此外，我们的方法允许使用任何非策略强化学习方法更新策略（基于模型和无模型）。我们通过在学习模式之间进行最佳切换，得出了一种混合学习的确定性方法。我们将我们的方法适应于一种随机变化，这种变化放松了原始推导中的一些关键假设。我们的确定性和随机性变化在各种机器人控制基准任务和硬件操作任务上进行了仿真测试。我们扩展了我们的方法，将其用于模仿学习方法，通过演示提供经验，并通过真实世界的选择和放置任务测试扩展的能力。结果表明，我们的方法能够在各种实验领域提高学习运动技能的性能和样本效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Hybrid control for combining model-based and model-free reinforcement learning

We develop an approach to improve the learning capabilities of robotic systems by combining learned predictive models with experience-based state-action policy mappings. Predictive models provide an understanding of the task and the dynamics, while experience-based (model-free) policy mappings encode favorable actions that override planned actions. We refer to our approach of systematically combining model-based and model-free learning methods as hybrid learning. Our approach efficiently learns motor skills and improves the performance of predictive models and experience-based policies. Moreover, our approach enables policies (both model-based and model-free) to be updated using any off-policy reinforcement learning method. We derive a deterministic method of hybrid learning by optimally switching between learning modalities. We adapt our method to a stochastic variation that relaxes some of the key assumptions in the original derivation. Our deterministic and stochastic variations are tested on a variety of robot control benchmark tasks in simulation as well as a hardware manipulation task. We extend our approach for use with imitation learning methods, where experience is provided through demonstrations, and we test the expanded capability with a real-world pick-and-place task. The results show that our method is capable of improving the performance and sample efficiency of learning motor skills in a variety of experimental domains.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Robotics Research 工程技术-机器人学

CiteScore

22.20

自引率

0.00%

发文量

审稿时长

6-12 weeks

期刊介绍： The International Journal of Robotics Research (IJRR) has been a leading peer-reviewed publication in the field for over two decades. It holds the distinction of being the first scholarly journal dedicated to robotics research. IJRR presents cutting-edge and thought-provoking original research papers, articles, and reviews that delve into groundbreaking trends, technical advancements, and theoretical developments in robotics. Renowned scholars and practitioners contribute to its content, offering their expertise and insights. This journal covers a wide range of topics, going beyond narrow technical advancements to encompass various aspects of robotics. The primary aim of IJRR is to publish work that has lasting value for the scientific and technological advancement of the field. Only original, robust, and practical research that can serve as a foundation for further progress is considered for publication. The focus is on producing content that will remain valuable and relevant over time. In summary, IJRR stands as a prestigious publication that drives innovation and knowledge in robotics research.