具有学习重要性抽样的约束随机最优控制:一种路径积分方法

IF 5 1区计算机科学 Q1 ROBOTICS International Journal of Robotics Research Pub Date : 2021-10-12 DOI:10.1177/02783649211047890

Jan Carius, René Ranftl, Farbod Farshidian, M. Hutter

{"title":"具有学习重要性抽样的约束随机最优控制:一种路径积分方法","authors":"Jan Carius, René Ranftl, Farbod Farshidian, M. Hutter","doi":"10.1177/02783649211047890","DOIUrl":null,"url":null,"abstract":"Modern robotic systems are expected to operate robustly in partially unknown environments. This article proposes an algorithm capable of controlling a wide range of high-dimensional robotic systems in such challenging scenarios. Our method is based on the path integral formulation of stochastic optimal control, which we extend with constraint-handling capabilities. Under our control law, the optimal input is inferred from a set of stochastic rollouts of the system dynamics. These rollouts are simulated by a physics engine, placing minimal restrictions on the types of systems and environments that can be modeled. Although sampling-based algorithms are typically not suitable for online control, we demonstrate in this work how importance sampling and constraints can be used to effectively curb the sampling complexity and enable real-time control applications. Furthermore, the path integral framework provides a natural way of incorporating existing control architectures as ancillary controllers for shaping the sampling distribution. Our results reveal that even in cases where the ancillary controller would fail, our stochastic control algorithm provides an additional safety and robustness layer. Moreover, in the absence of an existing ancillary controller, our method can be used to train a parametrized importance sampling policy using data from the stochastic rollouts. The algorithm may thereby bootstrap itself by learning an importance sampling policy offline and then refining it to unseen environments during online control. We validate our results on three robotic systems, including hardware experiments on a quadrupedal robot.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"41 1","pages":"189 - 209"},"PeriodicalIF":5.0000,"publicationDate":"2021-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Constrained stochastic optimal control with learned importance sampling: A path integral approach\",\"authors\":\"Jan Carius, René Ranftl, Farbod Farshidian, M. Hutter\",\"doi\":\"10.1177/02783649211047890\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern robotic systems are expected to operate robustly in partially unknown environments. This article proposes an algorithm capable of controlling a wide range of high-dimensional robotic systems in such challenging scenarios. Our method is based on the path integral formulation of stochastic optimal control, which we extend with constraint-handling capabilities. Under our control law, the optimal input is inferred from a set of stochastic rollouts of the system dynamics. These rollouts are simulated by a physics engine, placing minimal restrictions on the types of systems and environments that can be modeled. Although sampling-based algorithms are typically not suitable for online control, we demonstrate in this work how importance sampling and constraints can be used to effectively curb the sampling complexity and enable real-time control applications. Furthermore, the path integral framework provides a natural way of incorporating existing control architectures as ancillary controllers for shaping the sampling distribution. Our results reveal that even in cases where the ancillary controller would fail, our stochastic control algorithm provides an additional safety and robustness layer. Moreover, in the absence of an existing ancillary controller, our method can be used to train a parametrized importance sampling policy using data from the stochastic rollouts. The algorithm may thereby bootstrap itself by learning an importance sampling policy offline and then refining it to unseen environments during online control. We validate our results on three robotic systems, including hardware experiments on a quadrupedal robot.\",\"PeriodicalId\":54942,\"journal\":{\"name\":\"International Journal of Robotics Research\",\"volume\":\"41 1\",\"pages\":\"189 - 209\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2021-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Robotics Research\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1177/02783649211047890\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Robotics Research","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/02783649211047890","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 6

摘要

现代机器人系统有望在部分未知的环境中健壮地运行。本文提出了一种能够在这种具有挑战性的场景中控制各种高维机器人系统的算法。我们的方法是基于随机最优控制的路径积分公式，我们扩展了约束处理能力。在我们的控制律下，最优输入是从系统动力学的一组随机滚动中推断出来的。这些部署由物理引擎模拟，对可以建模的系统和环境类型的限制最小。尽管基于采样的算法通常不适合在线控制，但我们在这项工作中展示了如何使用重要性采样和约束来有效地抑制采样复杂性并实现实时控制应用。此外，路径积分框架提供了一种自然的方式，将现有的控制体系结构作为辅助控制器来塑造采样分布。我们的研究结果表明，即使在辅助控制器失效的情况下，我们的随机控制算法也提供了额外的安全性和鲁棒性层。此外，在没有辅助控制器的情况下，我们的方法可以使用随机滚动的数据来训练参数化的重要抽样策略。因此，该算法可以通过离线学习重要采样策略，然后在在线控制期间将其细化到不可见的环境来引导自己。我们在三个机器人系统上验证了我们的结果，包括四足机器人的硬件实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Constrained stochastic optimal control with learned importance sampling: A path integral approach

Modern robotic systems are expected to operate robustly in partially unknown environments. This article proposes an algorithm capable of controlling a wide range of high-dimensional robotic systems in such challenging scenarios. Our method is based on the path integral formulation of stochastic optimal control, which we extend with constraint-handling capabilities. Under our control law, the optimal input is inferred from a set of stochastic rollouts of the system dynamics. These rollouts are simulated by a physics engine, placing minimal restrictions on the types of systems and environments that can be modeled. Although sampling-based algorithms are typically not suitable for online control, we demonstrate in this work how importance sampling and constraints can be used to effectively curb the sampling complexity and enable real-time control applications. Furthermore, the path integral framework provides a natural way of incorporating existing control architectures as ancillary controllers for shaping the sampling distribution. Our results reveal that even in cases where the ancillary controller would fail, our stochastic control algorithm provides an additional safety and robustness layer. Moreover, in the absence of an existing ancillary controller, our method can be used to train a parametrized importance sampling policy using data from the stochastic rollouts. The algorithm may thereby bootstrap itself by learning an importance sampling policy offline and then refining it to unseen environments during online control. We validate our results on three robotic systems, including hardware experiments on a quadrupedal robot.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Robotics Research 工程技术-机器人学

CiteScore

22.20

自引率

0.00%

发文量

审稿时长

6-12 weeks

期刊介绍： The International Journal of Robotics Research (IJRR) has been a leading peer-reviewed publication in the field for over two decades. It holds the distinction of being the first scholarly journal dedicated to robotics research. IJRR presents cutting-edge and thought-provoking original research papers, articles, and reviews that delve into groundbreaking trends, technical advancements, and theoretical developments in robotics. Renowned scholars and practitioners contribute to its content, offering their expertise and insights. This journal covers a wide range of topics, going beyond narrow technical advancements to encompass various aspects of robotics. The primary aim of IJRR is to publish work that has lasting value for the scientific and technological advancement of the field. Only original, robust, and practical research that can serve as a foundation for further progress is considered for publication. The focus is on producing content that will remain valuable and relevant over time. In summary, IJRR stands as a prestigious publication that drives innovation and knowledge in robotics research.