Feedback Linearization for Uncertain Systems via Reinforcement Learning

2020 IEEE International Conference on Robotics and Automation (ICRA) Pub Date : 2020-05-01 DOI:10.1109/ICRA40945.2020.9197158

T. Westenbroek, David Fridovich-Keil, Eric V. Mazumdar, Shreyas Arora, Valmik Prabhu, S. Sastry, C. Tomlin

{"title":"Feedback Linearization for Uncertain Systems via Reinforcement Learning","authors":"T. Westenbroek, David Fridovich-Keil, Eric V. Mazumdar, Shreyas Arora, Valmik Prabhu, S. Sastry, C. Tomlin","doi":"10.1109/ICRA40945.2020.9197158","DOIUrl":null,"url":null,"abstract":"We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics. Feedback linearization is a technique from nonlinear control which renders the input-output dynamics of a nonlinear plant linear under application of an appropriate feedback controller. Once a linearizing controller has been constructed, desired output trajectories for the nonlinear plant can be tracked using a variety of linear control techniques. However, the calculation of a linearizing controller requires a precise dynamics model for the system. As a result, model-based approaches for learning exact linearizing controllers generally require a simple, highly structured model of the system with easily identifiable parameters. In contrast, the model-free approach presented in this paper is able to approximate the linearizing controller for the plant using general function approximation architectures. Specifically, we formulate a continuous-time optimization problem over the parameters of a learned linearizing controller whose optima are the set of parameters which best linearize the plant. We derive conditions under which the learning problem is (strongly) convex and provide guarantees which ensure the true linearizing controller for the plant is recovered. We then discuss how model-free policy optimization algorithms can be used to solve a discrete-time approximation to the problem using data collected from the real-world plant. The utility of the framework is demonstrated in simulation and on a real-world robotic platform.","PeriodicalId":6859,"journal":{"name":"2020 IEEE International Conference on Robotics and Automation (ICRA)","volume":"14 1","pages":"1364-1371"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA40945.2020.9197158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

Abstract

We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics. Feedback linearization is a technique from nonlinear control which renders the input-output dynamics of a nonlinear plant linear under application of an appropriate feedback controller. Once a linearizing controller has been constructed, desired output trajectories for the nonlinear plant can be tracked using a variety of linear control techniques. However, the calculation of a linearizing controller requires a precise dynamics model for the system. As a result, model-based approaches for learning exact linearizing controllers generally require a simple, highly structured model of the system with easily identifiable parameters. In contrast, the model-free approach presented in this paper is able to approximate the linearizing controller for the plant using general function approximation architectures. Specifically, we formulate a continuous-time optimization problem over the parameters of a learned linearizing controller whose optima are the set of parameters which best linearize the plant. We derive conditions under which the learning problem is (strongly) convex and provide guarantees which ensure the true linearizing controller for the plant is recovered. We then discuss how model-free policy optimization algorithms can be used to solve a discrete-time approximation to the problem using data collected from the real-world plant. The utility of the framework is demonstrated in simulation and on a real-world robotic platform.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于强化学习的不确定系统反馈线性化

我们提出了一种新的非线性系统控制设计方法，该方法利用无模型策略优化技术来学习具有未知动态的物理对象的线性化控制器。反馈线性化是非线性控制中的一种技术，它在适当的反馈控制器的作用下，使非线性对象的输入输出动态变为线性。一旦构造了线性化控制器，就可以使用各种线性控制技术跟踪非线性对象的期望输出轨迹。然而，线性化控制器的计算需要精确的系统动力学模型。因此，用于学习精确线性化控制器的基于模型的方法通常需要具有易于识别参数的简单，高度结构化的系统模型。相比之下，本文提出的无模型方法能够使用一般函数近似体系结构近似对象的线性化控制器。具体地说，我们在一个学习的线性化控制器的参数上建立了一个连续时间优化问题，该控制器的最优值是最优线性化对象的参数集。我们导出了学习问题是(强)凸的条件，并提供了保证恢复对象的真线性化控制器的保证。然后，我们讨论了如何使用无模型策略优化算法来使用从现实世界工厂收集的数据来解决问题的离散时间近似。该框架的实用性在仿真和现实世界的机器人平台上得到了验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 IEEE International Conference on Robotics and Automation (ICRA)

自引率

0.00%

发文量