Approximate dynamic programming for continuous state and control problems

2009 17th Mediterranean Conference on Control and Automation Pub Date : 2009-06-24 DOI:10.1109/MED.2009.5164745

J. Si, Lei Yang, Chao Lu, Jian Sun, S. Mei

{"title":"Approximate dynamic programming for continuous state and control problems","authors":"J. Si, Lei Yang, Chao Lu, Jian Sun, S. Mei","doi":"10.1109/MED.2009.5164745","DOIUrl":null,"url":null,"abstract":"Dynamic programming (DP) is an approach to computing the optimal control policy over time under nonlinearity and uncertainty by employing the principle of optimality introduced by Richard Bellman. Instead of enumerating all possible control sequences, dynamic programming only searches admissible state and/or action values that satisfy the principle of optimality. Therefore, the computation complexity can be much improved over the direct enumeration method. However, the computational efforts and the data storage requirement increase exponentially with the dimensionality of the system, which are reflected in the three curses: the state space, the observation space, and the action space. Thus, the traditional DP approach was limited to solving small size problems. This paper aims at providing an overview of latest development of a class of approximate/adaptive dynamic programming algorithms including those applicable to continuous state and continuous control problems. The paper will especially review direct heuristic dynamic programming (direct (HDP), its design and applications, which include large and complex continuous state and control problems. In addition to the basic principle of direct HDP, the paper includes two application studies of the direct HDP - one is when it is used in a nonlinear tracking problem, and the other is on a power grid coordination control problem based on China southern network.","PeriodicalId":422386,"journal":{"name":"2009 17th Mediterranean Conference on Control and Automation","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 17th Mediterranean Conference on Control and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MED.2009.5164745","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

Dynamic programming (DP) is an approach to computing the optimal control policy over time under nonlinearity and uncertainty by employing the principle of optimality introduced by Richard Bellman. Instead of enumerating all possible control sequences, dynamic programming only searches admissible state and/or action values that satisfy the principle of optimality. Therefore, the computation complexity can be much improved over the direct enumeration method. However, the computational efforts and the data storage requirement increase exponentially with the dimensionality of the system, which are reflected in the three curses: the state space, the observation space, and the action space. Thus, the traditional DP approach was limited to solving small size problems. This paper aims at providing an overview of latest development of a class of approximate/adaptive dynamic programming algorithms including those applicable to continuous state and continuous control problems. The paper will especially review direct heuristic dynamic programming (direct (HDP), its design and applications, which include large and complex continuous state and control problems. In addition to the basic principle of direct HDP, the paper includes two application studies of the direct HDP - one is when it is used in a nonlinear tracking problem, and the other is on a power grid coordination control problem based on China southern network.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

连续状态与控制问题的近似动态规划

动态规划(DP)是利用Richard Bellman提出的最优性原理，在非线性和不确定性条件下计算随时间变化的最优控制策略的一种方法。动态规划不是列举所有可能的控制序列，而是只搜索满足最优性原则的允许状态和/或动作值。因此，与直接枚举方法相比，计算复杂度可以大大提高。然而，随着系统维数的增加，计算量和数据存储需求呈指数级增长，这主要体现在状态空间、观察空间和动作空间三个维度上。因此，传统的DP方法仅限于解决小尺寸问题。本文旨在概述一类近似/自适应动态规划算法的最新发展，包括适用于连续状态和连续控制问题的算法。本文重点介绍了直接启发式动态规划(direct, HDP)及其设计和应用，以解决大型复杂的连续状态和控制问题。本文除了介绍直接HDP的基本原理外，还对直接HDP在非线性跟踪问题和基于中国南方电网的电网协调控制问题中的应用进行了研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2009 17th Mediterranean Conference on Control and Automation

自引率

0.00%

发文量

期刊最新文献

An application of the RMMAC methodology to an unstable plant Low-cost embedded solution for PID controllers of DC motors A grid forming target allocation strategy for multi robot systems. Modeling and motion control of an articulated-frame-steering hydraulic mobile machine Approximate dynamic programming for continuous state and control problems