Approximate dynamic programming for continuous state and control problems

J. Si, Lei Yang, Chao Lu, Jian Sun, S. Mei
{"title":"Approximate dynamic programming for continuous state and control problems","authors":"J. Si, Lei Yang, Chao Lu, Jian Sun, S. Mei","doi":"10.1109/MED.2009.5164745","DOIUrl":null,"url":null,"abstract":"Dynamic programming (DP) is an approach to computing the optimal control policy over time under nonlinearity and uncertainty by employing the principle of optimality introduced by Richard Bellman. Instead of enumerating all possible control sequences, dynamic programming only searches admissible state and/or action values that satisfy the principle of optimality. Therefore, the computation complexity can be much improved over the direct enumeration method. However, the computational efforts and the data storage requirement increase exponentially with the dimensionality of the system, which are reflected in the three curses: the state space, the observation space, and the action space. Thus, the traditional DP approach was limited to solving small size problems. This paper aims at providing an overview of latest development of a class of approximate/adaptive dynamic programming algorithms including those applicable to continuous state and continuous control problems. The paper will especially review direct heuristic dynamic programming (direct (HDP), its design and applications, which include large and complex continuous state and control problems. In addition to the basic principle of direct HDP, the paper includes two application studies of the direct HDP - one is when it is used in a nonlinear tracking problem, and the other is on a power grid coordination control problem based on China southern network.","PeriodicalId":422386,"journal":{"name":"2009 17th Mediterranean Conference on Control and Automation","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 17th Mediterranean Conference on Control and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MED.2009.5164745","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Dynamic programming (DP) is an approach to computing the optimal control policy over time under nonlinearity and uncertainty by employing the principle of optimality introduced by Richard Bellman. Instead of enumerating all possible control sequences, dynamic programming only searches admissible state and/or action values that satisfy the principle of optimality. Therefore, the computation complexity can be much improved over the direct enumeration method. However, the computational efforts and the data storage requirement increase exponentially with the dimensionality of the system, which are reflected in the three curses: the state space, the observation space, and the action space. Thus, the traditional DP approach was limited to solving small size problems. This paper aims at providing an overview of latest development of a class of approximate/adaptive dynamic programming algorithms including those applicable to continuous state and continuous control problems. The paper will especially review direct heuristic dynamic programming (direct (HDP), its design and applications, which include large and complex continuous state and control problems. In addition to the basic principle of direct HDP, the paper includes two application studies of the direct HDP - one is when it is used in a nonlinear tracking problem, and the other is on a power grid coordination control problem based on China southern network.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
连续状态与控制问题的近似动态规划
动态规划(DP)是利用Richard Bellman提出的最优性原理,在非线性和不确定性条件下计算随时间变化的最优控制策略的一种方法。动态规划不是列举所有可能的控制序列,而是只搜索满足最优性原则的允许状态和/或动作值。因此,与直接枚举方法相比,计算复杂度可以大大提高。然而,随着系统维数的增加,计算量和数据存储需求呈指数级增长,这主要体现在状态空间、观察空间和动作空间三个维度上。因此,传统的DP方法仅限于解决小尺寸问题。本文旨在概述一类近似/自适应动态规划算法的最新发展,包括适用于连续状态和连续控制问题的算法。本文重点介绍了直接启发式动态规划(direct, HDP)及其设计和应用,以解决大型复杂的连续状态和控制问题。本文除了介绍直接HDP的基本原理外,还对直接HDP在非线性跟踪问题和基于中国南方电网的电网协调控制问题中的应用进行了研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An application of the RMMAC methodology to an unstable plant Low-cost embedded solution for PID controllers of DC motors A grid forming target allocation strategy for multi robot systems. Modeling and motion control of an articulated-frame-steering hydraulic mobile machine Approximate dynamic programming for continuous state and control problems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1