概率控制和最优控制的大型化

IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Systems & Control Letters Pub Date : 2024-06-05 DOI:10.1016/j.sysconle.2024.105837
Tom Lefebvre
{"title":"概率控制和最优控制的大型化","authors":"Tom Lefebvre","doi":"10.1016/j.sysconle.2024.105837","DOIUrl":null,"url":null,"abstract":"<div><p>Probabilistic control design is founded on the principle that a rational agent attempts to match modelled with an arbitrary desired closed-loop system trajectory density. The framework was originally proposed as a tractable alternative to traditional optimal control design, parametrizing desired behaviour through fictitious transition and policy densities and using the information projection as a proximity measure. In this work we introduce an alternative parametrization of desired closed-loop behaviour and explore alternative proximity measures between densities. It is then illustrated how the associated probabilistic control problems solve into uncertain or probabilistic policies. Our main result is to show that the probabilistic control objectives majorize conventional, stochastic and risk sensitive, optimal control objectives. This observation allows us to identify two probabilistic fixed point iterations that converge to the deterministic optimal control policies establishing an explicit connection between either formulations. Further we demonstrate that the risk sensitive optimal control formulation is also technically equivalent to a Maximum Likelihood estimation problem on a probabilistic graph model where the notion of costs is directly encoded into the model. The associated treatment of the estimation problem is then shown to coincide with the moment projected probabilistic control formulation. That way optimal decision making can be reformulated as an iterative inference problem. Based on these insights we discuss directions for algorithmic development.</p></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167691124001257/pdfft?md5=49b677f3409249fbdf087370b6e8556c&pid=1-s2.0-S0167691124001257-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Probabilistic control and majorisation of optimal control\",\"authors\":\"Tom Lefebvre\",\"doi\":\"10.1016/j.sysconle.2024.105837\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Probabilistic control design is founded on the principle that a rational agent attempts to match modelled with an arbitrary desired closed-loop system trajectory density. The framework was originally proposed as a tractable alternative to traditional optimal control design, parametrizing desired behaviour through fictitious transition and policy densities and using the information projection as a proximity measure. In this work we introduce an alternative parametrization of desired closed-loop behaviour and explore alternative proximity measures between densities. It is then illustrated how the associated probabilistic control problems solve into uncertain or probabilistic policies. Our main result is to show that the probabilistic control objectives majorize conventional, stochastic and risk sensitive, optimal control objectives. This observation allows us to identify two probabilistic fixed point iterations that converge to the deterministic optimal control policies establishing an explicit connection between either formulations. Further we demonstrate that the risk sensitive optimal control formulation is also technically equivalent to a Maximum Likelihood estimation problem on a probabilistic graph model where the notion of costs is directly encoded into the model. The associated treatment of the estimation problem is then shown to coincide with the moment projected probabilistic control formulation. That way optimal decision making can be reformulated as an iterative inference problem. Based on these insights we discuss directions for algorithmic development.</p></div>\",\"PeriodicalId\":49450,\"journal\":{\"name\":\"Systems & Control Letters\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0167691124001257/pdfft?md5=49b677f3409249fbdf087370b6e8556c&pid=1-s2.0-S0167691124001257-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Systems & Control Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167691124001257\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems & Control Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167691124001257","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

概率控制设计的基本原理是,一个理性的代理试图与任意期望的闭环系统轨迹密度相匹配。该框架最初是作为传统最优控制设计的一种可行替代方案而提出的,它通过虚构的过渡和策略密度对所需行为进行参数化,并使用信息投影作为接近度量。在这项工作中,我们引入了理想闭环行为的替代参数化,并探索了密度之间的替代接近度量。然后,我们将说明相关的概率控制问题如何求解为不确定或概率策略。我们的主要结果表明,概率控制目标主要是传统的、随机的和风险敏感的最优控制目标。根据这一观察结果,我们确定了两个概率定点迭代,它们收敛于确定性最优控制策略,并在这两种方案之间建立了明确的联系。此外,我们还证明了风险敏感最优控制方案在技术上也等同于概率图模型上的最大似然估计问题,其中成本的概念被直接编码到模型中。估算问题的相关处理方法与时刻预测概率控制公式不谋而合。这样,最优决策就可以重新表述为一个迭代推理问题。基于这些见解,我们讨论了算法发展的方向。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Probabilistic control and majorisation of optimal control

Probabilistic control design is founded on the principle that a rational agent attempts to match modelled with an arbitrary desired closed-loop system trajectory density. The framework was originally proposed as a tractable alternative to traditional optimal control design, parametrizing desired behaviour through fictitious transition and policy densities and using the information projection as a proximity measure. In this work we introduce an alternative parametrization of desired closed-loop behaviour and explore alternative proximity measures between densities. It is then illustrated how the associated probabilistic control problems solve into uncertain or probabilistic policies. Our main result is to show that the probabilistic control objectives majorize conventional, stochastic and risk sensitive, optimal control objectives. This observation allows us to identify two probabilistic fixed point iterations that converge to the deterministic optimal control policies establishing an explicit connection between either formulations. Further we demonstrate that the risk sensitive optimal control formulation is also technically equivalent to a Maximum Likelihood estimation problem on a probabilistic graph model where the notion of costs is directly encoded into the model. The associated treatment of the estimation problem is then shown to coincide with the moment projected probabilistic control formulation. That way optimal decision making can be reformulated as an iterative inference problem. Based on these insights we discuss directions for algorithmic development.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Systems & Control Letters
Systems & Control Letters 工程技术-运筹学与管理科学
CiteScore
4.60
自引率
3.80%
发文量
144
审稿时长
6 months
期刊介绍: Founded in 1981 by two of the pre-eminent control theorists, Roger Brockett and Jan Willems, Systems & Control Letters is one of the leading journals in the field of control theory. The aim of the journal is to allow dissemination of relatively concise but highly original contributions whose high initial quality enables a relatively rapid review process. All aspects of the fields of systems and control are covered, especially mathematically-oriented and theoretical papers that have a clear relevance to engineering, physical and biological sciences, and even economics. Application-oriented papers with sophisticated and rigorous mathematical elements are also welcome.
期刊最新文献
Robust control of time-delayed stochastic switched systems with dwell Data-driven control of nonlinear systems: An online sequential approach Optimal impulse control problems with time delays: An illustrative example Inverse reinforcement learning methods for linear differential games Min–max group consensus of discrete-time multi-agent systems under directed random networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1