On the lack of gradient domination for linear quadratic Gaussian problems with incomplete state information

Hesameddin Mohammadi, M. Soltanolkotabi, M. Jovanović
{"title":"On the lack of gradient domination for linear quadratic Gaussian problems with incomplete state information","authors":"Hesameddin Mohammadi, M. Soltanolkotabi, M. Jovanović","doi":"10.1109/CDC45484.2021.9683369","DOIUrl":null,"url":null,"abstract":"Policy gradient algorithms in model-free reinforcement learning have been shown to achieve global exponential convergence for the Linear Quadratic Regulator problem despite the lack of convexity. However, extending such guarantees beyond the scope of standard LQR and full-state feedback has remained open. A key enabler for existing results on LQR is the so-called gradient dominance property of the underlying optimization problem that can be used as a surrogate for strong convexity. In this paper, we take a step further by studying the convergence of gradient descent for the Linear Quadratic Gaussian problem and demonstrate through examples that LQG does not satisfy the gradient dominance property. Our study shows the non-uniqueness of equilibrium points and thus disproves the global convergence of policy gradient methods for LQG.","PeriodicalId":229089,"journal":{"name":"2021 60th IEEE Conference on Decision and Control (CDC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 60th IEEE Conference on Decision and Control (CDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC45484.2021.9683369","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Policy gradient algorithms in model-free reinforcement learning have been shown to achieve global exponential convergence for the Linear Quadratic Regulator problem despite the lack of convexity. However, extending such guarantees beyond the scope of standard LQR and full-state feedback has remained open. A key enabler for existing results on LQR is the so-called gradient dominance property of the underlying optimization problem that can be used as a surrogate for strong convexity. In this paper, we take a step further by studying the convergence of gradient descent for the Linear Quadratic Gaussian problem and demonstrate through examples that LQG does not satisfy the gradient dominance property. Our study shows the non-uniqueness of equilibrium points and thus disproves the global convergence of policy gradient methods for LQG.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
状态信息不完全的线性二次高斯问题的缺乏梯度控制
无模型强化学习中的策略梯度算法虽然缺乏凸性,但仍能实现线性二次型调节器问题的全局指数收敛。然而,将这种保证扩展到标准LQR和全状态反馈范围之外仍然是开放的。LQR上现有结果的一个关键促成因素是所谓的底层优化问题的梯度优势属性,它可以用作强凸性的替代。本文进一步研究了线性二次高斯问题的梯度下降的收敛性,并通过实例证明了LQG不满足梯度优势性。我们的研究证明了平衡点的非唯一性,从而反驳了LQG策略梯度方法的全局收敛性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Computationally Efficient LQR based Model Predictive Control Scheme for Discrete-Time Switched Linear Systems Stability Analysis of LTI Fractional-order Systems with Distributed Delay Nonlinear Data-Driven Control via State-Dependent Representations Constraint-based Verification of Formation Control Robust Output Set-Point Tracking for a Power Flow Controller via Forwarding Design
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1