{"title":"Uniform Ergodicity and Ergodic-Risk Constrained Policy Optimization","authors":"Shahriar Talebi, Na Li","doi":"arxiv-2409.10767","DOIUrl":null,"url":null,"abstract":"In stochastic systems, risk-sensitive control balances performance with\nresilience to less likely events. Although existing methods rely on\nfinite-horizon risk criteria, this paper introduces \\textit{limiting-risk\ncriteria} that capture long-term cumulative risks through probabilistic\nlimiting theorems. Extending the Linear Quadratic Regulation (LQR) framework,\nwe incorporate constraints on these limiting-risk criteria derived from the\nasymptotic behavior of cumulative costs, accounting for extreme deviations.\nUsing tailored Functional Central Limit Theorems (FCLT), we demonstrate that\nthe time-correlated terms in the limiting-risk criteria converge under strong\nergodicity, and establish conditions for convergence in non-stationary settings\nwhile characterizing the distribution and providing explicit formulations for\nthe limiting variance of the risk functional. The FCLT is developed by applying\nergodic theory for Markov chains and obtaining \\textit{uniform ergodicity} of\nthe controlled process. For quadratic risk functionals on linear dynamics, in\naddition to internal stability, the uniform ergodicity requires the (possibly\nheavy-tailed) dynamic noise to have a finite fourth moment. This offers a clear\npath to quantifying long-term uncertainty. We also propose a primal-dual\nconstrained policy optimization method that optimizes the average performance\nwhile ensuring limiting-risk constraints are satisfied. Our framework offers a\npractical, theoretically guaranteed approach for long-term risk-sensitive\ncontrol, backed by convergence guarantees and validations through simulations.","PeriodicalId":501175,"journal":{"name":"arXiv - EE - Systems and Control","volume":"118 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Systems and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10767","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In stochastic systems, risk-sensitive control balances performance with
resilience to less likely events. Although existing methods rely on
finite-horizon risk criteria, this paper introduces \textit{limiting-risk
criteria} that capture long-term cumulative risks through probabilistic
limiting theorems. Extending the Linear Quadratic Regulation (LQR) framework,
we incorporate constraints on these limiting-risk criteria derived from the
asymptotic behavior of cumulative costs, accounting for extreme deviations.
Using tailored Functional Central Limit Theorems (FCLT), we demonstrate that
the time-correlated terms in the limiting-risk criteria converge under strong
ergodicity, and establish conditions for convergence in non-stationary settings
while characterizing the distribution and providing explicit formulations for
the limiting variance of the risk functional. The FCLT is developed by applying
ergodic theory for Markov chains and obtaining \textit{uniform ergodicity} of
the controlled process. For quadratic risk functionals on linear dynamics, in
addition to internal stability, the uniform ergodicity requires the (possibly
heavy-tailed) dynamic noise to have a finite fourth moment. This offers a clear
path to quantifying long-term uncertainty. We also propose a primal-dual
constrained policy optimization method that optimizes the average performance
while ensuring limiting-risk constraints are satisfied. Our framework offers a
practical, theoretically guaranteed approach for long-term risk-sensitive
control, backed by convergence guarantees and validations through simulations.