On non-stationary policies and maximal invariant safe sets of controlled Markov chains

2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601) Pub Date : 2004-12-01 DOI:10.1109/CDC.2004.1429313

Wei Wu, A. Arapostathis, Ratnesh Kumar

{"title":"On non-stationary policies and maximal invariant safe sets of controlled Markov chains","authors":"Wei Wu, A. Arapostathis, Ratnesh Kumar","doi":"10.1109/CDC.2004.1429313","DOIUrl":null,"url":null,"abstract":"This paper continues the study of safety control for Markov chains, a notion we introduced in our recent work. In our past work we have restricted our attention to Markov stationary controls, and derived necessary and sufficient conditions for safety enforcement in this class of policies. As opposed to optimal control of Markov chains under complete observations, where optimality is normally achieved in the class of stationary policies, enforcement of safety can benefit from the consideration of non-stationary policies. In this work we show that in meeting the safety control objective, it suffices to consider a class of non-stationary policies which are induced from the class of stationary policies of an augmented chain. Also, given a controlled Markov chain and a safety specification (describing bounds within which the probability distribution must always lie), we present an algorithm for computing the maximal set of safe initial distributions-the initial distributions from where it is possible to control the chain so that the safety specification is always satisfied.","PeriodicalId":254457,"journal":{"name":"2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC.2004.1429313","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

This paper continues the study of safety control for Markov chains, a notion we introduced in our recent work. In our past work we have restricted our attention to Markov stationary controls, and derived necessary and sufficient conditions for safety enforcement in this class of policies. As opposed to optimal control of Markov chains under complete observations, where optimality is normally achieved in the class of stationary policies, enforcement of safety can benefit from the consideration of non-stationary policies. In this work we show that in meeting the safety control objective, it suffices to consider a class of non-stationary policies which are induced from the class of stationary policies of an augmented chain. Also, given a controlled Markov chain and a safety specification (describing bounds within which the probability distribution must always lie), we present an algorithm for computing the maximal set of safe initial distributions-the initial distributions from where it is possible to control the chain so that the safety specification is always satisfied.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

控制马尔可夫链的非平稳策略和最大不变安全集

本文继续研究马尔可夫链的安全控制，这是我们在最近的工作中引入的一个概念。在过去的工作中，我们将注意力限制在马尔可夫平稳控制上，并推导出这类策略安全执行的充分必要条件。与完全观察下马尔可夫链的最优控制相反，最优性通常在平稳策略类中实现，安全性的实施可以从考虑非平稳策略中受益。在此工作中，我们证明了在满足安全控制目标时，考虑由增广链的平稳策略类导出的一类非平稳策略就足够了。此外，给定一个受控的马尔可夫链和一个安全规范(描述概率分布必须始终存在的范围)，我们提出了一个计算安全初始分布的最大集的算法-从可能控制链的初始分布使安全规范始终满足。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601)

自引率

0.00%

发文量

期刊最新文献

Remarks on strong stabilization and stable H/sup /spl infin// controller design Neural network compensation technique for standard PD-like fuzzy controlled nonlinear systems Failure-robust distributed controller architectures Stochastic optimal control guidance law with bounded acceleration On automating atomic force microscopes: an adaptive control approach