{"title":"Extrapolation of an Optimal Policy using Statistical Probabilistic Model Checking","authors":"A. Rataj, B. Wozna","doi":"10.3233/FI-2018-1637","DOIUrl":null,"url":null,"abstract":"We show how to extrapolate an optimal policy controlling a model, which is itself too large to find the policy directly using probabilistic model checking (PMC). In particular, we look for a global optimal resolution of non–determinism in several small Markov Decision Processes (MDP) using PMC. We then use the resolution to find a respective set of decision boundaries representing the optimal policies found. Then, a hypothesis is formed on an extrapolation of these boundaries to an equivalent boundary in a large MDP. The resulting hypothetical extrapolated decision boundary is statistically approximately verified, whether it indeed represents an optimal policy for the large MDP. The verification either weakens or strengthens the hypothesis. The criterion of the optimality of the policy can be expressed in any modal logic that includes the probabilistic operator P∼p[·], and for which a PMC method exists.","PeriodicalId":286395,"journal":{"name":"International Workshop on Concurrency, Specification and Programming","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Concurrency, Specification and Programming","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/FI-2018-1637","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We show how to extrapolate an optimal policy controlling a model, which is itself too large to find the policy directly using probabilistic model checking (PMC). In particular, we look for a global optimal resolution of non–determinism in several small Markov Decision Processes (MDP) using PMC. We then use the resolution to find a respective set of decision boundaries representing the optimal policies found. Then, a hypothesis is formed on an extrapolation of these boundaries to an equivalent boundary in a large MDP. The resulting hypothetical extrapolated decision boundary is statistically approximately verified, whether it indeed represents an optimal policy for the large MDP. The verification either weakens or strengthens the hypothesis. The criterion of the optimality of the policy can be expressed in any modal logic that includes the probabilistic operator P∼p[·], and for which a PMC method exists.