{"title":"On Strategic Measures and Optimality Properties in Discrete-Time Stochastic Control with Universally Measurable Policies","authors":"Huizhen Yu","doi":"10.1287/moor.2022.0188","DOIUrl":null,"url":null,"abstract":"This paper concerns discrete-time infinite-horizon stochastic control systems with Borel state and action spaces and universally measurable policies. We study optimization problems on strategic measures induced by the policies in these systems. The results are then applied to risk-neutral and risk-sensitive Markov decision processes to establish the measurability of the optimal value functions and the existence of universally measurable, randomized or nonrandomized, ϵ-optimal policies, for a variety of average cost criteria and risk criteria. We also extend our analysis to a class of minimax control problems and establish similar optimality results under the axiom of analytic determinacy. Funding: This work was supported by grants from DeepMind, the Alberta Machine Intelligence Institute (AMII), and Alberta Innovates-Technology Futures (AITF).","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"48 3","pages":"0"},"PeriodicalIF":1.4000,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematics of Operations Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1287/moor.2022.0188","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
This paper concerns discrete-time infinite-horizon stochastic control systems with Borel state and action spaces and universally measurable policies. We study optimization problems on strategic measures induced by the policies in these systems. The results are then applied to risk-neutral and risk-sensitive Markov decision processes to establish the measurability of the optimal value functions and the existence of universally measurable, randomized or nonrandomized, ϵ-optimal policies, for a variety of average cost criteria and risk criteria. We also extend our analysis to a class of minimax control problems and establish similar optimality results under the axiom of analytic determinacy. Funding: This work was supported by grants from DeepMind, the Alberta Machine Intelligence Institute (AMII), and Alberta Innovates-Technology Futures (AITF).
期刊介绍:
Mathematics of Operations Research is an international journal of the Institute for Operations Research and the Management Sciences (INFORMS). The journal invites articles concerned with the mathematical and computational foundations in the areas of continuous, discrete, and stochastic optimization; mathematical programming; dynamic programming; stochastic processes; stochastic models; simulation methodology; control and adaptation; networks; game theory; and decision theory. Also sought are contributions to learning theory and machine learning that have special relevance to decision making, operations research, and management science. The emphasis is on originality, quality, and importance; correctness alone is not sufficient. Significant developments in operations research and management science not having substantial mathematical interest should be directed to other journals such as Management Science or Operations Research.