{"title":"The exponential cost optimality for finite horizon semi-Markov decision processes","authors":"Haifeng Huo, Xian Wen","doi":"10.14736/kyb-2022-3-0301","DOIUrl":null,"url":null,"abstract":"This paper considers an exponential cost optimality problem for finite horizon semi-Markov decision processes (SMDPs). The objective is to calculate an optimal policy with minimal exponential costs over the full set of policies in a finite horizon. First, under the standard regular and compact-continuity conditions, we establish the optimality equation, prove that the value function is the unique solution of the optimality equation and the existence of an optimal policy by using the minimum nonnegative solution approach. Second, we establish a new value iteration algorithm to calculate both the value function and the (cid:15) -optimal policy. Finally, we give a computable machine maintenance system to illustrate the convergence of the algorithm.","PeriodicalId":49928,"journal":{"name":"Kybernetika","volume":"106 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Kybernetika","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.14736/kyb-2022-3-0301","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 1
Abstract
This paper considers an exponential cost optimality problem for finite horizon semi-Markov decision processes (SMDPs). The objective is to calculate an optimal policy with minimal exponential costs over the full set of policies in a finite horizon. First, under the standard regular and compact-continuity conditions, we establish the optimality equation, prove that the value function is the unique solution of the optimality equation and the existence of an optimal policy by using the minimum nonnegative solution approach. Second, we establish a new value iteration algorithm to calculate both the value function and the (cid:15) -optimal policy. Finally, we give a computable machine maintenance system to illustrate the convergence of the algorithm.
期刊介绍:
Kybernetika is the bi-monthly international journal dedicated for rapid publication of high-quality, peer-reviewed research articles in fields covered by its title. The journal is published by Nakladatelství Academia, Centre of Administration and Operations of the Czech Academy of Sciences for the Institute of Information Theory and Automation of The Czech Academy of Sciences.
Kybernetika traditionally publishes research results in the fields of Control Sciences, Information Sciences, Statistical Decision Making, Applied Probability Theory, Random Processes, Operations Research, Fuzziness and Uncertainty Theories, as well as in the topics closely related to the above fields.