{"title":"开放问题——遍历控制中相对值迭代的收敛性和渐近最优性","authors":"A. Arapostathis","doi":"10.1287/stsy.2019.0040","DOIUrl":null,"url":null,"abstract":"The relative value iteration scheme (RVI) for Markov decision processes (MDP) dates back to White (1963), a seminal work, which introduced an algorithm for solving the ergodic dynamic programming e...","PeriodicalId":36337,"journal":{"name":"Stochastic Systems","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1287/stsy.2019.0040","citationCount":"2","resultStr":"{\"title\":\"Open Problem—Convergence and Asymptotic Optimality of the Relative Value Iteration in Ergodic Control\",\"authors\":\"A. Arapostathis\",\"doi\":\"10.1287/stsy.2019.0040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The relative value iteration scheme (RVI) for Markov decision processes (MDP) dates back to White (1963), a seminal work, which introduced an algorithm for solving the ergodic dynamic programming e...\",\"PeriodicalId\":36337,\"journal\":{\"name\":\"Stochastic Systems\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1287/stsy.2019.0040\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Stochastic Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1287/stsy.2019.0040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stochastic Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1287/stsy.2019.0040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
Open Problem—Convergence and Asymptotic Optimality of the Relative Value Iteration in Ergodic Control
The relative value iteration scheme (RVI) for Markov decision processes (MDP) dates back to White (1963), a seminal work, which introduced an algorithm for solving the ergodic dynamic programming e...