{"title":"具有不精确奖励结构的顺序决策","authors":"C. White","doi":"10.1109/CDC.1984.272392","DOIUrl":null,"url":null,"abstract":"We examine a finite stage, finite state, and finite action dynamic program having a one transition value function and a terminal value function that are affine in an imprecisely known parameter. The special structural characteristics of the one transition value function and the terminal value function have been assumed in order to model parameter imprecision associated with the problem's reward or preference structure. We assume that the parameter of interest has no dynamics, no new information about its value is received once the decision process begins, and its imprecision is described by set inclusion. We seek the set of all parameter independent strategies that are optimal for some value of the imprecisely known parameter. We present a successive approximations procedure for solving this problem.","PeriodicalId":269680,"journal":{"name":"The 23rd IEEE Conference on Decision and Control","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1984-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sequential decisionmaking with imprecise reward structure\",\"authors\":\"C. White\",\"doi\":\"10.1109/CDC.1984.272392\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We examine a finite stage, finite state, and finite action dynamic program having a one transition value function and a terminal value function that are affine in an imprecisely known parameter. The special structural characteristics of the one transition value function and the terminal value function have been assumed in order to model parameter imprecision associated with the problem's reward or preference structure. We assume that the parameter of interest has no dynamics, no new information about its value is received once the decision process begins, and its imprecision is described by set inclusion. We seek the set of all parameter independent strategies that are optimal for some value of the imprecisely known parameter. We present a successive approximations procedure for solving this problem.\",\"PeriodicalId\":269680,\"journal\":{\"name\":\"The 23rd IEEE Conference on Decision and Control\",\"volume\":\"71 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1984-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 23rd IEEE Conference on Decision and Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CDC.1984.272392\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 23rd IEEE Conference on Decision and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC.1984.272392","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sequential decisionmaking with imprecise reward structure
We examine a finite stage, finite state, and finite action dynamic program having a one transition value function and a terminal value function that are affine in an imprecisely known parameter. The special structural characteristics of the one transition value function and the terminal value function have been assumed in order to model parameter imprecision associated with the problem's reward or preference structure. We assume that the parameter of interest has no dynamics, no new information about its value is received once the decision process begins, and its imprecision is described by set inclusion. We seek the set of all parameter independent strategies that are optimal for some value of the imprecisely known parameter. We present a successive approximations procedure for solving this problem.