{"title":"最优离散追求学习自动机","authors":"B. Oommen, J. Lanctôt","doi":"10.1109/ICSMC.1989.71244","DOIUrl":null,"url":null,"abstract":"The authors consider the problem of a stochastic learning automaton interacting with an unknown random environment. The fundamental problem is that of learning, through interaction, the best action (that is, the action which is rewarded optimally) allowed by the environment. By using running estimates of reward probabilities to learn the optimal action, an extremely efficient pursuit algorithm was obtained by M.A.L. Thathachar et al. (1986, 1989) which is presently among the fastest-growing algorithms known. In the present work, the authors investigate the improvements gained by rendering the pursuit algorithm discrete. This is done by restricting the probability of selecting an action to a finite and, hence, discrete subset of","PeriodicalId":72691,"journal":{"name":"Conference proceedings. IEEE International Conference on Systems, Man, and Cybernetics","volume":"5 1","pages":"6-12 vol.1"},"PeriodicalIF":0.0000,"publicationDate":"1989-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Epsilon-optimal discretized pursuit learning automata\",\"authors\":\"B. Oommen, J. Lanctôt\",\"doi\":\"10.1109/ICSMC.1989.71244\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The authors consider the problem of a stochastic learning automaton interacting with an unknown random environment. The fundamental problem is that of learning, through interaction, the best action (that is, the action which is rewarded optimally) allowed by the environment. By using running estimates of reward probabilities to learn the optimal action, an extremely efficient pursuit algorithm was obtained by M.A.L. Thathachar et al. (1986, 1989) which is presently among the fastest-growing algorithms known. In the present work, the authors investigate the improvements gained by rendering the pursuit algorithm discrete. This is done by restricting the probability of selecting an action to a finite and, hence, discrete subset of\",\"PeriodicalId\":72691,\"journal\":{\"name\":\"Conference proceedings. IEEE International Conference on Systems, Man, and Cybernetics\",\"volume\":\"5 1\",\"pages\":\"6-12 vol.1\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1989-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference proceedings. IEEE International Conference on Systems, Man, and Cybernetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSMC.1989.71244\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference proceedings. IEEE International Conference on Systems, Man, and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSMC.1989.71244","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The authors consider the problem of a stochastic learning automaton interacting with an unknown random environment. The fundamental problem is that of learning, through interaction, the best action (that is, the action which is rewarded optimally) allowed by the environment. By using running estimates of reward probabilities to learn the optimal action, an extremely efficient pursuit algorithm was obtained by M.A.L. Thathachar et al. (1986, 1989) which is presently among the fastest-growing algorithms known. In the present work, the authors investigate the improvements gained by rendering the pursuit algorithm discrete. This is done by restricting the probability of selecting an action to a finite and, hence, discrete subset of