{"title":"Multi-agent learning via implicit opponent modeling","authors":"Ronald V. Bjarnason, T. Peterson","doi":"10.1109/CEC.2002.1004470","DOIUrl":null,"url":null,"abstract":"We present a learning algorithm for two player stochastic games. The algorithm generates optimal deterministic finite automata (DFA) strategies against opponents who can be modeled by probabilistic action automata. The algorithm generates dynamic history trees based on statistical tests to eliminate state aliasing. Experiments are conducted in an iterated prisoner's dilemma environment.","PeriodicalId":184547,"journal":{"name":"Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2002.1004470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
We present a learning algorithm for two player stochastic games. The algorithm generates optimal deterministic finite automata (DFA) strategies against opponents who can be modeled by probabilistic action automata. The algorithm generates dynamic history trees based on statistical tests to eliminate state aliasing. Experiments are conducted in an iterated prisoner's dilemma environment.