{"title":"Cooperative Behavior of Agents That Model the Other and the Self in Noisy Iterated Prisoners' Dilemma Simulation","authors":"Takaki Makino, Kazuyuki Aihara","doi":"10.1109/DEVLRN.2005.1490943","DOIUrl":null,"url":null,"abstract":"We developed self learning for simulation study of mutual understanding between peer agents. We designed them to use various types of coplayer models and a reinforcement learning algorithm to learn to play a noisy iterated prisoners' dilemma game so that the pay-off for the agent itself is maximized. We measured the mutual-modeling ability of each type of agent in terms of cooperative behavior when playing with another equivalent agent. We observed that agents with a complex coplayer model, which includes a model of the agent itself, showed higher cooperation than agents with a simpler coplayer model only. Moreover, in low-noise environments, Level-M agent, which develops equivalent models of the self and the other, showed higher cooperation than other types of agents. These results suggest the importance of \"self-observation\" in the design of communicative agents","PeriodicalId":297121,"journal":{"name":"Proceedings. The 4nd International Conference on Development and Learning, 2005.","volume":"111 3S 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. The 4nd International Conference on Development and Learning, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2005.1490943","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We developed self learning for simulation study of mutual understanding between peer agents. We designed them to use various types of coplayer models and a reinforcement learning algorithm to learn to play a noisy iterated prisoners' dilemma game so that the pay-off for the agent itself is maximized. We measured the mutual-modeling ability of each type of agent in terms of cooperative behavior when playing with another equivalent agent. We observed that agents with a complex coplayer model, which includes a model of the agent itself, showed higher cooperation than agents with a simpler coplayer model only. Moreover, in low-noise environments, Level-M agent, which develops equivalent models of the self and the other, showed higher cooperation than other types of agents. These results suggest the importance of "self-observation" in the design of communicative agents