Guozhong Zheng, Jiqiang Zhang, Shengfeng Deng, Weiran Cai, Li Chen
{"title":"公共产品博弈中的合作演变与 Q 学习","authors":"Guozhong Zheng, Jiqiang Zhang, Shengfeng Deng, Weiran Cai, Li Chen","doi":"arxiv-2407.19851","DOIUrl":null,"url":null,"abstract":"Recent paradigm shifts from imitation learning to reinforcement learning (RL)\nis shown to be productive in understanding human behaviors. In the RL paradigm,\nindividuals search for optimal strategies through interaction with the\nenvironment to make decisions. This implies that gathering, processing, and\nutilizing information from their surroundings are crucial. However, existing\nstudies typically study pairwise games such as the prisoners' dilemma and\nemploy a self-regarding setup, where individuals play against one opponent\nbased solely on their own strategies, neglecting the environmental information.\nIn this work, we investigate the evolution of cooperation with the multiplayer\ngame -- the public goods game using the Q-learning algorithm by leveraging the\nenvironmental information. Specifically, the decision-making of players is\nbased upon the cooperation information in their neighborhood. Our results show\nthat cooperation is more likely to emerge compared to the case of imitation\nlearning by using Fermi rule. Of particular interest is the observation of an\nanomalous non-monotonic dependence which is revealed when voluntary\nparticipation is further introduced. The analysis of the Q-table explains the\nmechanisms behind the cooperation evolution. Our findings indicate the\nfundamental role of environment information in the RL paradigm to understand\nthe evolution of cooperation, and human behaviors in general.","PeriodicalId":501044,"journal":{"name":"arXiv - QuanBio - Populations and Evolution","volume":"20 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evolution of cooperation in the public goods game with Q-learning\",\"authors\":\"Guozhong Zheng, Jiqiang Zhang, Shengfeng Deng, Weiran Cai, Li Chen\",\"doi\":\"arxiv-2407.19851\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent paradigm shifts from imitation learning to reinforcement learning (RL)\\nis shown to be productive in understanding human behaviors. In the RL paradigm,\\nindividuals search for optimal strategies through interaction with the\\nenvironment to make decisions. This implies that gathering, processing, and\\nutilizing information from their surroundings are crucial. However, existing\\nstudies typically study pairwise games such as the prisoners' dilemma and\\nemploy a self-regarding setup, where individuals play against one opponent\\nbased solely on their own strategies, neglecting the environmental information.\\nIn this work, we investigate the evolution of cooperation with the multiplayer\\ngame -- the public goods game using the Q-learning algorithm by leveraging the\\nenvironmental information. Specifically, the decision-making of players is\\nbased upon the cooperation information in their neighborhood. Our results show\\nthat cooperation is more likely to emerge compared to the case of imitation\\nlearning by using Fermi rule. Of particular interest is the observation of an\\nanomalous non-monotonic dependence which is revealed when voluntary\\nparticipation is further introduced. The analysis of the Q-table explains the\\nmechanisms behind the cooperation evolution. Our findings indicate the\\nfundamental role of environment information in the RL paradigm to understand\\nthe evolution of cooperation, and human behaviors in general.\",\"PeriodicalId\":501044,\"journal\":{\"name\":\"arXiv - QuanBio - Populations and Evolution\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Populations and Evolution\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.19851\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Populations and Evolution","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.19851","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evolution of cooperation in the public goods game with Q-learning
Recent paradigm shifts from imitation learning to reinforcement learning (RL)
is shown to be productive in understanding human behaviors. In the RL paradigm,
individuals search for optimal strategies through interaction with the
environment to make decisions. This implies that gathering, processing, and
utilizing information from their surroundings are crucial. However, existing
studies typically study pairwise games such as the prisoners' dilemma and
employ a self-regarding setup, where individuals play against one opponent
based solely on their own strategies, neglecting the environmental information.
In this work, we investigate the evolution of cooperation with the multiplayer
game -- the public goods game using the Q-learning algorithm by leveraging the
environmental information. Specifically, the decision-making of players is
based upon the cooperation information in their neighborhood. Our results show
that cooperation is more likely to emerge compared to the case of imitation
learning by using Fermi rule. Of particular interest is the observation of an
anomalous non-monotonic dependence which is revealed when voluntary
participation is further introduced. The analysis of the Q-table explains the
mechanisms behind the cooperation evolution. Our findings indicate the
fundamental role of environment information in the RL paradigm to understand
the evolution of cooperation, and human behaviors in general.