具有 Q 学习功能的周期性更新规则促进了带有惩罚机制的博弈过渡中的合作演化

IF 6.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neurocomputing Pub Date : 2024-12-07 Epub Date: 2024-08-31 DOI:10.1016/j.neucom.2024.128510

Zeyuan Yan , Li Li , Jun Shang , Hui Zhao

{"title":"具有 Q 学习功能的周期性更新规则促进了带有惩罚机制的博弈过渡中的合作演化","authors":"Zeyuan Yan , Li Li , Jun Shang , Hui Zhao","doi":"10.1016/j.neucom.2024.128510","DOIUrl":null,"url":null,"abstract":"<div><p>Cooperative behavior assumes a critical role in resolving conflicts arising between collective and individual interests, while punishment measures serve as a robust deterrent against opportunistic free-riding. Within this context, evolutionary game theory (EGT) emerges as an indispensable paradigm for addressing this multifaceted issue. When it comes to introspection behaviors, reinforcement learning (RL) methods exhibit remarkable capabilities to capture agents’ cognitive processes. Nonetheless, previous research has often focused on a static and time-invariant update rule, neglecting the dynamic nature of real-world scenarios where individuals can flexibly transit between strategies in periodic time-dependent patterns. Here, we propose periodic update rules with Q-learning algorithm and game transition model with a punishment mechanism that grants cooperative agents the autonomy to exercise discretion in deciding whether to initiate punishment actions. The agents display dynamic rules periodically through game model transitions, thus ensuring EGT’s inherent adaptability. By employing Monte Carlo (MC) simulations, we analyze the emergence of cooperation that underscores the substantial enhancement of cooperative behavior through the proposed periodic update rules with Q-learning algorithm and game transitions in the presence of punishment. Our study highlights the indispensable significance of appropriate periodic intervals for updating rules and determining optimal punishment costs in the game transition model as critical elements for fostering the evolution of cooperation in real-world scenarios.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"609 ","pages":"Article 128510"},"PeriodicalIF":6.5000,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Periodic update rule with Q-learning promotes evolution of cooperation in game transition with punishment mechanism\",\"authors\":\"Zeyuan Yan , Li Li , Jun Shang , Hui Zhao\",\"doi\":\"10.1016/j.neucom.2024.128510\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Cooperative behavior assumes a critical role in resolving conflicts arising between collective and individual interests, while punishment measures serve as a robust deterrent against opportunistic free-riding. Within this context, evolutionary game theory (EGT) emerges as an indispensable paradigm for addressing this multifaceted issue. When it comes to introspection behaviors, reinforcement learning (RL) methods exhibit remarkable capabilities to capture agents’ cognitive processes. Nonetheless, previous research has often focused on a static and time-invariant update rule, neglecting the dynamic nature of real-world scenarios where individuals can flexibly transit between strategies in periodic time-dependent patterns. Here, we propose periodic update rules with Q-learning algorithm and game transition model with a punishment mechanism that grants cooperative agents the autonomy to exercise discretion in deciding whether to initiate punishment actions. The agents display dynamic rules periodically through game model transitions, thus ensuring EGT’s inherent adaptability. By employing Monte Carlo (MC) simulations, we analyze the emergence of cooperation that underscores the substantial enhancement of cooperative behavior through the proposed periodic update rules with Q-learning algorithm and game transitions in the presence of punishment. Our study highlights the indispensable significance of appropriate periodic intervals for updating rules and determining optimal punishment costs in the game transition model as critical elements for fostering the evolution of cooperation in real-world scenarios.</p></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"609 \",\"pages\":\"Article 128510\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2024-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231224012815\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/8/31 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224012815","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/31 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

合作行为在解决集体利益和个人利益之间的冲突中起着至关重要的作用，而惩罚措施则是对机会主义搭便车行为的有力威慑。在此背景下，进化博弈论（EGT）成为解决这一多方面问题不可或缺的范例。说到内省行为，强化学习（RL）方法在捕捉代理的认知过程方面表现出非凡的能力。然而，以往的研究通常关注的是静态和时间不变的更新规则，而忽略了现实世界场景的动态特性，即个体可以根据周期性的时间变化模式在不同策略之间灵活转换。在这里，我们提出了采用 Q-learning 算法的周期性更新规则，以及带有惩罚机制的博弈转换模型，赋予合作代理自主决定是否启动惩罚行动。代理通过博弈模型转换定期显示动态规则，从而确保 EGT 固有的适应性。通过使用蒙特卡罗（MC）模拟，我们分析了合作的出现，强调了在存在惩罚的情况下，通过建议的 Q-learning 算法和博弈转换的定期更新规则，合作行为得到了实质性的增强。我们的研究强调，在博弈转换模型中，更新规则和确定最佳惩罚成本的适当周期间隔是促进现实世界场景中合作演化的关键因素，具有不可或缺的重要意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Periodic update rule with Q-learning promotes evolution of cooperation in game transition with punishment mechanism

Cooperative behavior assumes a critical role in resolving conflicts arising between collective and individual interests, while punishment measures serve as a robust deterrent against opportunistic free-riding. Within this context, evolutionary game theory (EGT) emerges as an indispensable paradigm for addressing this multifaceted issue. When it comes to introspection behaviors, reinforcement learning (RL) methods exhibit remarkable capabilities to capture agents’ cognitive processes. Nonetheless, previous research has often focused on a static and time-invariant update rule, neglecting the dynamic nature of real-world scenarios where individuals can flexibly transit between strategies in periodic time-dependent patterns. Here, we propose periodic update rules with Q-learning algorithm and game transition model with a punishment mechanism that grants cooperative agents the autonomy to exercise discretion in deciding whether to initiate punishment actions. The agents display dynamic rules periodically through game model transitions, thus ensuring EGT’s inherent adaptability. By employing Monte Carlo (MC) simulations, we analyze the emergence of cooperation that underscores the substantial enhancement of cooperative behavior through the proposed periodic update rules with Q-learning algorithm and game transitions in the presence of punishment. Our study highlights the indispensable significance of appropriate periodic intervals for updating rules and determining optimal punishment costs in the game transition model as critical elements for fostering the evolution of cooperation in real-world scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.