On evolution of agent behavior under limited gaming time with reinforcement learning

IF 5.3 1区数学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Chaos Solitons & Fractals Pub Date : 2025-02-25 DOI:10.1016/j.chaos.2025.116166

Dandan Li , Qiongzi Wu , Dun Han

{"title":"On evolution of agent behavior under limited gaming time with reinforcement learning","authors":"Dandan Li , Qiongzi Wu , Dun Han","doi":"10.1016/j.chaos.2025.116166","DOIUrl":null,"url":null,"abstract":"<div><div>Based on the prisoner’s dilemma (PD) evolutionary game model and reinforcement learning framework, this paper studies the impact of factors such as temptation payoff, time allocation, and others on agent behavior evolution and strategy selection under limited gaming time resources, across three different agent relationship structures. The results show that an increase in the agent’s gaming time resources and lower temptation payoffs, or the agent’s greater emphasis on long-term rewards and avoidance of excessive behavioral adjustments, all contribute to promoting cooperation between agents. Additionally, the total remaining gaming time between agents gradually increases as the game progresses, while the total gaming time between agents gradually decreases. Both will eventually reach a steady state after a sufficiently large number of game rounds. Further results indicate that an increase in temptation payoff leads to an increase in total remaining gaming time, while reducing the total gaming time between agents. Finally, the measure of heterogeneity in gaming time distribution between agents gradually increases throughout the game process. This is particularly evident when the temptation payoff is high, as the differences in gaming time allocation between agents increase, significantly enhancing the heterogeneity of gaming time among agents in the system. This study provides important theoretical support for understanding agent behavior evolution under limited gaming time resources, especially in dynamic cooperative and competitive game scenarios.</div></div>","PeriodicalId":9764,"journal":{"name":"Chaos Solitons & Fractals","volume":"194 ","pages":"Article 116166"},"PeriodicalIF":5.3000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chaos Solitons & Fractals","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0960077925001791","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Based on the prisoner’s dilemma (PD) evolutionary game model and reinforcement learning framework, this paper studies the impact of factors such as temptation payoff, time allocation, and others on agent behavior evolution and strategy selection under limited gaming time resources, across three different agent relationship structures. The results show that an increase in the agent’s gaming time resources and lower temptation payoffs, or the agent’s greater emphasis on long-term rewards and avoidance of excessive behavioral adjustments, all contribute to promoting cooperation between agents. Additionally, the total remaining gaming time between agents gradually increases as the game progresses, while the total gaming time between agents gradually decreases. Both will eventually reach a steady state after a sufficiently large number of game rounds. Further results indicate that an increase in temptation payoff leads to an increase in total remaining gaming time, while reducing the total gaming time between agents. Finally, the measure of heterogeneity in gaming time distribution between agents gradually increases throughout the game process. This is particularly evident when the temptation payoff is high, as the differences in gaming time allocation between agents increase, significantly enhancing the heterogeneity of gaming time among agents in the system. This study provides important theoretical support for understanding agent behavior evolution under limited gaming time resources, especially in dynamic cooperative and competitive game scenarios.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Chaos Solitons & Fractals 物理-数学跨学科应用

CiteScore

13.20

自引率

10.30%

发文量

1087

审稿时长

9 months

期刊介绍： Chaos, Solitons & Fractals strives to establish itself as a premier journal in the interdisciplinary realm of Nonlinear Science, Non-equilibrium, and Complex Phenomena. It welcomes submissions covering a broad spectrum of topics within this field, including dynamics, non-equilibrium processes in physics, chemistry, and geophysics, complex matter and networks, mathematical models, computational biology, applications to quantum and mesoscopic phenomena, fluctuations and random processes, self-organization, and social phenomena.