A reinforcement learning-based strategy updating model for the cooperative evolution

My bibliography Save this article

A reinforcement learning-based strategy updating model for the cooperative evolution

Author

Listed:

Wang, Xianjia
Yang, Zhipeng
Liu, Yanli
Chen, Guici

Registered:

Abstract

The emergence of cooperation between competing agents has been commonly studied through evolutionary games, but such cooperation often requires a mechanism or a third party to be activated and kept alive. To investigate how a mechanism affects the evolution of cooperation, this paper proposes an innovative reinforcement learning-based strategy updating model. The model consists of two symmetrical sets of convolutional neural networks. Besides, the agents’ strategies updating rules are defined: firstly, the agents learn and predict the environment and the behaviors of neighboring agents, then estimate their future payoffs based on this information, and finally determine their strategies based on these estimated payoffs. Through investigating the behavior characteristics and the stable states of the network for highly intelligent agents with memory learning and prediction ability in the evolution of the prisoner’s dilemma game, the results demonstrate that the game initiators who adopt the mixed optimal payoff approach can increase the number of cooperators and facilitate “global cooperation” and “repaying kindness with kindness”. Although the temptation factor has little effect on the population, increasing the discount factor can expand the scale of the cooperative cluster and even achieve dynamic stability. Additionally, a smaller size of minibatch is beneficial for the evolution of cooperation in a smaller experience replay pool. A larger size of minibatch is more conducive to the evolution of cooperation with an increasing capacity of the experience replay pool. This research provides a novel perspective from reinforcement learning to understand the evolution of cooperation.

Suggested Citation

Wang, Xianjia & Yang, Zhipeng & Liu, Yanli & Chen, Guici, 2023. "A reinforcement learning-based strategy updating model for the cooperative evolution," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 618(C).

Handle: RePEc:eee:phsmap:v:618:y:2023:i:c:s0378437123002546
DOI: 10.1016/j.physa.2023.128699

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Wang, Xianjia & Lv, Shaojie & Quan, Ji, 2017. "The evolution of cooperation in the Prisoner’s Dilemma and the Snowdrift game based on Particle Swarm Optimization," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 482(C), pages 286-295.
Abhijit Gosavi, 2009. "Reinforcement Learning: A Tutorial Survey and Recent Advances," INFORMS Journal on Computing, INFORMS, vol. 21(2), pages 178-192, May.
Han, Jia-Xu & Wang, Rui-Wu, 2023. "Complex interactions promote the frequency of cooperation in snowdrift game," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 609(C).
Wang, Xianjia & Lv, Shaojie, 2019. "The roles of particle swarm intelligence in the prisoner’s dilemma based on continuous and mixed strategy systems on scale-free networks," Applied Mathematics and Computation, Elsevier, vol. 355(C), pages 213-220.
Pan, Jianchen & Zhang, Lan & Han, Wenchen & Huang, Changwei, 2023. "Heterogeneous investment promotes cooperation in spatial public goods game on hypergraphs," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 609(C).
Hisashi Ohtsuki & Christoph Hauert & Erez Lieberman & Martin A. Nowak, 2006. "A simple rule for the evolution of cooperation on graphs and social networks," Nature, Nature, vol. 441(7092), pages 502-505, May.
Zhu, Peican & Wang, Xiaoyu & Jia, Danyang & Guo, Yangming & Li, Shudong & Chu, Chen, 2020. "Investigating the co-evolution of node reputation and edge-strategy in prisoner's dilemma game," Applied Mathematics and Computation, Elsevier, vol. 386(C).
Gao, Liyan & Pan, Qiuhui & He, Mingfeng, 2022. "Advanced defensive cooperators promote cooperation in the prisoner’s dilemma game," Chaos, Solitons & Fractals, Elsevier, vol. 155(C).
Usui, Yuki & Ueda, Masahiko, 2021. "Symmetric equilibrium of multi-agent reinforcement learning in repeated prisoner’s dilemma," Applied Mathematics and Computation, Elsevier, vol. 409(C).
Martin A. Nowak & Akira Sasaki & Christine Taylor & Drew Fudenberg, 2004. "Emergence of cooperation and evolutionary stability in finite populations," Nature, Nature, vol. 428(6983), pages 646-650, April.
- Nowak, Martin & Sasaki, Akira & Fudenberg, Drew & Taylor, Christine, 2004. "Emergence of Cooperation and Evolutionary Stability in Finite Populations," Scholarly Articles 3196331, Harvard University Department of Economics.
Jia, Danyang & Li, Tong & Zhao, Yang & Zhang, Xiaoqin & Wang, Zhen, 2022. "Empty nodes affect conditional cooperation under reinforcement learning," Applied Mathematics and Computation, Elsevier, vol. 413(C).
Ren, Guangming & Wang, Xingyuan, 2014. "Robustness of cooperation in memory-based prisoner’s dilemma game on a square lattice," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 408(C), pages 40-46.
Lee, Hyun-Rok & Lee, Taesik, 2021. "Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response," European Journal of Operational Research, Elsevier, vol. 291(1), pages 296-308.
Zhen Wang & Marko Jusup & Lei Shi & Joung-Hun Lee & Yoh Iwasa & Stefano Boccaletti, 2018. "Exploiting a cognitive bias promotes cooperation in social dilemma experiments," Nature Communications, Nature, vol. 9(1), pages 1-7, December.
Volodymyr Mnih & Koray Kavukcuoglu & David Silver & Andrei A. Rusu & Joel Veness & Marc G. Bellemare & Alex Graves & Martin Riedmiller & Andreas K. Fidjeland & Georg Ostrovski & Stig Petersen & Charle, 2015. "Human-level control through deep reinforcement learning," Nature, Nature, vol. 518(7540), pages 529-533, February.
Lv, Shaojie & Song, Feifei, 2022. "Particle swarm intelligence and the evolution of cooperation in the spatial public goods game with punishment," Applied Mathematics and Computation, Elsevier, vol. 412(C).
Oriol Vinyals & Igor Babuschkin & Wojciech M. Czarnecki & Michaël Mathieu & Andrew Dudzik & Junyoung Chung & David H. Choi & Richard Powell & Timo Ewalds & Petko Georgiev & Junhyuk Oh & Dan Horgan & M, 2019. "Grandmaster level in StarCraft II using multi-agent reinforcement learning," Nature, Nature, vol. 575(7782), pages 350-354, November.
Kelsey R. McDonald & William F. Broderick & Scott A. Huettel & John M. Pearson, 2019. "Bayesian nonparametric models characterize instantaneous strategies in a competitive dynamic game," Nature Communications, Nature, vol. 10(1), pages 1-12, December.
Izquierdo, Luis R. & Izquierdo, Segismundo S. & Gotts, Nicholas M. & Polhill, J. Gary, 2007. "Transient and asymptotic dynamics of reinforcement learning in games," Games and Economic Behavior, Elsevier, vol. 61(2), pages 259-276, November.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Wang, Chengjie & Deng, Juan & Zhao, Hui & Li, Li, 2024. "Effect of Q-learning on the evolution of cooperation behavior in collective motion: An improved Vicsek model," Applied Mathematics and Computation, Elsevier, vol. 482(C).
Zou, Kuan & Huang, Changwei, 2024. "Incorporating reputation into reinforcement learning can promote cooperation on hypergraphs," Chaos, Solitons & Fractals, Elsevier, vol. 186(C).

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Ding, Zhen-Wei & Zheng, Guo-Zhong & Cai, Chao-Ran & Cai, Wei-Ran & Chen, Li & Zhang, Ji-Qiang & Wang, Xu-Ming, 2023. "Emergence of cooperation in two-agent repeated games with reinforcement learning," Chaos, Solitons & Fractals, Elsevier, vol. 175(P1).
Cheng, Jiangjiang & Mei, Wenjun & Su, Wei & Chen, Ge, 2023. "Evolutionary games on networks: Phase transition, quasi-equilibrium, and mathematical principles," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 611(C).
Feng, Meiling & Li, Xuezhu & Zhao, Dawei & Xia, Chengyi, 2023. "Evolutionary dynamics with the second-order reputation in the networked N-player trust game," Chaos, Solitons & Fractals, Elsevier, vol. 175(P2).
Wang, Xianjia & Chen, Wenman, 2020. "Evolutionary dynamics in spatial threshold public goods game with the asymmetric return rate mechanism," Chaos, Solitons & Fractals, Elsevier, vol. 136(C).
Song, Shenpeng & Feng, Yuhao & Xu, Wenzhe & Li, Hui-Jia & Wang, Zhen, 2022. "Evolutionary prisoner’s dilemma game on signed networks based on structural balance theory," Chaos, Solitons & Fractals, Elsevier, vol. 164(C).
Zhang, Qin & Liu, Yu & Xiang, Yisha & Xiahou, Tangfan, 2024. "Reinforcement learning in reliability and maintenance optimization: A tutorial," Reliability Engineering and System Safety, Elsevier, vol. 251(C).
Jia, Danyang & Li, Tong & Zhao, Yang & Zhang, Xiaoqin & Wang, Zhen, 2022. "Empty nodes affect conditional cooperation under reinforcement learning," Applied Mathematics and Computation, Elsevier, vol. 413(C).
Xie, Yunya & Bai, Yu & Zhang, Yankun & Peng, Zhengyin, 2024. "Trust-induced cooperation under the complex interaction of networks and emotions," Chaos, Solitons & Fractals, Elsevier, vol. 182(C).
Peng Liu & Haoxiang Xia, 2015. "Structure and evolution of co-authorship network in an interdisciplinary research field," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(1), pages 101-134, April.
Zhao, Zhengwu & Zhang, Chunyan, 2023. "The mechanisms of labor division from the perspective of task urgency and game theory," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 630(C).
Lessard, Sabin & Lahaie, Philippe, 2009. "Fixation probability with multiple alleles and projected average allelic effect on selection," Theoretical Population Biology, Elsevier, vol. 75(4), pages 266-277.
Huang, Shaoxu & Liu, Xuesong & Hu, Yuhan & Fu, Xiao, 2023. "The influence of aggressive behavior on cooperation evolution in social dilemma," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 630(C).
Wakano, Joe Yuichiro & Ohtsuki, Hisashi & Kobayashi, Yutaka, 2013. "A mathematical description of the inclusive fitness theory," Theoretical Population Biology, Elsevier, vol. 84(C), pages 46-55.
Hao, Weijuan & Hu, Yuhan, 2024. "The implications of deep cooperation strategy for the evolution of cooperation in social dilemmas," Applied Mathematics and Computation, Elsevier, vol. 470(C).
Dimitris Iliopoulos & Arend Hintze & Christoph Adami, 2010. "Critical Dynamics in the Evolution of Stochastic Strategies for the Iterated Prisoner's Dilemma," PLOS Computational Biology, Public Library of Science, vol. 6(10), pages 1-8, October.
Wang, Jianwei & Xu, Wenshu & Yu, Fengyuan & He, Jialu & Chen, Wei & Dai, Wenhui, 2024. "Evolution of cooperation under corrupt institutions," Chaos, Solitons & Fractals, Elsevier, vol. 184(C).
McAvoy, Alex & Fraiman, Nicolas & Hauert, Christoph & Wakeley, John & Nowak, Martin A., 2018. "Public goods games in populations with fluctuating size," Theoretical Population Biology, Elsevier, vol. 121(C), pages 72-84.
Du, Chunpeng & Guo, Keyu & Lu, Yikang & Jin, Haoyu & Shi, Lei, 2023. "Aspiration driven exit-option resolves social dilemmas in the network," Applied Mathematics and Computation, Elsevier, vol. 438(C).
Ye, Tianbo & Li, Pengcheng & Fan, Suohai, 2024. "Preferential selection based on aspiration and memory in spatial evolutionary prisoner’s dilemma game," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 647(C).
Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.

More about this item

Keywords

Reinforcement learning; Evolutionary game; Cooperation; Prisoner’s dilemma game;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:phsmap:v:618:y:2023:i:c:s0378437123002546. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/physica-a-statistical-mechpplications/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

A reinforcement learning-based strategy updating model for the cooperative evolution

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data