{"title":"Developing multi-agent adversarial environment using reinforcement learning and imitation learning","authors":"Ziyao Han, Yupeng Liang, Kazuhiro Ohkura","doi":"10.1007/s10015-023-00912-9","DOIUrl":null,"url":null,"abstract":"<div><p>A multi-agent system is a collection of autonomous, interacting agents that share a common environment. These entities observe their environment using sensors and interact with the environment. A multi-agent system that develops cooperative strategies by reinforcement learning does not perform well, mostly because of the sparse reward problem. This study conducts a 3D environment in which robots play the beach volleyball game. This study combines imitation learning (IL) with reinforcement learning (RL) to solve the sparse reward problem. The results show that the proposed approach gets a higher score in the Elo rating system and robots perform better than the conventional RL approach.</p></div>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s10015-023-00912-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A multi-agent system is a collection of autonomous, interacting agents that share a common environment. These entities observe their environment using sensors and interact with the environment. A multi-agent system that develops cooperative strategies by reinforcement learning does not perform well, mostly because of the sparse reward problem. This study conducts a 3D environment in which robots play the beach volleyball game. This study combines imitation learning (IL) with reinforcement learning (RL) to solve the sparse reward problem. The results show that the proposed approach gets a higher score in the Elo rating system and robots perform better than the conventional RL approach.