{"title":"A multi-agent cooperation system based on a Layered Cooperation Model","authors":"Kao-Shing Hwang, Jin-Ling Lin, Hsuan-Pei Hsu","doi":"10.1109/ICSSE.2014.6887923","DOIUrl":null,"url":null,"abstract":"This paper proposes a reinforcement learning model for multi-agent cooperation based on agents' cooperation tendency. An agent learns rules of cooperation according to these recorded cooperation probability in a Layered Cooperation Model (LCM). In the LCM, a candidate policy engine is first used to filter out candidate action sets, which consider payoff is given for coalition. Then, agents use Nash Bargaining Solution (NBS) to generate candidate policies for themselves from these candidate action sets during the learning. The proposed approach could work for both transferable utility and non-transferable utility cooperation problem. From the simulation results, the proposed method shows its learning efficiency outperforms Win or Learning Fast Policy Hill-Climbing (WoLF-PHC) and Nash Bargaining Solution (NBS).","PeriodicalId":166215,"journal":{"name":"2014 IEEE International Conference on System Science and Engineering (ICSSE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on System Science and Engineering (ICSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSE.2014.6887923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
This paper proposes a reinforcement learning model for multi-agent cooperation based on agents' cooperation tendency. An agent learns rules of cooperation according to these recorded cooperation probability in a Layered Cooperation Model (LCM). In the LCM, a candidate policy engine is first used to filter out candidate action sets, which consider payoff is given for coalition. Then, agents use Nash Bargaining Solution (NBS) to generate candidate policies for themselves from these candidate action sets during the learning. The proposed approach could work for both transferable utility and non-transferable utility cooperation problem. From the simulation results, the proposed method shows its learning efficiency outperforms Win or Learning Fast Policy Hill-Climbing (WoLF-PHC) and Nash Bargaining Solution (NBS).