{"title":"Automatically Discovering Hierarchies in Multi-agent Reinforcement Learning","authors":"Xiaobei Cheng, Jing Shen, Haibo Liu, Guochang Gu, Guoyin Zhang","doi":"10.1109/ICICSE.2008.32","DOIUrl":null,"url":null,"abstract":"It is difficult to automatically discovering hierarchies in multi-agent reinforcement learning. We consider an immune clustering approach for automatically discovering hierarchies in option learning framework. The leading agent generates an undirected edge-weighted topological graph of the environment state transitions based on the environment information explored by all agents. An immune clustering algorithm is then used to partition the state space. A second immune response algorithm is used to update the clusters when a new state being encountered later. Local strategies for reaching the different parts of the space are learned distributedly and added to the model in a form of options.","PeriodicalId":333889,"journal":{"name":"2008 International Conference on Internet Computing in Science and Engineering","volume":"3 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Internet Computing in Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICSE.2008.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
It is difficult to automatically discovering hierarchies in multi-agent reinforcement learning. We consider an immune clustering approach for automatically discovering hierarchies in option learning framework. The leading agent generates an undirected edge-weighted topological graph of the environment state transitions based on the environment information explored by all agents. An immune clustering algorithm is then used to partition the state space. A second immune response algorithm is used to update the clusters when a new state being encountered later. Local strategies for reaching the different parts of the space are learned distributedly and added to the model in a form of options.