Zheng Zhang, Qingrui Zhang, Bo Zhu, Xiaohan Wang, Tianjiang Hu
{"title":"EASpace: Enhanced Action Space for Policy Transfer.","authors":"Zheng Zhang, Qingrui Zhang, Bo Zhu, Xiaohan Wang, Tianjiang Hu","doi":"10.1109/TNNLS.2023.3322591","DOIUrl":null,"url":null,"abstract":"<p><p>Formulating expert policies as macro actions promises to alleviate the long-horizon issue via structured exploration and efficient credit assignment. However, traditional option-based multipolicy transfer methods suffer from inefficient exploration of macro action's length and insufficient exploitation of useful long-duration macro actions. In this article, a novel algorithm named enhanced action space (EASpace) is proposed, which formulates macro actions in an alternative form to accelerate the learning process using multiple available suboptimal expert policies. Specifically, EASpace formulates each expert policy into multiple macro actions with different execution times. All the macro actions are then integrated into the primitive action space directly. An intrinsic reward, which is proportional to the execution time of macro actions, is introduced to encourage the exploitation of useful macro actions. The corresponding learning rule that is similar to intraoption Q-learning is employed to improve the data efficiency. Theoretical analysis is presented to show the convergence of the proposed learning rule. The efficiency of EASpace is illustrated by a grid-based game and a multiagent pursuit problem. The proposed algorithm is also implemented in physical systems to validate its effectiveness.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.2000,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TNNLS.2023.3322591","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Formulating expert policies as macro actions promises to alleviate the long-horizon issue via structured exploration and efficient credit assignment. However, traditional option-based multipolicy transfer methods suffer from inefficient exploration of macro action's length and insufficient exploitation of useful long-duration macro actions. In this article, a novel algorithm named enhanced action space (EASpace) is proposed, which formulates macro actions in an alternative form to accelerate the learning process using multiple available suboptimal expert policies. Specifically, EASpace formulates each expert policy into multiple macro actions with different execution times. All the macro actions are then integrated into the primitive action space directly. An intrinsic reward, which is proportional to the execution time of macro actions, is introduced to encourage the exploitation of useful macro actions. The corresponding learning rule that is similar to intraoption Q-learning is employed to improve the data efficiency. Theoretical analysis is presented to show the convergence of the proposed learning rule. The efficiency of EASpace is illustrated by a grid-based game and a multiagent pursuit problem. The proposed algorithm is also implemented in physical systems to validate its effectiveness.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.