{"title":"Effective Search for Control Hierarchies Within the Policy Decomposition Framework","authors":"Ashwin Khadke;Hartmut Geyer","doi":"10.1109/LRA.2024.3483635","DOIUrl":null,"url":null,"abstract":"Policy decomposition is a novel framework for approximating optimal control policies of complex dynamical systems with a hierarchy of policies derived from smaller but tractable subsystems. It stands out amongst the class of hierarchical control methods by estimating \n<italic>a priori</i>\n how well the closed-loop behavior of different control hierarchies matches the optimal policy. However, the number of possible hierarchies grows prohibitively with the number of inputs and the dimension of the state-space of the system making it unrealistic to estimate the closed-loop performance for all hierarchies. Here, we present the development of two search methods based on Genetic Algorithm and Monte-Carlo Tree Search to tackle this combinatorial challenge, and demonstrate that it is indeed surmountable. We showcase the efficacy of our search methods and the generality of the framework by applying it towards finding hierarchies for control of three distinct robotic systems: a simplified biped, a planar manipulator, and a quadcopter. The discovered hierarchies, in comparison to heuristically designed ones, provide improved closed-loop performance or can be computed in minimal time with marginally worse control performance, and also exceed the control performance of policies obtained with popular deep reinforcement learning methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"9 12","pages":"11114-11121"},"PeriodicalIF":4.6000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10721360/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Policy decomposition is a novel framework for approximating optimal control policies of complex dynamical systems with a hierarchy of policies derived from smaller but tractable subsystems. It stands out amongst the class of hierarchical control methods by estimating
a priori
how well the closed-loop behavior of different control hierarchies matches the optimal policy. However, the number of possible hierarchies grows prohibitively with the number of inputs and the dimension of the state-space of the system making it unrealistic to estimate the closed-loop performance for all hierarchies. Here, we present the development of two search methods based on Genetic Algorithm and Monte-Carlo Tree Search to tackle this combinatorial challenge, and demonstrate that it is indeed surmountable. We showcase the efficacy of our search methods and the generality of the framework by applying it towards finding hierarchies for control of three distinct robotic systems: a simplified biped, a planar manipulator, and a quadcopter. The discovered hierarchies, in comparison to heuristically designed ones, provide improved closed-loop performance or can be computed in minimal time with marginally worse control performance, and also exceed the control performance of policies obtained with popular deep reinforcement learning methods.
期刊介绍:
The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.