{"title":"A Reinforcement Learning Method for Quadrotor Attitude Control Based on Expert Information","authors":"Yalu Zhu, Shi Lian, WenTao Zhong, Wei Meng","doi":"10.1109/CACRE58689.2023.10208497","DOIUrl":null,"url":null,"abstract":"In this paper, a model-free reinforcement learning(RL) method of training a nonlinear attitude controller of a quadrotor is proposed. For the problem that the attitude controller is uncontrolled when trained by RL directly, the proposed method utilizes an expert to provide the prior information, i.e. the action’s judgement and suggestion, to guide the updating process. For the problem that the policy falls in local optima by the limitation of the expert, the proposed method maximize the entropy of the strategy to increase the exploratory behavior of the nonlinear attitude controller approximator. Furthermore, We employ the Proximal policy optimization algorithm (PPO) as the RL model and PID algorithm as the expert model to approach an exact attitude controller of a quadrotor based on the proposed method. Finally, the simulations experiments has been conducted to verify that our proposed method can train a true nonlinear attitude controller which has a better performance than the expert.","PeriodicalId":447007,"journal":{"name":"2023 8th International Conference on Automation, Control and Robotics Engineering (CACRE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 8th International Conference on Automation, Control and Robotics Engineering (CACRE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CACRE58689.2023.10208497","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, a model-free reinforcement learning(RL) method of training a nonlinear attitude controller of a quadrotor is proposed. For the problem that the attitude controller is uncontrolled when trained by RL directly, the proposed method utilizes an expert to provide the prior information, i.e. the action’s judgement and suggestion, to guide the updating process. For the problem that the policy falls in local optima by the limitation of the expert, the proposed method maximize the entropy of the strategy to increase the exploratory behavior of the nonlinear attitude controller approximator. Furthermore, We employ the Proximal policy optimization algorithm (PPO) as the RL model and PID algorithm as the expert model to approach an exact attitude controller of a quadrotor based on the proposed method. Finally, the simulations experiments has been conducted to verify that our proposed method can train a true nonlinear attitude controller which has a better performance than the expert.