{"title":"Real-time Guidance for Powered Landing of Reusable Rockets via Deep Reinforcement Learning","authors":"Linfeng Su, Jinbo Wang, Zhenwei Ma, Hongbo Chen","doi":"10.1109/ICUS55513.2022.9986540","DOIUrl":null,"url":null,"abstract":"Powered landing of reusable rocket is an advanced technology to achieve pinpoint landing (the norm of position error < 5 m and velocity error < 2 m/s) while satisfying a series of highly nonlinear constraints. A major challenge is guaranteeing fuel-optimal and convergence when solving rocket powered landing problem. In this manuscript, a real-time feedback guidance algorithm based on deep reinforcement learning is developed. The proposed method maps state directly to thrust control commands. The first contribution of this paper is to use multi-stage reward function to eliminate the negative effects triggered by design guidance law, thereby significantly enhancing fuel-optimal performance. Another contribution is that a model pre-training framework based on imitation learning is presented to improve model convergence by fitting optimal data. Numerical experiments show that the nearly fuel-optimal trajectories generated by the proposed algorithm successfully achieve pinpoint landing from random initial states.","PeriodicalId":345773,"journal":{"name":"2022 IEEE International Conference on Unmanned Systems (ICUS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Unmanned Systems (ICUS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICUS55513.2022.9986540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Powered landing of reusable rocket is an advanced technology to achieve pinpoint landing (the norm of position error < 5 m and velocity error < 2 m/s) while satisfying a series of highly nonlinear constraints. A major challenge is guaranteeing fuel-optimal and convergence when solving rocket powered landing problem. In this manuscript, a real-time feedback guidance algorithm based on deep reinforcement learning is developed. The proposed method maps state directly to thrust control commands. The first contribution of this paper is to use multi-stage reward function to eliminate the negative effects triggered by design guidance law, thereby significantly enhancing fuel-optimal performance. Another contribution is that a model pre-training framework based on imitation learning is presented to improve model convergence by fitting optimal data. Numerical experiments show that the nearly fuel-optimal trajectories generated by the proposed algorithm successfully achieve pinpoint landing from random initial states.