{"title":"基于深度强化学习的可重复使用火箭动力着陆实时制导","authors":"Linfeng Su, Jinbo Wang, Zhenwei Ma, Hongbo Chen","doi":"10.1109/ICUS55513.2022.9986540","DOIUrl":null,"url":null,"abstract":"Powered landing of reusable rocket is an advanced technology to achieve pinpoint landing (the norm of position error < 5 m and velocity error < 2 m/s) while satisfying a series of highly nonlinear constraints. A major challenge is guaranteeing fuel-optimal and convergence when solving rocket powered landing problem. In this manuscript, a real-time feedback guidance algorithm based on deep reinforcement learning is developed. The proposed method maps state directly to thrust control commands. The first contribution of this paper is to use multi-stage reward function to eliminate the negative effects triggered by design guidance law, thereby significantly enhancing fuel-optimal performance. Another contribution is that a model pre-training framework based on imitation learning is presented to improve model convergence by fitting optimal data. Numerical experiments show that the nearly fuel-optimal trajectories generated by the proposed algorithm successfully achieve pinpoint landing from random initial states.","PeriodicalId":345773,"journal":{"name":"2022 IEEE International Conference on Unmanned Systems (ICUS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Real-time Guidance for Powered Landing of Reusable Rockets via Deep Reinforcement Learning\",\"authors\":\"Linfeng Su, Jinbo Wang, Zhenwei Ma, Hongbo Chen\",\"doi\":\"10.1109/ICUS55513.2022.9986540\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Powered landing of reusable rocket is an advanced technology to achieve pinpoint landing (the norm of position error < 5 m and velocity error < 2 m/s) while satisfying a series of highly nonlinear constraints. A major challenge is guaranteeing fuel-optimal and convergence when solving rocket powered landing problem. In this manuscript, a real-time feedback guidance algorithm based on deep reinforcement learning is developed. The proposed method maps state directly to thrust control commands. The first contribution of this paper is to use multi-stage reward function to eliminate the negative effects triggered by design guidance law, thereby significantly enhancing fuel-optimal performance. Another contribution is that a model pre-training framework based on imitation learning is presented to improve model convergence by fitting optimal data. Numerical experiments show that the nearly fuel-optimal trajectories generated by the proposed algorithm successfully achieve pinpoint landing from random initial states.\",\"PeriodicalId\":345773,\"journal\":{\"name\":\"2022 IEEE International Conference on Unmanned Systems (ICUS)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Unmanned Systems (ICUS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICUS55513.2022.9986540\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Unmanned Systems (ICUS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICUS55513.2022.9986540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Real-time Guidance for Powered Landing of Reusable Rockets via Deep Reinforcement Learning
Powered landing of reusable rocket is an advanced technology to achieve pinpoint landing (the norm of position error < 5 m and velocity error < 2 m/s) while satisfying a series of highly nonlinear constraints. A major challenge is guaranteeing fuel-optimal and convergence when solving rocket powered landing problem. In this manuscript, a real-time feedback guidance algorithm based on deep reinforcement learning is developed. The proposed method maps state directly to thrust control commands. The first contribution of this paper is to use multi-stage reward function to eliminate the negative effects triggered by design guidance law, thereby significantly enhancing fuel-optimal performance. Another contribution is that a model pre-training framework based on imitation learning is presented to improve model convergence by fitting optimal data. Numerical experiments show that the nearly fuel-optimal trajectories generated by the proposed algorithm successfully achieve pinpoint landing from random initial states.