基于深度强化学习的地月空间准周期轨道自主制导

IF 1.3 4区 工程技术 Q2 ENGINEERING, AEROSPACE Journal of Spacecraft and Rockets Pub Date : 2023-08-25 DOI:10.2514/1.a35747
Lorenzo Federici, A. Scorsoglio, Alessandro Zavoli, R. Furfaro
{"title":"基于深度强化学习的地月空间准周期轨道自主制导","authors":"Lorenzo Federici, A. Scorsoglio, Alessandro Zavoli, R. Furfaro","doi":"10.2514/1.a35747","DOIUrl":null,"url":null,"abstract":"This paper investigates the use of reinforcement learning for the fuel-optimal guidance of a spacecraft during a time-free low-thrust transfer between two libration point orbits in the cislunar environment. To this aim, a deep neural network is trained via proximal policy optimization to map any spacecraft state to the optimal control action. A general-purpose reward is used to guide the network toward a fuel-optimal control law, regardless of the specific pair of libration orbits considered and without the use of any ad hoc reward shaping technique. Eventually, the learned control policies are compared with the optimal solutions provided by a direct method in two different mission scenarios, and Monte Carlo simulations are used to assess the policies’ robustness to navigation uncertainties.","PeriodicalId":50048,"journal":{"name":"Journal of Spacecraft and Rockets","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Autonomous Guidance Between Quasiperiodic Orbits in Cislunar Space via Deep Reinforcement Learning\",\"authors\":\"Lorenzo Federici, A. Scorsoglio, Alessandro Zavoli, R. Furfaro\",\"doi\":\"10.2514/1.a35747\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper investigates the use of reinforcement learning for the fuel-optimal guidance of a spacecraft during a time-free low-thrust transfer between two libration point orbits in the cislunar environment. To this aim, a deep neural network is trained via proximal policy optimization to map any spacecraft state to the optimal control action. A general-purpose reward is used to guide the network toward a fuel-optimal control law, regardless of the specific pair of libration orbits considered and without the use of any ad hoc reward shaping technique. Eventually, the learned control policies are compared with the optimal solutions provided by a direct method in two different mission scenarios, and Monte Carlo simulations are used to assess the policies’ robustness to navigation uncertainties.\",\"PeriodicalId\":50048,\"journal\":{\"name\":\"Journal of Spacecraft and Rockets\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2023-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Spacecraft and Rockets\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.2514/1.a35747\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, AEROSPACE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Spacecraft and Rockets","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.2514/1.a35747","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
引用次数: 0

摘要

本文研究了在地月环境下,航天器在无时低推力轨道间转移时的燃料最优制导问题。为此,通过近端策略优化训练深度神经网络,将航天器的任何状态映射到最优控制动作。使用通用奖励来引导网络走向燃料最优控制律,而不考虑特定的振动轨道对,也不使用任何特殊的奖励塑造技术。最后,在两种不同的任务场景下,将学习到的控制策略与直接方法提供的最优解进行比较,并利用蒙特卡罗仿真来评估策略对导航不确定性的鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Autonomous Guidance Between Quasiperiodic Orbits in Cislunar Space via Deep Reinforcement Learning
This paper investigates the use of reinforcement learning for the fuel-optimal guidance of a spacecraft during a time-free low-thrust transfer between two libration point orbits in the cislunar environment. To this aim, a deep neural network is trained via proximal policy optimization to map any spacecraft state to the optimal control action. A general-purpose reward is used to guide the network toward a fuel-optimal control law, regardless of the specific pair of libration orbits considered and without the use of any ad hoc reward shaping technique. Eventually, the learned control policies are compared with the optimal solutions provided by a direct method in two different mission scenarios, and Monte Carlo simulations are used to assess the policies’ robustness to navigation uncertainties.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Spacecraft and Rockets
Journal of Spacecraft and Rockets 工程技术-工程:宇航
CiteScore
3.60
自引率
18.80%
发文量
185
审稿时长
4.5 months
期刊介绍: This Journal, that started it all back in 1963, is devoted to the advancement of the science and technology of astronautics and aeronautics through the dissemination of original archival research papers disclosing new theoretical developments and/or experimental result. The topics include aeroacoustics, aerodynamics, combustion, fundamentals of propulsion, fluid mechanics and reacting flows, fundamental aspects of the aerospace environment, hydrodynamics, lasers and associated phenomena, plasmas, research instrumentation and facilities, structural mechanics and materials, optimization, and thermomechanics and thermochemistry. Papers also are sought which review in an intensive manner the results of recent research developments on any of the topics listed above.
期刊最新文献
A systematic review of studies on resilience and risk and protective factors for health among refugee children in Nordic countries. Bayesian Reliability Analysis of the Enhanced Multimission Radioisotope Thermoelectric Generator Clarification: Seeded Hydrogen in Mars Transfer Vehicles Using Nuclear Thermal Propulsion Engines Clarification: Impacts of In-Situ Alternative Propellant on Nuclear Thermal Propulsion Mars Vehicle Architectures Concurrent Design Optimization of Tether-Net System and Actions for Reliable Space-Debris Capture
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1