利用近端策略优化实现稳健的太阳帆轨迹

IF 3.1 2区 物理与天体物理 Q1 ENGINEERING, AEROSPACE Acta Astronautica Pub Date : 2024-11-09 DOI:10.1016/j.actaastro.2024.10.065
Christian Bianchi, Lorenzo Niccolai, Giovanni Mengali
{"title":"利用近端策略优化实现稳健的太阳帆轨迹","authors":"Christian Bianchi,&nbsp;Lorenzo Niccolai,&nbsp;Giovanni Mengali","doi":"10.1016/j.actaastro.2024.10.065","DOIUrl":null,"url":null,"abstract":"<div><div>Reinforcement learning is used to design minimum-time trajectories of solar sails subject to the typical sources of uncertainty associated with such a propulsion system, i.e., inaccurate knowledge of the sail’s optical properties and the presence of wrinkles on the sail membrane. A proximal policy optimization (PPO) algorithm is used to train the agent and derive the control policy that associates the optimal sail attitude with each dynamic state. First, the agent is trained assuming deterministic unperturbed dynamics, and the results are compared with optimal solutions found by an indirect optimization method, thus demonstrating the effectiveness of this approach. Next, two stochastic scenarios are analysed. In the first, the optical coefficients of the sail are assumed to be random variables with Gaussian distribution, which leads to random variations in the sail characteristic acceleration. In the second scenario, wrinkles on the sail membrane are taken into account, resulting in a misalignment of the thrust vector with respect to a perfectly smooth surface. Both phenomena are modelled based on experimental measurements available in the literature in order to perform realistic analyses. In the stochastic scenarios, Monte Carlo simulations are performed using the trained policies, demonstrating that the reinforcement learning approach is capable of finding near time-optimal solutions, while also being robust to the sources of uncertainty considered.</div></div>","PeriodicalId":44971,"journal":{"name":"Acta Astronautica","volume":"226 ","pages":"Pages 702-715"},"PeriodicalIF":3.1000,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust solar sail trajectories using proximal policy optimization\",\"authors\":\"Christian Bianchi,&nbsp;Lorenzo Niccolai,&nbsp;Giovanni Mengali\",\"doi\":\"10.1016/j.actaastro.2024.10.065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Reinforcement learning is used to design minimum-time trajectories of solar sails subject to the typical sources of uncertainty associated with such a propulsion system, i.e., inaccurate knowledge of the sail’s optical properties and the presence of wrinkles on the sail membrane. A proximal policy optimization (PPO) algorithm is used to train the agent and derive the control policy that associates the optimal sail attitude with each dynamic state. First, the agent is trained assuming deterministic unperturbed dynamics, and the results are compared with optimal solutions found by an indirect optimization method, thus demonstrating the effectiveness of this approach. Next, two stochastic scenarios are analysed. In the first, the optical coefficients of the sail are assumed to be random variables with Gaussian distribution, which leads to random variations in the sail characteristic acceleration. In the second scenario, wrinkles on the sail membrane are taken into account, resulting in a misalignment of the thrust vector with respect to a perfectly smooth surface. Both phenomena are modelled based on experimental measurements available in the literature in order to perform realistic analyses. In the stochastic scenarios, Monte Carlo simulations are performed using the trained policies, demonstrating that the reinforcement learning approach is capable of finding near time-optimal solutions, while also being robust to the sources of uncertainty considered.</div></div>\",\"PeriodicalId\":44971,\"journal\":{\"name\":\"Acta Astronautica\",\"volume\":\"226 \",\"pages\":\"Pages 702-715\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Astronautica\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0094576524006398\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, AEROSPACE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Astronautica","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0094576524006398","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
引用次数: 0

摘要

强化学习用于设计太阳帆的最短时间轨迹,但这种推进系统具有典型的不确定性来源,即对太阳帆光学特性的不准确了解以及帆膜上是否存在褶皱。我们使用近端策略优化(PPO)算法来训练代理,并推导出将最佳风帆姿态与每个动态状态相关联的控制策略。首先,假设确定性的无扰动动态对代理进行训练,并将结果与间接优化方法找到的最优解进行比较,从而证明这种方法的有效性。接下来,分析了两种随机情况。在第一种情况下,假设风帆的光学系数是高斯分布的随机变量,这会导致风帆特性加速度的随机变化。在第二种情况下,考虑到帆膜上的褶皱会导致推力矢量相对于完全光滑表面的偏差。这两种现象都是根据文献中的实验测量结果建模的,以便进行实际分析。在随机情况下,使用训练有素的策略进行蒙特卡罗模拟,证明强化学习方法能够找到接近时间最优的解决方案,同时对所考虑的不确定性来源具有鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Robust solar sail trajectories using proximal policy optimization
Reinforcement learning is used to design minimum-time trajectories of solar sails subject to the typical sources of uncertainty associated with such a propulsion system, i.e., inaccurate knowledge of the sail’s optical properties and the presence of wrinkles on the sail membrane. A proximal policy optimization (PPO) algorithm is used to train the agent and derive the control policy that associates the optimal sail attitude with each dynamic state. First, the agent is trained assuming deterministic unperturbed dynamics, and the results are compared with optimal solutions found by an indirect optimization method, thus demonstrating the effectiveness of this approach. Next, two stochastic scenarios are analysed. In the first, the optical coefficients of the sail are assumed to be random variables with Gaussian distribution, which leads to random variations in the sail characteristic acceleration. In the second scenario, wrinkles on the sail membrane are taken into account, resulting in a misalignment of the thrust vector with respect to a perfectly smooth surface. Both phenomena are modelled based on experimental measurements available in the literature in order to perform realistic analyses. In the stochastic scenarios, Monte Carlo simulations are performed using the trained policies, demonstrating that the reinforcement learning approach is capable of finding near time-optimal solutions, while also being robust to the sources of uncertainty considered.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Acta Astronautica
Acta Astronautica 工程技术-工程:宇航
CiteScore
7.20
自引率
22.90%
发文量
599
审稿时长
53 days
期刊介绍: Acta Astronautica is sponsored by the International Academy of Astronautics. Content is based on original contributions in all fields of basic, engineering, life and social space sciences and of space technology related to: The peaceful scientific exploration of space, Its exploitation for human welfare and progress, Conception, design, development and operation of space-borne and Earth-based systems, In addition to regular issues, the journal publishes selected proceedings of the annual International Astronautical Congress (IAC), transactions of the IAA and special issues on topics of current interest, such as microgravity, space station technology, geostationary orbits, and space economics. Other subject areas include satellite technology, space transportation and communications, space energy, power and propulsion, astrodynamics, extraterrestrial intelligence and Earth observations.
期刊最新文献
Improving landing stability and terrain adaptability in Lunar exploration with biomimetic lander design and control Vision-based navigation and obstacle detection flight results in SLIM lunar landing On the two approaches for the combustion instability predictions in a long-flame combustor Investigation of discharge voltage characteristics of a lanthanum hexaboride heaterless hollow cathode Effect of particle size on gasification of solid fuel in a low-temperature gas generator
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1