多约束条件下基于强化学习的卫星编队姿态控制

IF 2.8 3区 地球科学 Q2 ASTRONOMY & ASTROPHYSICS Advances in Space Research Pub Date : 2024-08-03 DOI:10.1016/j.asr.2024.07.084
Yingkai Cai , Kay-Soon Low , Zhaokui Wang
{"title":"多约束条件下基于强化学习的卫星编队姿态控制","authors":"Yingkai Cai ,&nbsp;Kay-Soon Low ,&nbsp;Zhaokui Wang","doi":"10.1016/j.asr.2024.07.084","DOIUrl":null,"url":null,"abstract":"<div><div>As the complexity of space missions increases, the constraints on satellite attitude control become more stringent, particularly for satellites working in orbit formation. This paper introduces a novel method, based on the categorization and modeling of different constraints, for attitude control of satellite formations under multiple constraints. The method employs the Phased Priority Reinforcement Learning (PPRL) approach, which utilizes Deep Deterministic Policy Gradient (DDPG) technology. Considering the complexity of constraints and the challenge posed by the high control dimensionality due to multi-satellite coordination, the method addresses these challenges through a two-step training strategy. The first step addresses the multi-constraint issue for individual satellites and increases the priority of single-satellite training experience data in the experience replay buffer of the second step to enhance data utilization efficiency. To address the issue of reward sparsity in complex high-dimensional constraint models, a detailed reward mechanism is proposed, incorporating both local and global constraints into the reward function, thereby achieving both efficient and effective attitude control. This approach not only meets dynamic, state, and performance constraints but also demonstrates adaptability and robustness through numerical simulations. Compared to traditional methods, this approach achieves significant improvements in control performance and constraint satisfaction, offering a novel solution pathway for high-dimensional control problems in multi-constraint satellite formations.</div></div>","PeriodicalId":50850,"journal":{"name":"Advances in Space Research","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement learning-based satellite formation attitude control under multi-constraint\",\"authors\":\"Yingkai Cai ,&nbsp;Kay-Soon Low ,&nbsp;Zhaokui Wang\",\"doi\":\"10.1016/j.asr.2024.07.084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>As the complexity of space missions increases, the constraints on satellite attitude control become more stringent, particularly for satellites working in orbit formation. This paper introduces a novel method, based on the categorization and modeling of different constraints, for attitude control of satellite formations under multiple constraints. The method employs the Phased Priority Reinforcement Learning (PPRL) approach, which utilizes Deep Deterministic Policy Gradient (DDPG) technology. Considering the complexity of constraints and the challenge posed by the high control dimensionality due to multi-satellite coordination, the method addresses these challenges through a two-step training strategy. The first step addresses the multi-constraint issue for individual satellites and increases the priority of single-satellite training experience data in the experience replay buffer of the second step to enhance data utilization efficiency. To address the issue of reward sparsity in complex high-dimensional constraint models, a detailed reward mechanism is proposed, incorporating both local and global constraints into the reward function, thereby achieving both efficient and effective attitude control. This approach not only meets dynamic, state, and performance constraints but also demonstrates adaptability and robustness through numerical simulations. Compared to traditional methods, this approach achieves significant improvements in control performance and constraint satisfaction, offering a novel solution pathway for high-dimensional control problems in multi-constraint satellite formations.</div></div>\",\"PeriodicalId\":50850,\"journal\":{\"name\":\"Advances in Space Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Space Research\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0273117724008032\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ASTRONOMY & ASTROPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Space Research","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0273117724008032","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0

摘要

随着空间任务复杂性的增加,对卫星姿态控制的限制也越来越严格,特别是对在轨编队的卫星。本文介绍了一种基于不同约束条件分类和建模的新方法,用于多约束条件下的卫星编队姿态控制。该方法采用了分阶段优先强化学习(PPRL)方法,利用了深度确定性策略梯度(DDPG)技术。考虑到约束条件的复杂性和多卫星协调带来的高控制维度挑战,该方法通过两步训练策略来应对这些挑战。第一步解决单个卫星的多约束问题,并提高单个卫星训练经验数据在第二步经验重放缓冲区中的优先级,以提高数据利用效率。针对复杂高维约束模型中的奖励稀疏性问题,提出了详细的奖励机制,将局部约束和全局约束同时纳入奖励函数,从而实现高效和有效的姿态控制。这种方法不仅能满足动态、状态和性能约束,还能通过数值模拟证明其适应性和鲁棒性。与传统方法相比,这种方法在控制性能和约束满足方面都有显著改进,为多约束卫星编队中的高维控制问题提供了一种新的解决途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Reinforcement learning-based satellite formation attitude control under multi-constraint
As the complexity of space missions increases, the constraints on satellite attitude control become more stringent, particularly for satellites working in orbit formation. This paper introduces a novel method, based on the categorization and modeling of different constraints, for attitude control of satellite formations under multiple constraints. The method employs the Phased Priority Reinforcement Learning (PPRL) approach, which utilizes Deep Deterministic Policy Gradient (DDPG) technology. Considering the complexity of constraints and the challenge posed by the high control dimensionality due to multi-satellite coordination, the method addresses these challenges through a two-step training strategy. The first step addresses the multi-constraint issue for individual satellites and increases the priority of single-satellite training experience data in the experience replay buffer of the second step to enhance data utilization efficiency. To address the issue of reward sparsity in complex high-dimensional constraint models, a detailed reward mechanism is proposed, incorporating both local and global constraints into the reward function, thereby achieving both efficient and effective attitude control. This approach not only meets dynamic, state, and performance constraints but also demonstrates adaptability and robustness through numerical simulations. Compared to traditional methods, this approach achieves significant improvements in control performance and constraint satisfaction, offering a novel solution pathway for high-dimensional control problems in multi-constraint satellite formations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Advances in Space Research
Advances in Space Research 地学天文-地球科学综合
CiteScore
5.20
自引率
11.50%
发文量
800
审稿时长
5.8 months
期刊介绍: The COSPAR publication Advances in Space Research (ASR) is an open journal covering all areas of space research including: space studies of the Earth''s surface, meteorology, climate, the Earth-Moon system, planets and small bodies of the solar system, upper atmospheres, ionospheres and magnetospheres of the Earth and planets including reference atmospheres, space plasmas in the solar system, astrophysics from space, materials sciences in space, fundamental physics in space, space debris, space weather, Earth observations of space phenomena, etc. NB: Please note that manuscripts related to life sciences as related to space are no more accepted for submission to Advances in Space Research. Such manuscripts should now be submitted to the new COSPAR Journal Life Sciences in Space Research (LSSR). All submissions are reviewed by two scientists in the field. COSPAR is an interdisciplinary scientific organization concerned with the progress of space research on an international scale. Operating under the rules of ICSU, COSPAR ignores political considerations and considers all questions solely from the scientific viewpoint.
期刊最新文献
Periodic orbits around 216-Kleopatra asteroid modelled by a dipole-segment Preface: Lunar environment effects resulting from human exploration and occupation of the Moon Corrigendum to “Np-Fe0 addition affects the microstructure and composition of the microwave-sintered lunar soil simulant CLRS-2” [Adv. Space Res. 73(1) (2024) 945–957] A review of global long-term changes in the mesosphere, thermosphere and ionosphere: A starting point for inclusion in (semi-) empirical models THEMIS observations of compressional Pc5 pulsations in the dawn- and duskside magnetosphere
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1