Reinforcement learning-based satellite formation attitude control under multi-constraint

IF 2.8 3区地球科学 Q2 ASTRONOMY & ASTROPHYSICS Advances in Space Research Pub Date : 2024-12-01 Epub Date: 2024-08-03 DOI:10.1016/j.asr.2024.07.084

Yingkai Cai , Kay-Soon Low , Zhaokui Wang

{"title":"Reinforcement learning-based satellite formation attitude control under multi-constraint","authors":"Yingkai Cai , Kay-Soon Low , Zhaokui Wang","doi":"10.1016/j.asr.2024.07.084","DOIUrl":null,"url":null,"abstract":"<div><div>As the complexity of space missions increases, the constraints on satellite attitude control become more stringent, particularly for satellites working in orbit formation. This paper introduces a novel method, based on the categorization and modeling of different constraints, for attitude control of satellite formations under multiple constraints. The method employs the Phased Priority Reinforcement Learning (PPRL) approach, which utilizes Deep Deterministic Policy Gradient (DDPG) technology. Considering the complexity of constraints and the challenge posed by the high control dimensionality due to multi-satellite coordination, the method addresses these challenges through a two-step training strategy. The first step addresses the multi-constraint issue for individual satellites and increases the priority of single-satellite training experience data in the experience replay buffer of the second step to enhance data utilization efficiency. To address the issue of reward sparsity in complex high-dimensional constraint models, a detailed reward mechanism is proposed, incorporating both local and global constraints into the reward function, thereby achieving both efficient and effective attitude control. This approach not only meets dynamic, state, and performance constraints but also demonstrates adaptability and robustness through numerical simulations. Compared to traditional methods, this approach achieves significant improvements in control performance and constraint satisfaction, offering a novel solution pathway for high-dimensional control problems in multi-constraint satellite formations.</div></div>","PeriodicalId":50850,"journal":{"name":"Advances in Space Research","volume":"74 11","pages":"Pages 5819-5836"},"PeriodicalIF":2.8000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Space Research","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0273117724008032","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/3 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}

引用次数: 0

Abstract

As the complexity of space missions increases, the constraints on satellite attitude control become more stringent, particularly for satellites working in orbit formation. This paper introduces a novel method, based on the categorization and modeling of different constraints, for attitude control of satellite formations under multiple constraints. The method employs the Phased Priority Reinforcement Learning (PPRL) approach, which utilizes Deep Deterministic Policy Gradient (DDPG) technology. Considering the complexity of constraints and the challenge posed by the high control dimensionality due to multi-satellite coordination, the method addresses these challenges through a two-step training strategy. The first step addresses the multi-constraint issue for individual satellites and increases the priority of single-satellite training experience data in the experience replay buffer of the second step to enhance data utilization efficiency. To address the issue of reward sparsity in complex high-dimensional constraint models, a detailed reward mechanism is proposed, incorporating both local and global constraints into the reward function, thereby achieving both efficient and effective attitude control. This approach not only meets dynamic, state, and performance constraints but also demonstrates adaptability and robustness through numerical simulations. Compared to traditional methods, this approach achieves significant improvements in control performance and constraint satisfaction, offering a novel solution pathway for high-dimensional control problems in multi-constraint satellite formations.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

多约束条件下基于强化学习的卫星编队姿态控制

随着空间任务复杂性的增加，对卫星姿态控制的限制也越来越严格，特别是对在轨编队的卫星。本文介绍了一种基于不同约束条件分类和建模的新方法，用于多约束条件下的卫星编队姿态控制。该方法采用了分阶段优先强化学习（PPRL）方法，利用了深度确定性策略梯度（DDPG）技术。考虑到约束条件的复杂性和多卫星协调带来的高控制维度挑战，该方法通过两步训练策略来应对这些挑战。第一步解决单个卫星的多约束问题，并提高单个卫星训练经验数据在第二步经验重放缓冲区中的优先级，以提高数据利用效率。针对复杂高维约束模型中的奖励稀疏性问题，提出了详细的奖励机制，将局部约束和全局约束同时纳入奖励函数，从而实现高效和有效的姿态控制。这种方法不仅能满足动态、状态和性能约束，还能通过数值模拟证明其适应性和鲁棒性。与传统方法相比，这种方法在控制性能和约束满足方面都有显著改进，为多约束卫星编队中的高维控制问题提供了一种新的解决途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Advances in Space Research 地学天文-地球科学综合

CiteScore

5.20

自引率

11.50%

发文量

800

审稿时长

5.8 months

期刊介绍： The COSPAR publication Advances in Space Research (ASR) is an open journal covering all areas of space research including: space studies of the Earth''s surface, meteorology, climate, the Earth-Moon system, planets and small bodies of the solar system, upper atmospheres, ionospheres and magnetospheres of the Earth and planets including reference atmospheres, space plasmas in the solar system, astrophysics from space, materials sciences in space, fundamental physics in space, space debris, space weather, Earth observations of space phenomena, etc. NB: Please note that manuscripts related to life sciences as related to space are no more accepted for submission to Advances in Space Research. Such manuscripts should now be submitted to the new COSPAR Journal Life Sciences in Space Research (LSSR). All submissions are reviewed by two scientists in the field. COSPAR is an interdisciplinary scientific organization concerned with the progress of space research on an international scale. Operating under the rules of ICSU, COSPAR ignores political considerations and considers all questions solely from the scientific viewpoint.