可扩展地球观测卫星星座操作的单智能体强化学习

IF 1.3 4区 工程技术 Q2 ENGINEERING, AEROSPACE Journal of Spacecraft and Rockets Pub Date : 2023-11-02 DOI:10.2514/1.a35736
Adam Herrmann, Mark A. Stephenson, Hanspeter Schaub
{"title":"可扩展地球观测卫星星座操作的单智能体强化学习","authors":"Adam Herrmann, Mark A. Stephenson, Hanspeter Schaub","doi":"10.2514/1.a35736","DOIUrl":null,"url":null,"abstract":"This work explores single-agent reinforcement learning for the multi-satellite agile Earth-observing scheduling problem. The objective of the problem is to maximize the weighted sum of imaging targets collected and downlinked while avoiding resource constraint violations on board the spacecraft. To avoid the computational complexity associated with multi-agent deep reinforcement learning while creating a robust and scalable solution, a policy is trained in a single satellite environment. This policy is then deployed on board each satellite in a Walker-delta constellation. A global set of targets is distributed to each satellite based on target access. The satellites communicate with one another to determine whether an imaging target is imaged or downlinked. Free communication, line-of-sight communication, and no communication are explored to determine how the communication assumptions and constellation design impact performance. Free communication is shown to produce the best performance, and no communication is shown to produce the worst performance. Line-of-sight communication performance is shown to depend heavily on the design of the constellation and how frequently the satellites can communicate with one another. To explore how higher-level coordination can impact performance, a centralized mixed-integer programming optimization approach to global target distribution is explored and compared to a decentralized approach. A genetic algorithm is also implemented for comparison purposes, and the proposed method is shown to achieve higher reward on average at a fraction of the computational cost.","PeriodicalId":50048,"journal":{"name":"Journal of Spacecraft and Rockets","volume":"33 2","pages":"0"},"PeriodicalIF":1.3000,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Single-Agent Reinforcement Learning for Scalable Earth-Observing Satellite Constellation Operations\",\"authors\":\"Adam Herrmann, Mark A. Stephenson, Hanspeter Schaub\",\"doi\":\"10.2514/1.a35736\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work explores single-agent reinforcement learning for the multi-satellite agile Earth-observing scheduling problem. The objective of the problem is to maximize the weighted sum of imaging targets collected and downlinked while avoiding resource constraint violations on board the spacecraft. To avoid the computational complexity associated with multi-agent deep reinforcement learning while creating a robust and scalable solution, a policy is trained in a single satellite environment. This policy is then deployed on board each satellite in a Walker-delta constellation. A global set of targets is distributed to each satellite based on target access. The satellites communicate with one another to determine whether an imaging target is imaged or downlinked. Free communication, line-of-sight communication, and no communication are explored to determine how the communication assumptions and constellation design impact performance. Free communication is shown to produce the best performance, and no communication is shown to produce the worst performance. Line-of-sight communication performance is shown to depend heavily on the design of the constellation and how frequently the satellites can communicate with one another. To explore how higher-level coordination can impact performance, a centralized mixed-integer programming optimization approach to global target distribution is explored and compared to a decentralized approach. A genetic algorithm is also implemented for comparison purposes, and the proposed method is shown to achieve higher reward on average at a fraction of the computational cost.\",\"PeriodicalId\":50048,\"journal\":{\"name\":\"Journal of Spacecraft and Rockets\",\"volume\":\"33 2\",\"pages\":\"0\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2023-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Spacecraft and Rockets\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2514/1.a35736\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, AEROSPACE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Spacecraft and Rockets","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2514/1.a35736","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
引用次数: 0

摘要

本文探讨了多卫星敏捷地球观测调度问题的单智能体强化学习。该问题的目标是在避免违反星载资源约束的情况下,最大限度地获取和下行成像目标的加权和。为了避免与多智能体深度强化学习相关的计算复杂性,同时创建一个鲁棒和可扩展的解决方案,在单个卫星环境中训练策略。然后将该策略部署在沃克-三角洲星座的每颗卫星上。基于目标访问,将一组全局目标分配给每颗卫星。卫星之间相互通信,以确定一个成像目标是被成像还是被下行。探讨了自由通信、视距通信和无通信,以确定通信假设和星座设计如何影响性能。自由交流被证明能产生最好的表现,而不交流被证明会产生最差的表现。视距通信性能在很大程度上取决于星座的设计和卫星之间相互通信的频率。为了探索更高级别的协调如何影响性能,我们探索了一种用于全局目标分布的集中式混合整数规划优化方法,并将其与分散式方法进行了比较。为了进行比较,还实现了一种遗传算法,并且所提出的方法被证明在计算成本的一小部分上平均获得更高的回报。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Single-Agent Reinforcement Learning for Scalable Earth-Observing Satellite Constellation Operations
This work explores single-agent reinforcement learning for the multi-satellite agile Earth-observing scheduling problem. The objective of the problem is to maximize the weighted sum of imaging targets collected and downlinked while avoiding resource constraint violations on board the spacecraft. To avoid the computational complexity associated with multi-agent deep reinforcement learning while creating a robust and scalable solution, a policy is trained in a single satellite environment. This policy is then deployed on board each satellite in a Walker-delta constellation. A global set of targets is distributed to each satellite based on target access. The satellites communicate with one another to determine whether an imaging target is imaged or downlinked. Free communication, line-of-sight communication, and no communication are explored to determine how the communication assumptions and constellation design impact performance. Free communication is shown to produce the best performance, and no communication is shown to produce the worst performance. Line-of-sight communication performance is shown to depend heavily on the design of the constellation and how frequently the satellites can communicate with one another. To explore how higher-level coordination can impact performance, a centralized mixed-integer programming optimization approach to global target distribution is explored and compared to a decentralized approach. A genetic algorithm is also implemented for comparison purposes, and the proposed method is shown to achieve higher reward on average at a fraction of the computational cost.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Spacecraft and Rockets
Journal of Spacecraft and Rockets 工程技术-工程:宇航
CiteScore
3.60
自引率
18.80%
发文量
185
审稿时长
4.5 months
期刊介绍: This Journal, that started it all back in 1963, is devoted to the advancement of the science and technology of astronautics and aeronautics through the dissemination of original archival research papers disclosing new theoretical developments and/or experimental result. The topics include aeroacoustics, aerodynamics, combustion, fundamentals of propulsion, fluid mechanics and reacting flows, fundamental aspects of the aerospace environment, hydrodynamics, lasers and associated phenomena, plasmas, research instrumentation and facilities, structural mechanics and materials, optimization, and thermomechanics and thermochemistry. Papers also are sought which review in an intensive manner the results of recent research developments on any of the topics listed above.
期刊最新文献
A systematic review of studies on resilience and risk and protective factors for health among refugee children in Nordic countries. Bayesian Reliability Analysis of the Enhanced Multimission Radioisotope Thermoelectric Generator Clarification: Seeded Hydrogen in Mars Transfer Vehicles Using Nuclear Thermal Propulsion Engines Clarification: Impacts of In-Situ Alternative Propellant on Nuclear Thermal Propulsion Mars Vehicle Architectures Concurrent Design Optimization of Tether-Net System and Actions for Reliable Space-Debris Capture
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1