Predictive air combat decision model with segmented reward allocation

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Complex & Intelligent Systems Pub Date : 2024-07-22 DOI:10.1007/s40747-024-01556-3
Yundi Li, Yinlong Yuan, Yun Cheng, Liang Hua
{"title":"Predictive air combat decision model with segmented reward allocation","authors":"Yundi Li, Yinlong Yuan, Yun Cheng, Liang Hua","doi":"10.1007/s40747-024-01556-3","DOIUrl":null,"url":null,"abstract":"<p>In air combat missions, unmanned combat aerial vehicles (UCAVs) must take strategic actions to establish combat advantages, enabling effective tracking and attacking of enemy UCAVs. Currently, a lot of reinforcement learning algorithms are applied to the air combat mission of unmanned fighter aircraft. However, most algorithms can only select policies based on the current state of both sides. This leads to the inability to effectively track and attack when the enemy performs large angle maneuvering. Additionally, these algorithms cannot adapt to different situations, resulting in the unmanned fighter aircraft being at a disadvantage in some cases. To solve these problems, this paper proposes predictive air combat decision model with segmented reward allocation for air combat tracking and attacking. On the basis of the air combat environment, we propose the prediction soft actor-critic (Pre-SAC) algorithm, which combines the prediction of enemy states with the states of UCAV for model training. This enables the UCAV to predict the next move of the enemy UCAV in advance and establish a greater air combat advantage for us. Furthermore, by adopting a segmented reward allocation model and combining it with the Pre-SAC algorithm, we propose the segmented reward allocation soft actor-critic (Sra-SAC) algorithm, which solves the problem of UCAVs being unable to adapt to different situations. The results show that the prediction-based segmented reward allocation the Sra-SAC algorithm outperforms the traditional soft actor-critic (SAC) algorithm in terms of overall reward, travel distance, and relative advantage.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01556-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In air combat missions, unmanned combat aerial vehicles (UCAVs) must take strategic actions to establish combat advantages, enabling effective tracking and attacking of enemy UCAVs. Currently, a lot of reinforcement learning algorithms are applied to the air combat mission of unmanned fighter aircraft. However, most algorithms can only select policies based on the current state of both sides. This leads to the inability to effectively track and attack when the enemy performs large angle maneuvering. Additionally, these algorithms cannot adapt to different situations, resulting in the unmanned fighter aircraft being at a disadvantage in some cases. To solve these problems, this paper proposes predictive air combat decision model with segmented reward allocation for air combat tracking and attacking. On the basis of the air combat environment, we propose the prediction soft actor-critic (Pre-SAC) algorithm, which combines the prediction of enemy states with the states of UCAV for model training. This enables the UCAV to predict the next move of the enemy UCAV in advance and establish a greater air combat advantage for us. Furthermore, by adopting a segmented reward allocation model and combining it with the Pre-SAC algorithm, we propose the segmented reward allocation soft actor-critic (Sra-SAC) algorithm, which solves the problem of UCAVs being unable to adapt to different situations. The results show that the prediction-based segmented reward allocation the Sra-SAC algorithm outperforms the traditional soft actor-critic (SAC) algorithm in terms of overall reward, travel distance, and relative advantage.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
采用分段奖励分配的预测性空战决策模型
在执行空战任务时,无人战斗飞行器(UCAV)必须采取战略行动以建立作战优势,从而有效地跟踪和攻击敌方 UCAV。目前,大量强化学习算法被应用于无人战斗机的空战任务。然而,大多数算法只能根据双方的当前状态选择策略。这导致在敌方进行大角度机动时,无法进行有效的跟踪和攻击。此外,这些算法无法适应不同的情况,导致无人战斗机在某些情况下处于劣势。为了解决这些问题,本文提出了具有分段奖励分配的预测性空战决策模型,用于空战跟踪和攻击。在空战环境的基础上,我们提出了预测软行为批判(Pre-SAC)算法,将敌方状态预测与 UCAV 状态预测相结合进行模型训练。这样,UCAV 就能提前预测敌方 UCAV 的下一步行动,为我方建立更大的空战优势。此外,通过采用分段奖励分配模型并与Pre-SAC算法相结合,我们提出了分段奖励分配软行为批判(Sra-SAC)算法,解决了UCAV无法适应不同情况的问题。结果表明,基于预测的分段奖励分配 Sra-SAC 算法在总体奖励、行进距离和相对优势方面都优于传统的软演员批评(SAC)算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Complex & Intelligent Systems
Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
9.60
自引率
10.30%
发文量
297
期刊介绍: Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.
期刊最新文献
A spherical Z-number multi-attribute group decision making model based on the prospect theory and GLDS method Integration of a novel 3D chaotic map with ELSS and novel cross-border pixel exchange strategy for secure image communication A collision-free transition path planning method for placement robots in complex environments SAGB: self-attention with gate and BiGRU network for intrusion detection Enhanced EDAS methodology for multiple-criteria group decision analysis utilizing linguistic q-rung orthopair fuzzy hamacher aggregation operators
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1