基于近端策略优化的强化学习三维滑模拦截制导

Jianguo Guo;Mengxuan Li;Zongyi Guo;Zhiyong She
{"title":"基于近端策略优化的强化学习三维滑模拦截制导","authors":"Jianguo Guo;Mengxuan Li;Zongyi Guo;Zhiyong She","doi":"10.1109/JMASS.2023.3325054","DOIUrl":null,"url":null,"abstract":"This article proposes a novel 3-D sliding mode interception guidance law for maneuvering targets, which explores the potential of reinforcement learning (RL) techniques to enhance guidance accuracy and reduce chattering. The guidance problem of intercepting maneuvering targets is abstracted into a Markov decision process whose reward function is established to estimate the off-target amount and line-of-sight angular rate chattering. Importantly, a design framework of reward function suitable for general guidance problems based on RL can be proposed. Then, the proximal policy optimization algorithm with a satisfactory training performance is introduced to learn an action policy which represents the observed engagements states to sliding mode interception guidance. Finally, numerical simulations and comparisons are conducted to demonstrate the effectiveness of the proposed guidance law.","PeriodicalId":100624,"journal":{"name":"IEEE Journal on Miniaturization for Air and Space Systems","volume":"4 4","pages":"423-430"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Learning-Based 3-D Sliding Mode Interception Guidance via Proximal Policy Optimization\",\"authors\":\"Jianguo Guo;Mengxuan Li;Zongyi Guo;Zhiyong She\",\"doi\":\"10.1109/JMASS.2023.3325054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article proposes a novel 3-D sliding mode interception guidance law for maneuvering targets, which explores the potential of reinforcement learning (RL) techniques to enhance guidance accuracy and reduce chattering. The guidance problem of intercepting maneuvering targets is abstracted into a Markov decision process whose reward function is established to estimate the off-target amount and line-of-sight angular rate chattering. Importantly, a design framework of reward function suitable for general guidance problems based on RL can be proposed. Then, the proximal policy optimization algorithm with a satisfactory training performance is introduced to learn an action policy which represents the observed engagements states to sliding mode interception guidance. Finally, numerical simulations and comparisons are conducted to demonstrate the effectiveness of the proposed guidance law.\",\"PeriodicalId\":100624,\"journal\":{\"name\":\"IEEE Journal on Miniaturization for Air and Space Systems\",\"volume\":\"4 4\",\"pages\":\"423-430\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal on Miniaturization for Air and Space Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10287104/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal on Miniaturization for Air and Space Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10287104/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种新的机动目标三维滑模拦截制导律,探索了强化学习技术在提高制导精度和减少抖振方面的潜力。将拦截机动目标的制导问题抽象为一个马尔可夫决策过程,并建立奖励函数来估计偏离目标量和视距角速率抖振。重要的是,可以提出一种适用于基于强化学习的一般制导问题的奖励函数设计框架。然后,引入训练性能满意的最近邻策略优化算法,学习一种表示观察到的交战状态的动作策略,用于滑模拦截制导。最后,通过数值仿真和比较验证了所提制导律的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Reinforcement Learning-Based 3-D Sliding Mode Interception Guidance via Proximal Policy Optimization
This article proposes a novel 3-D sliding mode interception guidance law for maneuvering targets, which explores the potential of reinforcement learning (RL) techniques to enhance guidance accuracy and reduce chattering. The guidance problem of intercepting maneuvering targets is abstracted into a Markov decision process whose reward function is established to estimate the off-target amount and line-of-sight angular rate chattering. Importantly, a design framework of reward function suitable for general guidance problems based on RL can be proposed. Then, the proximal policy optimization algorithm with a satisfactory training performance is introduced to learn an action policy which represents the observed engagements states to sliding mode interception guidance. Finally, numerical simulations and comparisons are conducted to demonstrate the effectiveness of the proposed guidance law.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.40
自引率
0.00%
发文量
0
期刊最新文献
2024 Index IEEE Journal on Miniaturization for Air and Space Systems Vol. 5 Table of Contents Front Cover The Journal of Miniaturized Air and Space Systems Broadband Miniaturized Antenna Based on Enhanced Magnetic Field Convergence in UAV
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1