三维空间双人零和博弈的增强型 LSTM-DQN 算法

Bo Lu, L. Ru, Maolong Lv, Shiguang Hu, Hongguo Zhang, Zilong Zhao
{"title":"三维空间双人零和博弈的增强型 LSTM-DQN 算法","authors":"Bo Lu, L. Ru, Maolong Lv, Shiguang Hu, Hongguo Zhang, Zilong Zhao","doi":"10.1049/cth2.12677","DOIUrl":null,"url":null,"abstract":"To tackle the challenges presented by the two‐player zero sum game (TZSG) in three‐dimensional space, this study introduces an enhanced deep Q‐learning (DQN) algorithm that utilizes long short term memory (LSTM) network. The primary objective of this algorithm is to enhance the temporal correlation of the TZSG in three‐dimensional space. Additionally, it incorporates the hindsight experience replay (HER) mechanism to improve the learning efficiency of the network and mitigate the issue of the “sparse reward” that arises from prolonged training of intelligence in solving the TZSG in the three‐dimensional. Furthermore, this method enhances the convergence and stability of the overall solution.An intelligent training environment centred around an airborne agent and its mutual pursuit interaction scenario was designed to proposed approach's effectiveness. The algorithm training and comparison results show that the LSTM‐DQN‐HER algorithm outperforms similar algorithm in solving the TZSG in three‐dimensional space. In conclusion, this paper presents an improved DQN algorithm based on LSTM and incorporates the HER mechanism to address the challenges posed by the TZSG in three‐dimensional space. The proposed algorithm enhances the solution's temporal correlation, learning efficiency, convergence, and stability. The simulation results confirm its superior performance in solving the TZSG in three‐dimensional space.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":"91 16","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced LSTM‐DQN algorithm for a two‐player zero‐sum game in three‐dimensional space\",\"authors\":\"Bo Lu, L. Ru, Maolong Lv, Shiguang Hu, Hongguo Zhang, Zilong Zhao\",\"doi\":\"10.1049/cth2.12677\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To tackle the challenges presented by the two‐player zero sum game (TZSG) in three‐dimensional space, this study introduces an enhanced deep Q‐learning (DQN) algorithm that utilizes long short term memory (LSTM) network. The primary objective of this algorithm is to enhance the temporal correlation of the TZSG in three‐dimensional space. Additionally, it incorporates the hindsight experience replay (HER) mechanism to improve the learning efficiency of the network and mitigate the issue of the “sparse reward” that arises from prolonged training of intelligence in solving the TZSG in the three‐dimensional. Furthermore, this method enhances the convergence and stability of the overall solution.An intelligent training environment centred around an airborne agent and its mutual pursuit interaction scenario was designed to proposed approach's effectiveness. The algorithm training and comparison results show that the LSTM‐DQN‐HER algorithm outperforms similar algorithm in solving the TZSG in three‐dimensional space. In conclusion, this paper presents an improved DQN algorithm based on LSTM and incorporates the HER mechanism to address the challenges posed by the TZSG in three‐dimensional space. The proposed algorithm enhances the solution's temporal correlation, learning efficiency, convergence, and stability. The simulation results confirm its superior performance in solving the TZSG in three‐dimensional space.\",\"PeriodicalId\":502998,\"journal\":{\"name\":\"IET Control Theory & Applications\",\"volume\":\"91 16\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Control Theory & Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/cth2.12677\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Control Theory & Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/cth2.12677","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

为应对三维空间中的双人零和博弈(TZSG)所带来的挑战,本研究引入了一种利用长短期记忆(LSTM)网络的增强型深度 Q-learning (DQN)算法。该算法的主要目标是增强三维空间中 TZSG 的时间相关性。此外,它还结合了事后经验重放(HER)机制,以提高网络的学习效率,并缓解在解决三维空间中的 TZSG 时,由于长时间的智能训练而产生的 "奖励稀疏 "问题。此外,该方法还增强了整体求解的收敛性和稳定性。为了验证所提方法的有效性,我们设计了一个以机载代理及其相互追逐交互场景为中心的智能训练环境。算法训练和对比结果表明,LSTM-DQN-HER 算法在求解三维空间中的 TZSG 时优于同类算法。总之,本文提出了一种基于 LSTM 并结合 HER 机制的改进 DQN 算法,以解决三维空间中的 TZSG 所带来的挑战。所提出的算法增强了解的时间相关性、学习效率、收敛性和稳定性。仿真结果证实了该算法在求解三维空间中的 TZSG 时的卓越性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Enhanced LSTM‐DQN algorithm for a two‐player zero‐sum game in three‐dimensional space
To tackle the challenges presented by the two‐player zero sum game (TZSG) in three‐dimensional space, this study introduces an enhanced deep Q‐learning (DQN) algorithm that utilizes long short term memory (LSTM) network. The primary objective of this algorithm is to enhance the temporal correlation of the TZSG in three‐dimensional space. Additionally, it incorporates the hindsight experience replay (HER) mechanism to improve the learning efficiency of the network and mitigate the issue of the “sparse reward” that arises from prolonged training of intelligence in solving the TZSG in the three‐dimensional. Furthermore, this method enhances the convergence and stability of the overall solution.An intelligent training environment centred around an airborne agent and its mutual pursuit interaction scenario was designed to proposed approach's effectiveness. The algorithm training and comparison results show that the LSTM‐DQN‐HER algorithm outperforms similar algorithm in solving the TZSG in three‐dimensional space. In conclusion, this paper presents an improved DQN algorithm based on LSTM and incorporates the HER mechanism to address the challenges posed by the TZSG in three‐dimensional space. The proposed algorithm enhances the solution's temporal correlation, learning efficiency, convergence, and stability. The simulation results confirm its superior performance in solving the TZSG in three‐dimensional space.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Innovative hull cleaning robot design and control by Laguerre base model predictive control for impedance and vibration management A hybrid energy storage array group control strategy for wind power smoothing Optimal data injection attack design for spacecraft systems via a model free Q‐learning approach Smart line planning method for power transmission based on D3QN‐PER algorithm Lightweight environment sensing algorithm for intelligent driving based on improved YOLOv7
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1