Many-Versus-Many AUV Attack-Defense Game in 3-D Scenarios Using Hierarchical Multiagent Reinforcement Learning

IF 8.9 1区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Internet of Things Journal Pub Date : 2025-03-17 DOI:10.1109/JIOT.2025.3552116
Wenhao Gan;Lei Qiao
{"title":"Many-Versus-Many AUV Attack-Defense Game in 3-D Scenarios Using Hierarchical Multiagent Reinforcement Learning","authors":"Wenhao Gan;Lei Qiao","doi":"10.1109/JIOT.2025.3552116","DOIUrl":null,"url":null,"abstract":"This article proposes a deep reinforcement learning (DRL)-based method for many-versus-many attack-defense games involving autonomous underwater vehicles (AUVs) in 3-D space, focusing on training a defense team to counter attackers. The attackers benefit from speed and unpredictability, while defenders leverage numerical superiority. The scenario includes irregular terrain, and AUVs are limited by low-frequency communication. First, a constrained Apollonius model considering AUV 3-D motion characteristics is developed to evaluate the repulsive effect of defenders on attackers. Second, a hybrid 3-D AUV maneuvering framework integrating end-to-velocity and hierarchical approaches is proposed to reduce the complexity of decision-making strategy learning, enabling AUVs to counter multiattacker threats and learn repulsion strategies across subteams. Third, a scalable learning architecture is designed to adapt to different game scales, with an improved update method to enhance advantage and credit estimation efficiency while ensuring convergence. The combination of population expansion-curriculum training and asynchronous parallel training strengthens the generalization of strategies across various environments. Finally, through comparative analysis with mainstream multiagent deep reinforcement learning-based methods, as well as ablation studies on the framework and rewards, our scheme demonstrates superior learning efficiency and generalization ability. Adversarial experiments across different game scales, along with specialized performance tests, indicate that the defense group exhibits strong robustness and adaptive characteristics.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 13","pages":"23479-23494"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10930426/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

This article proposes a deep reinforcement learning (DRL)-based method for many-versus-many attack-defense games involving autonomous underwater vehicles (AUVs) in 3-D space, focusing on training a defense team to counter attackers. The attackers benefit from speed and unpredictability, while defenders leverage numerical superiority. The scenario includes irregular terrain, and AUVs are limited by low-frequency communication. First, a constrained Apollonius model considering AUV 3-D motion characteristics is developed to evaluate the repulsive effect of defenders on attackers. Second, a hybrid 3-D AUV maneuvering framework integrating end-to-velocity and hierarchical approaches is proposed to reduce the complexity of decision-making strategy learning, enabling AUVs to counter multiattacker threats and learn repulsion strategies across subteams. Third, a scalable learning architecture is designed to adapt to different game scales, with an improved update method to enhance advantage and credit estimation efficiency while ensuring convergence. The combination of population expansion-curriculum training and asynchronous parallel training strengthens the generalization of strategies across various environments. Finally, through comparative analysis with mainstream multiagent deep reinforcement learning-based methods, as well as ablation studies on the framework and rewards, our scheme demonstrates superior learning efficiency and generalization ability. Adversarial experiments across different game scales, along with specialized performance tests, indicate that the defense group exhibits strong robustness and adaptive characteristics.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用分层多代理强化学习的三维场景中多对多 UUV 攻防游戏
本文提出了一种基于深度强化学习(DRL)的方法,用于涉及自主水下航行器(auv)在三维空间中的多对多攻防博弈,重点是训练防御团队对抗攻击者。攻击者受益于速度和不可预测性,而防御者则利用数量优势。该场景包括不规则地形,并且auv受到低频通信的限制。首先,建立了考虑水下机器人三维运动特征的受限Apollonius模型,以评估防御者对攻击者的排斥效应。其次,提出了一种集成端速度和分层方法的混合3d AUV机动框架,以降低决策策略学习的复杂性,使AUV能够应对多攻击者的威胁并学习跨子团队的排斥策略。第三,设计了一种可扩展的学习架构,以适应不同的博弈规模,并改进了更新方法,在保证收敛的同时增强了优势和信用估计效率。人口扩张课程训练与异步并行训练相结合,加强了策略在不同环境下的泛化。最后,通过与主流的基于多智能体深度强化学习方法的对比分析,以及对框架和奖励的研究,我们的方案显示出优越的学习效率和泛化能力。不同博弈尺度的对抗实验以及专门的性能测试表明,防御组表现出较强的鲁棒性和适应性特征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Internet of Things Journal
IEEE Internet of Things Journal Computer Science-Information Systems
CiteScore
17.60
自引率
13.20%
发文量
1982
期刊介绍: The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.
期刊最新文献
Microwave Photonic Joint Radar, Wireless Communications, and Spectrum Sensing System With Broadband Tunability From 12 to 40 GHz CIFDM: A Fault Diagnosis Mechanism for Access Networks Based on Cause Inference in Heterogeneous Emergency Networks Decision-Aware Status Updating for Multi-AP Compute-First Networking Under Transmission Constraints Model Predictive Control of Automated Vehicles Under Round-Robin Protocols and Refined Constant-Time-Headway Strategies Context-Aware Hierarchical Learning for Mobile Relay Control in mmWave 6G-IoT Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1