DeepCQ+:基于多智能体深度强化学习的高动态网络鲁棒和可扩展路由

MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM) Pub Date : 2021-11-29 DOI:10.1109/MILCOM52596.2021.9652948

Saeed Kaviani, Bo Ryu, E. Ahmed, Kevin Larson, Anh-Ngoc Le, Alex Yahja, J. H. Kim

{"title":"DeepCQ+:基于多智能体深度强化学习的高动态网络鲁棒和可扩展路由","authors":"Saeed Kaviani, Bo Ryu, E. Ahmed, Kevin Larson, Anh-Ngoc Le, Alex Yahja, J. H. Kim","doi":"10.1109/MILCOM52596.2021.9652948","DOIUrl":null,"url":null,"abstract":"Highly dynamic mobile ad-hoc networks (MANETs) remain as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing protocol which, in a novel manner, integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants, and achieves persistently higher performance across a wide range of topology and mobility configurations. While keeping the overall protocol structure of the Q-learning-based routing protocols, DeepCQ+ replaces statically configured parameterized thresholds and hand-written rules with carefully designed MADRL agents such that no configuration of such parameters is required a priori. Extensive simulation shows that DeepCQ+ yields significantly increased end-to-end throughput with lower overhead and no apparent degradation of end-to-end delays (hop counts) compared to its Q-learning-based counterparts. Qualitatively, and perhaps more significantly, DeepCQ+ maintains remarkably similar performance gains under many scenarios that it was not trained for in terms of network sizes, mobility conditions, and traffic dynamics. To the best of our knowledge, this is the first successful application of the MADRL framework for the MANET routing problem that demonstrates a high degree of scalability and robustness even under the environments that are outside the trained range of scenarios. This implies that our MARL-based DeepCQ+ design solution significantly improves the performance of Q-learning-based CQ+ baseline approach for comparison and increases its practicality and explainability because the real-world MANET environment will likely vary outside the trained range of MANET scenarios. Additional techniques to further increase the gains in performance and scalability are discussed.","PeriodicalId":187645,"journal":{"name":"MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)","volume":"194 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"DeepCQ+: Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for Highly Dynamic Networks\",\"authors\":\"Saeed Kaviani, Bo Ryu, E. Ahmed, Kevin Larson, Anh-Ngoc Le, Alex Yahja, J. H. Kim\",\"doi\":\"10.1109/MILCOM52596.2021.9652948\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Highly dynamic mobile ad-hoc networks (MANETs) remain as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing protocol which, in a novel manner, integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants, and achieves persistently higher performance across a wide range of topology and mobility configurations. While keeping the overall protocol structure of the Q-learning-based routing protocols, DeepCQ+ replaces statically configured parameterized thresholds and hand-written rules with carefully designed MADRL agents such that no configuration of such parameters is required a priori. Extensive simulation shows that DeepCQ+ yields significantly increased end-to-end throughput with lower overhead and no apparent degradation of end-to-end delays (hop counts) compared to its Q-learning-based counterparts. Qualitatively, and perhaps more significantly, DeepCQ+ maintains remarkably similar performance gains under many scenarios that it was not trained for in terms of network sizes, mobility conditions, and traffic dynamics. To the best of our knowledge, this is the first successful application of the MADRL framework for the MANET routing problem that demonstrates a high degree of scalability and robustness even under the environments that are outside the trained range of scenarios. This implies that our MARL-based DeepCQ+ design solution significantly improves the performance of Q-learning-based CQ+ baseline approach for comparison and increases its practicality and explainability because the real-world MANET environment will likely vary outside the trained range of MANET scenarios. Additional techniques to further increase the gains in performance and scalability are discussed.\",\"PeriodicalId\":187645,\"journal\":{\"name\":\"MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)\",\"volume\":\"194 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MILCOM52596.2021.9652948\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MILCOM52596.2021.9652948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

高度动态的移动自组织网络(manet)仍然是开发和部署健壮、高效和可扩展路由协议最具挑战性的环境之一。在本文中，我们提出了DeepCQ+路由协议，该协议以一种新颖的方式将新兴的多智能体深度强化学习(MADRL)技术集成到现有的基于q学习的路由协议及其变体中，并在广泛的拓扑和移动性配置中实现了持续更高的性能。在保持基于q学习的路由协议的整体协议结构的同时，DeepCQ+用精心设计的MADRL代理取代了静态配置的参数化阈值和手写规则，这样就不需要先验地配置这些参数。广泛的模拟表明，与基于q学习的同类相比，DeepCQ+的端到端吞吐量显著提高，开销更低，端到端延迟(跳数)没有明显下降。从质量上讲，也许更重要的是，DeepCQ+在许多场景下保持了非常相似的性能提升，而这些场景在网络规模、移动条件和流量动态方面都没有经过训练。据我们所知，这是MADRL框架在MANET路由问题上的首次成功应用，即使在训练场景范围之外的环境下，也展示了高度的可扩展性和鲁棒性。这意味着我们基于marl的DeepCQ+设计解决方案显著提高了基于q学习的CQ+基线方法的性能，并增加了其实用性和可解释性，因为现实世界的MANET环境可能会在MANET场景的训练范围之外变化。本文还讨论了进一步提高性能和可伸缩性的其他技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DeepCQ+: Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for Highly Dynamic Networks

Highly dynamic mobile ad-hoc networks (MANETs) remain as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing protocol which, in a novel manner, integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants, and achieves persistently higher performance across a wide range of topology and mobility configurations. While keeping the overall protocol structure of the Q-learning-based routing protocols, DeepCQ+ replaces statically configured parameterized thresholds and hand-written rules with carefully designed MADRL agents such that no configuration of such parameters is required a priori. Extensive simulation shows that DeepCQ+ yields significantly increased end-to-end throughput with lower overhead and no apparent degradation of end-to-end delays (hop counts) compared to its Q-learning-based counterparts. Qualitatively, and perhaps more significantly, DeepCQ+ maintains remarkably similar performance gains under many scenarios that it was not trained for in terms of network sizes, mobility conditions, and traffic dynamics. To the best of our knowledge, this is the first successful application of the MADRL framework for the MANET routing problem that demonstrates a high degree of scalability and robustness even under the environments that are outside the trained range of scenarios. This implies that our MARL-based DeepCQ+ design solution significantly improves the performance of Q-learning-based CQ+ baseline approach for comparison and increases its practicality and explainability because the real-world MANET environment will likely vary outside the trained range of MANET scenarios. Additional techniques to further increase the gains in performance and scalability are discussed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)

自引率

0.00%

发文量

期刊最新文献

RF-based Network Inference: Theoretical Foundations Security Threats Analysis of the Unmanned Aerial Vehicle System Using Distributed Ledgers For Command and Control – Concepts and Challenges DerechoDDS: Strongly Consistent Data Distribution for Mission-Critical Applications CUE: A Standalone Testbed for 5G Experimentation