Robust and energy-efficient RPL optimization algorithm with scalable deep reinforcement learning for IIoT

IF 4.4 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Computer Networks Pub Date : 2024-11-10 DOI:10.1016/j.comnet.2024.110894

Ying Wang, Yuanyuan Li, Jianjun Lei, Fengjun Shang

{"title":"Robust and energy-efficient RPL optimization algorithm with scalable deep reinforcement learning for IIoT","authors":"Ying Wang, Yuanyuan Li, Jianjun Lei, Fengjun Shang","doi":"10.1016/j.comnet.2024.110894","DOIUrl":null,"url":null,"abstract":"<div><div>The increasing complexity and quantity of the Industrial Internet of Things (IIoT) pose new challenges to the traditional routing protocol for low-power and lossy networks (RPL) in terms of dynamic management, data transmission reliability, and energy efficiency optimization. This paper proposes a scalable deep reinforcement learning (DRL) algorithm with a multi-attention actor double critic model for routing optimization (MADC) to meet the requirements of IIoT for efficient and intelligent routing decisions while improving data transmission reliability and energy efficiency. Specifically, MADC employs the centralized training and decentralized execution (CTDE) learning paradigm to decouple the model’s training and inference tasks, which reduces the difficulty and computational cost of model learning and improves the training efficiency. In addition, a lightweight actor network based on multi-scale convolutional attention mechanism is designed in MADC, which can provide intelligent and real-time decision-making capabilities for resource-constrained nodes with low computational and storage complexities. Moreover, a scalable critic network utilizing multiple attention mechanisms is proposed. It is not only suitable for dynamic and changing network environments but also can more comprehensively and accurately evaluate local observation states, providing more accurate and efficient guidance for model optimization. Furthermore, MADC incorporates a double critic network architecture to mitigate potential overestimation issues during training, thereby ensuring the model’s robustness and reliability. Simulation results demonstrate that MADC outperforms existing RPL optimization algorithms in terms of energy efficiency, data transmission reliability, and adaptability.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"255 ","pages":"Article 110894"},"PeriodicalIF":4.4000,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128624007266","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

The increasing complexity and quantity of the Industrial Internet of Things (IIoT) pose new challenges to the traditional routing protocol for low-power and lossy networks (RPL) in terms of dynamic management, data transmission reliability, and energy efficiency optimization. This paper proposes a scalable deep reinforcement learning (DRL) algorithm with a multi-attention actor double critic model for routing optimization (MADC) to meet the requirements of IIoT for efficient and intelligent routing decisions while improving data transmission reliability and energy efficiency. Specifically, MADC employs the centralized training and decentralized execution (CTDE) learning paradigm to decouple the model’s training and inference tasks, which reduces the difficulty and computational cost of model learning and improves the training efficiency. In addition, a lightweight actor network based on multi-scale convolutional attention mechanism is designed in MADC, which can provide intelligent and real-time decision-making capabilities for resource-constrained nodes with low computational and storage complexities. Moreover, a scalable critic network utilizing multiple attention mechanisms is proposed. It is not only suitable for dynamic and changing network environments but also can more comprehensively and accurately evaluate local observation states, providing more accurate and efficient guidance for model optimization. Furthermore, MADC incorporates a double critic network architecture to mitigate potential overestimation issues during training, thereby ensuring the model’s robustness and reliability. Simulation results demonstrate that MADC outperforms existing RPL optimization algorithms in terms of energy efficiency, data transmission reliability, and adaptability.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

采用可扩展深度强化学习的鲁棒节能 RPL 优化算法，适用于物联网

工业物联网（IIoT）的复杂性和数量不断增加，在动态管理、数据传输可靠性和能效优化方面对传统的低功耗和有损网络路由协议（RPL）提出了新的挑战。本文提出了一种可扩展的深度强化学习（DRL）算法与路由优化的多关注角色双批判模型（MADC），以满足 IIoT 对高效智能路由决策的要求，同时提高数据传输可靠性和能效。具体来说，MADC 采用集中训练和分散执行（CTDE）的学习范式，将模型的训练和推理任务解耦，降低了模型学习的难度和计算成本，提高了训练效率。此外，MADC 还设计了基于多尺度卷积注意力机制的轻量级行动者网络，可以为资源受限的节点提供智能和实时的决策能力，且计算和存储复杂度较低。此外，还提出了一种利用多重注意机制的可扩展批评者网络。它不仅适用于动态变化的网络环境，还能更全面、更准确地评估局部观测状态，为模型优化提供更准确、更高效的指导。此外，MADC 还采用了双批判网络架构，以缓解训练过程中可能出现的高估问题，从而确保模型的鲁棒性和可靠性。仿真结果表明，MADC 在能效、数据传输可靠性和适应性方面都优于现有的 RPL 优化算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computer Networks 工程技术-电信学

CiteScore

10.80

自引率

3.60%

发文量

434

审稿时长

8.6 months

期刊介绍： Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.