Robust and energy-efficient RPL optimization algorithm with scalable deep reinforcement learning for IIoT

IF 4.4 2区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Computer Networks Pub Date : 2024-11-10 DOI:10.1016/j.comnet.2024.110894
Ying Wang, Yuanyuan Li, Jianjun Lei, Fengjun Shang
{"title":"Robust and energy-efficient RPL optimization algorithm with scalable deep reinforcement learning for IIoT","authors":"Ying Wang,&nbsp;Yuanyuan Li,&nbsp;Jianjun Lei,&nbsp;Fengjun Shang","doi":"10.1016/j.comnet.2024.110894","DOIUrl":null,"url":null,"abstract":"<div><div>The increasing complexity and quantity of the Industrial Internet of Things (IIoT) pose new challenges to the traditional routing protocol for low-power and lossy networks (RPL) in terms of dynamic management, data transmission reliability, and energy efficiency optimization. This paper proposes a scalable deep reinforcement learning (DRL) algorithm with a multi-attention actor double critic model for routing optimization (MADC) to meet the requirements of IIoT for efficient and intelligent routing decisions while improving data transmission reliability and energy efficiency. Specifically, MADC employs the centralized training and decentralized execution (CTDE) learning paradigm to decouple the model’s training and inference tasks, which reduces the difficulty and computational cost of model learning and improves the training efficiency. In addition, a lightweight actor network based on multi-scale convolutional attention mechanism is designed in MADC, which can provide intelligent and real-time decision-making capabilities for resource-constrained nodes with low computational and storage complexities. Moreover, a scalable critic network utilizing multiple attention mechanisms is proposed. It is not only suitable for dynamic and changing network environments but also can more comprehensively and accurately evaluate local observation states, providing more accurate and efficient guidance for model optimization. Furthermore, MADC incorporates a double critic network architecture to mitigate potential overestimation issues during training, thereby ensuring the model’s robustness and reliability. Simulation results demonstrate that MADC outperforms existing RPL optimization algorithms in terms of energy efficiency, data transmission reliability, and adaptability.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"255 ","pages":"Article 110894"},"PeriodicalIF":4.4000,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128624007266","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

The increasing complexity and quantity of the Industrial Internet of Things (IIoT) pose new challenges to the traditional routing protocol for low-power and lossy networks (RPL) in terms of dynamic management, data transmission reliability, and energy efficiency optimization. This paper proposes a scalable deep reinforcement learning (DRL) algorithm with a multi-attention actor double critic model for routing optimization (MADC) to meet the requirements of IIoT for efficient and intelligent routing decisions while improving data transmission reliability and energy efficiency. Specifically, MADC employs the centralized training and decentralized execution (CTDE) learning paradigm to decouple the model’s training and inference tasks, which reduces the difficulty and computational cost of model learning and improves the training efficiency. In addition, a lightweight actor network based on multi-scale convolutional attention mechanism is designed in MADC, which can provide intelligent and real-time decision-making capabilities for resource-constrained nodes with low computational and storage complexities. Moreover, a scalable critic network utilizing multiple attention mechanisms is proposed. It is not only suitable for dynamic and changing network environments but also can more comprehensively and accurately evaluate local observation states, providing more accurate and efficient guidance for model optimization. Furthermore, MADC incorporates a double critic network architecture to mitigate potential overestimation issues during training, thereby ensuring the model’s robustness and reliability. Simulation results demonstrate that MADC outperforms existing RPL optimization algorithms in terms of energy efficiency, data transmission reliability, and adaptability.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
采用可扩展深度强化学习的鲁棒节能 RPL 优化算法,适用于物联网
工业物联网(IIoT)的复杂性和数量不断增加,在动态管理、数据传输可靠性和能效优化方面对传统的低功耗和有损网络路由协议(RPL)提出了新的挑战。本文提出了一种可扩展的深度强化学习(DRL)算法与路由优化的多关注角色双批判模型(MADC),以满足 IIoT 对高效智能路由决策的要求,同时提高数据传输可靠性和能效。具体来说,MADC 采用集中训练和分散执行(CTDE)的学习范式,将模型的训练和推理任务解耦,降低了模型学习的难度和计算成本,提高了训练效率。此外,MADC 还设计了基于多尺度卷积注意力机制的轻量级行动者网络,可以为资源受限的节点提供智能和实时的决策能力,且计算和存储复杂度较低。此外,还提出了一种利用多重注意机制的可扩展批评者网络。它不仅适用于动态变化的网络环境,还能更全面、更准确地评估局部观测状态,为模型优化提供更准确、更高效的指导。此外,MADC 还采用了双批判网络架构,以缓解训练过程中可能出现的高估问题,从而确保模型的鲁棒性和可靠性。仿真结果表明,MADC 在能效、数据传输可靠性和适应性方面都优于现有的 RPL 优化算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer Networks
Computer Networks 工程技术-电信学
CiteScore
10.80
自引率
3.60%
发文量
434
审稿时长
8.6 months
期刊介绍: Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.
期刊最新文献
Performance modeling and comparison of URLLC and eMBB coexistence strategies in 5G new radio systems Integrating Unmanned Aerial Vehicles (UAVs) with Vehicular Ad-hoc NETworks (VANETs): Architectures, applications, opportunities Deep reinforcement learning for autonomous SideLink radio resource management in platoon-based C-V2X networks: An overview Robust and energy-efficient RPL optimization algorithm with scalable deep reinforcement learning for IIoT Privacy-preserving local clustering coefficient query on structured encrypted graphs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1