Multi-Agent Reinforcement Learning With Contribution-Based Assignment Online Routing In SDN

2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP) Pub Date : 2022-12-16 DOI:10.1109/ICCWAMTIP56608.2022.10016566

Xiaofeng Yue, Wu Lijun, Weiwei Duan

{"title":"Multi-Agent Reinforcement Learning With Contribution-Based Assignment Online Routing In SDN","authors":"Xiaofeng Yue, Wu Lijun, Weiwei Duan","doi":"10.1109/ICCWAMTIP56608.2022.10016566","DOIUrl":null,"url":null,"abstract":"Emerging applications place critical QoS requirements on the Internet. Networks need to guarantee different quality of service (QoS) requirements for different data flows for various Internet services. Improvements in traffic classification techniques, software-defined networking (SDN), and programmable network devices make it possible to quickly identify user requirements and control the routing of fine-grained traffic. In this paper, we propose CBR, an online routing algorithm using multi-agent deep reinforcement learning. CBR uses GCN to extract topology features, designs different reward functions to learn appropriate routing policies for different types of traffic demands, and organizes agents to generate routes in a hop-by-hop approach. In addition, to address the challenge of not being able to distinguish whether the actions made by each agent are critical or not due to shared reward values, CBR designs a new baseline to indicate their contribution level. Finally, to ensure reliability and speed up training, we use pre-training to learn shortest path rules to obtain initial parameters to speed up training and introduce a routing alternative mechanism to provide security for online routing. We conducted Mininet-based experiments using Abilene and GEANT network topologies. The experimental results show that CBR is able to simultaneously meet the demands of different service types for their requested traffic while performing well in terms of reliability in the case of link failures.","PeriodicalId":159508,"journal":{"name":"2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCWAMTIP56608.2022.10016566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Emerging applications place critical QoS requirements on the Internet. Networks need to guarantee different quality of service (QoS) requirements for different data flows for various Internet services. Improvements in traffic classification techniques, software-defined networking (SDN), and programmable network devices make it possible to quickly identify user requirements and control the routing of fine-grained traffic. In this paper, we propose CBR, an online routing algorithm using multi-agent deep reinforcement learning. CBR uses GCN to extract topology features, designs different reward functions to learn appropriate routing policies for different types of traffic demands, and organizes agents to generate routes in a hop-by-hop approach. In addition, to address the challenge of not being able to distinguish whether the actions made by each agent are critical or not due to shared reward values, CBR designs a new baseline to indicate their contribution level. Finally, to ensure reliability and speed up training, we use pre-training to learn shortest path rules to obtain initial parameters to speed up training and introduce a routing alternative mechanism to provide security for online routing. We conducted Mininet-based experiments using Abilene and GEANT network topologies. The experimental results show that CBR is able to simultaneously meet the demands of different service types for their requested traffic while performing well in terms of reliability in the case of link failures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SDN中基于贡献分配在线路由的多智能体强化学习

新兴的应用程序对Internet提出了关键的QoS要求。对于不同的Internet业务，不同的数据流需要保证不同的服务质量(QoS)要求。流量分类技术、软件定义网络(SDN)和可编程网络设备的改进使得快速识别用户需求和控制细粒度流量的路由成为可能。本文提出了一种基于多智能体深度强化学习的在线路由算法CBR。CBR利用GCN提取拓扑特征，设计不同的奖励函数，针对不同类型的流量需求学习合适的路由策略，并以逐跳的方式组织agent生成路由。此外，为了解决由于共享奖励值而无法区分每个智能体所做的行为是否重要的挑战，CBR设计了一个新的基线来指示它们的贡献水平。最后，为了保证训练的可靠性和加快训练的速度，我们采用预训练方法学习最短路径规则来获得初始参数以加快训练速度，并引入路由替代机制为在线路由提供安全性。我们使用Abilene和GEANT网络拓扑进行了基于mininet的实验。实验结果表明，CBR能够同时满足不同业务类型对其请求流量的需求，并且在链路故障情况下具有良好的可靠性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)

自引率

0.00%

发文量