基于强化学习方法的SDN网络路由优化研究

2019 2nd International Conference on Safety Produce Informatization (IICSPI) Pub Date : 2019-11-01 DOI:10.1109/IICSPI48186.2019.9095940

Zhengwu Yuan, Peng Zhou, Shanshan Wang, Xiaojian Zhang

{"title":"基于强化学习方法的SDN网络路由优化研究","authors":"Zhengwu Yuan, Peng Zhou, Shanshan Wang, Xiaojian Zhang","doi":"10.1109/IICSPI48186.2019.9095940","DOIUrl":null,"url":null,"abstract":"The development of computer networks is making it become more complex and dynamic. How to achieve efficient package-routing in the SDN (Software Design Network) has become hot research field. SARSA-Learning is a typical Reinforcement Learning algorithm. Through the on-policy exploration and learning of the network environment, it can be used to derive the optimal decision in an unknown network environment, in this way, the network data routing and forwarding can be effectively completed. This paper yields a SARSA-Learning Routing algorithm with variable greedy function (Variable $\\varepsilon$-Greedy function within SARSA-Learning Routing, V-S Routing). The V-S Routing algorithm preserves the efficiency of the SARSA-Leaning framework. The purpose of V-S Routing introduces a variable factor to $\\varepsilon$-Greedy function. The V-S Routing algorithm can be dynamically calculated to represent the priority of the current state in the SDN network and to solve the problem of SDN network optimal route selection, which can avoid long package waiting queue and reduce SDN network congestion and improve the link transmission speed. The Variable $\\varepsilon$-Greedy function makes the algorithm more suitable to the network environment, and it also makes V-S Routing algorithm having better generalization ability. The experimental results verify the effectiveness of the algorithm.","PeriodicalId":318693,"journal":{"name":"2019 2nd International Conference on Safety Produce Informatization (IICSPI)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Research on Routing Optimization of SDN Network Using Reinforcement Learning Method\",\"authors\":\"Zhengwu Yuan, Peng Zhou, Shanshan Wang, Xiaojian Zhang\",\"doi\":\"10.1109/IICSPI48186.2019.9095940\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The development of computer networks is making it become more complex and dynamic. How to achieve efficient package-routing in the SDN (Software Design Network) has become hot research field. SARSA-Learning is a typical Reinforcement Learning algorithm. Through the on-policy exploration and learning of the network environment, it can be used to derive the optimal decision in an unknown network environment, in this way, the network data routing and forwarding can be effectively completed. This paper yields a SARSA-Learning Routing algorithm with variable greedy function (Variable $\\\\varepsilon$-Greedy function within SARSA-Learning Routing, V-S Routing). The V-S Routing algorithm preserves the efficiency of the SARSA-Leaning framework. The purpose of V-S Routing introduces a variable factor to $\\\\varepsilon$-Greedy function. The V-S Routing algorithm can be dynamically calculated to represent the priority of the current state in the SDN network and to solve the problem of SDN network optimal route selection, which can avoid long package waiting queue and reduce SDN network congestion and improve the link transmission speed. The Variable $\\\\varepsilon$-Greedy function makes the algorithm more suitable to the network environment, and it also makes V-S Routing algorithm having better generalization ability. The experimental results verify the effectiveness of the algorithm.\",\"PeriodicalId\":318693,\"journal\":{\"name\":\"2019 2nd International Conference on Safety Produce Informatization (IICSPI)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 2nd International Conference on Safety Produce Informatization (IICSPI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IICSPI48186.2019.9095940\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 2nd International Conference on Safety Produce Informatization (IICSPI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IICSPI48186.2019.9095940","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

计算机网络的发展使其变得更加复杂和动态。如何在软件设计网络中实现高效的分组路由已成为研究的热点。SARSA-Learning是一种典型的强化学习算法。通过对网络环境的on-policy探索和学习，可以推导出未知网络环境下的最优决策，从而有效地完成网络数据的路由和转发。本文提出了一种具有可变贪心函数的SARSA-Learning路由算法(SARSA-Learning Routing, V-S Routing中的variable $\varepsilon$-贪心函数)。V-S路由算法保留了sarsa - lean框架的效率。V-S路由的目的是在$\varepsilon$-Greedy函数中引入一个可变因子。V-S路由算法可以动态计算，表示当前状态在SDN网络中的优先级，解决SDN网络最优路由选择问题，避免数据包等待队列过长，减少SDN网络拥塞，提高链路传输速度。变量$\varepsilon$-Greedy函数使算法更适合网络环境，也使V-S路由算法具有更好的泛化能力。实验结果验证了该算法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Research on Routing Optimization of SDN Network Using Reinforcement Learning Method

The development of computer networks is making it become more complex and dynamic. How to achieve efficient package-routing in the SDN (Software Design Network) has become hot research field. SARSA-Learning is a typical Reinforcement Learning algorithm. Through the on-policy exploration and learning of the network environment, it can be used to derive the optimal decision in an unknown network environment, in this way, the network data routing and forwarding can be effectively completed. This paper yields a SARSA-Learning Routing algorithm with variable greedy function (Variable $\varepsilon$-Greedy function within SARSA-Learning Routing, V-S Routing). The V-S Routing algorithm preserves the efficiency of the SARSA-Leaning framework. The purpose of V-S Routing introduces a variable factor to $\varepsilon$-Greedy function. The V-S Routing algorithm can be dynamically calculated to represent the priority of the current state in the SDN network and to solve the problem of SDN network optimal route selection, which can avoid long package waiting queue and reduce SDN network congestion and improve the link transmission speed. The Variable $\varepsilon$-Greedy function makes the algorithm more suitable to the network environment, and it also makes V-S Routing algorithm having better generalization ability. The experimental results verify the effectiveness of the algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 2nd International Conference on Safety Produce Informatization (IICSPI)

自引率

0.00%

发文量