基于深度强化学习的行为诱导改进多波段弹性光网络性能

2022 IEEE Latin-American Conference on Communications (LATINCOM) Pub Date : 2022-11-30 DOI:10.1109/LATINCOM56090.2022.10000531

Marcelo Gonzalez, Felipe Condon, P. Morales, N. Jara

{"title":"基于深度强化学习的行为诱导改进多波段弹性光网络性能","authors":"Marcelo Gonzalez, Felipe Condon, P. Morales, N. Jara","doi":"10.1109/LATINCOM56090.2022.10000531","DOIUrl":null,"url":null,"abstract":"Deep Reinforcement Learning (DRL) has proven a considerable potential for enabling non-trivial solutions to resource allocation problems in optical networks. However, applying plain DRL does not ensure better performance than currently known best heuristics solutions. DRL demands a parameter tuning process to improve its performance. One tuning possibility is the reward function design. The reward function allows feedback to the agents on whether the actions sent to the environment were successful or not. A transparent reward function returns whether the action succeeds or not, but an elaborate reward function may allow inducing the desired behaviour to improve DRL performance. Our work designs reward functions in multi-band elastic optical networks (MB-EON) to improve the overall network blocking probability. A test environment was set up to analyze the performance of four reward functions for inducing a lower blocking probability. The proposed reward functions use band usage, link compactness, spectrum availability and link fragmentation as feedback information to the agents. Analysis was carried out using the DQN agent in the NSFNet network topology. Results show that reward function design improves the blocking probability. The best-performing one uses the band availability criteria, decreasing the blocking probability, as an average, by 22% compared to the baseline reward function, with a peak of 63,67% of improvement for a 1000 Erlang traffic load scenario.","PeriodicalId":221354,"journal":{"name":"2022 IEEE Latin-American Conference on Communications (LATINCOM)","volume":"2673 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving Multi-Band Elastic Optical Networks Performance using Behavior Induction on Deep Reinforcement Learning\",\"authors\":\"Marcelo Gonzalez, Felipe Condon, P. Morales, N. Jara\",\"doi\":\"10.1109/LATINCOM56090.2022.10000531\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep Reinforcement Learning (DRL) has proven a considerable potential for enabling non-trivial solutions to resource allocation problems in optical networks. However, applying plain DRL does not ensure better performance than currently known best heuristics solutions. DRL demands a parameter tuning process to improve its performance. One tuning possibility is the reward function design. The reward function allows feedback to the agents on whether the actions sent to the environment were successful or not. A transparent reward function returns whether the action succeeds or not, but an elaborate reward function may allow inducing the desired behaviour to improve DRL performance. Our work designs reward functions in multi-band elastic optical networks (MB-EON) to improve the overall network blocking probability. A test environment was set up to analyze the performance of four reward functions for inducing a lower blocking probability. The proposed reward functions use band usage, link compactness, spectrum availability and link fragmentation as feedback information to the agents. Analysis was carried out using the DQN agent in the NSFNet network topology. Results show that reward function design improves the blocking probability. The best-performing one uses the band availability criteria, decreasing the blocking probability, as an average, by 22% compared to the baseline reward function, with a peak of 63,67% of improvement for a 1000 Erlang traffic load scenario.\",\"PeriodicalId\":221354,\"journal\":{\"name\":\"2022 IEEE Latin-American Conference on Communications (LATINCOM)\",\"volume\":\"2673 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Latin-American Conference on Communications (LATINCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LATINCOM56090.2022.10000531\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Latin-American Conference on Communications (LATINCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LATINCOM56090.2022.10000531","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

深度强化学习(DRL)在解决光网络中的资源分配问题方面已经被证明具有相当大的潜力。然而，应用普通DRL并不能确保比目前已知的最佳启发式解决方案更好的性能。DRL需要一个参数调优过程来提高其性能。一种调整可能性是奖励功能设计。奖励函数允许向代理反馈发送到环境的动作是否成功。一个透明的奖励函数会返回操作是否成功，但一个精心设计的奖励函数可能会诱导期望的行为来提高DRL的性能。我们设计了多波段弹性光网络(MB-EON)中的奖励功能，以提高整个网络的阻塞概率。建立了一个测试环境，分析了四种奖励函数诱导较低阻塞概率的性能。提出的奖励函数使用频带利用率、链路紧凑性、频谱可用性和链路碎片作为反馈信息给代理。在NSFNet网络拓扑中使用DQN代理进行了分析。结果表明，奖励函数设计提高了阻塞概率。性能最好的一个使用频带可用性标准，与基线奖励函数相比，平均减少了22%的阻塞概率，在1000 Erlang流量负载场景中，峰值提高了63.67%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Improving Multi-Band Elastic Optical Networks Performance using Behavior Induction on Deep Reinforcement Learning

Deep Reinforcement Learning (DRL) has proven a considerable potential for enabling non-trivial solutions to resource allocation problems in optical networks. However, applying plain DRL does not ensure better performance than currently known best heuristics solutions. DRL demands a parameter tuning process to improve its performance. One tuning possibility is the reward function design. The reward function allows feedback to the agents on whether the actions sent to the environment were successful or not. A transparent reward function returns whether the action succeeds or not, but an elaborate reward function may allow inducing the desired behaviour to improve DRL performance. Our work designs reward functions in multi-band elastic optical networks (MB-EON) to improve the overall network blocking probability. A test environment was set up to analyze the performance of four reward functions for inducing a lower blocking probability. The proposed reward functions use band usage, link compactness, spectrum availability and link fragmentation as feedback information to the agents. Analysis was carried out using the DQN agent in the NSFNet network topology. Results show that reward function design improves the blocking probability. The best-performing one uses the band availability criteria, decreasing the blocking probability, as an average, by 22% compared to the baseline reward function, with a peak of 63,67% of improvement for a 1000 Erlang traffic load scenario.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助