Multi-objective optimization for submarine optical cable route planning based on cross reinforcement learning

IF 4 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Journal of Optical Communications and Networking Pub Date : 2024-09-25 DOI:10.1364/JOCN.529175

Zanshan Zhao;Guanjun Gao;Weiming Gan;Jialiang Zhang;Zengfu Wang;Haoyu Wang;Yonggang Guo

{"title":"Multi-objective optimization for submarine optical cable route planning based on cross reinforcement learning","authors":"Zanshan Zhao;Guanjun Gao;Weiming Gan;Jialiang Zhang;Zengfu Wang;Haoyu Wang;Yonggang Guo","doi":"10.1364/JOCN.529175","DOIUrl":null,"url":null,"abstract":"Submarine cable is a crucial infrastructure for international communications, and its cost and survivability are two key factors that must be considered at its design phase. In this paper, we propose a machine-learning-assisted submarine cable route planning algorithm for minimizing its accumulated cost and risk. The cost and risk distribution and the direction of the submarine cable route’s starting point and endpoint are used as prior data to initialize the state-action of reinforcement learning (RL). We also propose a multi-agent cross reinforcement learning (MA-XRL) framework composed of Q-learning and SARSA to improve the global optimization capability of RL in the case of multi-objective optimization. The results show that, compared to ant colony optimization (ACO), MA-XRL can reduce the accumulated cost by 26.87% under the same accumulated risk. The maximum accumulated cost of the Pareto solutions obtained by MA-XRL is lower than the minimum accumulated cost of that obtained by ACO. Meanwhile, the running time of MA-XRL is only 1.3‰ of that of ACO. Without prior data of cost and risk initialization, the accumulated cost and risk of the best submarine cable route obtained by MA-XRL is 1.84 times and 7.08 times those with cost and risk distribution initialization, respectively. The direction initialization can accelerate the agent to find the endpoint of the submarine cable route and double the search stability of MA-XRL. Compared to using Q-learning or SARSA alone, MA-XRL can respectively reduce the accumulated risk by 71.81% and 39.51% under the same accumulated cost and can reduce the accumulated cost by 16.65% and 11.99% under the same accumulated risk, respectively.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"16 10","pages":"1018-1033"},"PeriodicalIF":4.0000,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Optical Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10694705/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Submarine cable is a crucial infrastructure for international communications, and its cost and survivability are two key factors that must be considered at its design phase. In this paper, we propose a machine-learning-assisted submarine cable route planning algorithm for minimizing its accumulated cost and risk. The cost and risk distribution and the direction of the submarine cable route’s starting point and endpoint are used as prior data to initialize the state-action of reinforcement learning (RL). We also propose a multi-agent cross reinforcement learning (MA-XRL) framework composed of Q-learning and SARSA to improve the global optimization capability of RL in the case of multi-objective optimization. The results show that, compared to ant colony optimization (ACO), MA-XRL can reduce the accumulated cost by 26.87% under the same accumulated risk. The maximum accumulated cost of the Pareto solutions obtained by MA-XRL is lower than the minimum accumulated cost of that obtained by ACO. Meanwhile, the running time of MA-XRL is only 1.3‰ of that of ACO. Without prior data of cost and risk initialization, the accumulated cost and risk of the best submarine cable route obtained by MA-XRL is 1.84 times and 7.08 times those with cost and risk distribution initialization, respectively. The direction initialization can accelerate the agent to find the endpoint of the submarine cable route and double the search stability of MA-XRL. Compared to using Q-learning or SARSA alone, MA-XRL can respectively reduce the accumulated risk by 71.81% and 39.51% under the same accumulated cost and can reduce the accumulated cost by 16.65% and 11.99% under the same accumulated risk, respectively.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于交叉强化学习的海底光缆路由规划多目标优化

海底光缆是国际通信的重要基础设施，其成本和生存能力是设计阶段必须考虑的两个关键因素。在本文中，我们提出了一种机器学习辅助的海底光缆线路规划算法，以最大限度地降低其累积成本和风险。成本和风险分布以及海底电缆线路起点和终点的方向被用作先验数据，用于初始化强化学习（RL）的状态动作。我们还提出了由 Q-learning 和 SARSA 组成的多代理交叉强化学习（MA-XRL）框架，以提高 RL 在多目标优化情况下的全局优化能力。结果表明，与蚁群优化（ACO）相比，MA-XRL 能在相同累积风险下降低 26.87% 的累积成本。MA-XRL 所得到的帕累托方案的最大累计成本低于 ACO 所得到的最小累计成本。同时，MA-XRL 的运行时间仅为 ACO 的 1.3‰。在没有成本和风险初始化数据的情况下，MA-XRL 得到的最佳海缆线路的累计成本和风险分别是成本和风险分布初始化的 1.84 倍和 7.08 倍。方向初始化可以加快代理寻找海缆线路终点的速度，并使 MA-XRL 的搜索稳定性提高一倍。与单独使用 Q-learning 或 SARSA 相比，MA-XRL 在相同累积成本下可分别降低 71.81% 和 39.51% 的累积风险，在相同累积风险下可分别降低 16.65% 和 11.99% 的累积成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Optical Communications and Networking 工程技术-电信学

CiteScore

9.40

自引率

16.00%

发文量

104

审稿时长

4 months

期刊介绍： The scope of the Journal includes advances in the state-of-the-art of optical networking science, technology, and engineering. Both theoretical contributions (including new techniques, concepts, analyses, and economic studies) and practical contributions (including optical networking experiments, prototypes, and new applications) are encouraged. Subareas of interest include the architecture and design of optical networks, optical network survivability and security, software-defined optical networking, elastic optical networks, data and control plane advances, network management related innovation, and optical access networks. Enabling technologies and their applications are suitable topics only if the results are shown to directly impact optical networking beyond simple point-to-point networks.