Joel Reijonen, M. Opsenica, T. Kauppinen, M. Komu, Jimmy Kjällman, Tomas Mecklin, Eero Hiltunen, J. Arkko, Timo Simanainen, M. Elmusrati
{"title":"Benchmarking Q-Learning Methods for Intelligent Network Orchestration in the Edge","authors":"Joel Reijonen, M. Opsenica, T. Kauppinen, M. Komu, Jimmy Kjällman, Tomas Mecklin, Eero Hiltunen, J. Arkko, Timo Simanainen, M. Elmusrati","doi":"10.1109/6GSUMMIT49458.2020.9083745","DOIUrl":null,"url":null,"abstract":"We benchmark Q-learning methods, with various action selection strategies, in intelligent orchestration of the network edge. Q-learning is a reinforcement learning technique that aims to find optimal action policies by taking advantage of the experiences in the past without utilizing a model that describes the dynamics of the environment. With experiences, we refer to the observed causality between the action and the corresponding impact to the environment. In this paper, the environment for Q-learning is composed of virtualized networking resources along with their dynamics that are monitored with Spindump, an in-network latency measurement tool with support for QUIC and TCP. We optimize the orchestration of these networking resources by introducing Q-learning as part of the machine learning driven, intelligent orchestration that is applicable in the edge. Based on the benchmarking results, we identify which action selection strategies support network orchestration that provides low latency and packet loss by considering network resource allocation in the edge.","PeriodicalId":385212,"journal":{"name":"2020 2nd 6G Wireless Summit (6G SUMMIT)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 2nd 6G Wireless Summit (6G SUMMIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/6GSUMMIT49458.2020.9083745","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
We benchmark Q-learning methods, with various action selection strategies, in intelligent orchestration of the network edge. Q-learning is a reinforcement learning technique that aims to find optimal action policies by taking advantage of the experiences in the past without utilizing a model that describes the dynamics of the environment. With experiences, we refer to the observed causality between the action and the corresponding impact to the environment. In this paper, the environment for Q-learning is composed of virtualized networking resources along with their dynamics that are monitored with Spindump, an in-network latency measurement tool with support for QUIC and TCP. We optimize the orchestration of these networking resources by introducing Q-learning as part of the machine learning driven, intelligent orchestration that is applicable in the edge. Based on the benchmarking results, we identify which action selection strategies support network orchestration that provides low latency and packet loss by considering network resource allocation in the edge.