Horizontal Auto-Scaling for Multi-Access Edge Computing Using Safe Reinforcement Learning

ACM Trans. Embed. Comput. Syst. Pub Date : 2021-11-30 DOI:10.1145/3475991

Kaustabha Ray, A. Banerjee

{"title":"Horizontal Auto-Scaling for Multi-Access Edge Computing Using Safe Reinforcement Learning","authors":"Kaustabha Ray, A. Banerjee","doi":"10.1145/3475991","DOIUrl":null,"url":null,"abstract":"Multi-Access Edge Computing (MEC) has emerged as a promising new paradigm allowing low latency access to services deployed on edge servers to avert network latencies often encountered in accessing cloud services. A key component of the MEC environment is an auto-scaling policy which is used to decide the overall management and scaling of container instances corresponding to individual services deployed on MEC servers to cater to traffic fluctuations. In this work, we propose a Safe Reinforcement Learning (RL)-based auto-scaling policy agent that can efficiently adapt to traffic variations to ensure adherence to service specific latency requirements. We model the MEC environment using a Markov Decision Process (MDP). We demonstrate how latency requirements can be formally expressed in Linear Temporal Logic (LTL). The LTL specification acts as a guide to the policy agent to automatically learn auto-scaling decisions that maximize the probability of satisfying the LTL formula. We introduce a quantitative reward mechanism based on the LTL formula to tailor service specific latency requirements. We prove that our reward mechanism ensures convergence of standard Safe-RL approaches. We present experimental results in practical scenarios on a test-bed setup with real-world benchmark applications to show the effectiveness of our approach in comparison to other state-of-the-art methods in literature. Furthermore, we perform extensive simulated experiments to demonstrate the effectiveness of our approach in large scale scenarios.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Trans. Embed. Comput. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3475991","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Multi-Access Edge Computing (MEC) has emerged as a promising new paradigm allowing low latency access to services deployed on edge servers to avert network latencies often encountered in accessing cloud services. A key component of the MEC environment is an auto-scaling policy which is used to decide the overall management and scaling of container instances corresponding to individual services deployed on MEC servers to cater to traffic fluctuations. In this work, we propose a Safe Reinforcement Learning (RL)-based auto-scaling policy agent that can efficiently adapt to traffic variations to ensure adherence to service specific latency requirements. We model the MEC environment using a Markov Decision Process (MDP). We demonstrate how latency requirements can be formally expressed in Linear Temporal Logic (LTL). The LTL specification acts as a guide to the policy agent to automatically learn auto-scaling decisions that maximize the probability of satisfying the LTL formula. We introduce a quantitative reward mechanism based on the LTL formula to tailor service specific latency requirements. We prove that our reward mechanism ensures convergence of standard Safe-RL approaches. We present experimental results in practical scenarios on a test-bed setup with real-world benchmark applications to show the effectiveness of our approach in comparison to other state-of-the-art methods in literature. Furthermore, we perform extensive simulated experiments to demonstrate the effectiveness of our approach in large scale scenarios.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于安全强化学习的多访问边缘计算水平自动缩放

多访问边缘计算(MEC)已经成为一种很有前途的新范式，允许对部署在边缘服务器上的服务进行低延迟访问，以避免在访问云服务时经常遇到的网络延迟。MEC环境的一个关键组件是自动扩展策略，该策略用于决定容器实例的整体管理和扩展，这些实例对应于部署在MEC服务器上的各个服务，以适应流量波动。在这项工作中，我们提出了一种基于安全强化学习(RL)的自动扩展策略代理，它可以有效地适应流量变化，以确保遵守服务特定的延迟要求。我们使用马尔可夫决策过程(MDP)对MEC环境进行建模。我们演示了如何用线性时间逻辑(LTL)形式化地表示延迟需求。LTL规范充当策略代理的指南，以自动学习自动缩放决策，从而最大化满足LTL公式的概率。我们引入了一种基于LTL公式的定量奖励机制，以定制特定于服务的延迟需求。我们证明了我们的奖励机制保证了标准安全强化学习方法的收敛性。我们在一个具有真实基准应用程序的试验台设置的实际场景中给出了实验结果，以显示与文献中其他最先进的方法相比，我们的方法的有效性。此外，我们进行了大量的模拟实验来证明我们的方法在大规模场景中的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Trans. Embed. Comput. Syst.

自引率

0.00%

发文量