F. Tonini, C. Natalino, D. Temesgene, Z. Ghebretensae, L. Wosinska, P. Monti
{"title":"A Service-Aware Autoscaling Strategy for Container Orchestration Platforms with Soft Resource Isolation","authors":"F. Tonini, C. Natalino, D. Temesgene, Z. Ghebretensae, L. Wosinska, P. Monti","doi":"10.1109/EuCNC/6GSummit58263.2023.10188268","DOIUrl":null,"url":null,"abstract":"Container orchestration platforms like Kubernetes (K8s) allow easy deployment and management of cloud native services. When deploying their services, service providers need to specify a proper amount of resources to K8s, so that the desired Quality of Service (QoS) to their users can be maintained. To cope with the varying traffic demand coming from users, they can rely on the K8s Horizontal Pod Autoscaling (HPA) mechanism. To ensure that enough resources are available when needed, the standard HPA mechanism relies on resource overprovisioning. In this way, the required QoS is achieved most of (or all) the time but at the expense of additional resources that are allocated (and charged for), while they may stay idle for significant periods of time. A way to reduce overprovisioning is provided by the soft resource isolation of K8s, which allows services to compensate for a temporary lack of resources with shared resources, i.e., idle resources of the machines where services are running. However, during traffic spikes, these idle resources may not be enough to serve the whole demand, degrading the QoS. The HPA, which is not aware of how much demand could not be served, is not always able to correctly estimate the required additional resources, further degrading the QoS. To overcome this, service providers need to leverage overprovisioning, limiting the use of shared resources. In this paper, we propose a novel mechanism for autoscaling resources in K8s that relies on service-related data to avoid the additional degradation introduced by the HPA. The proposed strategy also offers a way to tune overprovisioning and shared resources. Simulation results show that our approach can reduce idle resources by up to 60% compared with the HPA mechanism.","PeriodicalId":65870,"journal":{"name":"公共管理高层论坛","volume":"17 1","pages":"454-459"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"公共管理高层论坛","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1109/EuCNC/6GSummit58263.2023.10188268","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Container orchestration platforms like Kubernetes (K8s) allow easy deployment and management of cloud native services. When deploying their services, service providers need to specify a proper amount of resources to K8s, so that the desired Quality of Service (QoS) to their users can be maintained. To cope with the varying traffic demand coming from users, they can rely on the K8s Horizontal Pod Autoscaling (HPA) mechanism. To ensure that enough resources are available when needed, the standard HPA mechanism relies on resource overprovisioning. In this way, the required QoS is achieved most of (or all) the time but at the expense of additional resources that are allocated (and charged for), while they may stay idle for significant periods of time. A way to reduce overprovisioning is provided by the soft resource isolation of K8s, which allows services to compensate for a temporary lack of resources with shared resources, i.e., idle resources of the machines where services are running. However, during traffic spikes, these idle resources may not be enough to serve the whole demand, degrading the QoS. The HPA, which is not aware of how much demand could not be served, is not always able to correctly estimate the required additional resources, further degrading the QoS. To overcome this, service providers need to leverage overprovisioning, limiting the use of shared resources. In this paper, we propose a novel mechanism for autoscaling resources in K8s that relies on service-related data to avoid the additional degradation introduced by the HPA. The proposed strategy also offers a way to tune overprovisioning and shared resources. Simulation results show that our approach can reduce idle resources by up to 60% compared with the HPA mechanism.