{"title":"Learning-driven hybrid scaling for multi-type services in cloud","authors":"Haitao Zhang, Tongyu Guo, Wei Tian, Huadong Ma","doi":"10.1016/j.jpdc.2024.104880","DOIUrl":null,"url":null,"abstract":"<div><p>In order to deal with the fast changing requirements of container based services in clouds, auto-scaling is used as an essential mechanism for adapting the number of provisioned resources with the variable service workloads. However, the latest auto-scaling approaches lack the comprehensive consideration of variable workloads and hybrid auto-scaling for multi-type services. Firstly, the historical data based proactive approaches are widely used to handle complex and variable workloads in advance. The decision-making accuracy of proactive approaches depends on the prediction algorithm, which is affected by the anomalies, missing values and errors in the historical workload data, and the unexpected workload cannot be handled. Secondly, the trigger based reactive approaches are seriously affected by workload fluctuation which causes the frequent invalid scaling of service resources. Besides, due to the existence of scaling time, there are different completion delays of different scaling actions. Thirdly, the latest approaches also ignore the different scaling time of hybrid scaling for multi-type services including stateful services and stateless services. Especially, when the stateful services are scaled horizontally, the neglected long scaling time causes the untimely supply and withdrawal of resources. Consequently, all three issues above can lead to the degradation of Quality of Services (QoS) and the inefficient utilization of resources. This paper proposes a new hybrid auto-scaling approach for multi-type services to resolve the impact of service scaling time on decision making. We combine the proactive scaling strategy with the reactive anomaly detection and correction mechanism. For making a proactive decision, the ensemble learning model with the structure improved deep network is designed to predict the future workload. On the basis of the predicted results and the scaling time of different types of services, the auto-scaling decisions are made by a Deep Reinforcement Learning (DRL) model with heterogeneous action space, which integrates horizontal and vertical scaling actions. Meanwhile, with the anomaly detection and correction mechanism, the workload fluctuation and unexpected workload can be detected and handled. We evaluate our approach against three different proactive and reactive auto-scaling approaches in the cloud environment, and the experimental results show the proposed approach can achieve the better scaling behavior compared to state-of-the-art approaches.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"189 ","pages":"Article 104880"},"PeriodicalIF":3.4000,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Parallel and Distributed Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0743731524000443","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
In order to deal with the fast changing requirements of container based services in clouds, auto-scaling is used as an essential mechanism for adapting the number of provisioned resources with the variable service workloads. However, the latest auto-scaling approaches lack the comprehensive consideration of variable workloads and hybrid auto-scaling for multi-type services. Firstly, the historical data based proactive approaches are widely used to handle complex and variable workloads in advance. The decision-making accuracy of proactive approaches depends on the prediction algorithm, which is affected by the anomalies, missing values and errors in the historical workload data, and the unexpected workload cannot be handled. Secondly, the trigger based reactive approaches are seriously affected by workload fluctuation which causes the frequent invalid scaling of service resources. Besides, due to the existence of scaling time, there are different completion delays of different scaling actions. Thirdly, the latest approaches also ignore the different scaling time of hybrid scaling for multi-type services including stateful services and stateless services. Especially, when the stateful services are scaled horizontally, the neglected long scaling time causes the untimely supply and withdrawal of resources. Consequently, all three issues above can lead to the degradation of Quality of Services (QoS) and the inefficient utilization of resources. This paper proposes a new hybrid auto-scaling approach for multi-type services to resolve the impact of service scaling time on decision making. We combine the proactive scaling strategy with the reactive anomaly detection and correction mechanism. For making a proactive decision, the ensemble learning model with the structure improved deep network is designed to predict the future workload. On the basis of the predicted results and the scaling time of different types of services, the auto-scaling decisions are made by a Deep Reinforcement Learning (DRL) model with heterogeneous action space, which integrates horizontal and vertical scaling actions. Meanwhile, with the anomaly detection and correction mechanism, the workload fluctuation and unexpected workload can be detected and handled. We evaluate our approach against three different proactive and reactive auto-scaling approaches in the cloud environment, and the experimental results show the proposed approach can achieve the better scaling behavior compared to state-of-the-art approaches.
期刊介绍:
This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing.
The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal also features special issues on these topics; again covering the full range from the design to the use of our targeted systems.