Raseena M. Haris, Mahmoud Barhamgi, Armstrong Nhlabatsi, Khaled M. Khan
{"title":"利用基于机器学习的预测模型优化云计算中的预复制实时虚拟机迁移","authors":"Raseena M. Haris, Mahmoud Barhamgi, Armstrong Nhlabatsi, Khaled M. Khan","doi":"10.1007/s00607-024-01318-6","DOIUrl":null,"url":null,"abstract":"<p>One of the preconditions for efficient cloud computing services is the continuous availability of services to clients. However, there are various reasons for temporary service unavailability due to routine maintenance, load balancing, cyber-attacks, power management, fault tolerance, emergency incident response, and resource usage. Live Virtual Machine Migration (LVM) is an option to address service unavailability by moving virtual machines between hosts without disrupting running services. Pre-copy memory migration is a common LVM approach used in cloud systems, but it faces challenges due to the high rate of frequently updated memory pages known as dirty pages. Transferring these dirty pages during pre-copy migration prolongs the overall migration time. If there are large numbers of remaining memory pages after a predefined iteration of page transfer, the stop-and-copy phase is initiated, which significantly increases downtime and negatively impacts service availability. To mitigate this issue, we introduce a prediction-based approach that optimizes the migration process by dynamically halting the iteration phase when the predicted downtime falls below a predefined threshold. Our proposed machine learning method was rigorously evaluated through experiments conducted on a dedicated testbed using KVM/QEMU technology, involving different VM sizes and memory-intensive workloads. A comparative analysis against proposed pre-copy methods and default migration approach reveals a remarkable improvement, with an average 64.91% reduction in downtime for different RAM configurations in high-write-intensive workloads, along with an average reduction in total migration time of approximately 85.81%. These findings underscore the practical advantages of our method in reducing service disruptions during live virtual machine migration in cloud systems.</p>","PeriodicalId":10718,"journal":{"name":"Computing","volume":"40 1","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimizing pre-copy live virtual machine migration in cloud computing using machine learning-based prediction model\",\"authors\":\"Raseena M. Haris, Mahmoud Barhamgi, Armstrong Nhlabatsi, Khaled M. Khan\",\"doi\":\"10.1007/s00607-024-01318-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>One of the preconditions for efficient cloud computing services is the continuous availability of services to clients. However, there are various reasons for temporary service unavailability due to routine maintenance, load balancing, cyber-attacks, power management, fault tolerance, emergency incident response, and resource usage. Live Virtual Machine Migration (LVM) is an option to address service unavailability by moving virtual machines between hosts without disrupting running services. Pre-copy memory migration is a common LVM approach used in cloud systems, but it faces challenges due to the high rate of frequently updated memory pages known as dirty pages. Transferring these dirty pages during pre-copy migration prolongs the overall migration time. If there are large numbers of remaining memory pages after a predefined iteration of page transfer, the stop-and-copy phase is initiated, which significantly increases downtime and negatively impacts service availability. To mitigate this issue, we introduce a prediction-based approach that optimizes the migration process by dynamically halting the iteration phase when the predicted downtime falls below a predefined threshold. Our proposed machine learning method was rigorously evaluated through experiments conducted on a dedicated testbed using KVM/QEMU technology, involving different VM sizes and memory-intensive workloads. A comparative analysis against proposed pre-copy methods and default migration approach reveals a remarkable improvement, with an average 64.91% reduction in downtime for different RAM configurations in high-write-intensive workloads, along with an average reduction in total migration time of approximately 85.81%. These findings underscore the practical advantages of our method in reducing service disruptions during live virtual machine migration in cloud systems.</p>\",\"PeriodicalId\":10718,\"journal\":{\"name\":\"Computing\",\"volume\":\"40 1\",\"pages\":\"\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00607-024-01318-6\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00607-024-01318-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
Optimizing pre-copy live virtual machine migration in cloud computing using machine learning-based prediction model
One of the preconditions for efficient cloud computing services is the continuous availability of services to clients. However, there are various reasons for temporary service unavailability due to routine maintenance, load balancing, cyber-attacks, power management, fault tolerance, emergency incident response, and resource usage. Live Virtual Machine Migration (LVM) is an option to address service unavailability by moving virtual machines between hosts without disrupting running services. Pre-copy memory migration is a common LVM approach used in cloud systems, but it faces challenges due to the high rate of frequently updated memory pages known as dirty pages. Transferring these dirty pages during pre-copy migration prolongs the overall migration time. If there are large numbers of remaining memory pages after a predefined iteration of page transfer, the stop-and-copy phase is initiated, which significantly increases downtime and negatively impacts service availability. To mitigate this issue, we introduce a prediction-based approach that optimizes the migration process by dynamically halting the iteration phase when the predicted downtime falls below a predefined threshold. Our proposed machine learning method was rigorously evaluated through experiments conducted on a dedicated testbed using KVM/QEMU technology, involving different VM sizes and memory-intensive workloads. A comparative analysis against proposed pre-copy methods and default migration approach reveals a remarkable improvement, with an average 64.91% reduction in downtime for different RAM configurations in high-write-intensive workloads, along with an average reduction in total migration time of approximately 85.81%. These findings underscore the practical advantages of our method in reducing service disruptions during live virtual machine migration in cloud systems.
期刊介绍:
Computing publishes original papers, short communications and surveys on all fields of computing. The contributions should be written in English and may be of theoretical or applied nature, the essential criteria are computational relevance and systematic foundation of results.