OFP-TM: an online VM failure prediction and tolerance model towards high availability of cloud computing environments.

IF 2.5 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Journal of Supercomputing Pub Date : 2022-01-01 Epub Date: 2022-01-06 DOI:10.1007/s11227-021-04235-z
Deepika Saxena, Ashutosh Kumar Singh
{"title":"OFP-TM: an online VM failure prediction and tolerance model towards high availability of cloud computing environments.","authors":"Deepika Saxena,&nbsp;Ashutosh Kumar Singh","doi":"10.1007/s11227-021-04235-z","DOIUrl":null,"url":null,"abstract":"<p><p>The indispensable collaboration of cloud computing in every digital service has raised its resource usage exponentially. The ever-growing demand of cloud resources evades service availability leading to critical challenges such as cloud outages, SLA violation, and excessive power consumption. Previous approaches have addressed this problem by utilizing multiple cloud platforms or running multiple replicas of a Virtual Machine (VM) resulting into high operational cost. This paper has addressed this alarming problem from a different perspective by proposing a novel <math><mi>O</mi></math> nline virtual machine <math><mi>F</mi></math> ailure <math><mi>P</mi></math> rediction and <math><mi>T</mi></math> olerance <math><mi>M</mi></math> odel (OFP-TM) with high availability awareness embedded in physical machines as well as virtual machines. The failure-prone VMs are estimated in real-time based on their future resource usage by developing an ensemble approach-based resource predictor. These VMs are assigned to a failure tolerance unit comprising of a resource provision matrix and Selection Box (S-Box) mechanism which triggers the migration of failure-prone VMs and handle any outage beforehand while maintaining the desired level of availability for cloud users. The proposed model is evaluated and compared against existing related approaches by simulating cloud environment and executing several experiments using a real-world workload Google Cluster dataset. Consequently, it has been concluded that OFP-TM improves availability and scales down the number of live VM migrations up to 33.5% and 83.3%, respectively, over without OFP-TM.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 6","pages":"8003-8024"},"PeriodicalIF":2.5000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8731188/pdf/","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Supercomputing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11227-021-04235-z","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/1/6 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 13

Abstract

The indispensable collaboration of cloud computing in every digital service has raised its resource usage exponentially. The ever-growing demand of cloud resources evades service availability leading to critical challenges such as cloud outages, SLA violation, and excessive power consumption. Previous approaches have addressed this problem by utilizing multiple cloud platforms or running multiple replicas of a Virtual Machine (VM) resulting into high operational cost. This paper has addressed this alarming problem from a different perspective by proposing a novel O nline virtual machine F ailure P rediction and T olerance M odel (OFP-TM) with high availability awareness embedded in physical machines as well as virtual machines. The failure-prone VMs are estimated in real-time based on their future resource usage by developing an ensemble approach-based resource predictor. These VMs are assigned to a failure tolerance unit comprising of a resource provision matrix and Selection Box (S-Box) mechanism which triggers the migration of failure-prone VMs and handle any outage beforehand while maintaining the desired level of availability for cloud users. The proposed model is evaluated and compared against existing related approaches by simulating cloud environment and executing several experiments using a real-world workload Google Cluster dataset. Consequently, it has been concluded that OFP-TM improves availability and scales down the number of live VM migrations up to 33.5% and 83.3%, respectively, over without OFP-TM.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
OFP-TM:面向高可用性云计算环境的在线虚拟机故障预测和容错模型。
云计算在各种数字服务中不可或缺的协作使得其资源使用量呈指数级增长。不断增长的云资源需求回避了服务可用性,导致诸如云中断、SLA违反和过度功耗等关键挑战。以前的方法通过利用多个云平台或运行虚拟机(VM)的多个副本来解决此问题,从而导致高运营成本。本文从不同的角度解决了这一令人担忧的问题,提出了一种新颖的在线虚拟机故障预测和容错模型(OFP-TM),该模型在物理机和虚拟机中嵌入了高可用性感知。通过开发基于集成方法的资源预测器,实时估计易故障虚拟机的未来资源使用情况。这些虚拟机被分配到一个容错单元,该单元由资源供应矩阵和选择框(S-Box)机制组成,该机制触发易发生故障的虚拟机的迁移,并提前处理任何中断,同时保持云用户所需的可用性水平。通过模拟云环境和使用真实工作负载的Google Cluster数据集执行几个实验,对所提出的模型进行了评估并与现有的相关方法进行了比较。因此,得出的结论是,与没有OFP-TM相比,OFP-TM提高了可用性,并将活动VM迁移的数量分别减少了33.5%和83.3%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Supercomputing
Journal of Supercomputing 工程技术-工程:电子与电气
CiteScore
6.30
自引率
12.10%
发文量
734
审稿时长
13 months
期刊介绍: The Journal of Supercomputing publishes papers on the technology, architecture and systems, algorithms, languages and programs, performance measures and methods, and applications of all aspects of Supercomputing. Tutorial and survey papers are intended for workers and students in the fields associated with and employing advanced computer systems. The journal also publishes letters to the editor, especially in areas relating to policy, succinct statements of paradoxes, intuitively puzzling results, partial results and real needs. Published theoretical and practical papers are advanced, in-depth treatments describing new developments and new ideas. Each includes an introduction summarizing prior, directly pertinent work that is useful for the reader to understand, in order to appreciate the advances being described.
期刊最新文献
Topic sentiment analysis based on deep neural network using document embedding technique. A Fechner multiscale local descriptor for face recognition. Data quality model for assessing public COVID-19 big datasets. BTDA: Two-factor dynamic identity authentication scheme for data trading based on alliance chain. Driving behavior analysis and classification by vehicle OBD data using machine learning.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1