Energy-aware dynamic response and efficient consolidation strategies for disaster survivability of cloud microservices architecture

IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Computing Pub Date : 2024-06-17 DOI:10.1007/s00607-024-01305-x
Iure Fé, Tuan Anh Nguyen, Mario Di Mauro, Fabio Postiglione, Alex Ramos, André Soares, Eunmi Choi, Dugki Min, Jae Woo Lee, Francisco Airton Silva
{"title":"Energy-aware dynamic response and efficient consolidation strategies for disaster survivability of cloud microservices architecture","authors":"Iure Fé, Tuan Anh Nguyen, Mario Di Mauro, Fabio Postiglione, Alex Ramos, André Soares, Eunmi Choi, Dugki Min, Jae Woo Lee, Francisco Airton Silva","doi":"10.1007/s00607-024-01305-x","DOIUrl":null,"url":null,"abstract":"<p>Computer system resilience refers to the ability of a computer system to continue functioning even in the face of unexpected events or disruptions. These disruptions can be caused by a variety of factors, such as hardware failures, software glitches, cyber attacks, or even natural disasters. Modern computational environments need applications that can recover quickly from major disruptions while also being environmentally sustainable. Balancing system resilience with energy efficiency is challenging, as efforts to improve one can harm the other. This paper presents a method to enhance disaster survivability in microservice architectures, particularly those using Kubernetes in cloud-based environments, focusing on optimizing electrical energy use. Aiming to save energy, our work adopt the consolidation strategy that means grouping multiple microservices on a single host. Our aproach uses a widely adopted analytical model, the Generalized Stochastic Petri Net (GSPN). GSPN are a powerful modeling technique that is widely used in various fields, including engineering, computer science, and operations research. One of the primary advantages of GSPN is its ability to model complex systems with a high degree of accuracy. Additionally, GSPN allows for the modeling of both logical and stochastic behavior, making it ideal for systems that involve a combination of both. Our GSPN models compute a number of metrics such as: recovery time, system availability, reliability, Mean Time to Failure, and the configuration of cloud-based microservices. We compared our approach against others focusing on survivability or efficiency. Our approach aligns with Recovery Time Objectives during sudden disasters and offers the fastest recovery, requiring 9% less warning time to fully recover in cases of disaster with alert when compared to strategies with similar electrical consumption. It also saves about 27% energy compared to low consolidation strategies and 5% against high consolidation under static conditions.</p>","PeriodicalId":10718,"journal":{"name":"Computing","volume":"9 1","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00607-024-01305-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Computer system resilience refers to the ability of a computer system to continue functioning even in the face of unexpected events or disruptions. These disruptions can be caused by a variety of factors, such as hardware failures, software glitches, cyber attacks, or even natural disasters. Modern computational environments need applications that can recover quickly from major disruptions while also being environmentally sustainable. Balancing system resilience with energy efficiency is challenging, as efforts to improve one can harm the other. This paper presents a method to enhance disaster survivability in microservice architectures, particularly those using Kubernetes in cloud-based environments, focusing on optimizing electrical energy use. Aiming to save energy, our work adopt the consolidation strategy that means grouping multiple microservices on a single host. Our aproach uses a widely adopted analytical model, the Generalized Stochastic Petri Net (GSPN). GSPN are a powerful modeling technique that is widely used in various fields, including engineering, computer science, and operations research. One of the primary advantages of GSPN is its ability to model complex systems with a high degree of accuracy. Additionally, GSPN allows for the modeling of both logical and stochastic behavior, making it ideal for systems that involve a combination of both. Our GSPN models compute a number of metrics such as: recovery time, system availability, reliability, Mean Time to Failure, and the configuration of cloud-based microservices. We compared our approach against others focusing on survivability or efficiency. Our approach aligns with Recovery Time Objectives during sudden disasters and offers the fastest recovery, requiring 9% less warning time to fully recover in cases of disaster with alert when compared to strategies with similar electrical consumption. It also saves about 27% energy compared to low consolidation strategies and 5% against high consolidation under static conditions.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向云微服务架构灾难生存能力的能源感知动态响应和高效整合策略
计算机系统恢复能力是指计算机系统在面对突发事件或中断时仍能继续运行的能力。这些中断可能由多种因素造成,如硬件故障、软件故障、网络攻击甚至自然灾害。现代计算环境需要既能从重大中断中快速恢复,又具有环境可持续性的应用程序。在系统恢复能力和能源效率之间取得平衡具有挑战性,因为改善其中一个方面的努力可能会损害另一个方面。本文介绍了一种提高微服务架构(尤其是在基于云的环境中使用 Kubernetes 的微服务架构)灾难生存能力的方法,重点是优化电能使用。为了节约能源,我们的工作采用了整合策略,即在单个主机上组合多个微服务。我们的方法采用了一种广泛采用的分析模型--广义随机 Petri 网(GSPN)。GSPN 是一种强大的建模技术,广泛应用于工程、计算机科学和运筹学等多个领域。GSPN 的主要优势之一是能够对复杂系统进行高精度建模。此外,GSPN 还能对逻辑行为和随机行为进行建模,因此非常适合涉及逻辑行为和随机行为的系统。我们的 GSPN 模型可以计算一系列指标,例如:恢复时间、系统可用性、可靠性、平均故障时间以及基于云的微服务的配置。我们将我们的方法与其他专注于生存性或效率的方法进行了比较。我们的方法符合突发性灾难期间的恢复时间目标,并能提供最快的恢复速度,与耗电量相似的策略相比,在有警报的灾难情况下,完全恢复所需的预警时间减少了 9%。在静态条件下,与低整合策略相比,它还能节省约 27% 的能源,与高整合策略相比,能节省 5% 的能源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computing
Computing 工程技术-计算机:理论方法
CiteScore
8.20
自引率
2.70%
发文量
107
审稿时长
3 months
期刊介绍: Computing publishes original papers, short communications and surveys on all fields of computing. The contributions should be written in English and may be of theoretical or applied nature, the essential criteria are computational relevance and systematic foundation of results.
期刊最新文献
Mapping and just-in-time traffic congestion mitigation for emergency vehicles in smart cities Fog intelligence for energy efficient management in smart street lamps Contextual authentication of users and devices using machine learning Multi-objective service composition optimization problem in IoT for agriculture 4.0 Robust evaluation of GPU compute instances for HPC and AI in the cloud: a TOPSIS approach with sensitivity, bootstrapping, and non-parametric analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1