apache flink中流应用的能量感知调度和两层协调负载平衡

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-01-01 DOI:10.1016/j.future.2024.107681
Hongjian Li, Junlin Li, Xiaolin Duan, Jianglin Xia
{"title":"apache flink中流应用的能量感知调度和两层协调负载平衡","authors":"Hongjian Li,&nbsp;Junlin Li,&nbsp;Xiaolin Duan,&nbsp;Jianglin Xia","doi":"10.1016/j.future.2024.107681","DOIUrl":null,"url":null,"abstract":"<div><div>Apache Flink has become one of the highly regarded streaming computing frameworks with its excellent advantages of high throughput, low latency, and high reliability. However, the default task scheduling policy follows the first-come-first-served principle, which fails to fully consider the differences in energy efficiency and resource loading of nodes in heterogeneous clusters and may lead to high energy consumption and uneven load distribution when executing jobs. To solve this problem, this paper proposes a two-tier coordinated load balancing and energy-saving scheduling optimization strategy. First, we construct an energy efficiency model based on Service Level Agreements (SLA) and design an Energy-Saving Scheduling Algorithm (ESSA) based on this model, aiming to reduce the energy consumption of Flink clusters when executing jobs. This ESSA algorithm integrally considers the effects of two SLA performance metrics including node response time and throughput on node energy consumption, as well as the differences in the energy efficiencies of different nodes in heterogeneous clusters. Second, in order to solve the load imbalance problem that may be caused by Flink’s default scheduling policy, an Energy-Aware Two-Tier Coordinated Load Balancing algorithm (TTCLB-EA) is proposed, which optimizes the cluster load at both the inter-node and intra-node levels through task based on energy efficiency priorities. Experimental results show that compared with the default scheduling strategy, round-robin scheduling strategy, and St-Stream, the proposed algorithm improves about 14.59%, 12.75%, and 7.32% in load balancing, while saving about 14.52%, 10.54%, and 7.58% in energy consumption, respectively. The proposed algorithms not only enhance the performance of the Flink cluster but also help to reduce energy consumption and achieve more efficient resource utilization.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"166 ","pages":"Article 107681"},"PeriodicalIF":6.2000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Energy-aware scheduling and two-tier coordinated load balancing for streaming applications in apache flink\",\"authors\":\"Hongjian Li,&nbsp;Junlin Li,&nbsp;Xiaolin Duan,&nbsp;Jianglin Xia\",\"doi\":\"10.1016/j.future.2024.107681\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Apache Flink has become one of the highly regarded streaming computing frameworks with its excellent advantages of high throughput, low latency, and high reliability. However, the default task scheduling policy follows the first-come-first-served principle, which fails to fully consider the differences in energy efficiency and resource loading of nodes in heterogeneous clusters and may lead to high energy consumption and uneven load distribution when executing jobs. To solve this problem, this paper proposes a two-tier coordinated load balancing and energy-saving scheduling optimization strategy. First, we construct an energy efficiency model based on Service Level Agreements (SLA) and design an Energy-Saving Scheduling Algorithm (ESSA) based on this model, aiming to reduce the energy consumption of Flink clusters when executing jobs. This ESSA algorithm integrally considers the effects of two SLA performance metrics including node response time and throughput on node energy consumption, as well as the differences in the energy efficiencies of different nodes in heterogeneous clusters. Second, in order to solve the load imbalance problem that may be caused by Flink’s default scheduling policy, an Energy-Aware Two-Tier Coordinated Load Balancing algorithm (TTCLB-EA) is proposed, which optimizes the cluster load at both the inter-node and intra-node levels through task based on energy efficiency priorities. Experimental results show that compared with the default scheduling strategy, round-robin scheduling strategy, and St-Stream, the proposed algorithm improves about 14.59%, 12.75%, and 7.32% in load balancing, while saving about 14.52%, 10.54%, and 7.58% in energy consumption, respectively. The proposed algorithms not only enhance the performance of the Flink cluster but also help to reduce energy consumption and achieve more efficient resource utilization.</div></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":\"166 \",\"pages\":\"Article 107681\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X24006459\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24006459","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

摘要

Apache Flink以其高吞吐量、低延迟、高可靠性等优点,成为备受推崇的流计算框架之一。但是,默认的任务调度策略采用先到先得的原则,没有充分考虑异构集群中节点的能效和资源负载差异,在执行任务时可能导致能耗高、负载分配不均匀。针对这一问题,本文提出了一种两层协同负载均衡和节能调度优化策略。首先,构建了基于服务水平协议(SLA)的能效模型,并在此基础上设计了节能调度算法(ESSA),以降低Flink集群在执行作业时的能耗。该算法综合考虑了节点响应时间和吞吐量两个SLA性能指标对节点能耗的影响,以及异构集群中不同节点能效的差异。其次,为了解决Flink的默认调度策略可能导致的负载不平衡问题,提出了一种能量感知的两层协调负载平衡算法(TTCLB-EA),该算法通过基于能效优先级的任务,在节点间和节点内两个层面对集群负载进行优化。实验结果表明,与默认调度策略、轮循调度策略和St-Stream调度策略相比,该算法的负载均衡性能分别提高了14.59%、12.75%和7.32%,能耗分别节省了14.52%、10.54%和7.58%。提出的算法不仅提高了Flink集群的性能,而且有助于降低能耗,实现更有效的资源利用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Energy-aware scheduling and two-tier coordinated load balancing for streaming applications in apache flink
Apache Flink has become one of the highly regarded streaming computing frameworks with its excellent advantages of high throughput, low latency, and high reliability. However, the default task scheduling policy follows the first-come-first-served principle, which fails to fully consider the differences in energy efficiency and resource loading of nodes in heterogeneous clusters and may lead to high energy consumption and uneven load distribution when executing jobs. To solve this problem, this paper proposes a two-tier coordinated load balancing and energy-saving scheduling optimization strategy. First, we construct an energy efficiency model based on Service Level Agreements (SLA) and design an Energy-Saving Scheduling Algorithm (ESSA) based on this model, aiming to reduce the energy consumption of Flink clusters when executing jobs. This ESSA algorithm integrally considers the effects of two SLA performance metrics including node response time and throughput on node energy consumption, as well as the differences in the energy efficiencies of different nodes in heterogeneous clusters. Second, in order to solve the load imbalance problem that may be caused by Flink’s default scheduling policy, an Energy-Aware Two-Tier Coordinated Load Balancing algorithm (TTCLB-EA) is proposed, which optimizes the cluster load at both the inter-node and intra-node levels through task based on energy efficiency priorities. Experimental results show that compared with the default scheduling strategy, round-robin scheduling strategy, and St-Stream, the proposed algorithm improves about 14.59%, 12.75%, and 7.32% in load balancing, while saving about 14.52%, 10.54%, and 7.58% in energy consumption, respectively. The proposed algorithms not only enhance the performance of the Flink cluster but also help to reduce energy consumption and achieve more efficient resource utilization.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
期刊最新文献
Self-sovereign identity framework with user-friendly private key generation and rule table Accelerating complex graph queries by summary-based hybrid partitioning for discovering vulnerabilities of distribution equipment DNA: Dual-radio Dual-constraint Node Activation scheduling for energy-efficient data dissemination in IoT Blending lossy and lossless data compression methods to support health data streaming in smart cities Energy–time modelling of distributed multi-population genetic algorithms with dynamic workload in HPC clusters
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1