异构环境中基于 Spark 和 YARN 的能量感知资源管理策略

IF 5.3 2区 计算机科学 Q1 TELECOMMUNICATIONS IEEE Transactions on Green Communications and Networking Pub Date : 2023-12-26 DOI:10.1109/TGCN.2023.3347276
Fatemeh Shabestari;Nima Jafari Navimipour
{"title":"异构环境中基于 Spark 和 YARN 的能量感知资源管理策略","authors":"Fatemeh Shabestari;Nima Jafari Navimipour","doi":"10.1109/TGCN.2023.3347276","DOIUrl":null,"url":null,"abstract":"Apache Spark is a popular framework for processing big data. Running Spark on Hadoop YARN allows it to schedule Spark workloads alongside other data-processing frameworks on Hadoop. When an application is deployed in a YARN cluster, its resources are given without considering energy efficiency. Furthermore, there is no way to enforce any user-specified deadline constraints. To address these issues, we propose a new deadline-aware resource management system and a scheduling algorithm to minimize the total energy consumption in Spark on YARN for heterogeneous clusters. First, a deadline-aware energy-efficient model for the considered problem is proposed. Then, using a locality-aware method, executors are assigned to applications. This algorithm sorts the nodes based on the performance per watt (PPW) metric, the number of application data blocks on nodes, and the rack locality. It also offers three ways to choose executors from different machines: greedy, random, and Pareto-based. Finally, the proposed heuristic task scheduler schedules tasks on executors to minimize total energy and tardiness. We evaluated the performance of the suggested algorithm regarding energy efficiency and satisfying the Service Level Agreement (SLA). The results showed that the method outperforms the popular algorithms regarding energy consumption and meeting deadlines.","PeriodicalId":13052,"journal":{"name":"IEEE Transactions on Green Communications and Networking","volume":null,"pages":null},"PeriodicalIF":5.3000,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Energy-Aware Resource Management Strategy Based on Spark and YARN in Heterogeneous Environments\",\"authors\":\"Fatemeh Shabestari;Nima Jafari Navimipour\",\"doi\":\"10.1109/TGCN.2023.3347276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Apache Spark is a popular framework for processing big data. Running Spark on Hadoop YARN allows it to schedule Spark workloads alongside other data-processing frameworks on Hadoop. When an application is deployed in a YARN cluster, its resources are given without considering energy efficiency. Furthermore, there is no way to enforce any user-specified deadline constraints. To address these issues, we propose a new deadline-aware resource management system and a scheduling algorithm to minimize the total energy consumption in Spark on YARN for heterogeneous clusters. First, a deadline-aware energy-efficient model for the considered problem is proposed. Then, using a locality-aware method, executors are assigned to applications. This algorithm sorts the nodes based on the performance per watt (PPW) metric, the number of application data blocks on nodes, and the rack locality. It also offers three ways to choose executors from different machines: greedy, random, and Pareto-based. Finally, the proposed heuristic task scheduler schedules tasks on executors to minimize total energy and tardiness. We evaluated the performance of the suggested algorithm regarding energy efficiency and satisfying the Service Level Agreement (SLA). The results showed that the method outperforms the popular algorithms regarding energy consumption and meeting deadlines.\",\"PeriodicalId\":13052,\"journal\":{\"name\":\"IEEE Transactions on Green Communications and Networking\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2023-12-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Green Communications and Networking\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10374225/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Green Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10374225/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

Apache Spark 是处理大数据的流行框架。在 Hadoop YARN 上运行 Spark,可以将 Spark 工作负载与 Hadoop 上的其他数据处理框架一起调度。在 YARN 集群中部署应用程序时,其资源是在不考虑能效的情况下提供的。此外,也无法强制执行任何用户指定的截止日期约束。为了解决这些问题,我们提出了一种新的截止日期感知资源管理系统和调度算法,以最大限度地减少异构集群 YARN 上 Spark 的总能耗。首先,我们为所考虑的问题提出了一个截止日期感知高能效模型。然后,使用本地感知方法为应用程序分配执行器。该算法根据每瓦特性能(PPW)指标、节点上的应用数据块数量和机架位置对节点进行排序。它还提供了三种从不同机器中选择执行器的方法:贪婪、随机和基于帕累托。最后,建议的启发式任务调度器将任务调度到执行器上,以最小化总能量和延迟。我们评估了建议算法在能源效率和满足服务水平协议(SLA)方面的性能。结果表明,该方法在能耗和满足截止日期方面优于常用算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Energy-Aware Resource Management Strategy Based on Spark and YARN in Heterogeneous Environments
Apache Spark is a popular framework for processing big data. Running Spark on Hadoop YARN allows it to schedule Spark workloads alongside other data-processing frameworks on Hadoop. When an application is deployed in a YARN cluster, its resources are given without considering energy efficiency. Furthermore, there is no way to enforce any user-specified deadline constraints. To address these issues, we propose a new deadline-aware resource management system and a scheduling algorithm to minimize the total energy consumption in Spark on YARN for heterogeneous clusters. First, a deadline-aware energy-efficient model for the considered problem is proposed. Then, using a locality-aware method, executors are assigned to applications. This algorithm sorts the nodes based on the performance per watt (PPW) metric, the number of application data blocks on nodes, and the rack locality. It also offers three ways to choose executors from different machines: greedy, random, and Pareto-based. Finally, the proposed heuristic task scheduler schedules tasks on executors to minimize total energy and tardiness. We evaluated the performance of the suggested algorithm regarding energy efficiency and satisfying the Service Level Agreement (SLA). The results showed that the method outperforms the popular algorithms regarding energy consumption and meeting deadlines.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Green Communications and Networking
IEEE Transactions on Green Communications and Networking Computer Science-Computer Networks and Communications
CiteScore
9.30
自引率
6.20%
发文量
181
期刊最新文献
Table of Contents Guest Editorial Special Issue on Green Open Radio Access Networks: Architecture, Challenges, Opportunities, and Use Cases IEEE Transactions on Green Communications and Networking IEEE Communications Society Information HSADR: A New Highly Secure Aggregation and Dropout-Resilient Federated Learning Scheme for Radio Access Networks With Edge Computing Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1