Energy–time modelling of distributed multi-population genetic algorithms with dynamic workload in HPC clusters

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Future Generation Computer Systems-The International Journal of Escience Pub Date : 2025-02-10 DOI:10.1016/j.future.2025.107753
Juan José Escobar , Pablo Sánchez-Cuevas , Beatriz Prieto , Rukiye Savran Kızıltepe , Fernando Díaz-del-Río , Dragi Kimovski
{"title":"Energy–time modelling of distributed multi-population genetic algorithms with dynamic workload in HPC clusters","authors":"Juan José Escobar ,&nbsp;Pablo Sánchez-Cuevas ,&nbsp;Beatriz Prieto ,&nbsp;Rukiye Savran Kızıltepe ,&nbsp;Fernando Díaz-del-Río ,&nbsp;Dragi Kimovski","doi":"10.1016/j.future.2025.107753","DOIUrl":null,"url":null,"abstract":"<div><div>Time and energy efficiency is a highly relevant objective in high-performance computing systems, with high costs for executing the tasks. Among these tasks, evolutionary algorithms are of consideration due to their inherent parallel scalability and usually costly fitness evaluation functions. In this respect, several scheduling strategies for workload balancing in heterogeneous systems have been proposed in the literature, with runtime and energy consumption reduction as their goals. Our hypothesis is that a dynamic workload distribution can be fitted with greater precision using metaheuristics, such as genetic algorithms, instead of linear regression. Therefore, this paper proposes a new mathematical model to predict the energy–time behaviour of applications based on multi-population genetic algorithms, which dynamically distributes the evaluation of individuals among the CPU–GPU devices of heterogeneous clusters. An accurate predictor would save time and energy by selecting the best resource set before running such applications. The estimation of the workload distributed to each device has been carried out by simulation, while the model parameters have been fitted in a two-phase run using another genetic algorithm and the experimental energy–time values of the target application as input. When the new model is analysed and compared with another based on linear regression, the one proposed in this work significantly improves the baseline approach, showing normalised prediction errors of 0.081 for runtime and 0.091 for energy consumption, compared to 0.213 and 0.256 shown in the baseline approach.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"167 ","pages":"Article 107753"},"PeriodicalIF":6.2000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25000482","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Time and energy efficiency is a highly relevant objective in high-performance computing systems, with high costs for executing the tasks. Among these tasks, evolutionary algorithms are of consideration due to their inherent parallel scalability and usually costly fitness evaluation functions. In this respect, several scheduling strategies for workload balancing in heterogeneous systems have been proposed in the literature, with runtime and energy consumption reduction as their goals. Our hypothesis is that a dynamic workload distribution can be fitted with greater precision using metaheuristics, such as genetic algorithms, instead of linear regression. Therefore, this paper proposes a new mathematical model to predict the energy–time behaviour of applications based on multi-population genetic algorithms, which dynamically distributes the evaluation of individuals among the CPU–GPU devices of heterogeneous clusters. An accurate predictor would save time and energy by selecting the best resource set before running such applications. The estimation of the workload distributed to each device has been carried out by simulation, while the model parameters have been fitted in a two-phase run using another genetic algorithm and the experimental energy–time values of the target application as input. When the new model is analysed and compared with another based on linear regression, the one proposed in this work significantly improves the baseline approach, showing normalised prediction errors of 0.081 for runtime and 0.091 for energy consumption, compared to 0.213 and 0.256 shown in the baseline approach.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高性能计算集群中具有动态工作负载的分布式多种群遗传算法的能量时间建模
在高性能计算系统中,时间和能源效率是一个高度相关的目标,执行任务的成本很高。在这些任务中,进化算法由于其固有的并行可扩展性和通常昂贵的适应度评估函数而被考虑。在这方面,文献中提出了几种异构系统中工作负载平衡的调度策略,以减少运行时间和能源消耗为目标。我们的假设是,动态工作量分布可以使用元启发式(如遗传算法)而不是线性回归来更精确地拟合。因此,本文提出了一种基于多种群遗传算法的预测应用程序能量时间行为的数学模型,该模型在异构集群的CPU-GPU设备之间动态分配个体的评价。通过在运行此类应用程序之前选择最佳资源集,准确的预测器将节省时间和精力。通过仿真对分配到每个设备的工作量进行了估计,同时使用另一种遗传算法以目标应用程序的实验能量-时间值作为输入,分两阶段拟合模型参数。当对新模型进行分析并与另一个基于线性回归的模型进行比较时,本工作中提出的模型显着改进了基线方法,显示运行时的归一化预测误差为0.081,能源消耗为0.091,而基线方法显示的归一化预测误差为0.213和0.256。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
期刊最新文献
Dynamic UAV Task Offloading Combining Deep Reinforcement Learning and Two-Stage Stochastic Optimization TAERM: Traffic Accident Emergency Response Management Framework for Detection and Classification Using IoT and YOLOv9 Applying Quantum Error-correcting Codes for Fault-tolerant Blind Quantum Cloud Computation FedIoV: A Secure and Adaptive Federated Framework for Real-Time Intrusion Detection in Vehicular Networks A Hybrid Ensemble Framework for Unknown Attack Detection in IoT Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1