{"title":"Energy-aware scheduling for reliability-oriented real-time parallel applications allocation on heterogeneous computing systems","authors":"Rui She , Yuting Wu , Enfang Cui","doi":"10.1016/j.future.2025.107738","DOIUrl":null,"url":null,"abstract":"<div><div>Heterogeneous computing systems (HCSs) have rapidly developed and been widely applied due to their high performance and low cost characteristics. However, HCSs face trade-offs and conflicts among the three core indicators: energy consumption, reliability, and scheduling length. How to balance the three core indicators to achieve optimal performance is the core issue faced by HCSs. In this paper, we propose an energy-aware scheduling model for reliability-oriented real-time parallel applications on heterogeneous computing systems. The problem of minimum system-centric energy efficiency problem is studied. In terms of problem solving, minimum schedule time length (MSTL) algorithm is proposed, which provides a baseline for assessing feasibility and ensuring compliance with both response time and reliability criteria. To further enhance reliability, this paper considers both transient faults and permanent faults, and proposes the primary–secondary backup (PSB) algorithm to improve the fault tolerance, with dynamic power management (DPM) and dynamic voltage and frequency scaling (DVFS) to reduce energy consumption. Furthermore, the dynamic voltage and frequency scaling (DVFS) algorithm is proposed, within the deadline, redistributing tasks that have not been executed on failed processors to reduce energy consumption caused by excessively long redundant backups. Extensive experimental results on real-world and randomly generated applications demonstrate the effectiveness of the proposed algorithms under various conditions.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"168 ","pages":"Article 107738"},"PeriodicalIF":6.2000,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25000330","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Heterogeneous computing systems (HCSs) have rapidly developed and been widely applied due to their high performance and low cost characteristics. However, HCSs face trade-offs and conflicts among the three core indicators: energy consumption, reliability, and scheduling length. How to balance the three core indicators to achieve optimal performance is the core issue faced by HCSs. In this paper, we propose an energy-aware scheduling model for reliability-oriented real-time parallel applications on heterogeneous computing systems. The problem of minimum system-centric energy efficiency problem is studied. In terms of problem solving, minimum schedule time length (MSTL) algorithm is proposed, which provides a baseline for assessing feasibility and ensuring compliance with both response time and reliability criteria. To further enhance reliability, this paper considers both transient faults and permanent faults, and proposes the primary–secondary backup (PSB) algorithm to improve the fault tolerance, with dynamic power management (DPM) and dynamic voltage and frequency scaling (DVFS) to reduce energy consumption. Furthermore, the dynamic voltage and frequency scaling (DVFS) algorithm is proposed, within the deadline, redistributing tasks that have not been executed on failed processors to reduce energy consumption caused by excessively long redundant backups. Extensive experimental results on real-world and randomly generated applications demonstrate the effectiveness of the proposed algorithms under various conditions.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.