Louis-Claude Canon, Anthony Dugois, Mohamad El Sayah, Pierre-Cyrille Héam
{"title":"MCMC generation of cost matrices for scheduling performance evaluation","authors":"Louis-Claude Canon, Anthony Dugois, Mohamad El Sayah, Pierre-Cyrille Héam","doi":"10.1016/j.future.2025.107758","DOIUrl":null,"url":null,"abstract":"<div><div>In high performance computing, scheduling and allocating tasks to machines has long been a critical challenge, especially when dealing with heterogeneous execution costs. To design efficient algorithms and then assess their performance, many approaches have been proposed, among which simulations, which can be performed on a large variety of environments and application models. However, this technique is known to be sensitive to bias when it relies on random instances with an uncontrolled distribution. In this article, instead of designing a new optimization method, we focus on generating cost matrices to improve the empirical evaluation methodology. In particular, we use methods from the literature to provide formal guarantee on how costs matrices are distributed: we ensure a uniform distribution among the cost matrices with given task and machine heterogeneities. Although the use of randomly generated matrices has often been criticized, this new generation procedure is the first that is proven to prevent biased generation by ensuring a uniform generation with given properties. This method is relevant to assess the performance of scheduling heuristics, in particular when characterizing for which parameter values a given approach performs better than others. When applied to a makespan minimization problem, the methodology reveals when each of three efficient heuristics performs better depending on the instance heterogeneity.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"168 ","pages":"Article 107758"},"PeriodicalIF":6.2000,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25000536","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
In high performance computing, scheduling and allocating tasks to machines has long been a critical challenge, especially when dealing with heterogeneous execution costs. To design efficient algorithms and then assess their performance, many approaches have been proposed, among which simulations, which can be performed on a large variety of environments and application models. However, this technique is known to be sensitive to bias when it relies on random instances with an uncontrolled distribution. In this article, instead of designing a new optimization method, we focus on generating cost matrices to improve the empirical evaluation methodology. In particular, we use methods from the literature to provide formal guarantee on how costs matrices are distributed: we ensure a uniform distribution among the cost matrices with given task and machine heterogeneities. Although the use of randomly generated matrices has often been criticized, this new generation procedure is the first that is proven to prevent biased generation by ensuring a uniform generation with given properties. This method is relevant to assess the performance of scheduling heuristics, in particular when characterizing for which parameter values a given approach performs better than others. When applied to a makespan minimization problem, the methodology reveals when each of three efficient heuristics performs better depending on the instance heterogeneity.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.