{"title":"Game-Theoretic Approach for Grace-Period Policy in Supercomputers","authors":"Fei He, N. Rao, Chris Y. T. Ma","doi":"10.23919/fusion49465.2021.9626952","DOIUrl":null,"url":null,"abstract":"Job scheduling at supercomputing facilities is important for achieving high utilization of these valuable resources while ensuring effective execution of jobs submitted by users. The jobs are scheduled according to their specified resource demands such as expected job completion times, and the available resources based on allocations. Jobs that overrun their allocated times are terminated, for example, after a grace-period. It is non-trivial and often very complex for users to accurately estimate the completion times of their jobs, and consequently they face a dilemma: underestimate the job time to have a higher priority and risk job termination due to overrun, or overestimate it to ensure its completion and risk its delayed execution. In this paper, we investigate whether providing grace-period can benefit facility performance by developing a game- theoretic model between a facility provider and multiple users for a simplified scheduling scenario based on job execution times. We present closed-form expressions for the provider’s and user’s best-response strategies to maximize their respective utility functions. We describe conditions under which offering a grace-period is advantageous to both facility provider and users by deriving the Nash equilibrium of the game.","PeriodicalId":226850,"journal":{"name":"2021 IEEE 24th International Conference on Information Fusion (FUSION)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 24th International Conference on Information Fusion (FUSION)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/fusion49465.2021.9626952","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Job scheduling at supercomputing facilities is important for achieving high utilization of these valuable resources while ensuring effective execution of jobs submitted by users. The jobs are scheduled according to their specified resource demands such as expected job completion times, and the available resources based on allocations. Jobs that overrun their allocated times are terminated, for example, after a grace-period. It is non-trivial and often very complex for users to accurately estimate the completion times of their jobs, and consequently they face a dilemma: underestimate the job time to have a higher priority and risk job termination due to overrun, or overestimate it to ensure its completion and risk its delayed execution. In this paper, we investigate whether providing grace-period can benefit facility performance by developing a game- theoretic model between a facility provider and multiple users for a simplified scheduling scenario based on job execution times. We present closed-form expressions for the provider’s and user’s best-response strategies to maximize their respective utility functions. We describe conditions under which offering a grace-period is advantageous to both facility provider and users by deriving the Nash equilibrium of the game.