The Dynamics of Backfilling: Solving the Mystery of Why Increased Inaccuracy May Help

2006 IEEE International Symposium on Workload Characterization Pub Date : 2006-10-01 DOI:10.1109/IISWC.2006.302737

Dan Tsafrir, D. Feitelson

{"title":"The Dynamics of Backfilling: Solving the Mystery of Why Increased Inaccuracy May Help","authors":"Dan Tsafrir, D. Feitelson","doi":"10.1109/IISWC.2006.302737","DOIUrl":null,"url":null,"abstract":"Parallel job scheduling with backfilling requires users to provide runtime estimates, used by the scheduler to better pack the jobs. Studies of the impact of such estimates on performance have modeled them using a \"badness factor\" f ges 0 in an attempt to capture their inaccuracy (given a runtime r, the estimate is uniformly distributed in [r, (f + 1) middot r]). Surprisingly, inaccurate estimates (f > 0) yielded better performance than accurate ones (f = 0). We explain this by a \"heel and toe\" dynamics that, with f > 0, cause backfilling to approximate shortest-job first scheduling. We further find the effect of systematically increasing f is V-shaped: average wait time and slowdown initially drop, only to rise later on. This happens because higher fs create bigger \"holes\" in the schedule (longer jobs can backfill) and increase the randomness (more long jobs appear as short), thus overshadowing the initial heel-and-toe preference for shorter jobs. The bottom line is that artificial inaccuracy generated by multiplying (real or perfect) estimates by a factor is (1) just a scheduling technique that trades off fairness for performance, and is (2) ill-suited for studying the effect of real inaccuracy. Real estimates are modal (90% of the jobs use the same 20 estimates) and bounded by a maximum (usually the most popular estimate). Therefore, when performing an evaluation, \"increased inaccuracy\" should translate to increased modality. Unlike multiplying, this indeed worsens performance as one would intuitively expect","PeriodicalId":222041,"journal":{"name":"2006 IEEE International Symposium on Workload Characterization","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Symposium on Workload Characterization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISWC.2006.302737","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 53

Abstract

Parallel job scheduling with backfilling requires users to provide runtime estimates, used by the scheduler to better pack the jobs. Studies of the impact of such estimates on performance have modeled them using a "badness factor" f ges 0 in an attempt to capture their inaccuracy (given a runtime r, the estimate is uniformly distributed in [r, (f + 1) middot r]). Surprisingly, inaccurate estimates (f > 0) yielded better performance than accurate ones (f = 0). We explain this by a "heel and toe" dynamics that, with f > 0, cause backfilling to approximate shortest-job first scheduling. We further find the effect of systematically increasing f is V-shaped: average wait time and slowdown initially drop, only to rise later on. This happens because higher fs create bigger "holes" in the schedule (longer jobs can backfill) and increase the randomness (more long jobs appear as short), thus overshadowing the initial heel-and-toe preference for shorter jobs. The bottom line is that artificial inaccuracy generated by multiplying (real or perfect) estimates by a factor is (1) just a scheduling technique that trades off fairness for performance, and is (2) ill-suited for studying the effect of real inaccuracy. Real estimates are modal (90% of the jobs use the same 20 estimates) and bounded by a maximum (usually the most popular estimate). Therefore, when performing an evaluation, "increased inaccuracy" should translate to increased modality. Unlike multiplying, this indeed worsens performance as one would intuitively expect

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

回填动力学:解决为什么不准确性增加的谜团可能会有所帮助

带回填的并行作业调度要求用户提供运行时估计，以便调度器更好地打包作业。关于这种估计对性能的影响的研究使用“坏因子”fges 0对它们进行建模，试图捕捉它们的不准确性(给定运行时r，估计均匀分布在[r， (f + 1)中间点r]中)。令人惊讶的是，不准确的估计(f > 0)比准确的估计(f = 0)产生更好的性能。我们通过“脚跟和脚趾”动力学来解释这一点，当f > 0时，导致回填近似于最短作业优先调度。我们进一步发现，系统地增加f的影响呈v型:平均等待时间和慢速最初下降，后来才上升。这是因为较高的fs会在调度中产生较大的“漏洞”(较长的作业可以回填)，并增加随机性(较长的作业显示为较短的作业)，从而掩盖了最初对较短作业的完全偏好。底线是，通过将(真实的或完美的)估计值乘以一个因素而产生的人为不准确性(1)只是一种权衡公平性和性能的调度技术，并且(2)不适合研究真实不准确性的影响。真实的估计是模态的(90%的作业使用相同的20个估计)，并且有一个最大值(通常是最流行的估计)。因此，在执行评估时，“增加的不准确性”应该转化为增加的模态。与乘法不同，这确实会使性能恶化，正如人们直观地预期的那样

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2006 IEEE International Symposium on Workload Characterization

自引率

0.00%

发文量

期刊最新文献

Evaluating Benchmark Subsetting Approaches An Architectural Characterization Study of Data Mining and Bioinformatics Workloads Techniques for Real-System Characterization of Java Virtual Machine Energy and Power Behavior A Quantitative Evaluation of the Contribution of Native Code to Java Workloads DFS: A Simple to Write Yet Difficult to Execute Benchmark