Approximation Proofs of a Fast and Efficient List Scheduling Algorithm for Task-Based Runtime Systems on Multicores and GPUs

Olivier Beaumont, Lionel Eyraud-Dubois, Suraj Kumar
{"title":"Approximation Proofs of a Fast and Efficient List Scheduling Algorithm for Task-Based Runtime Systems on Multicores and GPUs","authors":"Olivier Beaumont, Lionel Eyraud-Dubois, Suraj Kumar","doi":"10.1109/IPDPS.2017.71","DOIUrl":null,"url":null,"abstract":"In High Performance Computing, heterogeneity is now the normwith specialized accelerators like GPUs providing efficientcomputational power. The added complexity has led to the developmentof task-based runtime systems, which allow complex computations to beexpressed as task graphs, and rely on scheduling algorithms to performload balancing between all resources of the platforms. Developing goodscheduling algorithms, even on a single node, and analyzing them canthus have a very high impact on the performance of current HPCsystems. The special case of two types of resources (namely CPUs andGPUs) is of practical interest. HeteroPrio is such an algorithm whichhas been proposed in the context of fast multipole computations, andthen extended to general task graphs with very interesting results. Inthis paper, we provide a theoretical insight on the performance ofHeteroPrio, by proving approximation bounds compared to the optimalschedule in the case where all tasks are independent and for differentplatform sizes. Interestingly, this shows that spoliation allows toprove approximation ratios for a list scheduling algorithm on twounrelated resources, which is not possible otherwise. We also establishthat almost all our bounds are tight. Additionally, we provide anexperimental evaluation of HeteroPrio on real task graphs from denselinear algebra computation, which highlights the reasons explainingits good practical performance.","PeriodicalId":209524,"journal":{"name":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2017.71","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

Abstract

In High Performance Computing, heterogeneity is now the normwith specialized accelerators like GPUs providing efficientcomputational power. The added complexity has led to the developmentof task-based runtime systems, which allow complex computations to beexpressed as task graphs, and rely on scheduling algorithms to performload balancing between all resources of the platforms. Developing goodscheduling algorithms, even on a single node, and analyzing them canthus have a very high impact on the performance of current HPCsystems. The special case of two types of resources (namely CPUs andGPUs) is of practical interest. HeteroPrio is such an algorithm whichhas been proposed in the context of fast multipole computations, andthen extended to general task graphs with very interesting results. Inthis paper, we provide a theoretical insight on the performance ofHeteroPrio, by proving approximation bounds compared to the optimalschedule in the case where all tasks are independent and for differentplatform sizes. Interestingly, this shows that spoliation allows toprove approximation ratios for a list scheduling algorithm on twounrelated resources, which is not possible otherwise. We also establishthat almost all our bounds are tight. Additionally, we provide anexperimental evaluation of HeteroPrio on real task graphs from denselinear algebra computation, which highlights the reasons explainingits good practical performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多核和gpu上基于任务的运行时系统快速高效列表调度算法的逼近证明
在高性能计算中,异构现在是标准,像gpu这样的专用加速器提供了高效的计算能力。增加的复杂性导致了基于任务的运行时系统的发展,它允许将复杂的计算表示为任务图,并依赖于调度算法在平台的所有资源之间执行负载平衡。开发好的调度算法,即使是在单个节点上,并对其进行分析,对当前高性能计算系统的性能有很大的影响。两种类型的资源(即cpu和gpu)的特殊情况具有实际意义。HeteroPrio就是这样一种算法,它是在快速多极计算的背景下提出的,然后扩展到一般的任务图,得到了非常有趣的结果。在本文中,我们通过证明在所有任务独立且不同平台大小的情况下与最优调度相比的近似界,提供了对heteroprio性能的理论见解。有趣的是,这表明掠夺允许在两个相关资源上证明列表调度算法的近似比率,这在其他情况下是不可能的。我们还确定了几乎所有的边界都是紧的。此外,我们还从密集线性代数计算的实际任务图上对HeteroPrio进行了实验评估,这突出了解释其良好实际性能的原因。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL Toucan — A Translator for Communication Tolerant MPI Applications Production Hardware Overprovisioning: Real-World Performance Optimization Using an Extensible Power-Aware Resource Management Framework Approximation Proofs of a Fast and Efficient List Scheduling Algorithm for Task-Based Runtime Systems on Multicores and GPUs Dynamic Memory-Aware Task-Tree Scheduling
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1