Simulating HEP Workflows on Heterogeneous Architectures

C. Leggett, I. Shapoval
{"title":"Simulating HEP Workflows on Heterogeneous Architectures","authors":"C. Leggett, I. Shapoval","doi":"10.1109/eScience.2018.00087","DOIUrl":null,"url":null,"abstract":"The next generation of supercomputing facilities, such as Oak Ridge's Summit and Lawrence Livermore's Sierra, show an increasing use of GPGPUs and other accelerators in order to achieve their high FLOP counts. This trend will only grow with exascale facilities. In general, High Energy Physics computing workflows have made little use of GPUs due to the relatively small fraction of kernels that run efficiently on GPUs, and the expense of rewriting code for rapidly evolving GPU hardware. However, the computing requirements for high-luminosity LHC are enormous, and it will become essential to be able to make use of supercomputing facilities that rely heavily on GPUs and other accelerator technologies. ATLAS has already developed an extension to AthenaMT, its multithreaded event processing framework, that enables the non-intrusive offloading of computations to external accelerator resources, and is developing strategies to schedule the offloading efficiently. Before investing heavily in writing many kernels, we need to better understand the performance metrics and throughput bounds of the workflows with various accelerator configurations. This can be done by simulating the workflows, using real metrics for task interdependencies and timing, as we vary fractions of offloaded tasks, latencies, data conversion speeds, memory bandwidths, and accelerator offloading parameters such as CPU/GPU ratios and speeds. We present the results of these studies, which will be instrumental in directing effort to make the ATLAS framework, kernels and workflows run efficiently on exascale facilities.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"6 1","pages":"343-343"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 14th International Conference on e-Science (e-Science)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/eScience.2018.00087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

The next generation of supercomputing facilities, such as Oak Ridge's Summit and Lawrence Livermore's Sierra, show an increasing use of GPGPUs and other accelerators in order to achieve their high FLOP counts. This trend will only grow with exascale facilities. In general, High Energy Physics computing workflows have made little use of GPUs due to the relatively small fraction of kernels that run efficiently on GPUs, and the expense of rewriting code for rapidly evolving GPU hardware. However, the computing requirements for high-luminosity LHC are enormous, and it will become essential to be able to make use of supercomputing facilities that rely heavily on GPUs and other accelerator technologies. ATLAS has already developed an extension to AthenaMT, its multithreaded event processing framework, that enables the non-intrusive offloading of computations to external accelerator resources, and is developing strategies to schedule the offloading efficiently. Before investing heavily in writing many kernels, we need to better understand the performance metrics and throughput bounds of the workflows with various accelerator configurations. This can be done by simulating the workflows, using real metrics for task interdependencies and timing, as we vary fractions of offloaded tasks, latencies, data conversion speeds, memory bandwidths, and accelerator offloading parameters such as CPU/GPU ratios and speeds. We present the results of these studies, which will be instrumental in directing effort to make the ATLAS framework, kernels and workflows run efficiently on exascale facilities.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
异构架构上的HEP工作流模拟
下一代超级计算设备,如橡树岭的Summit和Lawrence Livermore的Sierra,显示出越来越多地使用gpgpu和其他加速器来实现高FLOP计数。这种趋势只会随着百亿亿次设施的发展而增长。一般来说,高能物理计算工作流很少使用GPU,因为在GPU上有效运行的内核相对较少,并且为快速发展的GPU硬件重写代码的费用很高。然而,高亮度LHC的计算需求是巨大的,能够利用严重依赖gpu和其他加速器技术的超级计算设施将变得至关重要。ATLAS已经开发了AthenaMT的扩展,AthenaMT是其多线程事件处理框架,可以将非侵入性的计算卸载到外部加速器资源,并且正在开发有效调度卸载的策略。在大量投入编写许多内核之前,我们需要更好地理解使用各种加速器配置的工作流的性能指标和吞吐量界限。这可以通过模拟工作流来实现,使用任务相互依赖性和时间的真实指标,因为我们可以改变卸载任务的部分、延迟、数据转换速度、内存带宽和加速器卸载参数(如CPU/GPU比率和速度)。我们介绍了这些研究的结果,这将有助于指导使ATLAS框架,内核和工作流程在百亿亿次设施上有效运行的工作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Occam: Software Environment for Creating Reproducible Research Smart Data Scouting in Professional Soccer: Evaluating Passing Performance Based on Position Tracking Data Improving LBFGS Optimizer in PyTorch: Knowledge Transfer from Radio Interferometric Calibration to Machine Learning Nordic Exome Variant Catalogue a Web Resource for Genomic Data Browsing Survey on Research Software Engineering in the Netherlands
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1