Minos计算库:用于极端异构系统的高效并行编程

R. Gioiosa, B. O. Mutlu, Seyong Lee, J. Vetter, Giulio Picierro, M. Cesati
{"title":"Minos计算库:用于极端异构系统的高效并行编程","authors":"R. Gioiosa, B. O. Mutlu, Seyong Lee, J. Vetter, Giulio Picierro, M. Cesati","doi":"10.1145/3366428.3380770","DOIUrl":null,"url":null,"abstract":"Hardware specialization has become the silver bullet to achieve efficient high performance, from Systems-on-Chip systems, where hardware specialization can be \"extreme\", to large-scale HPC systems. As the complexity of the systems increases, so does the complexity of programming such architectures in a portable way. This work introduces the Minos Computing Library (MCL), as system software, programming model, and programming model runtime that facilitate programming extremely heterogeneous systems. MCL supports the execution of several multi-threaded applications within the same compute node, performs asynchronous execution of application tasks, efficiently balances computation across hardware resources, and provides performance portability. We show that code developed on a personal desktop automatically scales up to fully utilize powerful workstations with 8 GPUs and down to power-efficient embedded systems. MCL provides up to 17.5x speedup over OpenCL on NVIDIA DGX-1 systems and up to 1.88x speedup on single-GPU systems. In multi-application workloads, MCL's dynamic resource allocation provides up to 2.43x performance improvement over manual, static resources allocation.","PeriodicalId":266831,"journal":{"name":"Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"The Minos Computing Library: efficient parallel programming for extremely heterogeneous systems\",\"authors\":\"R. Gioiosa, B. O. Mutlu, Seyong Lee, J. Vetter, Giulio Picierro, M. Cesati\",\"doi\":\"10.1145/3366428.3380770\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hardware specialization has become the silver bullet to achieve efficient high performance, from Systems-on-Chip systems, where hardware specialization can be \\\"extreme\\\", to large-scale HPC systems. As the complexity of the systems increases, so does the complexity of programming such architectures in a portable way. This work introduces the Minos Computing Library (MCL), as system software, programming model, and programming model runtime that facilitate programming extremely heterogeneous systems. MCL supports the execution of several multi-threaded applications within the same compute node, performs asynchronous execution of application tasks, efficiently balances computation across hardware resources, and provides performance portability. We show that code developed on a personal desktop automatically scales up to fully utilize powerful workstations with 8 GPUs and down to power-efficient embedded systems. MCL provides up to 17.5x speedup over OpenCL on NVIDIA DGX-1 systems and up to 1.88x speedup on single-GPU systems. In multi-application workloads, MCL's dynamic resource allocation provides up to 2.43x performance improvement over manual, static resources allocation.\",\"PeriodicalId\":266831,\"journal\":{\"name\":\"Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3366428.3380770\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366428.3380770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

硬件专门化已经成为实现高效高性能的灵丹妙药,从硬件专门化可以做到“极致”的片上系统,到大规模高性能计算系统。随着系统复杂性的增加,以可移植的方式对这种体系结构进行编程的复杂性也在增加。本工作介绍了Minos Computing Library (MCL),作为系统软件、编程模型和编程模型运行时,它促进了对极端异构系统的编程。MCL支持在同一计算节点内执行多个多线程应用程序,执行应用程序任务的异步执行,有效地平衡硬件资源之间的计算,并提供性能可移植性。我们展示了在个人桌面上开发的代码可以自动扩展到充分利用具有8个gpu的强大工作站,并向下扩展到节能的嵌入式系统。MCL在NVIDIA DGX-1系统上提供高达17.5倍的OpenCL加速,在单gpu系统上提供高达1.88倍的加速。在多应用程序工作负载中,MCL的动态资源分配比手动静态资源分配提供了高达2.43倍的性能改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
The Minos Computing Library: efficient parallel programming for extremely heterogeneous systems
Hardware specialization has become the silver bullet to achieve efficient high performance, from Systems-on-Chip systems, where hardware specialization can be "extreme", to large-scale HPC systems. As the complexity of the systems increases, so does the complexity of programming such architectures in a portable way. This work introduces the Minos Computing Library (MCL), as system software, programming model, and programming model runtime that facilitate programming extremely heterogeneous systems. MCL supports the execution of several multi-threaded applications within the same compute node, performs asynchronous execution of application tasks, efficiently balances computation across hardware resources, and provides performance portability. We show that code developed on a personal desktop automatically scales up to fully utilize powerful workstations with 8 GPUs and down to power-efficient embedded systems. MCL provides up to 17.5x speedup over OpenCL on NVIDIA DGX-1 systems and up to 1.88x speedup on single-GPU systems. In multi-application workloads, MCL's dynamic resource allocation provides up to 2.43x performance improvement over manual, static resources allocation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Minos Computing Library: efficient parallel programming for extremely heterogeneous systems GPGPU performance estimation for frequency scaling using cross-benchmarking Unveiling kernel concurrency in multiresolution filters on GPUs with an image processing DSL Automatic generation of specialized direct convolutions for mobile GPUs High-level hardware feature extraction for GPU performance prediction of stencils
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1