Compiler-aided nd-range parallel-for implementations on CPU in hipSYCL

Joachim Meyer, Aksel Alpay, H. Fröning, V. Heuveline
{"title":"Compiler-aided nd-range parallel-for implementations on CPU in hipSYCL","authors":"Joachim Meyer, Aksel Alpay, H. Fröning, V. Heuveline","doi":"10.1145/3529538.3530216","DOIUrl":null,"url":null,"abstract":"With heterogeneous programming continuously on the rise, performance portability is still to be improved. SYCL provides the nd-range parallel-for paradigm for writing data-parallel kernels. This model allows barriers for group-local synchronization, similar to CUDA and OpenCL kernels. GPUs provide efficient means to model this, but on CPUs the necessary forward-progress guarantees require the use of many (lightweight) threads in library-only SYCL implementations, rendering the nd-range parallel-for unacceptably inefficient. By adopting two compiler-based approaches solving this, the present work improves the performance of the nd-range parallel-for in hipSYCL for CPUs by up to multiple orders of magnitude on various CPU architectures. The two alternatives are compared with regard to their functional correctness and performance. By upstreaming one of the variants, hipSYCL is the first SYCL implementation to provide a well performing nd-range parallel-for on CPU, without requiring an available OpenCL runtime.","PeriodicalId":73497,"journal":{"name":"International Workshop on OpenCL","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on OpenCL","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3529538.3530216","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

With heterogeneous programming continuously on the rise, performance portability is still to be improved. SYCL provides the nd-range parallel-for paradigm for writing data-parallel kernels. This model allows barriers for group-local synchronization, similar to CUDA and OpenCL kernels. GPUs provide efficient means to model this, but on CPUs the necessary forward-progress guarantees require the use of many (lightweight) threads in library-only SYCL implementations, rendering the nd-range parallel-for unacceptably inefficient. By adopting two compiler-based approaches solving this, the present work improves the performance of the nd-range parallel-for in hipSYCL for CPUs by up to multiple orders of magnitude on various CPU architectures. The two alternatives are compared with regard to their functional correctness and performance. By upstreaming one of the variants, hipSYCL is the first SYCL implementation to provide a well performing nd-range parallel-for on CPU, without requiring an available OpenCL runtime.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在hipSYCL中,编译器辅助的和范围并行的CPU实现
随着异构编程的不断兴起,性能可移植性仍有待改进。SYCL为编写数据并行内核提供了极差的parallel-for范型。这个模型允许组本地同步的障碍,类似于CUDA和OpenCL内核。gpu提供了有效的建模方法,但是在cpu上,必要的前向进度保证需要在仅库的SYCL实现中使用许多(轻量级)线程,从而使范围内的并行效率低得令人无法接受。通过采用两种基于编译器的方法来解决这个问题,本研究在不同的CPU体系结构上将用于CPU的hipSYCL的近程并行性能提高了多个数量级。比较了这两种备选方案的功能正确性和性能。通过对其中一个变体进行上传到,hipSYCL是第一个在CPU上提供性能良好的远程并行处理的SYCL实现,而不需要可用的OpenCL运行时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Improving Performance Portability of the Procedurally Generated High Energy Physics Event Generator MadGraph Using SYCL Acceleration of Quantum Transport Simulations with OpenCL CodePin: An Instrumentation-Based Debug Tool of SYCLomatic An Efficient Approach to Resolving Stack Overflow of SYCL Kernel on Intel® CPUs Ray Tracer based lidar simulation using SYCL
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1