Using interoperability mode in SYCL 2020

International Workshop on OpenCL Pub Date : 2022-05-10 DOI:10.1145/3529538.3529997

Aksel Alpay, T. Applencourt, Gordon Brown, R. Keryell, G. Lueck

{"title":"Using interoperability mode in SYCL 2020","authors":"Aksel Alpay, T. Applencourt, Gordon Brown, R. Keryell, G. Lueck","doi":"10.1145/3529538.3529997","DOIUrl":null,"url":null,"abstract":"SYCL is a programming standard targeting hardware platforms with a host connected to various heterogeneous accelerators. Both the host and accelerator parts of the computation are expressed in a single-source modern C++ program. While the previous versions of the SYCL standard were based only on top of the OpenCL standard to control the accelerators, starting with SYCL 2020, the standard is independent from OpenCL and can target different API, described with the concept of backend. Some SYCL implementations can thus target today various lower-level API, like OpenCL, CUDA, Level0, HIP, XRT, Vulkan, etc. with possibly different backends used at the same time in the same application. Even if the SYCL standard thrive to abstract the generic principles used in heterogeneous programming with C++ classes and functions, real applications require often to use specific details of a given architecture to benefit fully from an accelerator or need to be into integrated into a wider framework, including parts implemented in other languages and other API for heterogeneous computing. This is possible in SYCL with a less-know but powerful concept of interoperability, which is introduced at different levels. On one hand, by accessing some native backend objects from SYCL objects, it is possible to use in a SYCL program the native API, for example by calling some existing optimized libraries like mathematical libraries, machine learning, video CODEC, etc. to simplify the application development and reach the maximum performance. In that case it is for example possible to get from a sycl::queue a native queue from the backend to be used to enqueue a library function. On the other hand, it is possible to use a part of the application written in SYCL from another part of the application using another API by using SYCL interoperability functions to constructs SYCL objects like sycl::device or sycl::queue from native equivalent objects from the lower-level API backend used in the main part of the program. Another feature of SYCL 2020 interoperability is the ability to schedule backend API operations within the SYCL task DAG using host task interoperability. In SYCL, host tasks allow the user to enqueue an arbitrary C++ function within the SYCL DAG and host tasks have an optional interoperability handle which provides access to the native backend queue, device and memory objects at that point in the DAG. This feature is very powerful as it allows a SYCL application to interoperate with backend-specific libraries such as BLAS or DNN libraries. Finally, SYCL interoperability allows for calling backend-specific kernel functions in the backend kernel language such as OpenCL or CUDA via backend-specific functions when generating a kernel_bundle, which can be invoked via a SYCL queue. Some implementations can also go beyond the standard and provide some native functions directly callable from a plain SYCL kernel. SYCL can also be used to simplify the direct use of a lower-level API, like a higher-level C++ wrapper, to remove a lot of the boilerplate code otherwise needed to use the lower-level API. Since it is possible to use the interoperability mode with sycl::buffer and sycl::accessor, some code using the native API can benefit from the implicit data dependency task graph and automatic overlap of computation and implicit communications provided by the SYCL programming model. Having all these interoperability modes in SYCL allows leveraging existing other interoperability modes and building some complex interoperability paths between several frameworks or standards in a single application. For example in HPC a SYCL application can interoperate with an OpenMP library through a common backend to use parallelism in a cooperative way or could use the OpenCL back-end to reach Vulkan through OpenCL-Vulkan interoperability for high-performance graphics rendering. A multimedia application could use a SYCL-OpenCL-OpenGL-DX12 path to do image processing of native images.","PeriodicalId":73497,"journal":{"name":"International Workshop on OpenCL","volume":"50 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on OpenCL","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3529538.3529997","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

SYCL is a programming standard targeting hardware platforms with a host connected to various heterogeneous accelerators. Both the host and accelerator parts of the computation are expressed in a single-source modern C++ program. While the previous versions of the SYCL standard were based only on top of the OpenCL standard to control the accelerators, starting with SYCL 2020, the standard is independent from OpenCL and can target different API, described with the concept of backend. Some SYCL implementations can thus target today various lower-level API, like OpenCL, CUDA, Level0, HIP, XRT, Vulkan, etc. with possibly different backends used at the same time in the same application. Even if the SYCL standard thrive to abstract the generic principles used in heterogeneous programming with C++ classes and functions, real applications require often to use specific details of a given architecture to benefit fully from an accelerator or need to be into integrated into a wider framework, including parts implemented in other languages and other API for heterogeneous computing. This is possible in SYCL with a less-know but powerful concept of interoperability, which is introduced at different levels. On one hand, by accessing some native backend objects from SYCL objects, it is possible to use in a SYCL program the native API, for example by calling some existing optimized libraries like mathematical libraries, machine learning, video CODEC, etc. to simplify the application development and reach the maximum performance. In that case it is for example possible to get from a sycl::queue a native queue from the backend to be used to enqueue a library function. On the other hand, it is possible to use a part of the application written in SYCL from another part of the application using another API by using SYCL interoperability functions to constructs SYCL objects like sycl::device or sycl::queue from native equivalent objects from the lower-level API backend used in the main part of the program. Another feature of SYCL 2020 interoperability is the ability to schedule backend API operations within the SYCL task DAG using host task interoperability. In SYCL, host tasks allow the user to enqueue an arbitrary C++ function within the SYCL DAG and host tasks have an optional interoperability handle which provides access to the native backend queue, device and memory objects at that point in the DAG. This feature is very powerful as it allows a SYCL application to interoperate with backend-specific libraries such as BLAS or DNN libraries. Finally, SYCL interoperability allows for calling backend-specific kernel functions in the backend kernel language such as OpenCL or CUDA via backend-specific functions when generating a kernel_bundle, which can be invoked via a SYCL queue. Some implementations can also go beyond the standard and provide some native functions directly callable from a plain SYCL kernel. SYCL can also be used to simplify the direct use of a lower-level API, like a higher-level C++ wrapper, to remove a lot of the boilerplate code otherwise needed to use the lower-level API. Since it is possible to use the interoperability mode with sycl::buffer and sycl::accessor, some code using the native API can benefit from the implicit data dependency task graph and automatic overlap of computation and implicit communications provided by the SYCL programming model. Having all these interoperability modes in SYCL allows leveraging existing other interoperability modes and building some complex interoperability paths between several frameworks or standards in a single application. For example in HPC a SYCL application can interoperate with an OpenMP library through a common backend to use parallelism in a cooperative way or could use the OpenCL back-end to reach Vulkan through OpenCL-Vulkan interoperability for high-performance graphics rendering. A multimedia application could use a SYCL-OpenCL-OpenGL-DX12 path to do image processing of native images.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在SYCL 2020中使用互操作性模式

SYCL是一种针对硬件平台的编程标准，其主机连接到各种异构加速器。计算的主机和加速器部分都在一个单一源的现代c++程序中表示。先前版本的SYCL标准仅基于OpenCL标准来控制加速器，而从SYCL 2020开始，该标准独立于OpenCL，可以针对不同的API，并使用后端概念进行描述。因此，一些SYCL实现可以针对今天的各种低级API，如OpenCL, CUDA, Level0, HIP, XRT, Vulkan等，可能在同一应用程序中同时使用不同的后端。即使SYCL标准能够抽象出使用c++类和函数的异构编程中使用的一般原则，实际的应用程序通常需要使用给定体系结构的特定细节来充分受益于加速器，或者需要集成到更广泛的框架中，包括用其他语言实现的部分和用于异构计算的其他API。这在SYCL中是可能的，它有一个鲜为人知但功能强大的互操作性概念，它是在不同级别引入的。一方面，通过从SYCL对象访问一些本地后端对象，可以在SYCL程序中使用本地API，例如通过调用一些现有的优化库，如数学库，机器学习，视频CODEC等来简化应用程序开发并达到最大性能。在这种情况下，例如可以从sycl::queue从后端获得一个本机队列，用于对库函数进行排队。另一方面，通过使用SYCL互操作性函数从程序主要部分使用的低级API后端的本机等效对象构造SYCL::device或SYCL::queue等SYCL对象，可以使用使用另一个API的应用程序的另一部分使用SYCL编写的应用程序的一部分。SYCL 2020互操作性的另一个特性是能够使用主机任务互操作性在SYCL任务DAG内调度后端API操作。在SYCL中，主机任务允许用户在SYCL DAG中为任意c++函数排队，并且主机任务有一个可选的互操作性句柄，该句柄提供对DAG中该点的本机后端队列、设备和内存对象的访问。这个特性非常强大，因为它允许SYCL应用程序与特定于后端的库(如BLAS或DNN库)进行互操作。最后，SYCL互操作性允许在生成kernel_bundle(可以通过SYCL队列调用)时，通过后端特定函数在后端内核语言(如OpenCL或CUDA)中调用后端特定内核函数。一些实现还可以超越标准，提供一些可从普通SYCL内核直接调用的本地函数。SYCL还可用于简化对低级API(如高级c++包装器)的直接使用，从而删除使用低级API所需的大量样板代码。由于可以使用sycl::buffer和sycl::accessor的互操作性模式，一些使用本机API的代码可以受益于sycl编程模型提供的隐式数据依赖任务图和计算的自动重叠以及隐式通信。在SYCL中拥有所有这些互操作性模式允许利用现有的其他互操作性模式，并在单个应用程序中的多个框架或标准之间构建一些复杂的互操作性路径。例如，在HPC中，SYCL应用程序可以通过公共后端与OpenMP库进行互操作，以协作的方式使用并行性，或者可以使用OpenCL后端通过OpenCL-Vulkan互操作性与Vulkan进行高性能图形渲染。多媒体应用程序可以使用SYCL-OpenCL-OpenGL-DX12路径对本地图像进行图像处理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Workshop on OpenCL

自引率

0.00%

发文量

期刊最新文献

Improving Performance Portability of the Procedurally Generated High Energy Physics Event Generator MadGraph Using SYCL Acceleration of Quantum Transport Simulations with OpenCL CodePin: An Instrumentation-Based Debug Tool of SYCLomatic An Efficient Approach to Resolving Stack Overflow of SYCL Kernel on Intel® CPUs Ray Tracer based lidar simulation using SYCL