{"title":"Evaluating the Performance of Integer Sum Reduction on an Intel GPU","authors":"Zheming Jin, J. Vetter","doi":"10.1109/IPDPSW52791.2021.00099","DOIUrl":null,"url":null,"abstract":"Sum reduction is a primitive operation in parallel computing while SYCL is a promising heterogeneous programming language. In this paper, we describe the SYCL implementations of integer sum reduction using atomic functions, shared local memory, vectorized memory accesses, and parameterized workload sizes. Evaluating the reduction kernels shows that we can achieve 1.4X speedup over the open-source implementations of sum reduction for a sufficiently large number of integers on an Intel integrated GPU.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW52791.2021.00099","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Sum reduction is a primitive operation in parallel computing while SYCL is a promising heterogeneous programming language. In this paper, we describe the SYCL implementations of integer sum reduction using atomic functions, shared local memory, vectorized memory accesses, and parameterized workload sizes. Evaluating the reduction kernels shows that we can achieve 1.4X speedup over the open-source implementations of sum reduction for a sufficiently large number of integers on an Intel integrated GPU.