首页 > 最新文献

International Workshop on OpenCL最新文献

英文 中文
Pattern matching in OpenCL: GPU vs CPU energy consumption on two mobile chipsets OpenCL中的模式匹配:两个移动芯片组上的GPU vs CPU能耗
Pub Date : 2014-05-12 DOI: 10.1145/2664666.2664671
Elena Aragon, J. M. Jiménez, Arian Maghazeh, J. Rasmusson, Unmesh D. Bordoloi
Adaptations of the Aho-Corasick (AC) algorithm on high performance graphics processors (also called GPUs) have garnered increasing attention in recent years. However, no results have been reported regarding their implementations on mobile GPUs. In this paper, we show that implementing a state-of-the-art Aho-Corasick parallel algorithm on a mobile GPU delivers significant speedups. We study a few implementation optimizations some of which may seem counter-intuitive to standard optimizations for high-end GPUs. More importantly, we focus on measuring the energy consumed by different components of the OpenCL application rather than reporting the aggregate. We show that there are considerable energy savings compared to the CPU implementation of the AC algorithm.
近年来,高性能图形处理器(也称为gpu)上的Aho-Corasick (AC)算法的改编引起了越来越多的关注。然而,没有关于它们在移动gpu上实现的结果报告。在本文中,我们展示了在移动GPU上实现最先进的Aho-Corasick并行算法可以提供显着的加速。我们研究了一些实现优化,其中一些可能看起来与高端gpu的标准优化反直觉。更重要的是,我们专注于测量OpenCL应用程序的不同组件所消耗的能量,而不是报告聚合。我们表明,与AC算法的CPU实现相比,有相当大的能源节约。
{"title":"Pattern matching in OpenCL: GPU vs CPU energy consumption on two mobile chipsets","authors":"Elena Aragon, J. M. Jiménez, Arian Maghazeh, J. Rasmusson, Unmesh D. Bordoloi","doi":"10.1145/2664666.2664671","DOIUrl":"https://doi.org/10.1145/2664666.2664671","url":null,"abstract":"Adaptations of the Aho-Corasick (AC) algorithm on high performance graphics processors (also called GPUs) have garnered increasing attention in recent years. However, no results have been reported regarding their implementations on mobile GPUs. In this paper, we show that implementing a state-of-the-art Aho-Corasick parallel algorithm on a mobile GPU delivers significant speedups. We study a few implementation optimizations some of which may seem counter-intuitive to standard optimizations for high-end GPUs. More importantly, we focus on measuring the energy consumed by different components of the OpenCL application rather than reporting the aggregate. We show that there are considerable energy savings compared to the CPU implementation of the AC algorithm.","PeriodicalId":73497,"journal":{"name":"International Workshop on OpenCL","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85370443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Porting a commercial application to OpenCL: a case study 将商业应用程序移植到OpenCL:一个案例研究
Pub Date : 2014-05-12 DOI: 10.1145/2664666.2664669
S. Krige, M. Mackey, Simon McIntosh-Smith, R. Sessions
The use of virtual screening to find new drug hits and leads has become commonplace within the pharmaceutical industry. 2D methods have largely been replaced by 3D ligand-based methods and by structure-based methods (docking) where a reliable protein structure is available. However, the computational cost of calculating 3D molecular similarities is much higher than that for 2D similarity methods, meaning that large amounts of computing power are needed to screen a reasonable number of virtual compounds in a useful time scale. In recent years, the popularity of graphical processing units (GPUs) has increased in the area of high performance computing, mainly for their attractive cost to performance ratio and the appearance of stable GPU coding frameworks. They are a promising solution for computationally-intense problems such as virtual screening. In collaboration, the University of Bristol and Cresset have ported the blazeV10 virtual screening commercial application to OpenCL, a framework for writing programs that execute across heterogeneous platforms using both CPUs and GPUs. We present results showing that our OpenCL port of blazeV10 can provide an up to 20-fold speedup when run on a recent off-the-shelf GPU, compared to a contemporary multi-core CPU. This not only reduces the time required to obtain results but also saves hardware cost and space. We discuss some of the difficulties encountered in porting this commercial application to work well across a range of CPUs and GPUs, present hardware comparisons, and give guidance on how to maximize performance while retaining full precision.
在制药行业,使用虚拟筛选来发现新药的成功和领先已经变得司空见惯。2D方法已经在很大程度上被基于3D配体的方法和基于结构的方法(对接)所取代,其中可靠的蛋白质结构是可用的。然而,计算三维分子相似度的计算成本远远高于二维相似度方法,这意味着需要大量的计算能力来在有用的时间尺度内筛选合理数量的虚拟化合物。近年来,图形处理单元(GPU)在高性能计算领域的普及程度越来越高,主要是因为其具有吸引力的性价比和稳定的GPU编码框架的出现。对于像虚拟筛选这样的计算密集型问题,它们是一个很有前途的解决方案。在合作中,布里斯托尔大学和克雷塞特大学将blazeV10虚拟筛选商业应用程序移植到OpenCL上,OpenCL是一个框架,用于编写使用cpu和gpu在异构平台上执行的程序。我们展示的结果表明,与当代多核CPU相比,我们的OpenCL blazeV10端口在最新的现成GPU上运行时可以提供高达20倍的加速。这不仅减少了获得结果所需的时间,而且节省了硬件成本和空间。我们讨论了在移植这个商业应用程序时遇到的一些困难,以便在一系列cpu和gpu上很好地工作,给出了硬件比较,并给出了如何在保持完全精度的同时最大化性能的指导。
{"title":"Porting a commercial application to OpenCL: a case study","authors":"S. Krige, M. Mackey, Simon McIntosh-Smith, R. Sessions","doi":"10.1145/2664666.2664669","DOIUrl":"https://doi.org/10.1145/2664666.2664669","url":null,"abstract":"The use of virtual screening to find new drug hits and leads has become commonplace within the pharmaceutical industry. 2D methods have largely been replaced by 3D ligand-based methods and by structure-based methods (docking) where a reliable protein structure is available. However, the computational cost of calculating 3D molecular similarities is much higher than that for 2D similarity methods, meaning that large amounts of computing power are needed to screen a reasonable number of virtual compounds in a useful time scale.\u0000 In recent years, the popularity of graphical processing units (GPUs) has increased in the area of high performance computing, mainly for their attractive cost to performance ratio and the appearance of stable GPU coding frameworks. They are a promising solution for computationally-intense problems such as virtual screening. In collaboration, the University of Bristol and Cresset have ported the blazeV10 virtual screening commercial application to OpenCL, a framework for writing programs that execute across heterogeneous platforms using both CPUs and GPUs.\u0000 We present results showing that our OpenCL port of blazeV10 can provide an up to 20-fold speedup when run on a recent off-the-shelf GPU, compared to a contemporary multi-core CPU. This not only reduces the time required to obtain results but also saves hardware cost and space. We discuss some of the difficulties encountered in porting this commercial application to work well across a range of CPUs and GPUs, present hardware comparisons, and give guidance on how to maximize performance while retaining full precision.","PeriodicalId":73497,"journal":{"name":"International Workshop on OpenCL","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81926341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Accelerating Lagrangian particle dispersion in the atmosphere with OpenCL across multiple platforms 使用OpenCL跨多个平台加速拉格朗日粒子在大气中的色散
Pub Date : 2014-05-12 DOI: 10.1145/2664666.2664672
P. Harvey, S. Hameed, W. Vanderbauwhede
FLEXPART is a popular simulator that models the transport and diffusion of air pollutants, based on the Lagrangian approach. It is capable of regional and global simulation and supports both forward and backward runs. A complex model like this contains many calculations suitable for parallelisation. Recently, a GPU-accelerated version of the simulator (FLEXCPP) has been written in C++/CUDA. As CUDA is only supported on NVIDIA GPUs, such an implementation is tied to a single hardware vendor, and is not able to take advantage of other hardware acceleration platforms. This paper presents an OpenCL/C++ version of FLEXCPP, called FlexOcl. This simulator provides all the functionality of FLEXCPP, and has been extended to include modelling of the decay of radioactive particles. A performance comparison between the two simulators has been performed on GPU, and the performance of FlexOcl has also been evaluated on the Intel Xeon Phi, as well as a number of other hardware platforms. Our results show that the OpenCL code performs better than CUDA code on GPUs, and that equivalent performance is seen on the Xeon Phi for this type of application.
FLEXPART是一个流行的模拟器,它基于拉格朗日方法模拟空气污染物的传输和扩散。它具有区域和全球模拟的能力,并支持向前和向后运行。像这样的复杂模型包含许多适合并行化的计算。最近,用c++ /CUDA编写了一个gpu加速版本的模拟器(FLEXCPP)。由于CUDA仅支持NVIDIA gpu,因此这种实现与单一硬件供应商绑定,并且无法利用其他硬件加速平台。本文介绍了一个OpenCL/ c++版本的FLEXCPP,称为FlexOcl。该模拟器提供了FLEXCPP的所有功能,并已扩展到包括放射性粒子衰变的建模。两个模拟器之间的性能比较已经在GPU上进行了,并且FlexOcl的性能也在Intel Xeon Phi以及其他一些硬件平台上进行了评估。我们的结果表明,OpenCL代码在gpu上的性能优于CUDA代码,并且对于这种类型的应用程序,在Xeon Phi上可以看到相同的性能。
{"title":"Accelerating Lagrangian particle dispersion in the atmosphere with OpenCL across multiple platforms","authors":"P. Harvey, S. Hameed, W. Vanderbauwhede","doi":"10.1145/2664666.2664672","DOIUrl":"https://doi.org/10.1145/2664666.2664672","url":null,"abstract":"FLEXPART is a popular simulator that models the transport and diffusion of air pollutants, based on the Lagrangian approach. It is capable of regional and global simulation and supports both forward and backward runs. A complex model like this contains many calculations suitable for parallelisation. Recently, a GPU-accelerated version of the simulator (FLEXCPP) has been written in C++/CUDA. As CUDA is only supported on NVIDIA GPUs, such an implementation is tied to a single hardware vendor, and is not able to take advantage of other hardware acceleration platforms.\u0000 This paper presents an OpenCL/C++ version of FLEXCPP, called FlexOcl. This simulator provides all the functionality of FLEXCPP, and has been extended to include modelling of the decay of radioactive particles. A performance comparison between the two simulators has been performed on GPU, and the performance of FlexOcl has also been evaluated on the Intel Xeon Phi, as well as a number of other hardware platforms. Our results show that the OpenCL code performs better than CUDA code on GPUs, and that equivalent performance is seen on the Xeon Phi for this type of application.","PeriodicalId":73497,"journal":{"name":"International Workshop on OpenCL","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73464877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
International Workshop on OpenCL
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1