Python for Development of OpenMP and CUDA Kernels for Multidimensional Data

B. Vacaliuc, D. Patlolla, E. D'Azevedo, G. Davidson, John K. Munro Jr, T. Evans, W. Joubert, Z. Bell
{"title":"Python for Development of OpenMP and CUDA Kernels for Multidimensional Data","authors":"B. Vacaliuc, D. Patlolla, E. D'Azevedo, G. Davidson, John K. Munro Jr, T. Evans, W. Joubert, Z. Bell","doi":"10.1109/SAAHPC.2011.26","DOIUrl":null,"url":null,"abstract":"Design of data structures for high performance computing (HPC) is one of the principal challenges facing researchers looking to utilize heterogeneous computing machinery. Heterogeneous systems derive cost, power, and speed efficiency by being composed of the appropriate hardware for the task. Yet, each type of processor requires a specific organization of the application state in order to achieve peak performance. Discovering this and refactoring the code can be a challenging and time-consuming task for the researcher, as the data structures and the computational model must be co-designed. We present a methodology that uses Python as the environment for which to explore tradeoffs in both the data structure design as well as the code executing on the computation accelerator. Our method enables multi-dimensional arrays to be used effectively in any target environment. We have chosen to focus on OpenMP and CUDA environments, thus exploring the development of optimized kernels for the two most common classes of computing hardware available today: multi-core CPU and GPU. Python's large palette of file and network access routines, its associative indexing syntax and support for common HPC environments makes it relevant for diverse hardware ranging from laptops through computing clusters to the highest performance supercomputers. Our work enables researchers to accelerate the development of their codes on the computing hardware of their choice.","PeriodicalId":331604,"journal":{"name":"2011 Symposium on Application Accelerators in High-Performance Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2011-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Symposium on Application Accelerators in High-Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAAHPC.2011.26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Design of data structures for high performance computing (HPC) is one of the principal challenges facing researchers looking to utilize heterogeneous computing machinery. Heterogeneous systems derive cost, power, and speed efficiency by being composed of the appropriate hardware for the task. Yet, each type of processor requires a specific organization of the application state in order to achieve peak performance. Discovering this and refactoring the code can be a challenging and time-consuming task for the researcher, as the data structures and the computational model must be co-designed. We present a methodology that uses Python as the environment for which to explore tradeoffs in both the data structure design as well as the code executing on the computation accelerator. Our method enables multi-dimensional arrays to be used effectively in any target environment. We have chosen to focus on OpenMP and CUDA environments, thus exploring the development of optimized kernels for the two most common classes of computing hardware available today: multi-core CPU and GPU. Python's large palette of file and network access routines, its associative indexing syntax and support for common HPC environments makes it relevant for diverse hardware ranging from laptops through computing clusters to the highest performance supercomputers. Our work enables researchers to accelerate the development of their codes on the computing hardware of their choice.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于多维数据的OpenMP和CUDA内核开发的Python
高性能计算(HPC)的数据结构设计是研究人员利用异构计算机器所面临的主要挑战之一。异构系统通过由适合任务的硬件组成来获得成本、功率和速度效率。然而,每种类型的处理器都需要对应用程序状态进行特定的组织,以便实现峰值性能。对于研究人员来说,发现这一点并重构代码可能是一项具有挑战性且耗时的任务,因为数据结构和计算模型必须共同设计。我们提出了一种使用Python作为环境的方法,用于探索数据结构设计以及在计算加速器上执行的代码的权衡。我们的方法可以在任何目标环境中有效地使用多维数组。我们选择专注于OpenMP和CUDA环境,从而为当今两种最常见的计算硬件(多核CPU和GPU)探索优化内核的开发。Python的大量文件和网络访问例程,其关联索引语法和对通用HPC环境的支持使其适用于从笔记本电脑到计算集群再到高性能超级计算机的各种硬件。我们的工作使研究人员能够在他们选择的计算硬件上加速他们的代码开发。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Experience Applying Fortran GPU Compilers to Numerical Weather Prediction Implications of Memory-Efficiency on Sparse Matrix-Vector Multiplication Application of Graphics Processing Units (GPUs) to the Study of Non-linear Dynamics of the Exciton Bose-Einstein Condensate in a Semiconductor Quantum Well A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures Evaluation of GPU Architectures Using Spiking Neural Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1