Twin Peaks: A Software Platform for Heterogeneous Computing on General-Purpose and Graphics Processors

J. Gummaraju, L. Morichetti, Michael Houston, B. Sander, Benedict R. Gaster, Bixia Zheng
{"title":"Twin Peaks: A Software Platform for Heterogeneous Computing on General-Purpose and Graphics Processors","authors":"J. Gummaraju, L. Morichetti, Michael Houston, B. Sander, Benedict R. Gaster, Bixia Zheng","doi":"10.1145/1854273.1854302","DOIUrl":null,"url":null,"abstract":"Modern processors are evolving into hybrid, heterogeneous processors with both CPU and GPU cores used for generalpurpose computation. Several languages such as Brook, CUDA , and more recently OpenCL are being developed to fully harness the potential of these processors. These languages typically involve the control code running on the CPU and the performance-critical, data-parallel kernel code running on the GPUs. In this paper, we present Twin Peaks, a software platform for heterogeneous computing that executes code originally targeted for GPUs effi ciently on CPUs as well. This permits a more balanced execution between the CPU and GPU, and enables portability of code between these architectures and to CPU-only environments. We propose several techniques in the runtime system to efficiently utilize the caches and functional units present in CPUs. Using OpenCL as a canonical language for heterogeneous computing, and running several experiments on real hardware, we show that our techniques enable GPGPU-style code to execute efficiently on multi core CPUs with minimal runtime overhead. These results also show that for maximum performance, it is beneficial for applications to utilize both CPUs and GPUs as accelerator targets. Categories a nd Subject D escriptors: D.1.3 [Programming Techniques] : Concurrent Programming G eneral Terms: Design , Experimentation, Performance. K eywords: GPGPU, Multicore , OpenCL, Programmability, Runtime.","PeriodicalId":422461,"journal":{"name":"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"94","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1854273.1854302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 94

Abstract

Modern processors are evolving into hybrid, heterogeneous processors with both CPU and GPU cores used for generalpurpose computation. Several languages such as Brook, CUDA , and more recently OpenCL are being developed to fully harness the potential of these processors. These languages typically involve the control code running on the CPU and the performance-critical, data-parallel kernel code running on the GPUs. In this paper, we present Twin Peaks, a software platform for heterogeneous computing that executes code originally targeted for GPUs effi ciently on CPUs as well. This permits a more balanced execution between the CPU and GPU, and enables portability of code between these architectures and to CPU-only environments. We propose several techniques in the runtime system to efficiently utilize the caches and functional units present in CPUs. Using OpenCL as a canonical language for heterogeneous computing, and running several experiments on real hardware, we show that our techniques enable GPGPU-style code to execute efficiently on multi core CPUs with minimal runtime overhead. These results also show that for maximum performance, it is beneficial for applications to utilize both CPUs and GPUs as accelerator targets. Categories a nd Subject D escriptors: D.1.3 [Programming Techniques] : Concurrent Programming G eneral Terms: Design , Experimentation, Performance. K eywords: GPGPU, Multicore , OpenCL, Programmability, Runtime.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
双峰:通用和图形处理器异构计算的软件平台
现代处理器正在演变成混合的、异构的处理器,CPU和GPU内核都用于通用计算。一些语言,如Brook、CUDA和最近的OpenCL正在开发中,以充分利用这些处理器的潜力。这些语言通常涉及在CPU上运行的控制代码和在gpu上运行的性能关键型数据并行内核代码。在本文中,我们提出了一个异构计算软件平台Twin Peaks,它可以在cpu上高效地执行原本针对gpu的代码。这允许CPU和GPU之间更加平衡的执行,并使这些架构之间的代码可移植性和CPU环境。我们在运行时系统中提出了几种技术来有效地利用cpu中的缓存和功能单元。使用OpenCL作为异构计算的规范语言,并在实际硬件上运行了几个实验,我们表明我们的技术使gpgpu风格的代码能够以最小的运行时开销在多核cpu上有效地执行。这些结果还表明,为了获得最佳性能,应用程序将cpu和gpu同时用作加速器目标是有益的。类别a和主题D描述符:D.1.3[编程技术]:并发编程G一般术语:设计、实验、性能。关键词:GPGPU,多核,OpenCL,可编程性,运行时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reducing task creation and termination overhead in explicitly parallel programs An intra-tile cache set balancing scheme NUcache: A multicore cache organization based on Next-Use distance Towards a science of parallel programming Discovering and understanding performance bottlenecks in transactional applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1