Task parallel assembly language for uncompromising parallelism

Mike Rainey, Kyle C. Hale, Nikos Hardavellas, Simone Campanoni, Umut A. Acar
{"title":"Task parallel assembly language for uncompromising parallelism","authors":"Mike Rainey, Kyle C. Hale, Nikos Hardavellas, Simone Campanoni, Umut A. Acar","doi":"10.1145/3453483.3460969","DOIUrl":null,"url":null,"abstract":"Achieving parallel performance and scalability involves making compromises between parallel and sequential computation. If not contained, the overheads of parallelism can easily outweigh its benefits, sometimes by orders of magnitude. Today, we expect programmers to implement this compromise by optimizing their code manually. This process is labor intensive, requires deep expertise, and reduces code quality. Recent work on heartbeat scheduling shows a promising approach that manifests the potentially vast amounts of available, latent parallelism, at a regular rate, based on even beats in time. The idea is to amortize the overheads of parallelism over the useful work performed between the beats. Heartbeat scheduling is promising in theory, but the reality is complicated: it has no known practical implementation. In this paper, we propose a practical approach to heartbeat scheduling that involves equipping the assembly language with a small set of primitives. These primitives leverage existing kernel and hardware support for interrupts to allow parallelism to remain latent, until a heartbeat, when it can be manifested with low cost. Our Task Parallel Assembly Language (TPAL) is a compact, RISC-like assembly language. We specify TPAL through an abstract machine and implement the abstract machine as compiler transformations for C/C++ code and a specialized run-time system. We present an evaluation on both the Linux and the Nautilus kernels, considering a range of heartbeat interrupt mechanisms. The evaluation shows that TPAL can dramatically reduce the overheads of parallelism without compromising scalability.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"os-48 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3453483.3460969","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Achieving parallel performance and scalability involves making compromises between parallel and sequential computation. If not contained, the overheads of parallelism can easily outweigh its benefits, sometimes by orders of magnitude. Today, we expect programmers to implement this compromise by optimizing their code manually. This process is labor intensive, requires deep expertise, and reduces code quality. Recent work on heartbeat scheduling shows a promising approach that manifests the potentially vast amounts of available, latent parallelism, at a regular rate, based on even beats in time. The idea is to amortize the overheads of parallelism over the useful work performed between the beats. Heartbeat scheduling is promising in theory, but the reality is complicated: it has no known practical implementation. In this paper, we propose a practical approach to heartbeat scheduling that involves equipping the assembly language with a small set of primitives. These primitives leverage existing kernel and hardware support for interrupts to allow parallelism to remain latent, until a heartbeat, when it can be manifested with low cost. Our Task Parallel Assembly Language (TPAL) is a compact, RISC-like assembly language. We specify TPAL through an abstract machine and implement the abstract machine as compiler transformations for C/C++ code and a specialized run-time system. We present an evaluation on both the Linux and the Nautilus kernels, considering a range of heartbeat interrupt mechanisms. The evaluation shows that TPAL can dramatically reduce the overheads of parallelism without compromising scalability.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
任务并行汇编语言的不妥协的并行性
实现并行性能和可伸缩性需要在并行计算和顺序计算之间做出妥协。如果不加以控制,并行性的开销很容易超过它的好处,有时甚至超过它的数量级。今天,我们期望程序员通过手动优化他们的代码来实现这种折衷。这个过程是劳动密集型的,需要深厚的专业知识,并且降低了代码质量。最近关于心跳调度的工作显示了一种很有前途的方法,该方法显示了潜在的大量可用性,潜在的并行性,以规律的速率,基于时间上的均匀心跳。其思想是将并行性的开销分摊到节拍之间执行的有用工作上。心跳调度在理论上很有希望,但现实很复杂:它没有已知的实际实现。在本文中,我们提出了一种实用的心跳调度方法,该方法包括为汇编语言配备一小组原语。这些原语利用现有的内核和硬件对中断的支持,使并行性保持潜伏状态,直到出现心跳时,才能以低成本表现出来。我们的任务并行汇编语言(TPAL)是一种紧凑的、类似risc的汇编语言。我们通过一个抽象机器来指定TPAL,并将抽象机器实现为C/ c++代码的编译器转换和一个专门的运行时系统。我们对Linux和Nautilus内核进行了评估,考虑了一系列心跳中断机制。评估表明,TPAL可以在不影响可伸缩性的情况下显著降低并行性的开销。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Learning to find naming issues with big code and small supervision Cyclic program synthesis Fluid: a framework for approximate concurrency via controlled dependency relaxation Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models Phased synthesis of divide and conquer programs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1