首页 > 最新文献

Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation最新文献

英文 中文
Polyhedral auto-transformation with no integer linear programming 无整数线性规划的多面体自变换
Aravind Acharya, Uday Bondhugula, Albert Cohen
State-of-the-art algorithms used in automatic polyhedral transformation for parallelization and locality optimization typically rely on Integer Linear Programming (ILP). This poses a scalability issue when scaling to tens or hundreds of statements, and may be disconcerting in production compiler settings. In this work, we consider relaxing integrality in the ILP formulation of the Pluto algorithm, a popular algorithm used to find good affine transformations. We show that the rational solutions obtained from the relaxed LP formulation can easily be scaled to valid integral ones to obtain desired solutions, although with some caveats. We first present formal results connecting the solution of the relaxed LP to the original Pluto ILP. We then show that there are difficulties in realizing the above theoretical results in practice, and propose an alternate approach to overcome those while still leveraging linear programming. Our new approach obtains dramatic compile-time speedups for a range of large benchmarks. While achieving these compile-time improvements, we show that the performance of the transformed code is not sacrificed. Our approach to automatic transformation provides a mean compilation time improvement of 5.6× over state-of-the-art on relevant challenging benchmarks from the NAS PB, SPEC CPU 2006, and PolyBench suites. We also came across situations where prior frameworks failed to find a transformation in a reasonable amount of time, while our new approach did so instantaneously.
在自动多面体变换中用于并行化和局部优化的最先进算法通常依赖于整数线性规划(ILP)。当扩展到数十或数百条语句时,这会带来可伸缩性问题,并且在生产编译器设置中可能会令人不安。在这项工作中,我们考虑了冥王星算法的ILP公式中的松弛完整性,冥王星算法是一种常用的用于寻找好的仿射变换的算法。我们证明了从松弛的LP公式得到的有理解可以很容易地缩放到有效的积分解,以得到期望的解,尽管有一些注意事项。我们首先给出了将松弛LP的解与原始冥王星ILP联系起来的形式化结果。然后,我们表明在实践中实现上述理论结果存在困难,并提出了一种替代方法来克服这些困难,同时仍然利用线性规划。我们的新方法在一系列大型基准测试中获得了显著的编译时加速。在实现这些编译时改进的同时,我们展示了转换后的代码的性能并没有被牺牲。我们的自动转换方法在NAS PB、SPEC CPU 2006和PolyBench套件的相关具有挑战性的基准测试中提供了5.6倍的平均编译时间改进。我们还遇到了以前的框架无法在合理的时间内找到转换的情况,而我们的新方法可以立即找到转换。
{"title":"Polyhedral auto-transformation with no integer linear programming","authors":"Aravind Acharya, Uday Bondhugula, Albert Cohen","doi":"10.1145/3192366.3192401","DOIUrl":"https://doi.org/10.1145/3192366.3192401","url":null,"abstract":"State-of-the-art algorithms used in automatic polyhedral transformation for parallelization and locality optimization typically rely on Integer Linear Programming (ILP). This poses a scalability issue when scaling to tens or hundreds of statements, and may be disconcerting in production compiler settings. In this work, we consider relaxing integrality in the ILP formulation of the Pluto algorithm, a popular algorithm used to find good affine transformations. We show that the rational solutions obtained from the relaxed LP formulation can easily be scaled to valid integral ones to obtain desired solutions, although with some caveats. We first present formal results connecting the solution of the relaxed LP to the original Pluto ILP. We then show that there are difficulties in realizing the above theoretical results in practice, and propose an alternate approach to overcome those while still leveraging linear programming. Our new approach obtains dramatic compile-time speedups for a range of large benchmarks. While achieving these compile-time improvements, we show that the performance of the transformed code is not sacrificed. Our approach to automatic transformation provides a mean compilation time improvement of 5.6× over state-of-the-art on relevant challenging benchmarks from the NAS PB, SPEC CPU 2006, and PolyBench suites. We also came across situations where prior frameworks failed to find a transformation in a reasonable amount of time, while our new approach did so instantaneously.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82579964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Crellvm: verified credible compilation for LLVM clllvm:经过验证的可靠的LLVM编译
Jeehoon Kang, Yoonseung Kim, Youngju Song, Juneyoung Lee, Sanghoon Park, Mark Dongyeon Shin, Y. Kim, Sungkeun Cho, Joonwon Choi, C. Hur, K. Yi
Production compilers such as GCC and LLVM are large complex software systems, for which achieving a high level of reliability is hard. Although testing is an effective method for finding bugs, it alone cannot guarantee a high level of reliability. To provide a higher level of reliability, many approaches that examine compilers' internal logics have been proposed. However, none of them have been successfully applied to major optimizations of production compilers. This paper presents Crellvm: a verified credible compilation framework for LLVM, which can be used as a systematic way of providing a high level of reliability for major optimizations in LLVM. Specifically, we augment an LLVM optimizer to generate translation results together with their correctness proofs, which can then be checked by a proof checker formally verified in Coq. As case studies, we applied our approach to two major optimizations of LLVM: register promotion mem2reg and global value numbering gvn, having found four new miscompilation bugs (two in each).
生产编译器(如GCC和LLVM)是大型复杂的软件系统,很难实现高水平的可靠性。尽管测试是发现bug的有效方法,但它本身并不能保证高水平的可靠性。为了提供更高级别的可靠性,已经提出了许多检查编译器内部逻辑的方法。然而,它们都没有成功地应用到产品编译器的主要优化中。本文提出了Crellvm:一个经过验证的可靠的LLVM编译框架,它可以作为一种系统的方式,为LLVM中的主要优化提供高水平的可靠性。具体来说,我们增加了一个LLVM优化器来生成翻译结果及其正确性证明,然后可以由Coq中正式验证的证明检查器进行检查。作为案例研究,我们将我们的方法应用于LLVM的两个主要优化:寄存器提升mem2reg和全局值编号gvn,发现了四个新的错误编译错误(每个错误两个)。
{"title":"Crellvm: verified credible compilation for LLVM","authors":"Jeehoon Kang, Yoonseung Kim, Youngju Song, Juneyoung Lee, Sanghoon Park, Mark Dongyeon Shin, Y. Kim, Sungkeun Cho, Joonwon Choi, C. Hur, K. Yi","doi":"10.1145/3192366.3192377","DOIUrl":"https://doi.org/10.1145/3192366.3192377","url":null,"abstract":"Production compilers such as GCC and LLVM are large complex software systems, for which achieving a high level of reliability is hard. Although testing is an effective method for finding bugs, it alone cannot guarantee a high level of reliability. To provide a higher level of reliability, many approaches that examine compilers' internal logics have been proposed. However, none of them have been successfully applied to major optimizations of production compilers. This paper presents Crellvm: a verified credible compilation framework for LLVM, which can be used as a systematic way of providing a high level of reliability for major optimizations in LLVM. Specifically, we augment an LLVM optimizer to generate translation results together with their correctness proofs, which can then be checked by a proof checker formally verified in Coq. As case studies, we applied our approach to two major optimizations of LLVM: register promotion mem2reg and global value numbering gvn, having found four new miscompilation bugs (two in each).","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89216114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Spatial: a language and compiler for application accelerators 空间:用于应用程序加速器的语言和编译器
D. Koeplinger, Matthew Feldman, R. Prabhakar, Yaqi Zhang, Stefan Hadjis, Ruben Fiszel, Tian Zhao, Luigi Nardi, A. Pedram, C. Kozyrakis, K. Olukotun
Industry is increasingly turning to reconfigurable architectures like FPGAs and CGRAs for improved performance and energy efficiency. Unfortunately, adoption of these architectures has been limited by their programming models. HDLs lack abstractions for productivity and are difficult to target from higher level languages. HLS tools are more productive, but offer an ad-hoc mix of software and hardware abstractions which make performance optimizations difficult. In this work, we describe a new domain-specific language and compiler called Spatial for higher level descriptions of application accelerators. We describe Spatial's hardware-centric abstractions for both programmer productivity and design performance, and summarize the compiler passes required to support these abstractions, including pipeline scheduling, automatic memory banking, and automated design tuning driven by active machine learning. We demonstrate the language's ability to target FPGAs and CGRAs from common source code. We show that applications written in Spatial are, on average, 42% shorter and achieve a mean speedup of 2.9x over SDAccel HLS when targeting a Xilinx UltraScale+ VU9P FPGA on an Amazon EC2 F1 instance.
工业越来越多地转向可重构架构,如fpga和CGRAs,以提高性能和能源效率。不幸的是,这些体系结构的采用受到其编程模型的限制。hdl缺乏对生产力的抽象,并且很难从高级语言中获得目标。HLS工具的生产效率更高,但提供了软件和硬件抽象的临时组合,这使得性能优化变得困难。在这项工作中,我们描述了一种新的领域特定语言和编译器,称为Spatial,用于对应用程序加速器进行更高级别的描述。我们描述了Spatial以硬件为中心的抽象,以提高程序员的工作效率和设计性能,并总结了支持这些抽象所需的编译器传递,包括流水线调度、自动内存银行和由主动机器学习驱动的自动设计调优。我们演示了该语言从通用源代码中针对fpga和CGRAs的能力。我们发现,在Amazon EC2 F1实例上使用Xilinx UltraScale+ VU9P FPGA时,用Spatial编写的应用程序比SDAccel HLS平均缩短了42%,平均加速提高了2.9倍。
{"title":"Spatial: a language and compiler for application accelerators","authors":"D. Koeplinger, Matthew Feldman, R. Prabhakar, Yaqi Zhang, Stefan Hadjis, Ruben Fiszel, Tian Zhao, Luigi Nardi, A. Pedram, C. Kozyrakis, K. Olukotun","doi":"10.1145/3192366.3192379","DOIUrl":"https://doi.org/10.1145/3192366.3192379","url":null,"abstract":"Industry is increasingly turning to reconfigurable architectures like FPGAs and CGRAs for improved performance and energy efficiency. Unfortunately, adoption of these architectures has been limited by their programming models. HDLs lack abstractions for productivity and are difficult to target from higher level languages. HLS tools are more productive, but offer an ad-hoc mix of software and hardware abstractions which make performance optimizations difficult. In this work, we describe a new domain-specific language and compiler called Spatial for higher level descriptions of application accelerators. We describe Spatial's hardware-centric abstractions for both programmer productivity and design performance, and summarize the compiler passes required to support these abstractions, including pipeline scheduling, automatic memory banking, and automated design tuning driven by active machine learning. We demonstrate the language's ability to target FPGAs and CGRAs from common source code. We show that applications written in Spatial are, on average, 42% shorter and achieve a mean speedup of 2.9x over SDAccel HLS when targeting a Xilinx UltraScale+ VU9P FPGA on an Amazon EC2 F1 instance.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91339441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 155
Bayonet: probabilistic inference for networks 刺刀:网络的概率推理
Timon Gehr, Sasa Misailovic, Petar Tsankov, L. Vanbever, Pascal Wiesmann, Martin T. Vechev
Network operators often need to ensure that important probabilistic properties are met, such as that the probability of network congestion is below a certain threshold. Ensuring such properties is challenging and requires both a suitable language for probabilistic networks and an automated procedure for answering probabilistic inference queries. We present Bayonet, a novel approach that consists of: (i) a probabilistic network programming language and (ii) a system that performs probabilistic inference on Bayonet programs. The key insight behind Bayonet is to phrase the problem of probabilistic network reasoning as inference in existing probabilistic languages. As a result, Bayonet directly leverages existing probabilistic inference systems and offers a flexible and expressive interface to operators. We present a detailed evaluation of Bayonet on common network scenarios, such as network congestion, reliability of packet delivery, and others. Our results indicate that Bayonet can express such practical scenarios and answer queries for realistic topology sizes (with up to 30 nodes).
网络运营商经常需要保证满足重要的概率属性,例如网络拥塞的概率低于某一阈值。确保这些属性是具有挑战性的,既需要适合概率网络的语言,也需要用于回答概率推理查询的自动化过程。我们提出了一种新的方法Bayonet,它由:(i)一个概率网络编程语言和(ii)一个对Bayonet程序执行概率推理的系统组成。刺刀背后的关键见解是将概率网络推理问题表述为现有概率语言中的推理。因此,Bayonet直接利用现有的概率推理系统,并为操作员提供灵活而富有表现力的界面。我们在常见的网络场景(如网络拥塞、数据包传输的可靠性等)上详细评估了Bayonet。我们的结果表明,Bayonet可以表达这样的实际场景,并回答实际拓扑大小(最多30个节点)的查询。
{"title":"Bayonet: probabilistic inference for networks","authors":"Timon Gehr, Sasa Misailovic, Petar Tsankov, L. Vanbever, Pascal Wiesmann, Martin T. Vechev","doi":"10.1145/3192366.3192400","DOIUrl":"https://doi.org/10.1145/3192366.3192400","url":null,"abstract":"Network operators often need to ensure that important probabilistic properties are met, such as that the probability of network congestion is below a certain threshold. Ensuring such properties is challenging and requires both a suitable language for probabilistic networks and an automated procedure for answering probabilistic inference queries. We present Bayonet, a novel approach that consists of: (i) a probabilistic network programming language and (ii) a system that performs probabilistic inference on Bayonet programs. The key insight behind Bayonet is to phrase the problem of probabilistic network reasoning as inference in existing probabilistic languages. As a result, Bayonet directly leverages existing probabilistic inference systems and offers a flexible and expressive interface to operators. We present a detailed evaluation of Bayonet on common network scenarios, such as network congestion, reliability of packet delivery, and others. Our results indicate that Bayonet can express such practical scenarios and answer queries for realistic topology sizes (with up to 30 nodes).","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"101 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74972386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
HHVM JIT: a profile-guided, region-based compiler for PHP and Hack HHVM JIT:一个基于配置文件的编译器,用于PHP和Hack
Guilherme Ottoni
Dynamic languages such as PHP, JavaScript, Python, and Ruby have been gaining popularity over the last two decades. A very popular domain for these languages is web development, including server-side development of large-scale websites. As a result, improving the performance of these languages has become more important. Efficiently compiling programs in these languages is challenging, and many popular dynamic languages still lack efficient production-quality implementations. This paper describes the design of the second generation of the HHVM JIT and how it addresses the challenges to efficiently execute PHP and Hack programs. This new design uses profiling to build an aggressive region-based JIT compiler. We discuss the benefits of this approach compared to the more popular method-based and trace-based approaches to compile dynamic languages. Our evaluation running a very large PHP-based code base, the Facebook website, demonstrates the effectiveness of the new JIT design.
像PHP、JavaScript、Python和Ruby这样的动态语言在过去的二十年里变得越来越流行。这些语言非常流行的领域是web开发,包括大型网站的服务器端开发。因此,提高这些语言的性能变得更加重要。在这些语言中有效地编译程序是具有挑战性的,并且许多流行的动态语言仍然缺乏高效的生产质量实现。本文描述了第二代HHVM JIT的设计,以及它如何解决高效执行PHP和Hack程序的挑战。这种新设计使用分析来构建一个主动的基于区域的JIT编译器。我们讨论了与更流行的基于方法和基于跟踪的编译动态语言的方法相比,这种方法的优点。我们运行一个非常大的基于php的代码库(Facebook网站)进行评估,证明了新JIT设计的有效性。
{"title":"HHVM JIT: a profile-guided, region-based compiler for PHP and Hack","authors":"Guilherme Ottoni","doi":"10.1145/3192366.3192374","DOIUrl":"https://doi.org/10.1145/3192366.3192374","url":null,"abstract":"Dynamic languages such as PHP, JavaScript, Python, and Ruby have been gaining popularity over the last two decades. A very popular domain for these languages is web development, including server-side development of large-scale websites. As a result, improving the performance of these languages has become more important. Efficiently compiling programs in these languages is challenging, and many popular dynamic languages still lack efficient production-quality implementations. This paper describes the design of the second generation of the HHVM JIT and how it addresses the challenges to efficiently execute PHP and Hack programs. This new design uses profiling to build an aggressive region-based JIT compiler. We discuss the benefits of this approach compared to the more popular method-based and trace-based approaches to compile dynamic languages. Our evaluation running a very large PHP-based code base, the Facebook website, demonstrates the effectiveness of the new JIT design.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78425085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Probabilistic programming with programmable inference 具有可编程推理的概率规划
Vikash K. Mansinghka, Ulrich Schaechtle, Shivam Handa, Alexey Radul, Yutian Chen, M. Rinard
We introduce inference metaprogramming for probabilistic programming languages, including new language constructs, a formalism, and the rst demonstration of e ectiveness in practice. Instead of relying on rigid black-box inference algorithms hard-coded into the language implementation as in previous probabilistic programming languages, infer- ence metaprogramming enables developers to 1) dynamically decompose inference problems into subproblems, 2) apply in- ference tactics to subproblems, 3) alternate between incorpo- rating new data and performing inference over existing data, and 4) explore multiple execution traces of the probabilis- tic program at once. Implemented tactics include gradient- based optimization, Markov chain Monte Carlo, variational inference, and sequental Monte Carlo techniques. Inference metaprogramming enables the concise expression of proba- bilistic models and inference algorithms across diverse elds, such as computer vision, data science, and robotics, within a single probabilistic programming language.
我们介绍了概率编程语言的推理元编程,包括新的语言结构,一种形式,以及在实践中有效性的其他演示。不像以前的概率编程语言那样依赖于硬编码到语言实现中的严格的黑盒推理算法,推理元编程使开发人员能够1)动态地将推理问题分解为子问题,2)对子问题应用推理策略,3)在合并新数据和对现有数据进行推理之间交替,以及4)一次探索概率程序的多个执行轨迹。实现的策略包括基于梯度的优化、马尔可夫链蒙特卡罗、变分推理和顺序蒙特卡罗技术。推理元编程能够在单一的概率编程语言中对不同领域(如计算机视觉、数据科学和机器人)的概率模型和推理算法进行简洁的表达。
{"title":"Probabilistic programming with programmable inference","authors":"Vikash K. Mansinghka, Ulrich Schaechtle, Shivam Handa, Alexey Radul, Yutian Chen, M. Rinard","doi":"10.1145/3192366.3192409","DOIUrl":"https://doi.org/10.1145/3192366.3192409","url":null,"abstract":"We introduce inference metaprogramming for probabilistic programming languages, including new language constructs, a formalism, and the rst demonstration of e ectiveness in practice. Instead of relying on rigid black-box inference algorithms hard-coded into the language implementation as in previous probabilistic programming languages, infer- ence metaprogramming enables developers to 1) dynamically decompose inference problems into subproblems, 2) apply in- ference tactics to subproblems, 3) alternate between incorpo- rating new data and performing inference over existing data, and 4) explore multiple execution traces of the probabilis- tic program at once. Implemented tactics include gradient- based optimization, Markov chain Monte Carlo, variational inference, and sequental Monte Carlo techniques. Inference metaprogramming enables the concise expression of proba- bilistic models and inference algorithms across diverse elds, such as computer vision, data science, and robotics, within a single probabilistic programming language.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82660220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
CURD: a dynamic CUDA race detector CURD:一个动态CUDA竞赛检测器
Yuanfeng Peng, Vinod Grover, Joseph Devietti
As GPUs have become an integral part of nearly every pro- cessor, GPU programming has become increasingly popular. GPU programming requires a combination of extreme levels of parallelism and low-level programming, making it easy for concurrency bugs such as data races to arise. These con- currency bugs can be extremely subtle and di cult to debug due to the massive numbers of threads running concurrently on a modern GPU. While some tools exist to detect data races in GPU pro- grams, they are often prohibitively slow or focused only on a small class of data races in shared memory. Compared to prior work, our race detector, CURD, can detect data races precisely on both shared and global memory, selects an appropriate race detection algorithm based on the synchronization used in a program, and utilizes efficient compiler instrumentation to reduce performance overheads. Across 53 benchmarks, we find that using CURD incurs an aver- age slowdown of just 2.88x over native execution. CURD is 2.1x faster than Nvidia’s CUDA-Racecheck race detector, de- spite detecting a much broader class of races. CURD finds 35 races across our benchmarks, including bugs in established benchmark suites and in sample programs from Nvidia.
由于GPU已经成为几乎所有处理器的一个组成部分,GPU编程已经变得越来越流行。GPU编程需要极端的并行性和低级编程的结合,这使得并发性错误(如数据竞争)很容易出现。由于在现代GPU上并发运行大量线程,这些虚拟货币bug可能非常微妙且难以调试。虽然存在一些工具来检测GPU程序中的数据竞争,但它们通常非常慢,或者只关注共享内存中的一小类数据竞争。与以前的工作相比,我们的争用检测器CURD可以在共享内存和全局内存上精确地检测数据争用,根据程序中使用的同步选择合适的争用检测算法,并利用高效的编译器检测工具来降低性能开销。在53个基准测试中,我们发现使用CURD比本地执行平均只慢2.88倍。CURD比Nvidia的CUDA-Racecheck比赛检测器快2.1倍,尽管它检测的比赛类别要广泛得多。CURD在我们的基准测试中发现了35个竞赛,包括已建立的基准套件和Nvidia示例程序中的错误。
{"title":"CURD: a dynamic CUDA race detector","authors":"Yuanfeng Peng, Vinod Grover, Joseph Devietti","doi":"10.1145/3192366.3192368","DOIUrl":"https://doi.org/10.1145/3192366.3192368","url":null,"abstract":"As GPUs have become an integral part of nearly every pro- cessor, GPU programming has become increasingly popular. GPU programming requires a combination of extreme levels of parallelism and low-level programming, making it easy for concurrency bugs such as data races to arise. These con- currency bugs can be extremely subtle and di cult to debug due to the massive numbers of threads running concurrently on a modern GPU. While some tools exist to detect data races in GPU pro- grams, they are often prohibitively slow or focused only on a small class of data races in shared memory. Compared to prior work, our race detector, CURD, can detect data races precisely on both shared and global memory, selects an appropriate race detection algorithm based on the synchronization used in a program, and utilizes efficient compiler instrumentation to reduce performance overheads. Across 53 benchmarks, we find that using CURD incurs an aver- age slowdown of just 2.88x over native execution. CURD is 2.1x faster than Nvidia’s CUDA-Racecheck race detector, de- spite detecting a much broader class of races. CURD finds 35 races across our benchmarks, including bugs in established benchmark suites and in sample programs from Nvidia.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83632273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation 第39届ACM SIGPLAN编程语言设计与实现会议论文集
{"title":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","authors":"","doi":"10.1145/3192366","DOIUrl":"https://doi.org/10.1145/3192366","url":null,"abstract":"","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74649834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Bounding data races in space and time 边界数据在空间和时间上竞争
Stephen Dolan, K. Sivaramakrishnan, Anil Madhavapeddy
We propose a new semantics for shared-memory parallel programs that gives strong guarantees even in the presence of data races. Our local data race freedom property guarantees that all data-race-free portions of programs exhibit sequential semantics. We provide a straightforward operational semantics and an equivalent axiomatic model, and evaluate an implementation for the OCaml programming language. Our evaluation demonstrates that it is possible to balance a comprehensible memory model with a reasonable (no overhead on x86, ~0.6% on ARM) sequential performance trade-off in a mainstream programming language.
我们为共享内存并行程序提出了一种新的语义,即使在存在数据竞争的情况下也能提供强有力的保证。我们的本地数据竞争自由属性保证程序的所有数据竞争自由部分显示顺序语义。我们提供了一个简单的操作语义和等效的公理模型,并评估了OCaml编程语言的实现。我们的评估表明,在主流编程语言中,可以在可理解的内存模型和合理的(x86上没有开销,ARM上约0.6%)顺序性能权衡之间取得平衡。
{"title":"Bounding data races in space and time","authors":"Stephen Dolan, K. Sivaramakrishnan, Anil Madhavapeddy","doi":"10.1145/3192366.3192421","DOIUrl":"https://doi.org/10.1145/3192366.3192421","url":null,"abstract":"We propose a new semantics for shared-memory parallel programs that gives strong guarantees even in the presence of data races. Our local data race freedom property guarantees that all data-race-free portions of programs exhibit sequential semantics. We provide a straightforward operational semantics and an equivalent axiomatic model, and evaluate an implementation for the OCaml programming language. Our evaluation demonstrates that it is possible to balance a comprehensible memory model with a reasonable (no overhead on x86, ~0.6% on ARM) sequential performance trade-off in a mainstream programming language.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"191 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79577052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Inferring crypto API rules from code changes 从代码更改推断加密API规则
Rumen Paletov, Petar Tsankov, Veselin Raychev, Martin T. Vechev
Creating and maintaining an up-to-date set of security rules that match misuses of crypto APIs is challenging, as crypto APIs constantly evolve over time with new cryptographic primitives and settings, making existing ones obsolete. To address this challenge, we present a new approach to extract security fixes from thousands of code changes. Our approach consists of: (i) identifying code changes, which often capture security fixes, (ii) an abstraction that filters irrelevant code changes (such as refactorings), and (iii) a clustering analysis that reveals commonalities between semantic code changes and helps in eliciting security rules. We applied our approach to the Java Crypto API and showed that it is effective: (i) our abstraction effectively filters non-semantic code changes (over 99% of all changes) without removing security fixes, and (ii) over 80% of the code changes are security fixes identifying security rules. Based on our results, we identified 13 rules, including new ones not supported by existing security checkers.
创建和维护一组最新的安全规则来匹配加密api的滥用是具有挑战性的,因为加密api随着时间的推移不断发展,出现了新的加密原语和设置,使现有的加密原语和设置过时。为了应对这一挑战,我们提出了一种从数千个代码更改中提取安全修复的新方法。我们的方法包括:(i)识别代码更改,通常捕获安全修复;(ii)过滤不相关代码更改(如重构)的抽象;(iii)揭示语义代码更改之间的共性并帮助引出安全规则的聚类分析。我们将我们的方法应用于Java Crypto API,并证明了它是有效的:(i)我们的抽象有效地过滤了非语义代码更改(超过99%的更改),而不会删除安全修复;(ii)超过80%的代码更改是识别安全规则的安全修复。根据我们的结果,我们确定了13条规则,包括现有安全检查器不支持的新规则。
{"title":"Inferring crypto API rules from code changes","authors":"Rumen Paletov, Petar Tsankov, Veselin Raychev, Martin T. Vechev","doi":"10.1145/3192366.3192403","DOIUrl":"https://doi.org/10.1145/3192366.3192403","url":null,"abstract":"Creating and maintaining an up-to-date set of security rules that match misuses of crypto APIs is challenging, as crypto APIs constantly evolve over time with new cryptographic primitives and settings, making existing ones obsolete. To address this challenge, we present a new approach to extract security fixes from thousands of code changes. Our approach consists of: (i) identifying code changes, which often capture security fixes, (ii) an abstraction that filters irrelevant code changes (such as refactorings), and (iii) a clustering analysis that reveals commonalities between semantic code changes and helps in eliciting security rules. We applied our approach to the Java Crypto API and showed that it is effective: (i) our abstraction effectively filters non-semantic code changes (over 99% of all changes) without removing security fixes, and (ii) over 80% of the code changes are security fixes identifying security rules. Based on our results, we identified 13 rules, including new ones not supported by existing security checkers.","PeriodicalId":20583,"journal":{"name":"Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"205 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78074959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
期刊
Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1