Proceedings of the 26th International Conference on Compiler Construction最新文献

Static optimization in PHP 7 PHP 7中的静态优化

Proceedings of the 26th International Conference on Compiler Construction

Pub Date : 2017-02-05 DOI: 10.1145/3033019.3033026

N. Popov, Biagio Cosenza, B. Juurlink, Dmitry Stogov

PHP is a dynamically typed programming language commonly used for the server-side implementation of web applications. Approachability and ease of deployment have made PHP one of the most widely used scripting languages for the web, powering important web applications such as WordPress, Wikipedia, and Facebook. PHP's highly dynamic nature, while providing useful language features, also makes it hard to optimize statically. This paper reports on the implementation of purely static bytecode optimizations for PHP 7, the last major version of PHP. We discuss the challenge of integrating classical compiler optimizations, which have been developed in the context of statically-typed languages, into a programming language that is dynamically and weakly typed, and supports a plethora of dynamic language features. Based on a careful analysis of language semantics, we adapt static single assignment (SSA) form for use in PHP. Combined with type inference, this allows type-based specialization of instructions, as well as the application of various classical SSA-enabled compiler optimizations such as constant propagation or dead code elimination. We evaluate the impact of the proposed static optimizations on a wide collection of programs, including micro-benchmarks, libraries and web frameworks. Despite the dynamic nature of PHP, our approach achieves an average speedup of 50% on micro-benchmarks, 13% on computationally intensive libraries, as well as 1.1% (MediaWiki) and 3.5% (WordPress) on web applications.

PHP是一种动态类型的编程语言，通常用于web应用程序的服务器端实现。易于使用和易于部署使PHP成为最广泛使用的web脚本语言之一，为WordPress、维基百科和Facebook等重要的web应用程序提供支持。PHP的高度动态特性虽然提供了有用的语言特性，但也使得静态优化变得困难。本文报告了PHP 7 (PHP的最后一个主要版本)的纯静态字节码优化的实现。我们讨论了将经典编译器优化集成到动态和弱类型编程语言中所面临的挑战，这些优化是在静态类型语言的上下文中开发的，并且支持大量动态语言特性。在仔细分析语言语义的基础上，我们采用静态单赋值(SSA)形式在PHP中使用。结合类型推断，这允许基于类型的指令专门化，以及应用各种经典的支持ssa的编译器优化，如常量传播或死代码消除。我们评估了提议的静态优化对广泛的程序集合的影响，包括微基准测试、库和web框架。尽管PHP具有动态性，但我们的方法在微基准测试上实现了50%的平均加速，在计算密集型库上实现了13%的平均加速，在web应用程序上实现了1.1% (MediaWiki)和3.5% (WordPress)的平均加速。

{"title":"Static optimization in PHP 7","authors":"N. Popov, Biagio Cosenza, B. Juurlink, Dmitry Stogov","doi":"10.1145/3033019.3033026","DOIUrl":"https://doi.org/10.1145/3033019.3033026","url":null,"abstract":"PHP is a dynamically typed programming language commonly used for the server-side implementation of web applications. Approachability and ease of deployment have made PHP one of the most widely used scripting languages for the web, powering important web applications such as WordPress, Wikipedia, and Facebook. PHP's highly dynamic nature, while providing useful language features, also makes it hard to optimize statically. This paper reports on the implementation of purely static bytecode optimizations for PHP 7, the last major version of PHP. We discuss the challenge of integrating classical compiler optimizations, which have been developed in the context of statically-typed languages, into a programming language that is dynamically and weakly typed, and supports a plethora of dynamic language features. Based on a careful analysis of language semantics, we adapt static single assignment (SSA) form for use in PHP. Combined with type inference, this allows type-based specialization of instructions, as well as the application of various classical SSA-enabled compiler optimizations such as constant propagation or dead code elimination. We evaluate the impact of the proposed static optimizations on a wide collection of programs, including micro-benchmarks, libraries and web frameworks. Despite the dynamic nature of PHP, our approach achieves an average speedup of 50% on micro-benchmarks, 13% on computationally intensive libraries, as well as 1.1% (MediaWiki) and 3.5% (WordPress) on web applications.","PeriodicalId":146080,"journal":{"name":"Proceedings of the 26th International Conference on Compiler Construction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121565327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

From functional programs to pipelined dataflow circuits 从功能程序到流水线数据流电路

Proceedings of the 26th International Conference on Compiler Construction

Pub Date : 2017-02-05 DOI: 10.1145/3033019.3033027

Richard Townsend, Martha A. Kim, S. Edwards

We present a translation from programs expressed in a functional IR into dataflow networks as an intermediate step within a Haskell-to-Hardware compiler. Our networks exploit pipeline parallelism, particularly across multiple tail-recursive calls, via non-strict function evaluation. To handle the long-latency memory operations common to our target applications, we employ a latency-insensitive methodology that ensures arbitrary delays do not change the functionality of the circuit. We present empirical results comparing our networks against their strict counterparts, showing that non-strictness can mitigate small increases in memory latency and improve overall performance by up to 2×.

我们提出了从函数IR中表达的程序到数据流网络的转换，作为haskell到硬件编译器中的中间步骤。我们的网络利用管道并行性，特别是跨多个尾部递归调用，通过非严格的函数求值。为了处理目标应用程序常见的长延迟内存操作，我们采用了延迟不敏感的方法，以确保任意延迟不会改变电路的功能。我们提供了经验结果，将我们的网络与严格的网络进行比较，表明非严格网络可以减轻内存延迟的小幅增加，并将整体性能提高2倍。

引用次数: 30

Let it recover: multiparty protocol-induced recovery 让它恢复:多方协议诱导的恢复

Proceedings of the 26th International Conference on Compiler Construction

Pub Date : 2017-02-05 DOI: 10.1145/3033019.3033031

R. Neykova, N. Yoshida

Fault-tolerant communication systems rely on recovery strategies which are often error-prone (e.g. a programmer manually specifies recovery strategies) or inefficient (e.g. the whole system is restarted from the beginning). This paper proposes a static analysis based on multiparty session types that can efficiently compute a safe global state from which a system of interacting processes should be recovered. We statically analyse the communication flow of a program, given as a multiparty protocol, to extract the causal dependencies between processes and to localise failures. We formalise our recovery algorithm and prove its safety. A recovered communication system is free from deadlocks, orphan messages and reception errors. Our recovery algorithm incurs less communication cost (only affected processes are notified) and overall execution time (only required states are repeated). On top of our analysis, we design and implement a runtime framework in Erlang where failed processes and their dependencies are soundly restarted from a computed safe state. We evaluate our recovery framework on message-passing benchmarks and a use case for crawling webpages. The experimental results indicate our framework outperforms a built-in static recovery strategy in Erlang when a part of the protocol can be safely recovered.

容错通信系统依赖于恢复策略，这些策略通常容易出错(例如程序员手动指定恢复策略)或效率低下(例如整个系统从头开始重新启动)。本文提出了一种基于多方会话类型的静态分析方法，它可以有效地计算出一个安全的全局状态，从这个状态中恢复一个交互过程系统。我们静态地分析程序的通信流，给出一个多方协议，以提取进程之间的因果关系并定位故障。我们形式化了我们的恢复算法并证明了它的安全性。恢复后的通信系统没有死锁、孤立消息和接收错误。我们的恢复算法减少了通信成本(只通知受影响的进程)和总体执行时间(只重复必需的状态)。在我们的分析之上，我们在Erlang中设计并实现了一个运行时框架，在这个框架中，失败的进程及其依赖关系将从计算出的安全状态重新启动。我们根据消息传递基准和抓取网页的用例来评估我们的恢复框架。实验结果表明，当协议的一部分可以安全恢复时，我们的框架优于Erlang中内置的静态恢复策略。

{"title":"Let it recover: multiparty protocol-induced recovery","authors":"R. Neykova, N. Yoshida","doi":"10.1145/3033019.3033031","DOIUrl":"https://doi.org/10.1145/3033019.3033031","url":null,"abstract":"Fault-tolerant communication systems rely on recovery strategies which are often error-prone (e.g. a programmer manually specifies recovery strategies) or inefficient (e.g. the whole system is restarted from the beginning). This paper proposes a static analysis based on multiparty session types that can efficiently compute a safe global state from which a system of interacting processes should be recovered. We statically analyse the communication flow of a program, given as a multiparty protocol, to extract the causal dependencies between processes and to localise failures. We formalise our recovery algorithm and prove its safety. A recovered communication system is free from deadlocks, orphan messages and reception errors. Our recovery algorithm incurs less communication cost (only affected processes are notified) and overall execution time (only required states are repeated). On top of our analysis, we design and implement a runtime framework in Erlang where failed processes and their dependencies are soundly restarted from a computed safe state. We evaluate our recovery framework on message-passing benchmarks and a use case for crawling webpages. The experimental results indicate our framework outperforms a built-in static recovery strategy in Erlang when a part of the protocol can be safely recovered.","PeriodicalId":146080,"journal":{"name":"Proceedings of the 26th International Conference on Compiler Construction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115211485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 65

Data structure-aware heap partitioning 数据结构感知的堆分区

Proceedings of the 26th International Conference on Compiler Construction

Pub Date : 2017-02-05 DOI: 10.1145/3033019.3033030

Nouraldin Jaber, Milind Kulkarni

There are many applications of program (or heap) partitioning, such as computation offloading, region-based memory management, and OS-driven memory locality optimizations. Although these applications are conceptually different, fundamentally, they must generate code such that objects in the heap (and hence the code that operates on those objects) get partitioned depending on how those objects are used. Regardless of the intended application goal, the granularity at which the heap is partitioned is the key factor in partition quality, and hence it needs to be carefully chosen. Previous work suggested two main granularities: class-based and allocation site--based, where objects from the same class (or those allocated at the same allocation site) are co-located. Both approaches share a critical drawback: data structures that are used in different ways can share the same class, or the same allocation sites for internal objects, and hence are forced to be co-located despite their different usage patterns. We introduce the notion of data structure--aware partitioning to allow different data structures to be placed in different partitions, even by existing tools and analyses that inherently operate in a class-based or allocation site--based manner. Our strategy consists of an analysis that infers ownership properties between objects to identify data structures, and a code generation phase that encodes this ownership information into objects' data types and allocation sites without changing the semantics of the code. We evaluate the quality of data structure--aware partitions by comparing it to the state-of-the-art allocation site--based partitioning on a subset of the DaCapo Benchmarks. Across a set of randomized trials, we had a median range of 5% to 25% reduction of cross-partition accesses, and, depending on partitioning decisions, up to a 95% reduction.

程序(或堆)分区有许多应用程序，例如计算卸载、基于区域的内存管理和操作系统驱动的内存局部性优化。尽管这些应用程序在概念上是不同的，但从根本上说，它们必须生成这样的代码，以便根据使用这些对象的方式对堆中的对象(以及对这些对象进行操作的代码)进行分区。不管预期的应用程序目标是什么，堆分区的粒度是分区质量的关键因素，因此需要仔细选择。以前的工作提出了两个主要的粒度:基于类和基于分配站点，其中来自同一类(或在同一分配站点分配的对象)的对象是共存的。这两种方法都有一个严重的缺点:以不同方式使用的数据结构可以共享相同的类，或者为内部对象共享相同的分配站点，因此尽管它们的使用模式不同，但它们被迫共存。我们引入了数据结构感知分区的概念，允许将不同的数据结构放置在不同的分区中，甚至可以使用现有的工具和分析，这些工具和分析本质上是以基于类或基于分配站点的方式操作的。我们的策略包括推断对象之间的所有权属性以识别数据结构的分析，以及在不改变代码语义的情况下将所有权信息编码为对象的数据类型和分配站点的代码生成阶段。我们在DaCapo基准的一个子集上，通过将数据结构感知分区与基于最先进的分配站点的分区进行比较，来评估数据结构感知分区的质量。在一组随机试验中，我们发现跨分区访问减少的中位数范围为5%到25%，根据分区决策的不同，减少幅度最高可达95%。

{"title":"Data structure-aware heap partitioning","authors":"Nouraldin Jaber, Milind Kulkarni","doi":"10.1145/3033019.3033030","DOIUrl":"https://doi.org/10.1145/3033019.3033030","url":null,"abstract":"There are many applications of program (or heap) partitioning, such as computation offloading, region-based memory management, and OS-driven memory locality optimizations. Although these applications are conceptually different, fundamentally, they must generate code such that objects in the heap (and hence the code that operates on those objects) get partitioned depending on how those objects are used. Regardless of the intended application goal, the granularity at which the heap is partitioned is the key factor in partition quality, and hence it needs to be carefully chosen. Previous work suggested two main granularities: class-based and allocation site--based, where objects from the same class (or those allocated at the same allocation site) are co-located. Both approaches share a critical drawback: data structures that are used in different ways can share the same class, or the same allocation sites for internal objects, and hence are forced to be co-located despite their different usage patterns. We introduce the notion of data structure--aware partitioning to allow different data structures to be placed in different partitions, even by existing tools and analyses that inherently operate in a class-based or allocation site--based manner. Our strategy consists of an analysis that infers ownership properties between objects to identify data structures, and a code generation phase that encodes this ownership information into objects' data types and allocation sites without changing the semantics of the code. We evaluate the quality of data structure--aware partitions by comparing it to the state-of-the-art allocation site--based partitioning on a subset of the DaCapo Benchmarks. Across a set of randomized trials, we had a median range of 5% to 25% reduction of cross-partition accesses, and, depending on partitioning decisions, up to a 95% reduction.","PeriodicalId":146080,"journal":{"name":"Proceedings of the 26th International Conference on Compiler Construction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123930235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Granullar: gradual nullable types for Java 粒度:Java的逐渐可空类型

Proceedings of the 26th International Conference on Compiler Construction

Pub Date : 2017-02-05 DOI: 10.1145/3033019.3033032

D. Brotherston, Werner Dietl, O. Lhoták

Object-oriented languages like Java and C# allow the null value for all references. This supports many flexible patterns, but has led to many errors, security vulnerabilities, and system crashes. % Static type systems can prevent null-pointer exceptions at compile time, but require annotations, in particular for used libraries. Conservative defaults choose the most restrictive typing, preventing many errors, but requiring a large annotation effort. Liberal defaults choose the most flexible typing, requiring less annotations, but giving weaker guarantees. Trusted annotations can be provided, but are not checked and require a large manual effort. None of these approaches provide a strong guarantee that the checked part of the program is isolated from the unchecked part: even with conservative defaults, null-pointer exceptions can occur in the checked part. This paper presents Granullar, a gradual type system for null-safety. Developers start out verifying null-safety for the most important components of their applications. At the boundary to unchecked components, runtime checks are inserted by Granullar to guard the verified system from being polluted by unexpected null values. This ensures that null-pointer exceptions can only occur within the unchecked code or at the boundary to checked code; the checked code is free of null-pointer exceptions. We present Granullar for Java, define the checked-unchecked boundary, and how runtime checks are generated. We evaluate our approach on real world software annotated for null-safety. We demonstrate the runtime checks, and acceptable compile-time and run-time performance impacts. Granullar enables combining a checked core with untrusted libraries in a safe manner, improving on the practicality of such a system.

像Java和c#这样的面向对象语言允许所有引用的空值。这支持许多灵活的模式，但也导致了许多错误、安全漏洞和系统崩溃。静态类型系统可以在编译时防止空指针异常，但需要注释，特别是对于使用的库。保守默认选择最严格的类型，避免了许多错误，但需要大量的注释工作。自由默认选择最灵活的类型，需要较少的注释，但提供较弱的保证。可以提供可信的注释，但不检查，需要大量的手工工作。这些方法都不能保证程序中已检查的部分与未检查的部分是隔离的:即使使用保守的默认值，空指针异常也可能发生在已检查的部分。本文提出了一种渐进型的零安全系统granular。开发人员开始验证应用程序中最重要组件的null安全性。在未检查组件的边界，granular会插入运行时检查，以防止已验证的系统受到意外空值的污染。这确保了空指针异常只能发生在未检查的代码中或在已检查代码的边界处;被检查的代码没有空指针异常。我们介绍了Java的粒度，定义了已检查-未检查的边界，以及如何生成运行时检查。我们在真实世界的软件上对我们的方法进行了评估。我们将演示运行时检查，以及可接受的编译时和运行时性能影响。granular支持以安全的方式将已检查的核心与不受信任的库组合在一起，从而提高了此类系统的实用性。

{"title":"Granullar: gradual nullable types for Java","authors":"D. Brotherston, Werner Dietl, O. Lhoták","doi":"10.1145/3033019.3033032","DOIUrl":"https://doi.org/10.1145/3033019.3033032","url":null,"abstract":"Object-oriented languages like Java and C# allow the null value for all references. This supports many flexible patterns, but has led to many errors, security vulnerabilities, and system crashes. % Static type systems can prevent null-pointer exceptions at compile time, but require annotations, in particular for used libraries. Conservative defaults choose the most restrictive typing, preventing many errors, but requiring a large annotation effort. Liberal defaults choose the most flexible typing, requiring less annotations, but giving weaker guarantees. Trusted annotations can be provided, but are not checked and require a large manual effort. None of these approaches provide a strong guarantee that the checked part of the program is isolated from the unchecked part: even with conservative defaults, null-pointer exceptions can occur in the checked part. This paper presents Granullar, a gradual type system for null-safety. Developers start out verifying null-safety for the most important components of their applications. At the boundary to unchecked components, runtime checks are inserted by Granullar to guard the verified system from being polluted by unexpected null values. This ensures that null-pointer exceptions can only occur within the unchecked code or at the boundary to checked code; the checked code is free of null-pointer exceptions. We present Granullar for Java, define the checked-unchecked boundary, and how runtime checks are generated. We evaluate our approach on real world software annotated for null-safety. We demonstrate the runtime checks, and acceptable compile-time and run-time performance impacts. Granullar enables combining a checked core with untrusted libraries in a safe manner, improving on the practicality of such a system.","PeriodicalId":146080,"journal":{"name":"Proceedings of the 26th International Conference on Compiler Construction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128184729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Optimization space pruning without regrets 优化空间修剪无怨无悔

Proceedings of the 26th International Conference on Compiler Construction

Pub Date : 2017-02-05 DOI: 10.1145/3033019.3033023

Ulysse Beaugnon, A. Pouille, Marc Pouzet, J. Pienaar, Albert Cohen

Many computationally-intensive algorithms benefit from the wide parallelism offered by Graphical Processing Units (GPUs). However, the search for a close-to-optimal implementation remains extremely tedious due to the specialization and complexity of GPU architectures. We present a novel approach to automatically discover the best performing code from a given set of possible implementations. It involves a branch and bound algorithm with two distinctive features: (1) an analytic performance model of a lower bound on the execution time, and (2) the ability to estimate such bounds on a partially-specified implementation. The unique features of this performance model allow to aggressively prune the optimization space without eliminating the best performing implementation. While the space considered in this paper focuses on GPUs, the approach is generic enough to be applied to other architectures. We implemented our algorithm in a tool called Telamon and demonstrate its effectiveness on a huge, architecture-specific and input-sensitive optimization space. The information provided by the performance model also helps to identify ways to enrich the search space to consider better candidates, or to highlight architectural bottlenecks.

许多计算密集型算法受益于图形处理单元(gpu)提供的广泛并行性。然而，由于GPU架构的专业化和复杂性，寻找接近最佳的实现仍然非常繁琐。我们提出了一种新颖的方法，从给定的一组可能的实现中自动发现性能最佳的代码。它涉及一个分支定界算法，具有两个显著特征:(1)执行时间下界的分析性能模型，以及(2)在部分指定的实现上估计这种边界的能力。此性能模型的独特特性允许在不消除最佳性能实现的情况下大幅减少优化空间。虽然本文考虑的空间主要集中在gpu上，但该方法足够通用，可以应用于其他架构。我们在一个名为Telamon的工具中实现了我们的算法，并在一个巨大的、特定于架构的、输入敏感的优化空间中展示了它的有效性。性能模型提供的信息还有助于确定丰富搜索空间的方法，以考虑更好的候选对象，或突出体系结构瓶颈。

{"title":"Optimization space pruning without regrets","authors":"Ulysse Beaugnon, A. Pouille, Marc Pouzet, J. Pienaar, Albert Cohen","doi":"10.1145/3033019.3033023","DOIUrl":"https://doi.org/10.1145/3033019.3033023","url":null,"abstract":"Many computationally-intensive algorithms benefit from the wide parallelism offered by Graphical Processing Units (GPUs). However, the search for a close-to-optimal implementation remains extremely tedious due to the specialization and complexity of GPU architectures. We present a novel approach to automatically discover the best performing code from a given set of possible implementations. It involves a branch and bound algorithm with two distinctive features: (1) an analytic performance model of a lower bound on the execution time, and (2) the ability to estimate such bounds on a partially-specified implementation. The unique features of this performance model allow to aggressively prune the optimization space without eliminating the best performing implementation. While the space considered in this paper focuses on GPUs, the approach is generic enough to be applied to other architectures. We implemented our algorithm in a tool called Telamon and demonstrate its effectiveness on a huge, architecture-specific and input-sensitive optimization space. The information provided by the performance model also helps to identify ways to enrich the search space to consider better candidates, or to highlight architectural bottlenecks.","PeriodicalId":146080,"journal":{"name":"Proceedings of the 26th International Conference on Compiler Construction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130072913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Compile-time function memoization 编译时函数记忆

Proceedings of the 26th International Conference on Compiler Construction

Pub Date : 2017-02-05 DOI: 10.1145/3033019.3033024

Arjun Suresh, Erven Rohou, André Seznec

Memoization is the technique of saving the results of computations so that future executions can be omitted when the same inputs repeat. Recent work showed that memoization can be applied to dynamically linked pure functions using a load-time technique and results were encouraging for the demonstrated transcendental functions. A restriction of the proposed framework was that memoization was restricted only to dynamically linked functions and the functions must be determined beforehand. In this work, we propose function memoization using a compile-time technique thus extending the scope of memoization to user defined functions as well as making it transparently applicable to any dynamically linked functions. Our compile-time technique allows static linking of memoization code and this increases the benefit due to memoization by leveraging the inlining capability for the memoization wrapper. Our compile-time analysis can also handle functions with pointer parameters, and we handle constants more efficiently. Instruction set support can also be considered, and we propose associated hardware leading to additional performance gain.

记忆是一种保存计算结果的技术，以便在重复相同输入时可以省略以后的执行。最近的工作表明，使用加载时技术，记忆可以应用于动态链接的纯函数，并且对于所演示的超越函数的结果令人鼓舞。所提出的框架的一个限制是，记忆仅限于动态链接的函数，并且必须事先确定函数。在这项工作中，我们建议使用编译时技术进行函数记忆，从而将记忆的范围扩展到用户定义的函数，并使其透明地适用于任何动态链接的函数。我们的编译时技术允许对记忆代码进行静态链接，这通过利用记忆包装器的内联功能增加了记忆带来的好处。我们的编译时分析还可以处理带有指针形参的函数，并且可以更有效地处理常量。指令集支持也可以考虑，我们建议相关的硬件导致额外的性能增益。

引用次数: 31

Dynamic symbolic execution for polymorphism 多态的动态符号执行

Proceedings of the 26th International Conference on Compiler Construction

Pub Date : 2017-02-05 DOI: 10.1145/3033019.3033029

Lian Li, Yi Lu, Jingling Xue

Symbolic execution is an important program analysis technique that provides auxiliary execution semantics to execute programs with symbolic rather than concrete values. There has been much recent interest in symbolic execution for automatic test case generation and security vulnerability detection, resulting in various tools being deployed in academia and industry. Nevertheless, (subtype or dynamic) polymorphism of object-oriented programs has been neglected: existing symbolic execution techniques can explore different targets of conditional branches but not different targets of method invocations. We address the problem of how this polymorphism can be expressed in a symbolic execution framework. We propose the notion of symbolic types, which make object types symbolic. With symbolic types,[ various targets of a method invocation can be explored systematically by mutating the type of the receiver object of the method during automatic test case generation. To the best of our knowledge, this is the first attempt to address polymorphism in symbolic execution. Mutation of method invocation targets is critical for effectively testing object-oriented programs, especially libraries. Our experimental results show that symbolic types are significantly more effective than existing symbolic execution techniques in achieving test coverage and finding bugs and security vulnerabilities in OpenJDK.

符号执行是一种重要的程序分析技术，它提供辅助执行语义来执行带有符号值而不是具体值的程序。最近有很多人对自动测试用例生成和安全漏洞检测的符号执行感兴趣，导致学术界和工业界部署了各种工具。然而，面向对象程序的(子类型或动态)多态性被忽视了:现有的符号执行技术可以探索条件分支的不同目标，但不能探索方法调用的不同目标。我们解决了如何在符号执行框架中表达这种多态性的问题。我们提出了符号类型的概念，它使对象类型具有符号性。使用符号类型，可以通过在自动测试用例生成期间改变方法的接收者对象的类型来系统地探索方法调用的各种目标。据我们所知，这是第一次尝试在符号执行中解决多态性问题。方法调用目标的变化对于有效地测试面向对象程序(尤其是库)至关重要。我们的实验结果表明，符号类型在实现测试覆盖和发现OpenJDK中的错误和安全漏洞方面比现有的符号执行技术有效得多。

{"title":"Dynamic symbolic execution for polymorphism","authors":"Lian Li, Yi Lu, Jingling Xue","doi":"10.1145/3033019.3033029","DOIUrl":"https://doi.org/10.1145/3033019.3033029","url":null,"abstract":"Symbolic execution is an important program analysis technique that provides auxiliary execution semantics to execute programs with symbolic rather than concrete values. There has been much recent interest in symbolic execution for automatic test case generation and security vulnerability detection, resulting in various tools being deployed in academia and industry. Nevertheless, (subtype or dynamic) polymorphism of object-oriented programs has been neglected: existing symbolic execution techniques can explore different targets of conditional branches but not different targets of method invocations. We address the problem of how this polymorphism can be expressed in a symbolic execution framework. We propose the notion of symbolic types, which make object types symbolic. With symbolic types,[ various targets of a method invocation can be explored systematically by mutating the type of the receiver object of the method during automatic test case generation. To the best of our knowledge, this is the first attempt to address polymorphism in symbolic execution. Mutation of method invocation targets is critical for effectively testing object-oriented programs, especially libraries. Our experimental results show that symbolic types are significantly more effective than existing symbolic execution techniques in achieving test coverage and finding bugs and security vulnerabilities in OpenJDK.","PeriodicalId":146080,"journal":{"name":"Proceedings of the 26th International Conference on Compiler Construction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125285753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Lightweight data race detection for production runs 用于生产运行的轻量级数据竞争检测

Proceedings of the 26th International Conference on Compiler Construction

Pub Date : 2017-02-05 DOI: 10.1145/3033019.3033020

Swarnendu Biswas, Man Cao, Minjia Zhang, Michael D. Bond, Benjamin P. Wood

To detect data races that harm production systems, program analysis must target production runs. However, sound and precise data race detection adds too much run-time overhead for use in production systems. Even existing approaches that provide soundness or precision incur significant limitations. This work addresses the need for soundness (no missed races) and precision (no false races) by introducing novel, efficient production-time analyses that address each need separately. (1) Precise data race detection is useful for developers, who want to fix bugs but loathe false positives. We introduce a precise analysis called RaceChaser that provides low, bounded run-time overhead. (2) Sound race detection benefits analyses and tools whose correctness relies on knowledge of all potential data races. We present a sound, efficient approach called Caper that combines static and dynamic analysis to catch all data races in observed runs. RaceChaser and Caper are useful not only on their own; we introduce a framework that combines these analyses, using Caper as a sound filter for precise data race detection by RaceChaser. Our evaluation shows that RaceChaser and Caper are efficient and effective, and compare favorably with existing state-of-the-art approaches. These results suggest that RaceChaser and Caper enable practical data race detection that is precise and sound, respectively, ultimately leading to more reliable software systems.

为了检测损害生产系统的数据竞争，程序分析必须以生产运行为目标。然而，在生产系统中使用可靠和精确的数据竞争检测会增加太多的运行时开销。即使是现有的提供可靠性或精确性的方法也会产生明显的限制。这项工作通过引入新颖、高效的生产时间分析来分别解决每一种需求，从而解决了对可靠性(没有遗漏的赛跑)和精度(没有错误的赛跑)的需求。(1)精确的数据竞争检测对于想要修复错误但讨厌误报的开发人员很有用。我们引入了一种称为RaceChaser的精确分析，它提供了低的、有限的运行时开销。(2)健全的竞争检测有利于分析和工具，其正确性依赖于对所有潜在数据竞争的了解。我们提出了一种合理、有效的方法，称为Caper，它结合了静态和动态分析来捕捉观察到的运行中的所有数据竞赛。RaceChaser和Caper不仅单独使用很有用;我们介绍了一个结合这些分析的框架，使用Caper作为RaceChaser精确数据竞赛检测的声音过滤器。我们的评估表明，RaceChaser和Caper是高效和有效的，与现有的最先进的方法相比具有优势。这些结果表明RaceChaser和Caper分别实现了精确和可靠的实际数据竞赛检测，最终导致更可靠的软件系统。

{"title":"Lightweight data race detection for production runs","authors":"Swarnendu Biswas, Man Cao, Minjia Zhang, Michael D. Bond, Benjamin P. Wood","doi":"10.1145/3033019.3033020","DOIUrl":"https://doi.org/10.1145/3033019.3033020","url":null,"abstract":"To detect data races that harm production systems, program analysis must target production runs. However, sound and precise data race detection adds too much run-time overhead for use in production systems. Even existing approaches that provide soundness or precision incur significant limitations. This work addresses the need for soundness (no missed races) and precision (no false races) by introducing novel, efficient production-time analyses that address each need separately. (1) Precise data race detection is useful for developers, who want to fix bugs but loathe false positives. We introduce a precise analysis called RaceChaser that provides low, bounded run-time overhead. (2) Sound race detection benefits analyses and tools whose correctness relies on knowledge of all potential data races. We present a sound, efficient approach called Caper that combines static and dynamic analysis to catch all data races in observed runs. RaceChaser and Caper are useful not only on their own; we introduce a framework that combines these analyses, using Caper as a sound filter for precise data race detection by RaceChaser. Our evaluation shows that RaceChaser and Caper are efficient and effective, and compare favorably with existing state-of-the-art approaches. These results suggest that RaceChaser and Caper enable practical data race detection that is precise and sound, respectively, ultimately leading to more reliable software systems.","PeriodicalId":146080,"journal":{"name":"Proceedings of the 26th International Conference on Compiler Construction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123818396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 32

One compiler: deoptimization to optimized code 一个编译器:反优化到优化的代码

Proceedings of the 26th International Conference on Compiler Construction

Pub Date : 2017-02-05 DOI: 10.1145/3033019.3033025

Christian Wimmer, V. Jovanovic, E. Eckstein, Thomas Würthinger

A multi-tier virtual machine (VM) deoptimizes and transfers last-tier execution to the first-tier execution when a speculative optimization is invalidated. The first-tier target of deoptimization is either an interpreter or code compiled by a baseline compiler. Because such a first-tier execution uses a fixed stack frame layout, this complicates all VM components that need to walk the stack. We propose to use the optimizing compiler also to compile deoptimization target code, i.e., the non-speculative first-tier code where execution continues after a deoptimization. Deoptimization entry points are described with the same scope descriptors used to describe the origin of the deoptimization, i.e., deoptimization is a two-way matching of two scope descriptors describing the same abstract frame at the same virtual program counter. We evaluate this deoptimization approach in a high-performance JavaScript VM. It strictly uses a one-compiler approach, i.e., all frames on the stack originate from the same compiler.

当推测性优化失效时，多层虚拟机会反优化并将最后一层执行转移到第一层执行。反优化的第一级目标要么是解释器，要么是由基线编译器编译的代码。因为这样的第一层执行使用固定的堆栈帧布局，这使得需要遍历堆栈的所有VM组件变得复杂。我们建议使用优化编译器来编译反优化目标代码，即非投机的第一层代码，在反优化后继续执行。反优化入口点用与描述反优化起始点相同的作用域描述符来描述，也就是说，反优化是两个作用域描述符在相同的虚拟程序计数器上描述相同的抽象框架的双向匹配。我们在高性能JavaScript虚拟机中评估这种反优化方法。它严格使用一个编译器方法，即堆栈上的所有帧都来自同一个编译器。

引用次数: 13