Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation最新文献_第8页

Retrofitting effect handlers onto OCaml 将效果处理程序改装到OCaml上

Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pub Date : 2021-04-01 DOI: 10.1145/3453483.3454039

K. Sivaramakrishnan, Stephen Dolan, Leo White, T. Kelly, S. Jaffer, Anil Madhavapeddy

Effect handlers have been gathering momentum as a mechanism for modular programming with user-defined effects. Effect handlers allow for non-local control flow mechanisms such as generators, async/await, lightweight threads and coroutines to be composably expressed. We present a design and evaluate a full-fledged efficient implementation of effect handlers for OCaml, an industrial-strength multi-paradigm programming language. Our implementation strives to maintain the backwards compatibility and performance profile of existing OCaml code. Retrofitting effect handlers onto OCaml is challenging since OCaml does not currently have any non-local control flow mechanisms other than exceptions. Our implementation of effect handlers for OCaml: (i) imposes a mean 1% overhead on a comprehensive macro benchmark suite that does not use effect handlers; (ii) remains compatible with program analysis tools that inspect the stack; and (iii) is efficient for new code that makes use of effect handlers.

作为一种带有用户定义效果的模块化编程机制，效果处理程序的势头越来越大。效果处理程序允许非本地控制流机制(如生成器、async/await、轻量级线程和协程)以可组合的方式表达。我们为OCaml(一种工业强度的多范式编程语言)设计并评估了一个完整有效的效果处理程序实现。我们的实现努力保持现有OCaml代码的向后兼容性和性能配置文件。将效果处理程序改装到OCaml上是具有挑战性的，因为OCaml目前除了异常之外没有任何非本地控制流机制。我们对OCaml的效果处理程序的实现:(i)在不使用效果处理程序的综合宏基准套件上强加了平均1%的开销;(ii)与检查堆栈的程序分析工具保持兼容;并且(iii)对于使用效果处理程序的新代码是有效的。

引用次数: 28

Unleashing the hidden power of compiler optimization on binary code difference: an empirical study 释放编译器优化对二进制代码差异的隐藏力量:一个实证研究

Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pub Date : 2021-03-23 DOI: 10.1145/3453483.3454035

Xiaolei Ren, Michael Ho, Jiang Ming, Yu Lei, Li Li

Hunting binary code difference without source code (i.e., binary diffing) has compelling applications in software security. Due to the high variability of binary code, existing solutions have been driven towards measuring semantic similarities from syntactically different code. Since compiler optimization is the most common source contributing to binary code differences in syntax, testing the resilience against the changes caused by different compiler optimization settings has become a standard evaluation step for most binary diffing approaches. For example, 47 top-venue papers in the last 12 years compared different program versions compiled by default optimization levels (e.g., -Ox in GCC and LLVM). Although many of them claim they are immune to compiler transformations, it is yet unclear about their resistance to non-default optimization settings. Especially, we have observed that adversaries explored non-default compiler settings to amplify malware differences. This paper takes the first step to systematically studying the effectiveness of compiler optimization on binary code differences. We tailor search-based iterative compilation for the auto-tuning of binary code differences. We develop BinTuner to search near-optimal optimization sequences that can maximize the amount of binary code differences. We run BinTuner with GCC 10.2 and LLVM 11.0 on SPEC benchmarks (CPU2006 & CPU2017), Coreutils, and OpenSSL. Our experiments show that at the cost of 279 to 1,881 compilation iterations, BinTuner can find custom optimization sequences that are substantially better than the general -Ox settings. BinTuner's outputs seriously undermine prominent binary diffing tools' comparisons. In addition, the detection rate of the IoT malware variants tuned by BinTuner falls by more than 50%. Our findings paint a cautionary tale for security analysts that attackers have a new way to mutate malware code cost-effectively, and the research community needs to step back to reassess optimization-resistance evaluations.

在没有源代码的情况下寻找二进制代码差异(即二进制差异)在软件安全性中具有引人注目的应用。由于二进制码的高度可变性，现有的解决方案已经被驱动到从语法不同的代码中测量语义相似性。由于编译器优化是导致二进制代码语法差异的最常见原因，因此测试针对不同编译器优化设置引起的更改的弹性已成为大多数二进制差异方法的标准评估步骤。例如，过去12年的47篇顶级论文比较了默认优化级别(例如GCC和LLVM中的-Ox)编译的不同程序版本。尽管它们中的许多声称它们不受编译器转换的影响，但它们对非默认优化设置的抵抗力尚不清楚。特别是，我们观察到攻击者利用非默认编译器设置来放大恶意软件的差异。本文首先系统地研究了编译器优化对二进制码差异的有效性。我们定制了基于搜索的迭代编译，用于二进制代码差异的自动调整。我们开发BinTuner来搜索接近最优的优化序列，可以最大化二进制代码差异的数量。我们使用GCC 10.2和LLVM 11.0在SPEC基准(CPU2006和CPU2017)， coretils和OpenSSL上运行BinTuner。我们的实验表明，在279到1881次编译迭代的代价下，BinTuner可以找到比一般-Ox设置要好得多的自定义优化序列。BinTuner的输出严重破坏了突出的二进制差分工具的比较。此外，BinTuner调整的物联网恶意软件变体的检测率下降了50%以上。我们的研究结果为安全分析师描绘了一个警示故事，即攻击者有了一种经济有效地改变恶意软件代码的新方法，研究团体需要退后一步，重新评估优化抵抗评估。

{"title":"Unleashing the hidden power of compiler optimization on binary code difference: an empirical study","authors":"Xiaolei Ren, Michael Ho, Jiang Ming, Yu Lei, Li Li","doi":"10.1145/3453483.3454035","DOIUrl":"https://doi.org/10.1145/3453483.3454035","url":null,"abstract":"Hunting binary code difference without source code (i.e., binary diffing) has compelling applications in software security. Due to the high variability of binary code, existing solutions have been driven towards measuring semantic similarities from syntactically different code. Since compiler optimization is the most common source contributing to binary code differences in syntax, testing the resilience against the changes caused by different compiler optimization settings has become a standard evaluation step for most binary diffing approaches. For example, 47 top-venue papers in the last 12 years compared different program versions compiled by default optimization levels (e.g., -Ox in GCC and LLVM). Although many of them claim they are immune to compiler transformations, it is yet unclear about their resistance to non-default optimization settings. Especially, we have observed that adversaries explored non-default compiler settings to amplify malware differences. This paper takes the first step to systematically studying the effectiveness of compiler optimization on binary code differences. We tailor search-based iterative compilation for the auto-tuning of binary code differences. We develop BinTuner to search near-optimal optimization sequences that can maximize the amount of binary code differences. We run BinTuner with GCC 10.2 and LLVM 11.0 on SPEC benchmarks (CPU2006 & CPU2017), Coreutils, and OpenSSL. Our experiments show that at the cost of 279 to 1,881 compilation iterations, BinTuner can find custom optimization sequences that are substantially better than the general -Ox settings. BinTuner's outputs seriously undermine prominent binary diffing tools' comparisons. In addition, the detection rate of the IoT malware variants tuned by BinTuner falls by more than 50%. Our findings paint a cautionary tale for security analysts that attackers have a new way to mutate malware code cost-effectively, and the research community needs to step back to reassess optimization-resistance evaluations.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73442256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

Repairing serializability bugs in distributed database programs via automated schema refactoring 通过自动模式重构修复分布式数据库程序中的可序列化性错误

Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pub Date : 2021-03-09 DOI: 10.1145/3453483.3454028

Kia Rahmani, Kartik Nagar, Benjamin Delaware, S. Jagannathan

Serializability is a well-understood concurrency control mechanism that eases reasoning about highly-concurrent database programs. Unfortunately, enforcing serializability has a high performance cost, especially on geographically distributed database clusters. Consequently, many databases allow programmers to choose when a transaction must be executed under serializability, with the expectation that transactions would only be so marked when necessary to avoid serious concurrency bugs. However, this is a significant burden to impose on developers, requiring them to (a) reason about subtle concurrent interactions among potentially interfering transactions, (b) determine when such interactions would violate desired invariants, and (c) then identify the minimum number of transactions whose executions should be serialized to prevent these violations. To mitigate this burden, this paper presents a sound fully-automated schema refactoring procedure that refactors a program’s data layout – rather than its concurrency control logic – to eliminate statically identified concurrency bugs, allowing more transactions to be safely executed under weaker and more performant database guarantees. Experimental results over a range of realistic database benchmarks indicate that our approach is highly effective in eliminating concurrency bugs, with safe refactored programs showing an average of 120% higher throughput and 45% lower latency compared to a serialized baseline.

可序列化性是一种易于理解的并发控制机制，它简化了对高并发数据库程序的推理。不幸的是，强制序列化的性能成本很高，特别是在地理上分布式的数据库集群上。因此，许多数据库允许程序员选择在可序列化的情况下何时执行事务，并期望事务只在必要时才被标记，以避免严重的并发性错误。然而，这对开发人员来说是一个很大的负担，要求他们(a)推断潜在干扰事务之间微妙的并发交互，(b)确定这种交互何时会违反期望的不变量，以及(c)然后确定应该序列化其执行的事务的最小数量，以防止这些违反。为了减轻这种负担，本文提出了一个完善的全自动模式重构过程，该过程重构程序的数据布局(而不是其并发控制逻辑)，以消除静态识别的并发错误，允许在更弱、更高性能的数据库保证下安全执行更多事务。在一系列实际数据库基准上的实验结果表明，我们的方法在消除并发错误方面非常有效，与序列化基线相比，安全重构程序的吞吐量平均提高了120%，延迟降低了45%。

{"title":"Repairing serializability bugs in distributed database programs via automated schema refactoring","authors":"Kia Rahmani, Kartik Nagar, Benjamin Delaware, S. Jagannathan","doi":"10.1145/3453483.3454028","DOIUrl":"https://doi.org/10.1145/3453483.3454028","url":null,"abstract":"Serializability is a well-understood concurrency control mechanism that eases reasoning about highly-concurrent database programs. Unfortunately, enforcing serializability has a high performance cost, especially on geographically distributed database clusters. Consequently, many databases allow programmers to choose when a transaction must be executed under serializability, with the expectation that transactions would only be so marked when necessary to avoid serious concurrency bugs. However, this is a significant burden to impose on developers, requiring them to (a) reason about subtle concurrent interactions among potentially interfering transactions, (b) determine when such interactions would violate desired invariants, and (c) then identify the minimum number of transactions whose executions should be serialized to prevent these violations. To mitigate this burden, this paper presents a sound fully-automated schema refactoring procedure that refactors a program’s data layout – rather than its concurrency control logic – to eliminate statically identified concurrency bugs, allowing more transactions to be safely executed under weaker and more performant database guarantees. Experimental results over a range of realistic database benchmarks indicate that our approach is highly effective in eliminating concurrency bugs, with safe refactored programs showing an average of 120% higher throughput and 45% lower latency compared to a serialized baseline.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"61 27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90164979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

RbSyn: type- and effect-guided program synthesis RbSyn:类型和效果导向的程序合成

Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pub Date : 2021-02-25 DOI: 10.1145/3453483.3454048

Sankha Narayan Guria, J. Foster, David Van Horn

In recent years, researchers have explored component-based synthesis, which aims to automatically construct programs that operate by composing calls to existing APIs. However, prior work has not considered efficient synthesis of methods with side effects, e.g., web app methods that update a database. In this paper, we introduce RbSyn, a novel type- and effect-guided synthesis tool for Ruby. An RbSyn synthesis goal is specified as the type for the target method and a series of test cases it must pass. RbSyn works by recursively generating well-typed candidate method bodies whose write effects match the read effects of the test case assertions. After finding a set of candidates that separately satisfy each test, RbSyn synthesizes a solution that branches to execute the correct candidate code under the appropriate conditions. We formalize RbSyn on a core, object-oriented language λsyn and describe how the key ideas of the model are scaled-up in our implementation for Ruby. We evaluated RbSyn on 19 benchmarks, 12 of which come from popular, open-source Ruby apps. We found that RbSyn synthesizes correct solutions for all benchmarks, with 15 benchmarks synthesizing in under 9 seconds, while the slowest benchmark takes 83 seconds. Using observed reads to guide synthesize is effective: using type-guidance alone times out on 10 of 12 app benchmarks. We also found that using less precise effect annotations leads to worse synthesis performance. In summary, we believe type- and effect-guided synthesis is an important step forward in synthesis of effectful methods from test cases.

近年来，研究人员探索了基于组件的合成，其目的是通过组合对现有api的调用来自动构建程序。然而，先前的工作并没有考虑到具有副作用的方法的有效合成，例如，更新数据库的web应用程序方法。在本文中，我们介绍了RbSyn，一个新型的类型和效果导向的Ruby合成工具。RbSyn合成目标被指定为目标方法的类型以及它必须通过的一系列测试用例。RbSyn通过递归地生成类型良好的候选方法体来工作，这些方法体的写效果与测试用例断言的读效果相匹配。在找到一组分别满足每个测试的候选代码后，RbSyn合成一个解决方案，该解决方案通过分支在适当的条件下执行正确的候选代码。我们在核心的、面向对象的语言λsyn上形式化了RbSyn，并描述了该模型的关键思想是如何在Ruby实现中扩展的。我们在19个基准上对RbSyn进行了评估，其中12个来自流行的开源Ruby应用程序。我们发现RbSyn为所有基准测试合成正确的解决方案，其中15个基准测试在9秒内合成，而最慢的基准测试需要83秒。使用观察到的读数来指导合成是有效的:单独使用类型指导在12个应用基准测试中有10个超时。我们还发现，使用较不精确的效果注释会导致较差的合成性能。总之，我们相信类型和效果引导的综合是从测试用例中综合有效方法的重要一步。

{"title":"RbSyn: type- and effect-guided program synthesis","authors":"Sankha Narayan Guria, J. Foster, David Van Horn","doi":"10.1145/3453483.3454048","DOIUrl":"https://doi.org/10.1145/3453483.3454048","url":null,"abstract":"In recent years, researchers have explored component-based synthesis, which aims to automatically construct programs that operate by composing calls to existing APIs. However, prior work has not considered efficient synthesis of methods with side effects, e.g., web app methods that update a database. In this paper, we introduce RbSyn, a novel type- and effect-guided synthesis tool for Ruby. An RbSyn synthesis goal is specified as the type for the target method and a series of test cases it must pass. RbSyn works by recursively generating well-typed candidate method bodies whose write effects match the read effects of the test case assertions. After finding a set of candidates that separately satisfy each test, RbSyn synthesizes a solution that branches to execute the correct candidate code under the appropriate conditions. We formalize RbSyn on a core, object-oriented language λsyn and describe how the key ideas of the model are scaled-up in our implementation for Ruby. We evaluated RbSyn on 19 benchmarks, 12 of which come from popular, open-source Ruby apps. We found that RbSyn synthesizes correct solutions for all benchmarks, with 15 benchmarks synthesizing in under 9 seconds, while the slowest benchmark takes 83 seconds. Using observed reads to guide synthesize is effective: using type-guidance alone times out on 10 of 12 app benchmarks. We also found that using less precise effect annotations leads to worse synthesis performance. In summary, we believe type- and effect-guided synthesis is an important step forward in synthesis of effectful methods from test cases.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87676583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Porcupine: a synthesizing compiler for vectorized homomorphic encryption 豪猪:一个矢量同态加密的综合编译器

Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pub Date : 2021-01-19 DOI: 10.1145/3453483.3454050

M. Cowan, Deeksha Dangwal, Armin Alaghi, Caroline Trippel, Vincent T. Lee, Brandon Reagen

Homomorphic encryption (HE) is a privacy-preserving technique that enables computation directly on encrypted data. Despite its promise, HE has seen limited use due to performance overheads and compilation challenges. Recent work has made significant advances to address the performance overheads but automatic compilation of efficient HE kernels remains relatively unexplored. This paper presents Porcupine, an optimizing compiler that generates vectorized HE code using program synthesis. HE poses three major compilation challenges: it only supports a limited set of SIMD-like operators, it uses long-vector operands, and decryption can fail if ciphertext noise growth is not managed properly. Porcupine captures the underlying HE operator behavior so that it can automatically reason about the complex trade-offs imposed by these challenges to generate optimized, verified HE kernels. To improve synthesis time, we propose a series of optimizations including a sketch design tailored to HE to narrow the program search space. We evaluate Porcupine using a set of kernels and show speedups of up to 52% (25% geometric mean) compared to heuristic-driven hand-optimized kernels. Analysis of Porcupine’s synthesized code reveals that optimal solutions are not always intuitive, underscoring the utility of automated reasoning in this domain.

同态加密(HE)是一种隐私保护技术，可以直接对加密数据进行计算。尽管有其承诺，但由于性能开销和编译挑战，HE的使用有限。最近的工作在解决性能开销方面取得了重大进展，但高效HE内核的自动编译仍然相对未被探索。本文介绍了一种利用程序合成生成向量化HE代码的优化编译器Porcupine。HE提出了三个主要的编译挑战:它只支持一组有限的类似simd的操作符，它使用长向量操作数，如果密文噪声增长管理不当，解密可能会失败。Porcupine捕捉底层HE操作员的行为，这样它就可以自动推理这些挑战所带来的复杂权衡，从而生成优化的、经过验证的HE内核。为了缩短合成时间，我们提出了一系列优化方案，包括针对HE的草图设计，以缩小程序搜索空间。我们使用一组内核对Porcupine进行了评估，结果显示，与启发式驱动的手动优化内核相比，Porcupine的速度提高了52%(几何平均为25%)。对Porcupine合成代码的分析表明，最佳解决方案并不总是直观的，强调了自动推理在该领域的实用性。

{"title":"Porcupine: a synthesizing compiler for vectorized homomorphic encryption","authors":"M. Cowan, Deeksha Dangwal, Armin Alaghi, Caroline Trippel, Vincent T. Lee, Brandon Reagen","doi":"10.1145/3453483.3454050","DOIUrl":"https://doi.org/10.1145/3453483.3454050","url":null,"abstract":"Homomorphic encryption (HE) is a privacy-preserving technique that enables computation directly on encrypted data. Despite its promise, HE has seen limited use due to performance overheads and compilation challenges. Recent work has made significant advances to address the performance overheads but automatic compilation of efficient HE kernels remains relatively unexplored. This paper presents Porcupine, an optimizing compiler that generates vectorized HE code using program synthesis. HE poses three major compilation challenges: it only supports a limited set of SIMD-like operators, it uses long-vector operands, and decryption can fail if ciphertext noise growth is not managed properly. Porcupine captures the underlying HE operator behavior so that it can automatically reason about the complex trade-offs imposed by these challenges to generate optimized, verified HE kernels. To improve synthesis time, we propose a series of optimizations including a sketch design tailored to HE to narrow the program search space. We evaluate Porcupine using a set of kernels and show speedups of up to 52% (25% geometric mean) compared to heuristic-driven hand-optimized kernels. Analysis of Porcupine’s synthesized code reveals that optimal solutions are not always intuitive, underscoring the utility of automated reasoning in this domain.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85850203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Quantitative analysis of assertion violations in probabilistic programs 概率程序中断言违规的定量分析

Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pub Date : 2020-11-30 DOI: 10.1145/3453483.3454102

Jinyi Wang, Yican Sun, Hongfei Fu, A. K. Goharshady, K. Chatterjee

We consider the fundamental problem of deriving quantitative bounds on the probability that a given assertion is violated in a probabilistic program. We provide automated algorithms that obtain both lower and upper bounds on the assertion violation probability. The main novelty of our approach is that we prove new and dedicated fixed-point theorems which serve as the theoretical basis of our algorithms and enable us to reason about assertion violation bounds in terms of pre and post fixed-point functions. To synthesize such fixed-points, we devise algorithms that utilize a wide range of mathematical tools, including repulsing ranking supermartingales, Hoeffding's lemma, Minkowski decompositions, Jensen's inequality, and convex optimization. On the theoretical side, we provide (i) the first automated algorithm for lower-bounds on assertion violation probabilities, (ii) the first complete algorithm for upper-bounds of exponential form in affine programs, and (iii) provably and significantly tighter upper-bounds than the previous approaches. On the practical side, we show our algorithms can handle a wide variety of programs from the literature and synthesize bounds that are remarkably tighter than previous results, in some cases by thousands of orders of magnitude.

我们考虑了一个基本问题:在一个概率规划中，给定的断言被违反的概率的定量边界的推导。我们提供了自动算法来获得断言违反概率的下界和上界。我们的方法的主要新颖之处在于我们证明了新的和专用的不动点定理，这些定理作为我们算法的理论基础，使我们能够根据不动点前和不动点后的函数来推理断言违反边界。为了综合这些不动点，我们设计了利用各种数学工具的算法，包括排斥排序上鞅、Hoeffding引理、Minkowski分解、Jensen不等式和凸优化。在理论方面，我们提供了(i)第一个关于断言违反概率下界的自动算法，(ii)第一个关于仿射程序中指数形式上界的完整算法，以及(iii)可证明且明显比以前的方法更严格的上界。在实践方面，我们展示了我们的算法可以处理文献中各种各样的程序，并且合成的边界比以前的结果要严格得多，在某些情况下可以达到数千个数量级。

{"title":"Quantitative analysis of assertion violations in probabilistic programs","authors":"Jinyi Wang, Yican Sun, Hongfei Fu, A. K. Goharshady, K. Chatterjee","doi":"10.1145/3453483.3454102","DOIUrl":"https://doi.org/10.1145/3453483.3454102","url":null,"abstract":"We consider the fundamental problem of deriving quantitative bounds on the probability that a given assertion is violated in a probabilistic program. We provide automated algorithms that obtain both lower and upper bounds on the assertion violation probability. The main novelty of our approach is that we prove new and dedicated fixed-point theorems which serve as the theoretical basis of our algorithms and enable us to reason about assertion violation bounds in terms of pre and post fixed-point functions. To synthesize such fixed-points, we devise algorithms that utilize a wide range of mathematical tools, including repulsing ranking supermartingales, Hoeffding's lemma, Minkowski decompositions, Jensen's inequality, and convex optimization. On the theoretical side, we provide (i) the first automated algorithm for lower-bounds on assertion violation probabilities, (ii) the first complete algorithm for upper-bounds of exponential form in affine programs, and (iii) provably and significantly tighter upper-bounds than the previous approaches. On the practical side, we show our algorithms can handle a wide variety of programs from the literature and synthesize bounds that are remarkably tighter than previous results, in some cases by thousands of orders of magnitude.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"109 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79559989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Perceus: garbage free reference counting with reuse 具有重用的无垃圾引用计数

Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pub Date : 2020-11-22 DOI: 10.1145/3453483.3454032

Alex Reinking, Ningning Xie, L. de Moura, Daan Leijen

We introduce Perceus, an algorithm for precise reference counting with reuse and specialization. Starting from a functional core language with explicit control-flow, Perceus emits precise reference counting instructions such that (cycle-free) programs are _garbage free_, where only live references are retained. This enables further optimizations, like reuse analysis that allows for guaranteed in-place updates at runtime. This in turn enables a novel programming paradigm that we call _functional but in-place_ (FBIP). Much like tail-call optimization enables writing loops with regular function calls, reuse analysis enables writing in-place mutating algorithms in a purely functional way. We give a novel formalization of reference counting in a linear resource calculus, and prove that Perceus is sound and garbage free. We show evidence that Perceus, as implemented in Koka, has good performance and is competitive with other state-of-the-art memory collectors.

我们介绍Perceus，一种具有重用和专门化的精确引用计数算法。从具有显式控制流的函数式核心语言开始，Perceus发出精确的引用计数指令，这样(无循环)程序是_garbage free_，其中仅保留活引用。这支持进一步的优化，比如允许在运行时保证就地更新的重用分析。这反过来又促成了一种新的编程范式，我们称之为FBIP (functional but in-place)。就像尾部调用优化支持使用常规函数调用编写循环一样，重用分析支持以纯函数的方式编写原地突变算法。我们给出了线性资源演算中引用计数的一种新的形式化，并证明了Perceus是健全的和无垃圾的。我们展示的证据表明，在Koka中实现的Perceus具有良好的性能，可以与其他最先进的内存收集器竞争。

引用次数: 19

SPPL: probabilistic programming with fast exact symbolic inference 具有快速精确符号推理的概率规划

Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pub Date : 2020-10-07 DOI: 10.1145/3453483.3454078

Feras A. Saad, M. Rinard, Vikash K. Mansinghka

We present the Sum-Product Probabilistic Language (SPPL), a new probabilistic programming language that automatically delivers exact solutions to a broad range of probabilistic inference queries. SPPL translates probabilistic programs into sum-product expressions, a new symbolic representation and associated semantic domain that extends standard sum-product networks to support mixed-type distributions, numeric transformations, logical formulas, and pointwise and set-valued constraints. We formalize SPPL via a novel translation strategy from probabilistic programs to sum-product expressions and give sound exact algorithms for conditioning on and computing probabilities of events. SPPL imposes a collection of restrictions on probabilistic programs to ensure they can be translated into sum-product expressions, which allow the system to leverage new techniques for improving the scalability of translation and inference by automatically exploiting probabilistic structure. We implement a prototype of SPPL with a modular architecture and evaluate it on benchmarks the system targets, showing that it obtains up to 3500x speedups over state-of-the-art symbolic systems on tasks such as verifying the fairness of decision tree classifiers, smoothing hidden Markov models, conditioning transformed random variables, and computing rare event probabilities.

我们提出了和积概率语言(Sum-Product Probabilistic Language, SPPL)，这是一种新的概率编程语言，可以自动为广泛的概率推理查询提供精确的解决方案。SPPL将概率程序转换为和积表达式，这是一种新的符号表示和相关的语义域，扩展了标准和积网络，以支持混合类型分布、数字转换、逻辑公式以及点和集值约束。我们通过一种新的从概率程序到和积表达式的转换策略形式化了SPPL，并给出了事件概率的条件和计算的可靠精确算法。SPPL对概率程序施加了一系列限制，以确保它们可以转换为和积表达式，这允许系统利用新技术，通过自动利用概率结构来提高转换和推理的可伸缩性。我们实现了一个具有模块化架构的SPPL原型，并在系统目标的基准测试上对其进行了评估，结果表明，在验证决策树分类器的公平性、平滑隐马尔可夫模型、调节转换的随机变量和计算罕见事件概率等任务上，它比最先进的符号系统获得了高达3500倍的加速。

{"title":"SPPL: probabilistic programming with fast exact symbolic inference","authors":"Feras A. Saad, M. Rinard, Vikash K. Mansinghka","doi":"10.1145/3453483.3454078","DOIUrl":"https://doi.org/10.1145/3453483.3454078","url":null,"abstract":"We present the Sum-Product Probabilistic Language (SPPL), a new probabilistic programming language that automatically delivers exact solutions to a broad range of probabilistic inference queries. SPPL translates probabilistic programs into sum-product expressions, a new symbolic representation and associated semantic domain that extends standard sum-product networks to support mixed-type distributions, numeric transformations, logical formulas, and pointwise and set-valued constraints. We formalize SPPL via a novel translation strategy from probabilistic programs to sum-product expressions and give sound exact algorithms for conditioning on and computing probabilities of events. SPPL imposes a collection of restrictions on probabilistic programs to ensure they can be translated into sum-product expressions, which allow the system to leverage new techniques for improving the scalability of translation and inference by automatically exploiting probabilistic structure. We implement a prototype of SPPL with a modular architecture and evaluate it on benchmarks the system targets, showing that it obtains up to 3500x speedups over state-of-the-art symbolic systems on tasks such as verifying the fairness of decision tree classifiers, smoothing hidden Markov models, conditioning transformed random variables, and computing rare event probabilities.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"234 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84929784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

Proof repair across type equivalences 跨类型等效的校样修复

Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pub Date : 2020-10-02 DOI: 10.1145/3453483.3454033

T. Ringer, Randair Porter, N. Yazdani, J. Leo, D. Grossman

We describe a new approach to automatically repairing broken proofs in the Coq proof assistant in response to changes in types. Our approach combines a configurable proof term transformation with a decompiler from proof terms to suggested tactic scripts. The proof term transformation implements transport across equivalences in a way that removes references to the old version of the changed type and does not rely on axioms beyond those Coq assumes. We have implemented this approach in Pumpkin Pi, an extension to the Pumpkin Patch Coq plugin suite for proof repair. We demonstrate Pumpkin Pi’s flexibility on eight case studies, including supporting a benchmark from a user study,easing development with dependent types, porting functions and proofs between unary and binary numbers, and supporting an industrial proof engineer to interoperate between Coq and other verification tools more easily.

我们描述了在Coq证明助手中自动修复损坏的证明以响应类型变化的新方法。我们的方法结合了一个可配置的证明项转换和一个从证明项到建议策略脚本的反编译器。证明项转换以一种方式实现了跨等价的传输，这种方式删除了对已更改类型的旧版本的引用，并且不依赖于Coq假设之外的公理。我们已经在南瓜派中实现了这种方法，南瓜派是南瓜补丁Coq插件套件的扩展，用于证明修复。我们通过八个案例研究展示了南瓜派的灵活性，包括支持来自用户研究的基准测试，简化依赖类型的开发，在一位数和二进制数之间移植函数和证明，以及支持工业证明工程师更容易地在Coq和其他验证工具之间进行互操作。

引用次数: 14

DNNFusion: accelerating deep neural networks execution with advanced operator fusion DNNFusion:通过先进的算子融合加速深度神经网络的执行

Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pub Date : 2020-09-30 DOI: 10.1145/3453483.3454083

Wei Niu, Jiexiong Guan, Yanzhi Wang, G. Agrawal, Bin Ren

Deep Neural Networks (DNNs) have emerged as the core enabler of many major applications on mobile devices. To achieve high accuracy, DNN models have become increasingly deep with hundreds or even thousands of operator layers, leading to high memory and computational requirements for inference. Operator fusion (or kernel/layer fusion) is key optimization in many state-of-the-art DNN execution frameworks, such as TensorFlow, TVM, and MNN, that aim to improve the efficiency of the DNN inference. However, these frameworks usually adopt fusion approaches based on certain patterns that are too restrictive to cover the diversity of operators and layer connections, especially those seen in many extremely deep models. Polyhedral-based loop fusion techniques, on the other hand, work on a low-level view of the computation without operator-level information, and can also miss potential fusion opportunities. To address this challenge, this paper proposes a novel and extensive loop fusion framework called DNNFusion. The basic idea of this work is to work at an operator view of DNNs, but expand fusion opportunities by developing a classification of both individual operators and their combinations. In addition, DNNFusion includes 1) a novel mathematical-property-based graph rewriting framework to reduce evaluation costs and facilitate subsequent operator fusion, 2) an integrated fusion plan generation that leverages the high-level analysis and accurate light-weight profiling, and 3) additional optimizations during fusion code generation. DNNFusion is extensively evaluated on 15 DNN models with varied types of tasks, model sizes, and layer counts. The evaluation results demonstrate that DNNFusion finds up to 8.8 × higher fusion opportunities, outperforms four state-of-the-art DNN execution frameworks with 9.3× speedup. The memory requirement reduction and speedups can enable the execution of many of the target models on mobile devices and even make them part of a real-time application.

深度神经网络(dnn)已经成为移动设备上许多主要应用程序的核心推动者。为了达到较高的精度，DNN模型变得越来越深，有数百甚至数千个算子层，导致对推理的内存和计算要求很高。算子融合(或核/层融合)是许多最先进的深度神经网络执行框架(如TensorFlow, TVM和MNN)的关键优化，旨在提高深度神经网络推理的效率。然而，这些框架通常采用基于某些模式的融合方法，这些模式的限制太大，无法覆盖操作符和层连接的多样性，特别是在许多极深模型中看到的那些。另一方面，基于多面体的环路融合技术在没有操作符级别信息的情况下对计算进行低级视图处理，并且也可能错过潜在的融合机会。为了解决这一挑战，本文提出了一种新颖而广泛的环路融合框架，称为DNNFusion。这项工作的基本思想是在dnn的算子视图下工作，但通过开发单个算子及其组合的分类来扩大融合机会。此外，DNNFusion还包括:1)一种新颖的基于数学属性的图形重写框架，以降低评估成本并促进后续的操作员融合;2)综合融合计划生成，利用高级分析和精确的轻量级分析;3)融合代码生成期间的额外优化。DNNFusion在15个DNN模型上进行了广泛的评估，这些模型具有不同类型的任务、模型大小和层数。评估结果表明，DNNFusion发现高达8.8倍的高融合机会，以9.3倍的加速优于四种最先进的DNN执行框架。内存需求的减少和速度的提高可以使许多目标模型在移动设备上执行，甚至使它们成为实时应用程序的一部分。

{"title":"DNNFusion: accelerating deep neural networks execution with advanced operator fusion","authors":"Wei Niu, Jiexiong Guan, Yanzhi Wang, G. Agrawal, Bin Ren","doi":"10.1145/3453483.3454083","DOIUrl":"https://doi.org/10.1145/3453483.3454083","url":null,"abstract":"Deep Neural Networks (DNNs) have emerged as the core enabler of many major applications on mobile devices. To achieve high accuracy, DNN models have become increasingly deep with hundreds or even thousands of operator layers, leading to high memory and computational requirements for inference. Operator fusion (or kernel/layer fusion) is key optimization in many state-of-the-art DNN execution frameworks, such as TensorFlow, TVM, and MNN, that aim to improve the efficiency of the DNN inference. However, these frameworks usually adopt fusion approaches based on certain patterns that are too restrictive to cover the diversity of operators and layer connections, especially those seen in many extremely deep models. Polyhedral-based loop fusion techniques, on the other hand, work on a low-level view of the computation without operator-level information, and can also miss potential fusion opportunities. To address this challenge, this paper proposes a novel and extensive loop fusion framework called DNNFusion. The basic idea of this work is to work at an operator view of DNNs, but expand fusion opportunities by developing a classification of both individual operators and their combinations. In addition, DNNFusion includes 1) a novel mathematical-property-based graph rewriting framework to reduce evaluation costs and facilitate subsequent operator fusion, 2) an integrated fusion plan generation that leverages the high-level analysis and accurate light-weight profiling, and 3) additional optimizations during fusion code generation. DNNFusion is extensively evaluated on 15 DNN models with varied types of tasks, model sizes, and layer counts. The evaluation results demonstrate that DNNFusion finds up to 8.8 × higher fusion opportunities, outperforms four state-of-the-art DNN execution frameworks with 9.3× speedup. The memory requirement reduction and speedups can enable the execution of many of the target models on mobile devices and even make them part of a real-time application.","PeriodicalId":20557,"journal":{"name":"Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85625231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 72