Proceedings of the ACM on Programming Languages最新文献_第10页

Asparagus: Automated Synthesis of Parametric Gas Upper-Bounds for Smart Contracts 芦笋:智能合约参数化气体上界的自动合成

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622829

Zhuo Cai, Soroush Farokhnia, Amir Kafshdar Goharshady, S. Hitarth

Modern programmable blockchains have built-in support for smart contracts, i.e. ‍programs that are stored on the blockchain and whose state is subject to consensus. After a smart contract is deployed on the blockchain, anyone on the network can interact with it and call its functions by creating transactions. The blockchain protocol is then used to reach a consensus about the order of the transactions and, as a direct corollary, the state of every smart contract. Reaching such consensus necessarily requires every node on the network to execute all function calls. Thus, an attacker can perform DoS by creating expensive transactions and function calls that use considerable or even possibly infinite time and space. To avoid this, following Ethereum, virtually all programmable blockchains have introduced the concept of “gas”. A fixed hard-coded gas cost is assigned to every atomic operation and the user who calls a function has to pay for its total gas usage. This technique ensures that the protocol is not vulnerable to DoS attacks, but it has also had significant unintended consequences. Out-of-gas errors, i.e. ‍when a user misunderestimates the gas usage of their function call and does not allocate enough gas, are a major source of security vulnerabilities in Ethereum. We focus on the well-studied problem of automatically finding upper-bounds on the gas usage of a smart contract. This is a classical problem in the blockchain community and has also been extensively studied by researchers in programming languages and verification. In this work, we provide a novel approach using theorems from polyhedral geometry and real algebraic geometry, namely Farkas’ Lemma, Handelman’s Theorem, and Putinar’s Positivstellensatz, to automatically synthesize linear and polynomial parametric bounds for the gas usage of smart contracts. Our approach is the first to provide completeness guarantees for the synthesis of such parametric upper-bounds. Moreover, our theoretical results are independent of the underlying consensus protocol and can be applied to smart contracts written in any language and run on any blockchain. As a proof of concept, we also provide a tool, called “Asparagus” that implements our algorithms for Ethereum contracts written in Solidity. Finally, we provide extensive experimental results over 24,188 real-world smart contracts that are currently deployed on the Ethereum blockchain. We compare Asparagus against GASTAP, which is the only previous tool that could provide parametric bounds, and show that our method significantly outperforms it, both in terms of applicability and the tightness of the resulting bounds. More specifically, our approach can handle 80.56% of the functions (126,269 out of 156,735) in comparison with GASTAP’s 58.62%. Additionally, even on the benchmarks where both approaches successfully synthesize a bound, our bound is tighter in 97.85% of the cases.

现代可编程区块链内置了对智能合约的支持，即存储在区块链上的‍程序，其状态受共识约束。在区块链上部署智能合约后，网络上的任何人都可以通过创建交易与它交互并调用它的功能。然后使用区块链协议就交易顺序达成共识，并作为直接推论，对每个智能合约的状态达成共识。要达成这样的共识，就必须要求网络上的每个节点执行所有的函数调用。因此，攻击者可以通过创建昂贵的事务和函数调用来执行DoS，这些事务和函数调用会占用大量甚至可能是无限的时间和空间。为了避免这种情况，继以太坊之后，几乎所有可编程区块链都引入了“gas”的概念。每个原子操作都分配了固定的硬编码气体成本，调用函数的用户必须为其总气体使用量付费。这种技术确保了协议不容易受到DoS攻击，但它也产生了意想不到的重大后果。gas -of-gas错误，即‍当用户错误地低估了其函数调用的gas使用并且没有分配足够的gas时，是以太坊安全漏洞的主要来源。我们关注的是一个已经得到充分研究的问题，即自动找到智能合约的gas使用上限。这是区块链社区的一个经典问题，也被编程语言和验证领域的研究人员广泛研究。在这项工作中，我们提供了一种新的方法，使用多面体几何和真实代数几何中的定理，即Farkas引理、Handelman定理和Putinar的Positivstellensatz，来自动合成智能合约的气体使用的线性和多项式参数界。我们的方法首次为这类参数上界的综合提供了完备性保证。此外，我们的理论结果独立于底层共识协议，可以应用于以任何语言编写的智能合约，并在任何区块链上运行。作为概念证明，我们还提供了一个名为“Asparagus”的工具，用于实现我们用Solidity编写的以太坊合约的算法。最后，我们提供了目前部署在以太坊区块链上的24,188个真实世界智能合约的广泛实验结果。我们将Asparagus与GASTAP进行了比较，后者是之前唯一可以提供参数边界的工具，并表明我们的方法在适用性和结果边界的紧密性方面都明显优于它。更具体地说，我们的方法可以处理80.56%的函数(156,735个函数中的126,269个)，而GASTAP的方法可以处理58.62%。此外，即使在两种方法都成功合成边界的基准上，97.85%的情况下我们的边界更紧。

{"title":"Asparagus: Automated Synthesis of Parametric Gas Upper-Bounds for Smart Contracts","authors":"Zhuo Cai, Soroush Farokhnia, Amir Kafshdar Goharshady, S. Hitarth","doi":"10.1145/3622829","DOIUrl":"https://doi.org/10.1145/3622829","url":null,"abstract":"Modern programmable blockchains have built-in support for smart contracts, i.e. ‍programs that are stored on the blockchain and whose state is subject to consensus. After a smart contract is deployed on the blockchain, anyone on the network can interact with it and call its functions by creating transactions. The blockchain protocol is then used to reach a consensus about the order of the transactions and, as a direct corollary, the state of every smart contract. Reaching such consensus necessarily requires every node on the network to execute all function calls. Thus, an attacker can perform DoS by creating expensive transactions and function calls that use considerable or even possibly infinite time and space. To avoid this, following Ethereum, virtually all programmable blockchains have introduced the concept of “gas”. A fixed hard-coded gas cost is assigned to every atomic operation and the user who calls a function has to pay for its total gas usage. This technique ensures that the protocol is not vulnerable to DoS attacks, but it has also had significant unintended consequences. Out-of-gas errors, i.e. ‍when a user misunderestimates the gas usage of their function call and does not allocate enough gas, are a major source of security vulnerabilities in Ethereum. We focus on the well-studied problem of automatically finding upper-bounds on the gas usage of a smart contract. This is a classical problem in the blockchain community and has also been extensively studied by researchers in programming languages and verification. In this work, we provide a novel approach using theorems from polyhedral geometry and real algebraic geometry, namely Farkas’ Lemma, Handelman’s Theorem, and Putinar’s Positivstellensatz, to automatically synthesize linear and polynomial parametric bounds for the gas usage of smart contracts. Our approach is the first to provide completeness guarantees for the synthesis of such parametric upper-bounds. Moreover, our theoretical results are independent of the underlying consensus protocol and can be applied to smart contracts written in any language and run on any blockchain. As a proof of concept, we also provide a tool, called “Asparagus” that implements our algorithms for Ethereum contracts written in Solidity. Finally, we provide extensive experimental results over 24,188 real-world smart contracts that are currently deployed on the Ethereum blockchain. We compare Asparagus against GASTAP, which is the only previous tool that could provide parametric bounds, and show that our method significantly outperforms it, both in terms of applicability and the tightness of the resulting bounds. More specifically, our approach can handle 80.56% of the functions (126,269 out of 156,735) in comparison with GASTAP’s 58.62%. Additionally, even on the benchmarks where both approaches successfully synthesize a bound, our bound is tighter in 97.85% of the cases.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"1145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136114735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Synthesizing Specifications 合成的规范

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622861

Kanghee Park, Loris D'Antoni, Thomas Reps

Every program should be accompanied by a specification that describes important aspects of the code's behavior, but writing good specifications is often harder than writing the code itself. This paper addresses the problem of synthesizing specifications automatically, guided by user-supplied inputs of two kinds: i) a query posed about a set of function definitions, and ii) a domain-specific language L in which the extracted property is to be expressed (we call properties in the language L-properties). Each of the property is a best L-property for the query: there is no other L-property that is strictly more precise. Furthermore, the set of synthesized L-properties is exhaustive: no more L-properties can be added to it to make the conjunction more precise. We implemented our method in a tool, Spyro. The ability to modify both the query and L provides a Spyro user with ways to customize the kind of specification to be synthesized. We use this ability to show that Spyro can be used in a variety of applications, such as mining program specifications, performing abstract-domain operations, and synthesizing algebraic properties of program modules.

每个程序都应该有一个描述代码行为的重要方面的规范，但是编写好的规范通常比编写代码本身更难。本文解决了在用户提供的两种输入的指导下自动合成规范的问题:i)对一组函数定义提出的查询，以及ii)特定于领域的语言L，其中提取的属性将被表示(我们在语言L-properties中称之为属性)。每个属性都是查询的最佳l -属性:没有其他严格意义上更精确的l -属性。此外，合成的l -属性集是详尽的:不能添加更多的l -属性来使连接更精确。我们在Spyro工具中实现了我们的方法。同时修改查询和L的能力为Spyro用户提供了定制要合成的规范类型的方法。我们使用这种能力来展示Spyro可以用于各种应用程序，例如挖掘程序规范、执行抽象域操作和合成程序模块的代数属性。

引用次数: 1

Modular Verification of Safe Memory Reclamation in Concurrent Separation Logic 并发分离逻辑中安全内存回收的模块化验证

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622827

Jaehwang Jung, Janggun Lee, Jaemin Choi, Jaewoo Kim, Sunho Park, Jeehoon Kang

Formal verification is an effective method to address the challenge of designing correct and efficient concurrent data structures. But verification efforts often ignore memory reclamation , which involves nontrivial synchronization between concurrent accesses and reclamation. When incorrectly implemented, it may lead to critical safety errors such as use-after-free and the ABA problem. Semi-automatic safe memory reclamation schemes such as hazard pointers and RCU encapsulate the complexity of manual memory management in modular interfaces. However, this modularity has not been carried over to formal verification. We propose modular specifications of hazard pointers and RCU, and formally verify realistic implementations of them in concurrent separation logic. Specifically, we design abstract predicates for hazard pointers that capture the meaning of validating the protection of nodes, and those for RCU that support optimistic traversal to possibly retired nodes. We demonstrate that the specifications indeed facilitate modular verification in three criteria: compositional verification, general applicability, and easy integration. In doing so, we present the first formal verification of Harris’s list, the Harris-Michael list, the Chase-Lev deque, and RDCSS with reclamation. We report the Coq mechanization of all our results in the Iris separation logic framework.

形式验证是解决设计正确、高效的并发数据结构这一难题的有效方法。但是验证工作经常忽略内存回收，这涉及到并发访问和回收之间的重要同步。如果执行不当，可能会导致严重的安全错误，例如免费后使用和ABA问题。半自动的安全内存回收方案，如危险指针和RCU，在模块化接口中封装了手动内存管理的复杂性。然而，这种模块化并没有延续到形式验证中。我们提出了危险指示器和RCU的模块化规范，并正式验证了它们在并发分离逻辑中的实际实现。具体来说，我们为危险指针设计了抽象的谓词，这些谓词捕获了验证节点保护的意义，为RCU设计了支持乐观遍历到可能退役的节点的谓词。我们证明了规范确实在三个标准中促进了模块化验证:组合验证、通用适用性和易于集成。在此过程中，我们提出了哈里斯列表的第一个正式验证，哈里斯-迈克尔列表，蔡斯-列夫队列和带回收的RDCSS。我们在Iris分离逻辑框架中报告了所有结果的Coq机械化。

{"title":"Modular Verification of Safe Memory Reclamation in Concurrent Separation Logic","authors":"Jaehwang Jung, Janggun Lee, Jaemin Choi, Jaewoo Kim, Sunho Park, Jeehoon Kang","doi":"10.1145/3622827","DOIUrl":"https://doi.org/10.1145/3622827","url":null,"abstract":"Formal verification is an effective method to address the challenge of designing correct and efficient concurrent data structures. But verification efforts often ignore memory reclamation , which involves nontrivial synchronization between concurrent accesses and reclamation. When incorrectly implemented, it may lead to critical safety errors such as use-after-free and the ABA problem. Semi-automatic safe memory reclamation schemes such as hazard pointers and RCU encapsulate the complexity of manual memory management in modular interfaces. However, this modularity has not been carried over to formal verification. We propose modular specifications of hazard pointers and RCU, and formally verify realistic implementations of them in concurrent separation logic. Specifically, we design abstract predicates for hazard pointers that capture the meaning of validating the protection of nodes, and those for RCU that support optimistic traversal to possibly retired nodes. We demonstrate that the specifications indeed facilitate modular verification in three criteria: compositional verification, general applicability, and easy integration. In doing so, we present the first formal verification of Harris’s list, the Harris-Michael list, the Chase-Lev deque, and RDCSS with reclamation. We report the Coq mechanization of all our results in the Iris separation logic framework.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136116377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

When Concurrency Matters: Behaviour-Oriented Concurrency 当并发很重要时:面向行为的并发

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622852

Luke Cheeseman, Matthew J. Parkinson, Sylvan Clebsch, Marios Kogias, Sophia Drossopoulou, David Chisnall, Tobias Wrigstad, Paul Liétar

Expressing parallelism and coordination is central for modern concurrent programming. Many mechanisms exist for expressing both parallelism and coordination. However, the design decisions for these two mechanisms are tightly intertwined. We believe that the interdependence of these two mechanisms should be recognised and achieved through a single, powerful primitive. We are not the first to realise this: the prime example is actor model programming, where parallelism arises through fine-grained decomposition of a program’s state into actors that are able to execute independently in parallel. However, actor model programming has a serious pain point: updating multiple actors as a single atomic operation is a challenging task. We address this pain point by introducing a new concurrency paradigm: Behaviour-Oriented Concurrency (BoC). In BoC, we are revisiting the fundamental concept of a behaviour to provide a more transactional concurrency model. BoC enables asynchronously creating atomic and ordered units of work with exclusive access to a collection of independent resources. In this paper, we describe BoC informally in terms of examples, which demonstrate the advantages of exclusive access to several independent resources, as well as the need for ordering. We define it through a formal model. We demonstrate its practicality by implementing a C++ runtime. We argue its applicability through the Savina benchmark suite: benchmarks in this suite can be more compactly represented using BoC in place of Actors, and we observe comparable, if not better, performance.

表达并行性和协调性是现代并发编程的核心。存在许多表达并行性和协调性的机制。然而，这两种机制的设计决策是紧密交织在一起的。我们认为，这两种机制的相互依存应该通过一个单一的、强大的原语来认识和实现。我们并不是第一个意识到这一点的人:主要的例子是参与者模型编程，其中并行性是通过细粒度地将程序状态分解为能够并行独立执行的参与者而产生的。然而，参与者模型编程有一个严重的痛点:将多个参与者更新为单个原子操作是一项具有挑战性的任务。我们通过引入一种新的并发范式来解决这个痛点:面向行为的并发(BoC)。在BoC中，我们正在重新审视行为的基本概念，以提供更具事务性的并发模型。BoC支持异步创建原子和有序的工作单元，这些工作单元具有对独立资源集合的独占访问权。在本文中，我们用示例非正式地描述了BoC，这些示例展示了对多个独立资源的独占访问的优势，以及排序的必要性。我们通过一个形式模型来定义它。我们通过实现一个c++运行时来演示它的实用性。我们通过Savina基准测试套件论证了它的适用性:该套件中的基准测试可以用BoC代替actor更紧凑地表示，并且我们观察到类似的性能，如果不是更好的话。

{"title":"When Concurrency Matters: Behaviour-Oriented Concurrency","authors":"Luke Cheeseman, Matthew J. Parkinson, Sylvan Clebsch, Marios Kogias, Sophia Drossopoulou, David Chisnall, Tobias Wrigstad, Paul Liétar","doi":"10.1145/3622852","DOIUrl":"https://doi.org/10.1145/3622852","url":null,"abstract":"Expressing parallelism and coordination is central for modern concurrent programming. Many mechanisms exist for expressing both parallelism and coordination. However, the design decisions for these two mechanisms are tightly intertwined. We believe that the interdependence of these two mechanisms should be recognised and achieved through a single, powerful primitive. We are not the first to realise this: the prime example is actor model programming, where parallelism arises through fine-grained decomposition of a program’s state into actors that are able to execute independently in parallel. However, actor model programming has a serious pain point: updating multiple actors as a single atomic operation is a challenging task. We address this pain point by introducing a new concurrency paradigm: Behaviour-Oriented Concurrency (BoC). In BoC, we are revisiting the fundamental concept of a behaviour to provide a more transactional concurrency model. BoC enables asynchronously creating atomic and ordered units of work with exclusive access to a collection of independent resources. In this paper, we describe BoC informally in terms of examples, which demonstrate the advantages of exclusive access to several independent resources, as well as the need for ordering. We define it through a formal model. We demonstrate its practicality by implementing a C++ runtime. We argue its applicability through the Savina benchmark suite: benchmarks in this suite can be more compactly represented using BoC in place of Actors, and we observe comparable, if not better, performance.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136077867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Reference Capabilities for Flexible Memory Management 灵活内存管理的参考功能

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622846

Arvidsson, Ellen, Castegren, Elias, Clebsch, Sylvan, Drossopoulou, Sophia, Noble, James, Parkinson, Matthew J., Wrigstad, Tobias

Verona is a concurrent object-oriented programming language that organises all the objects in a program into a forest of isolated regions. Memory is managed locally for each region, so programmers can control a program's memory use by adjusting objects' partition into regions, and by setting each region's memory management strategy. A thread can only mutate (allocate, deallocate) objects within one active region -- its "window of mutability". Memory management costs are localised to the active region, ensuring overheads can be predicted and controlled. Moving the mutability window between regions is explicit, so code can be executed wherever it is required, yet programs remain in control of memory use. An ownership type system based on reference capabilities enforces region isolation, controlling aliasing within and between regions, yet supporting objects moving between regions and threads. Data accesses never need expensive atomic operations, and are always thread-safe.

Verona是一种并发的面向对象编程语言，它将程序中的所有对象组织到孤立区域的森林中。每个区域的内存都是本地管理的，因此程序员可以通过调整对象的分区和设置每个区域的内存管理策略来控制程序的内存使用。一个线程只能在一个活动区域内改变(分配、释放)对象——它的“可变性窗口”。内存管理成本被定位到活动区域，确保开销可以预测和控制。在区域之间移动可变性窗口是显式的，因此代码可以在需要的地方执行，而程序仍然可以控制内存的使用。基于引用功能的所有权类型系统强制区域隔离，控制区域内部和区域之间的混叠，同时支持对象在区域和线程之间移动。数据访问从不需要昂贵的原子操作，并且始终是线程安全的。

引用次数: 2

Graph IRs for Impure Higher-Order Languages: Making Aggressive Optimizations Affordable with Precise Effect Dependencies 非纯高阶语言的图ir:使用精确的效果依赖来实现积极的优化

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622813

Oliver Bračevac, Guannan Wei, Songlin Jia, Supun Abeysinghe, Yuxuan Jiang, Yuyan Bao, Tiark Rompf

Graph-based intermediate representations (IRs) are widely used for powerful compiler optimizations, either interprocedurally in pure functional languages, or intraprocedurally in imperative languages. Yet so far, no suitable graph IR exists for aggressive global optimizations in languages with both effects and higher-order functions: aliasing and indirect control transfers make it difficult to maintain sufficiently granular dependency information for optimizations to be effective. To close this long-standing gap, we propose a novel typed graph IR combining a notion of reachability types with an expressive effect system to compute precise and granular effect dependencies at an affordable cost while supporting local reasoning and separate compilation. Our high-level graph IR imposes lexical structure to represent structured control flow and nesting, enabling aggressive and yet inexpensive code motion and other optimizations for impure higher-order programs. We formalize the new graph IR based on a λ-calculus with a reachability type-and-effect system along with a specification of various optimizations. We present performance case studies for tensor loop fusion, CUDA kernel fusion, symbolic execution of LLVM IR, and SQL query compilation in the Scala LMS compiler framework using the new graph IR. We observe significant speedups of up to 21 x .

基于图的中间表示(ir)广泛用于强大的编译器优化，无论是纯函数式语言的过程间优化，还是命令式语言的过程内优化。然而，到目前为止，还没有合适的图形IR用于在具有效果和高阶函数的语言中进行积极的全局优化:别名和间接控制传输使得难以维护足够细粒度的依赖信息以使优化有效。为了弥补这一长期存在的差距，我们提出了一种新的类型图IR，将可达性类型的概念与表达效果系统相结合，以可承受的成本计算精确和颗粒效果依赖关系，同时支持局部推理和单独编译。我们的高级图IR施加了词法结构来表示结构化的控制流和嵌套，从而为不纯的高阶程序提供了积极而廉价的代码移动和其他优化。我们基于λ-演算形式化了新的图IR，并给出了可达性类型和效果系统以及各种优化规范。我们介绍了张量循环融合、CUDA内核融合、LLVM IR的符号执行以及Scala LMS编译器框架中使用新图IR的SQL查询编译的性能案例研究。我们观察到高达21倍的显著加速。

{"title":"Graph IRs for Impure Higher-Order Languages: Making Aggressive Optimizations Affordable with Precise Effect Dependencies","authors":"Oliver Bračevac, Guannan Wei, Songlin Jia, Supun Abeysinghe, Yuxuan Jiang, Yuyan Bao, Tiark Rompf","doi":"10.1145/3622813","DOIUrl":"https://doi.org/10.1145/3622813","url":null,"abstract":"Graph-based intermediate representations (IRs) are widely used for powerful compiler optimizations, either interprocedurally in pure functional languages, or intraprocedurally in imperative languages. Yet so far, no suitable graph IR exists for aggressive global optimizations in languages with both effects and higher-order functions: aliasing and indirect control transfers make it difficult to maintain sufficiently granular dependency information for optimizations to be effective. To close this long-standing gap, we propose a novel typed graph IR combining a notion of reachability types with an expressive effect system to compute precise and granular effect dependencies at an affordable cost while supporting local reasoning and separate compilation. Our high-level graph IR imposes lexical structure to represent structured control flow and nesting, enabling aggressive and yet inexpensive code motion and other optimizations for impure higher-order programs. We formalize the new graph IR based on a λ-calculus with a reachability type-and-effect system along with a specification of various optimizations. We present performance case studies for tensor loop fusion, CUDA kernel fusion, symbolic execution of LLVM IR, and SQL query compilation in the Scala LMS compiler framework using the new graph IR. We observe significant speedups of up to 21 x .","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136112536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Verifying Indistinguishability of Privacy-Preserving Protocols 验证隐私保护协议的不可区分性

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622849

Kirby Linvill, Gowtham Kaki, Eric Wustrow

Internet users rely on the protocols they use to protect their private information including their identity and the websites they visit. Formal verification of these protocols can detect subtle bugs that compromise these protections at design time, but is a challenging task as it involves probabilistic reasoning about random sampling, cryptographic primitives, and concurrent execution. Existing approaches either reason about symbolic models of the protocols that sacrifice precision for automation, or reason about more precise computational models that are harder to automate and require cryptographic expertise. In this paper we propose a novel approach to verifying privacy-preserving protocols that is more precise than symbolic models yet more accessible than computational models. Our approach permits direct-style proofs of privacy, as opposed to indirect game-based proofs in computational models, by formalizing privacy as indistinguishability of possible network traces induced by a protocol. We ease automation by leveraging insights from the distributed systems verification community to create sound synchronous models of concurrent protocols. Our verification framework is implemented in F* as a library we call Waldo. We describe two large case studies of using Waldo to verify indistinguishability; one on the Encrypted Client Hello (ECH) extension of the TLS protocol and another on a Private Information Retrieval (PIR) protocol. We uncover subtle flaws in the TLS ECH specification that were missed by other models.

互联网用户依靠他们使用的协议来保护他们的私人信息，包括他们的身份和他们访问的网站。对这些协议的正式验证可以在设计时检测到危及这些保护的细微错误，但这是一项具有挑战性的任务，因为它涉及到关于随机抽样、加密原语和并发执行的概率推理。现有的方法要么是基于协议的符号模型，牺牲了自动化的精度，要么是基于更精确的计算模型，这些模型更难自动化，需要密码学专业知识。在本文中，我们提出了一种新的方法来验证隐私保护协议，它比符号模型更精确，但比计算模型更容易访问。我们的方法允许直接风格的隐私证明，而不是计算模型中间接的基于游戏的证明，通过将隐私形式化为协议诱导的可能网络痕迹的不可区分性。我们通过利用来自分布式系统验证社区的见解来创建并发协议的健全同步模型，从而简化自动化。我们的验证框架是在F*中作为一个库实现的，我们称之为Waldo。我们描述了两个使用Waldo验证不可区分性的大型案例研究;一个基于TLS协议的加密客户端Hello (ECH)扩展，另一个基于私有信息检索(PIR)协议。我们发现了TLS ECH规范中被其他模型遗漏的细微缺陷。

{"title":"Verifying Indistinguishability of Privacy-Preserving Protocols","authors":"Kirby Linvill, Gowtham Kaki, Eric Wustrow","doi":"10.1145/3622849","DOIUrl":"https://doi.org/10.1145/3622849","url":null,"abstract":"Internet users rely on the protocols they use to protect their private information including their identity and the websites they visit. Formal verification of these protocols can detect subtle bugs that compromise these protections at design time, but is a challenging task as it involves probabilistic reasoning about random sampling, cryptographic primitives, and concurrent execution. Existing approaches either reason about symbolic models of the protocols that sacrifice precision for automation, or reason about more precise computational models that are harder to automate and require cryptographic expertise. In this paper we propose a novel approach to verifying privacy-preserving protocols that is more precise than symbolic models yet more accessible than computational models. Our approach permits direct-style proofs of privacy, as opposed to indirect game-based proofs in computational models, by formalizing privacy as indistinguishability of possible network traces induced by a protocol. We ease automation by leveraging insights from the distributed systems verification community to create sound synchronous models of concurrent protocols. Our verification framework is implemented in F* as a library we call Waldo. We describe two large case studies of using Waldo to verify indistinguishability; one on the Encrypted Client Hello (ECH) extension of the TLS protocol and another on a Private Information Retrieval (PIR) protocol. We uncover subtle flaws in the TLS ECH specification that were missed by other models.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136112802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mutually Iso-Recursive Subtyping 相互等递归子类型

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622809

Andreas Rossberg

Iso-recursive types are often taken as a type-theoretic model for type recursion as present in many programming languages, e.g., classes in object-oriented languages or algebraic datatypes in functional languages. Their main advantage over an equi-recursive semantics is that they are simpler and algorithmically less expensive, which is an important consideration when the cost of type checking matters, such as for intermediate or low-level code representations, virtual machines, or runtime casts. However, a closer look reveals that iso-recursion cannot, in its standard form, efficiently express essential type system features like mutual recursion or non-uniform recursion. While it has been folklore that mutual recursion and non-uniform type parameterisation can nicely be handled by generalising to higher kinds, this encoding breaks down when combined with subtyping: the classic “Amber” rule for subtyping iso-recursive types is too weak to express mutual recursion without falling back to encodings of quadratic size. We present a foundational core calculus of iso-recursive types with declared subtyping that can express both inter- and intra-recursion subtyping without such blowup, including subtyping between constructors of higher or mixed kind. In a second step, we identify a syntactic fragment of this general calculus that allows for more efficient type checking without “deep” substitutions, by observing that higher-kinded iso-recursive types can be inserted to “guard” against unwanted β-reductions. This fragment closely resembles the structure of typical nominal subtype systems, but without requiring nominal semantics. It has been used as the basis for a proposed extension of WebAssembly with recursive types.

在许多编程语言中，例如，面向对象语言中的类或函数式语言中的代数数据类型，通常采用同递归类型作为类型递归的类型理论模型。与等递归语义相比，它们的主要优点是更简单，并且在算法上更便宜，当类型检查的成本很重要时，例如对于中间或低级代码表示、虚拟机或运行时强制转换，这是一个重要的考虑因素。然而，仔细观察就会发现，在标准形式下，等递归不能有效地表达基本的类型系统特征，如互递归或非均匀递归。虽然人们普遍认为相互递归和非统一类型参数化可以通过泛化到更高的类型来很好地处理，但这种编码在与子类型结合时就会崩溃:用于子类型等递归类型的经典“Amber”规则太弱，无法在不退回到二次大小的编码的情况下表达相互递归。我们提出了具有声明子类型的等递归类型的基本核心演算，它既可以表示递归间的子类型，也可以表示递归内的子类型，而不会出现这种爆炸，包括高级或混合类型构造函数之间的子类型。在第二步中，我们通过观察可以插入更高类型的等递归类型来“保护”不必要的β-约简，确定了这个通用演算的语法片段，该语法片段允许在没有“深度”替换的情况下进行更有效的类型检查。这个片段非常类似于典型的名义子类型系统的结构，但不需要名义语义。它被用作WebAssembly的递归类型扩展的基础。

{"title":"Mutually Iso-Recursive Subtyping","authors":"Andreas Rossberg","doi":"10.1145/3622809","DOIUrl":"https://doi.org/10.1145/3622809","url":null,"abstract":"Iso-recursive types are often taken as a type-theoretic model for type recursion as present in many programming languages, e.g., classes in object-oriented languages or algebraic datatypes in functional languages. Their main advantage over an equi-recursive semantics is that they are simpler and algorithmically less expensive, which is an important consideration when the cost of type checking matters, such as for intermediate or low-level code representations, virtual machines, or runtime casts. However, a closer look reveals that iso-recursion cannot, in its standard form, efficiently express essential type system features like mutual recursion or non-uniform recursion. While it has been folklore that mutual recursion and non-uniform type parameterisation can nicely be handled by generalising to higher kinds, this encoding breaks down when combined with subtyping: the classic “Amber” rule for subtyping iso-recursive types is too weak to express mutual recursion without falling back to encodings of quadratic size. We present a foundational core calculus of iso-recursive types with declared subtyping that can express both inter- and intra-recursion subtyping without such blowup, including subtyping between constructors of higher or mixed kind. In a second step, we identify a syntactic fragment of this general calculus that allows for more efficient type checking without “deep” substitutions, by observing that higher-kinded iso-recursive types can be inserted to “guard” against unwanted β-reductions. This fragment closely resembles the structure of typical nominal subtype systems, but without requiring nominal semantics. It has been used as the basis for a proposed extension of WebAssembly with recursive types.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136112803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Solving String Constraints with Lengths by Stabilization 用稳定法求解带长度的字符串约束

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622872

Yu-Fang Chen, David Chocholatý, Vojtěch Havlena, Lukáš Holík, Ondřej Lengál, Juraj Síč

We present a new algorithm for solving string constraints. The algorithm builds upon a recent method for solving word equations and regular constraints that interprets string variables as languages rather than strings and, consequently, mitigates the combinatorial explosion that plagues other approaches. We extend the approach to handle linear integer arithmetic length constraints by combination with a known principle of equation alignment and splitting, and by extension to other common types of string constraints, yielding a fully-fledged string solver. The ability of the framework to handle unrestricted disequalities even extends one of the largest decidable classes of string constraints, the chain-free fragment. We integrate our algorithm into a DPLL-based SMT solver. The performance of our implementation is competitive and even significantly better than state-of-the-art string solvers on several established benchmarks obtained from applications in verification of string programs.

提出了一种求解字符串约束的新算法。该算法建立在解决单词方程和规则约束的最新方法之上，该方法将字符串变量解释为语言而不是字符串，因此减轻了困扰其他方法的组合爆炸问题。我们将该方法扩展到处理线性整数算术长度约束，结合已知的方程对齐和分裂原理，并将其扩展到其他常见类型的字符串约束，从而产生一个完全成熟的字符串求解器。框架处理无限制不等式的能力甚至扩展了字符串约束的最大可确定类之一，无链片段。我们将我们的算法集成到基于dpl的SMT求解器中。在验证字符串程序的应用中获得的几个已建立的基准测试中，我们实现的性能具有竞争力，甚至明显优于最先进的字符串求解器。

引用次数: 1

Bring Your Own Data Structures to Datalog 把你自己的数据结构带到Datalog

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622840

Arash Sahebolamri, Langston Barrett, Scott Moore, Kristopher Micinski

The restricted logic programming language Datalog has become a popular implementation target for deductive-analytic workloads including social-media analytics and program analysis. Modern Datalog engines compile Datalog rules to joins over explicit representations of relations—often B-trees or hash maps. While these modern engines have enabled high scalability in many application domains, they have a crucial weakness: achieving the desired algorithmic complexity may be impossible due to representation-imposed overhead of the engine’s data structures. In this paper, we present the "Bring Your Own Data Structures" (Byods) approach, in the form of a DSL embedded in Rust. Using Byods, an engineer writes logical rules which are implicitly parametric on the concrete data structure representation; our implementation provides an interface to enable "bringing their own" data structures to represent relations, which harmoniously interact with code generated by our compiler (implemented as Rust procedural macros). We formalize the semantics of Byods as an extension of Datalog’s; our formalization captures the key properties demanded of data structures compatible with Byods, including properties required for incrementalized (semi-naïve) evaluation. We detail many applications of the Byods approach, implementing analyses requiring specialized data structures for transitive and equivalence relations to scale, including an optimized version of the Rust borrow checker Polonius; highly-parallel PageRank made possible by lattices; and a large-scale analysis of LLVM utilizing index-sharing to scale. Our results show that Byods offers both improved algorithmic scalability (reduced time and/or space complexity) and runtimes competitive with state-of-the-art parallelizing Datalog solvers.

受限制的逻辑编程语言Datalog已经成为演绎分析工作负载(包括社交媒体分析和程序分析)的流行实现目标。现代Datalog引擎将Datalog规则编译为关系的显式表示(通常是b树或哈希映射)上的连接。虽然这些现代引擎在许多应用领域中具有很高的可伸缩性，但它们有一个关键的弱点:由于引擎的数据结构的表示强加的开销，实现期望的算法复杂性可能是不可能的。在本文中，我们以嵌入在Rust中的DSL的形式提出了“自带数据结构”(Byods)方法。使用byod，工程师在具体的数据结构表示上编写隐含参数化的逻辑规则;我们的实现提供了一个接口，允许“自带”数据结构来表示关系，这些关系与我们的编译器生成的代码(作为Rust过程宏实现)和谐地交互。我们将byod的语义形式化，作为Datalog的扩展;我们的形式化捕获了与byod兼容的数据结构所需的关键属性，包括递增(semi-naïve)求值所需的属性。我们详细介绍了Byods方法的许多应用，实现了需要专门的数据结构来扩展传递和等价关系的分析，包括Rust借用检查器Polonius的优化版本;通过格实现高度并行的PageRank;以及利用索引共享进行扩展的LLVM大规模分析。我们的结果表明，Byods提供了改进的算法可伸缩性(减少了时间和/或空间复杂性)和运行时，与最先进的并行Datalog解决方案竞争。

{"title":"Bring Your Own Data Structures to Datalog","authors":"Arash Sahebolamri, Langston Barrett, Scott Moore, Kristopher Micinski","doi":"10.1145/3622840","DOIUrl":"https://doi.org/10.1145/3622840","url":null,"abstract":"The restricted logic programming language Datalog has become a popular implementation target for deductive-analytic workloads including social-media analytics and program analysis. Modern Datalog engines compile Datalog rules to joins over explicit representations of relations—often B-trees or hash maps. While these modern engines have enabled high scalability in many application domains, they have a crucial weakness: achieving the desired algorithmic complexity may be impossible due to representation-imposed overhead of the engine’s data structures. In this paper, we present the \"Bring Your Own Data Structures\" (Byods) approach, in the form of a DSL embedded in Rust. Using Byods, an engineer writes logical rules which are implicitly parametric on the concrete data structure representation; our implementation provides an interface to enable \"bringing their own\" data structures to represent relations, which harmoniously interact with code generated by our compiler (implemented as Rust procedural macros). We formalize the semantics of Byods as an extension of Datalog’s; our formalization captures the key properties demanded of data structures compatible with Byods, including properties required for incrementalized (semi-naïve) evaluation. We detail many applications of the Byods approach, implementing analyses requiring specialized data structures for transitive and equivalence relations to scale, including an optimized version of the Rust borrow checker Polonius; highly-parallel PageRank made possible by lattices; and a large-scale analysis of LLVM utilizing index-sharing to scale. Our results show that Byods offers both improved algorithmic scalability (reduced time and/or space complexity) and runtimes competitive with state-of-the-art parallelizing Datalog solvers.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136115188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1