Thomas Kœhler, Andrés Goens, Siddharth Bhat, Tobias Grosser, Phil Trinder, Michel Steuwer
Rewriting is a principled term transformation technique with uses across theorem proving and compilation. In theorem proving, each rewrite is a proof step; in compilation, rewrites optimize a program term. While developing rewrite sequences manually is possible, this process does not scale to larger rewrite sequences. Automated rewriting techniques, like greedy simplification or equality saturation, work well without requiring human input. Yet, they do not scale to large search spaces, limiting the complexity of tasks where automated rewriting is effective, and meaning that just a small increase in term size or rewrite length may result in failure. This paper proposes a semi-automatic rewriting technique as a means to scale rewriting by allowing human insight at key decision points. Specifically, we propose guided equality saturation that embraces human guidance when fully automated equality saturation does not scale. The rewriting is split into two simpler automatic equality saturation steps: from the original term to a human-provided intermediate guide, and from the guide to the target. Complex rewriting tasks may require multiple guides, resulting in a sequence of equality saturation steps. A guide can be a complete term, or a sketch containing undefined elements that are instantiated by the equality saturation search. Such sketches may be far more concise than complete terms. We demonstrate the generality and effectiveness of guided equality saturation using two case studies. First, we integrate guided equality saturation in the Lean 4 proof assistant. Proofs are written in the style of textbook proof sketches, as a series of calculations omitting details and skipping steps. These proofs conclude in less than a second instead of minutes when compared to unguided equality saturation, and can find complex proofs that previously had to be done manually. Second, in the compiler of the Rise array language, where unguided equality saturation fails to perform optimizations within an hour and using 60 GB of memory, guided equality saturation performs the same optimizations with at most 3 guides, within seconds using less than 1 GB memory.
{"title":"Guided Equality Saturation","authors":"Thomas Kœhler, Andrés Goens, Siddharth Bhat, Tobias Grosser, Phil Trinder, Michel Steuwer","doi":"10.1145/3632900","DOIUrl":"https://doi.org/10.1145/3632900","url":null,"abstract":"Rewriting is a principled term transformation technique with uses across theorem proving and compilation. In theorem proving, each rewrite is a proof step; in compilation, rewrites optimize a program term. While developing rewrite sequences manually is possible, this process does not scale to larger rewrite sequences. Automated rewriting techniques, like greedy simplification or equality saturation, work well without requiring human input. Yet, they do not scale to large search spaces, limiting the complexity of tasks where automated rewriting is effective, and meaning that just a small increase in term size or rewrite length may result in failure. This paper proposes a semi-automatic rewriting technique as a means to scale rewriting by allowing human insight at key decision points. Specifically, we propose guided equality saturation that embraces human guidance when fully automated equality saturation does not scale. The rewriting is split into two simpler automatic equality saturation steps: from the original term to a human-provided intermediate guide, and from the guide to the target. Complex rewriting tasks may require multiple guides, resulting in a sequence of equality saturation steps. A guide can be a complete term, or a sketch containing undefined elements that are instantiated by the equality saturation search. Such sketches may be far more concise than complete terms. We demonstrate the generality and effectiveness of guided equality saturation using two case studies. First, we integrate guided equality saturation in the Lean 4 proof assistant. Proofs are written in the style of textbook proof sketches, as a series of calculations omitting details and skipping steps. These proofs conclude in less than a second instead of minutes when compared to unguided equality saturation, and can find complex proofs that previously had to be done manually. Second, in the compiler of the Rise array language, where unguided equality saturation fails to perform optimizations within an hour and using 60 GB of memory, guided equality saturation performs the same optimizations with at most 3 guides, within seconds using less than 1 GB memory.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"43 10","pages":"1727 - 1758"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139381775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicolas Chataing, Stephen Dolan, Gabriel Scherer, J. Yallop
We propose a new language feature for ML-family languages, the ability to selectively unbox certain data constructors, so that their runtime representation gets compiled away to just the identity on their argument. Unboxing must be statically rejected when it could introduce confusion, that is, distinct values with the same representation. We discuss the use-case of big numbers, where unboxing allows to write code that is both efficient and safe, replacing either a safe but slow version or a fast but unsafe version. We explain the static analysis necessary to reject incorrect unboxing requests. We present our prototype implementation of this feature for the OCaml programming language, discuss several design choices and the interaction with advanced features such as Guarded Algebraic Datatypes. Our static analysis requires expanding type definitions in type expressions, which is not necessarily normalizing in presence of recursive type definitions. In other words, we must decide normalization of terms in the first-order λ-calculus with recursion. We provide an algorithm to detect non-termination on-the-fly during reduction, with proofs of correctness and completeness. Our algorithm turns out to be closely related to the normalization strategy for macro expansion in the cpp preprocessor.
{"title":"Unboxed Data Constructors: Or, How cpp Decides a Halting Problem","authors":"Nicolas Chataing, Stephen Dolan, Gabriel Scherer, J. Yallop","doi":"10.1145/3632893","DOIUrl":"https://doi.org/10.1145/3632893","url":null,"abstract":"We propose a new language feature for ML-family languages, the ability to selectively unbox certain data constructors, so that their runtime representation gets compiled away to just the identity on their argument. Unboxing must be statically rejected when it could introduce confusion, that is, distinct values with the same representation. We discuss the use-case of big numbers, where unboxing allows to write code that is both efficient and safe, replacing either a safe but slow version or a fast but unsafe version. We explain the static analysis necessary to reject incorrect unboxing requests. We present our prototype implementation of this feature for the OCaml programming language, discuss several design choices and the interaction with advanced features such as Guarded Algebraic Datatypes. Our static analysis requires expanding type definitions in type expressions, which is not necessarily normalizing in presence of recursive type definitions. In other words, we must decide normalization of terms in the first-order λ-calculus with recursion. We provide an algorithm to detect non-termination on-the-fly during reduction, with proofs of correctness and completeness. Our algorithm turns out to be closely related to the normalization strategy for macro expansion in the cpp preprocessor.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"9 5","pages":"1509 - 1539"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139381997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antoine Van Muylder, Andreas Nuyts, Dominique Devriese
Two approaches exist to incorporate parametricity into proof assistants based on dependent type theory. On the one hand, parametricity translations conveniently compute parametricity statements and their proofs solely based on individual well-typed polymorphic programs. But they do not offer internal parametricity: formal proofs that any polymorphic program of a certain type satisfies its parametricity statement. On the other hand, internally parametric type theories augment plain type theory with additional primitives out of which internal parametricity can be derived. But those type theories lack mature proof assistant implementations and deriving parametricity in them involves low-level intractable proofs. In this paper, we contribute Agda --bridges: the first practical internally parametric proof assistant. We provide the first mechanized proofs of crucial theorems for internal parametricity, like the relativity theorem. We identify a high-level sufficient condition for proving internal parametricity which we call the structure relatedness principle (SRP) by analogy with the structure identity principle (SIP) of HoTT/UF. We state and prove a general parametricity theorem for types that satisfy the SRP. Our parametricity theorem lets us obtain one-liner proofs of standard internal free theorems. We observe that the SRP is harder to prove than the SIP and provide in Agda --bridges a shallowly embedded type theory to compose types that satisfy the SRP. This type theory is an observational type theory of logical relations and our parametricity theorem ought to be one of its inference rules.
{"title":"Internal and Observational Parametricity for Cubical Agda","authors":"Antoine Van Muylder, Andreas Nuyts, Dominique Devriese","doi":"10.1145/3632850","DOIUrl":"https://doi.org/10.1145/3632850","url":null,"abstract":"Two approaches exist to incorporate parametricity into proof assistants based on dependent type theory. On the one hand, parametricity translations conveniently compute parametricity statements and their proofs solely based on individual well-typed polymorphic programs. But they do not offer internal parametricity: formal proofs that any polymorphic program of a certain type satisfies its parametricity statement. On the other hand, internally parametric type theories augment plain type theory with additional primitives out of which internal parametricity can be derived. But those type theories lack mature proof assistant implementations and deriving parametricity in them involves low-level intractable proofs. In this paper, we contribute Agda --bridges: the first practical internally parametric proof assistant. We provide the first mechanized proofs of crucial theorems for internal parametricity, like the relativity theorem. We identify a high-level sufficient condition for proving internal parametricity which we call the structure relatedness principle (SRP) by analogy with the structure identity principle (SIP) of HoTT/UF. We state and prove a general parametricity theorem for types that satisfy the SRP. Our parametricity theorem lets us obtain one-liner proofs of standard internal free theorems. We observe that the SRP is harder to prove than the SIP and provide in Agda --bridges a shallowly embedded type theory to compose types that satisfy the SRP. This type theory is an observational type theory of logical relations and our parametricity theorem ought to be one of its inference rules.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"37 33","pages":"209 - 240"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139382368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiyun Liu, Jonathan Chan, Jessica Shi, Stephanie Weirich
In type systems with dependency tracking, programmers can assign an ordered set of levels to computations and prevent information flow from high-level computations to the low-level ones. The key notion in such systems is indistinguishability: a definition of program equivalence that takes into account the parts of the program that an observer may depend on. In this paper, we investigate the use of dependency tracking in the context of dependently-typed languages. We present the Dependent Calculus of Indistinguishability (DCOI), a system that adopts indistinguishability as the definition of equality used by the type checker. DCOI also internalizes that relation as an observer-indexed propositional equality type, so that programmers may reason about indistinguishability within the language. Our design generalizes and extends prior systems that combine dependency tracking with dependent types and is the first to support conversion and propositional equality at arbitrary observer levels. We have proven type soundness and noninterference theorems for DCOI and have developed a prototype implementation of its type checker.
{"title":"Internalizing Indistinguishability with Dependent Types","authors":"Yiyun Liu, Jonathan Chan, Jessica Shi, Stephanie Weirich","doi":"10.1145/3632886","DOIUrl":"https://doi.org/10.1145/3632886","url":null,"abstract":"In type systems with dependency tracking, programmers can assign an ordered set of levels to computations and prevent information flow from high-level computations to the low-level ones. The key notion in such systems is indistinguishability: a definition of program equivalence that takes into account the parts of the program that an observer may depend on. In this paper, we investigate the use of dependency tracking in the context of dependently-typed languages. We present the Dependent Calculus of Indistinguishability (DCOI), a system that adopts indistinguishability as the definition of equality used by the type checker. DCOI also internalizes that relation as an observer-indexed propositional equality type, so that programmers may reason about indistinguishability within the language. Our design generalizes and extends prior systems that combine dependency tracking with dependent types and is the first to support conversion and propositional equality at arbitrary observer levels. We have proven type soundness and noninterference theorems for DCOI and have developed a prototype implementation of its type checker.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"45 47","pages":"1298 - 1325"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139382505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eric Zhao, Raef Maroof, Anand Dukkipati, Andrew Blinn, Zhiyi Pan, Cyrus Omar
Type systems typically only define the conditions under which an expression is well-typed, leaving ill-typed expressions formally meaningless. This approach is insufficient as the basis for language servers driving modern programming environments, which are expected to recover from simultaneously localized errors and continue to provide a variety of downstream semantic services. This paper addresses this problem, contributing the first comprehensive formal account of total type error localization and recovery: the marked lambda calculus. In particular, we define a gradual type system for expressions with marked errors, which operate as non-empty holes, together with a total procedure for marking arbitrary unmarked expressions. We mechanize the metatheory of the marked lambda calculus in Agda and implement it, scaled up, as the new basis for Hazel, a full-scale live functional programming environment with, uniquely, no meaningless editor states. The marked lambda calculus is bidirectionally typed, so localization decisions are systematically predictable based on a local flow of typing information. Constraint-based type inference can bring more distant information to bear in discovering inconsistencies but this notoriously complicates error localization. We approach this problem by deploying constraint solving as a type-hole-filling layer atop this gradual bidirectionally typed core. Errors arising from inconsistent unification constraints are localized exclusively to type and expression holes, i.e. the system identifies unfillable holes using a system of traced provenances, rather than localized in an ad hoc manner to particular expressions. The user can then interactively shift these errors to particular downstream expressions by selecting from suggested partially consistent type hole fillings, which returns control back to the bidirectional system. We implement this type hole inference system in Hazel.
{"title":"Total Type Error Localization and Recovery with Holes","authors":"Eric Zhao, Raef Maroof, Anand Dukkipati, Andrew Blinn, Zhiyi Pan, Cyrus Omar","doi":"10.1145/3632910","DOIUrl":"https://doi.org/10.1145/3632910","url":null,"abstract":"Type systems typically only define the conditions under which an expression is well-typed, leaving ill-typed expressions formally meaningless. This approach is insufficient as the basis for language servers driving modern programming environments, which are expected to recover from simultaneously localized errors and continue to provide a variety of downstream semantic services. This paper addresses this problem, contributing the first comprehensive formal account of total type error localization and recovery: the marked lambda calculus. In particular, we define a gradual type system for expressions with marked errors, which operate as non-empty holes, together with a total procedure for marking arbitrary unmarked expressions. We mechanize the metatheory of the marked lambda calculus in Agda and implement it, scaled up, as the new basis for Hazel, a full-scale live functional programming environment with, uniquely, no meaningless editor states. The marked lambda calculus is bidirectionally typed, so localization decisions are systematically predictable based on a local flow of typing information. Constraint-based type inference can bring more distant information to bear in discovering inconsistencies but this notoriously complicates error localization. We approach this problem by deploying constraint solving as a type-hole-filling layer atop this gradual bidirectionally typed core. Errors arising from inconsistent unification constraints are localized exclusively to type and expression holes, i.e. the system identifies unfillable holes using a system of traced provenances, rather than localized in an ad hoc manner to particular expressions. The user can then interactively shift these errors to particular downstream expressions by selecting from suggested partially consistent type hole fillings, which returns control back to the bidirectional system. We implement this type hole inference system in Hazel.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"2 4","pages":"2041 - 2068"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139384255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angus Hammond, Zong-Chao Liu, Thibaut Pérami, Peter Sewell, Lars Birkedal, Jean Pichon-Pharabod
Very relaxed concurrency memory models, like those of the Arm-A, RISC-V, and IBM Power hardware architectures, underpin much of computing but break a fundamental intuition about programs, namely that syntactic program order and the reads-from relation always both induce order in the execution. Instead, out-of-order execution is allowed except where prevented by certain pairwise dependencies, barriers, or other synchronisation. This means that there is no notion of the 'current' state of the program, making it challenging to design (and prove sound) syntax-directed, modular reasoning methods like Hoare logics, as usable resources cannot implicitly flow from one program point to the next. We present AxSL, a separation logic for the relaxed memory model of Arm-A, that captures the fine-grained reasoning underpinning the low-overhead synchronisation mechanisms used by high-performance systems code. In particular, AxSL allows transferring arbitrary resources using relaxed reads and writes when they induce inter-thread ordering. We mechanise AxSL in the Iris separation logic framework, illustrate it on key examples, and prove it sound with respect to the axiomatic memory model of Arm-A. Our approach is largely generic in the axiomatic model and in the instruction-set semantics, offering a potential way forward for compositional reasoning for other similar models, and for the combination of production concurrency models and full-scale ISAs.
非常宽松的并发内存模型,如 Arm-A、RISC-V 和 IBM Power 硬件架构的并发内存模型,是大部分计算的基础,但却打破了程序的一个基本直觉,即语法程序顺序和从读取关系总是同时引起执行顺序。相反,除了某些成对依赖关系、障碍或其他同步所阻止的情况外,允许无序执行。这意味着没有程序 "当前 "状态的概念,这使得设计(并证明合理的)语法导向的模块化推理方法(如 Hoare 逻辑)具有挑战性,因为可用资源不能隐式地从一个程序点流向下一个程序点。我们提出的 AxSL 是一种适用于 Arm-A 宽松内存模型的分离逻辑,它捕捉到了高性能系统代码所使用的低开销同步机制的细粒度推理基础。特别是,当线程间排序引起读写时,AxSL 允许使用宽松读写来传输任意资源。我们在 Iris 分离逻辑框架中将 AxSL 机械化,在关键示例中加以说明,并证明它在 Arm-A 的公理内存模型方面是合理的。我们的方法在公理模型和指令集语义方面具有很大程度的通用性,为其他类似模型的组合推理以及生产并发模型和全面 ISA 的结合提供了潜在的发展方向。
{"title":"An Axiomatic Basis for Computer Programming on the Relaxed Arm-A Architecture: The AxSL Logic","authors":"Angus Hammond, Zong-Chao Liu, Thibaut Pérami, Peter Sewell, Lars Birkedal, Jean Pichon-Pharabod","doi":"10.1145/3632863","DOIUrl":"https://doi.org/10.1145/3632863","url":null,"abstract":"Very relaxed concurrency memory models, like those of the Arm-A, RISC-V, and IBM Power hardware architectures, underpin much of computing but break a fundamental intuition about programs, namely that syntactic program order and the reads-from relation always both induce order in the execution. Instead, out-of-order execution is allowed except where prevented by certain pairwise dependencies, barriers, or other synchronisation. This means that there is no notion of the 'current' state of the program, making it challenging to design (and prove sound) syntax-directed, modular reasoning methods like Hoare logics, as usable resources cannot implicitly flow from one program point to the next. We present AxSL, a separation logic for the relaxed memory model of Arm-A, that captures the fine-grained reasoning underpinning the low-overhead synchronisation mechanisms used by high-performance systems code. In particular, AxSL allows transferring arbitrary resources using relaxed reads and writes when they induce inter-thread ordering. We mechanise AxSL in the Iris separation logic framework, illustrate it on key examples, and prove it sound with respect to the axiomatic memory model of Arm-A. Our approach is largely generic in the axiomatic model and in the instruction-set semantics, offering a potential way forward for compositional reasoning for other similar models, and for the combination of production concurrency models and full-scale ISAs.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"5 41","pages":"604 - 637"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139380998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xueying Qin, Liam O’CONNOR, Rob van Glabbeek, Peter Höfner, Ohad Kammar, Michel Steuwer
Rewriting is a versatile and powerful technique used in many domains. Strategic rewriting allows programmers to control the application of rewrite rules by composing individual rewrite rules into complex rewrite strategies. These strategies are semantically complex, as they may be nondeterministic, they may raise errors that trigger backtracking, and they may not terminate. Given such semantic complexity, it is necessary to establish a formal understanding of rewrite strategies and to enable reasoning about them in order to answer questions like: How do we know that a rewrite strategy terminates? How do we know that a rewrite strategy does not fail because we compose two incompatible rewrites? How do we know that a desired property holds after applying a rewrite strategy? In this paper, we introduce Shoggoth: a formal foundation for understanding, analysing and reasoning about strategic rewriting that is capable of answering these questions. We provide a denotational semantics of System S, a core language for strategic rewriting, and prove its equivalence to our big-step operational semantics, which extends existing work by explicitly accounting for divergence. We further define a location-based weakest precondition calculus to enable formal reasoning about rewriting strategies, and we prove this calculus sound with respect to the denotational semantics. We show how this calculus can be used in practice to reason about properties of rewriting strategies, including termination, that they are well-composed, and that desired postconditions hold. The semantics and calculus are formalised in Isabelle/HOL and all proofs are mechanised.
重写是一种多用途的强大技术,可用于许多领域。策略重写允许程序员将单个重写规则组合成复杂的重写策略,从而控制重写规则的应用。这些策略在语义上非常复杂,因为它们可能是非确定性的,可能会引发错误从而触发回溯,也可能不会终止。鉴于这种语义上的复杂性,我们有必要建立对重写策略的正式理解,并对它们进行推理,以回答以下问题:我们如何知道一个重写策略是正确的?我们如何知道重写策略会终止?我们如何知道一个重写策略不会因为我们组成了两个不兼容的重写而失败?我们如何知道应用重写策略后所需属性成立?在本文中,我们将介绍 Shoggoth:一个用于理解、分析和推理策略重写的形式基础,它能够回答这些问题。我们提供了策略重写的核心语言 System S 的指称语义,并证明了它与我们的大步操作语义的等价性,通过明确考虑分歧,我们扩展了现有的工作。我们进一步定义了一种基于位置的最弱前提条件微积分,以实现对重写策略的形式推理,并证明了这种微积分与指称语义的合理性。我们展示了如何在实践中使用这种微积分来推理重写策略的属性,包括终止、组合良好以及所需的后置条件成立。语义和微积分都用 Isabelle/HOL 形式化,所有证明都是机械化的。
{"title":"Shoggoth: A Formal Foundation for Strategic Rewriting","authors":"Xueying Qin, Liam O’CONNOR, Rob van Glabbeek, Peter Höfner, Ohad Kammar, Michel Steuwer","doi":"10.1145/3633211","DOIUrl":"https://doi.org/10.1145/3633211","url":null,"abstract":"Rewriting is a versatile and powerful technique used in many domains. Strategic rewriting allows programmers to control the application of rewrite rules by composing individual rewrite rules into complex rewrite strategies. These strategies are semantically complex, as they may be nondeterministic, they may raise errors that trigger backtracking, and they may not terminate. Given such semantic complexity, it is necessary to establish a formal understanding of rewrite strategies and to enable reasoning about them in order to answer questions like: How do we know that a rewrite strategy terminates? How do we know that a rewrite strategy does not fail because we compose two incompatible rewrites? How do we know that a desired property holds after applying a rewrite strategy? In this paper, we introduce Shoggoth: a formal foundation for understanding, analysing and reasoning about strategic rewriting that is capable of answering these questions. We provide a denotational semantics of System S, a core language for strategic rewriting, and prove its equivalence to our big-step operational semantics, which extends existing work by explicitly accounting for divergence. We further define a location-based weakest precondition calculus to enable formal reasoning about rewriting strategies, and we prove this calculus sound with respect to the denotational semantics. We show how this calculus can be used in practice to reason about properties of rewriting strategies, including termination, that they are well-composed, and that desired postconditions hold. The semantics and calculus are formalised in Isabelle/HOL and all proofs are mechanised.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"50 43","pages":"61 - 89"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139382205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Pottier, Armaël Guéneau, Jacques-Henri Jourdan, Glen Mével
A thunk is a mutable data structure that offers a simple memoization service: it stores either a suspended computation or the result of this computation. Okasaki [1999] presents many data structures that exploit thunks to achieve good amortized time complexity. He analyzes their complexity by associating a debit with every thunk. A debit can be paid off in several increments; a thunk whose debit has been fully paid off can be forced. Quite strikingly, a debit is associated also with future thunks, which do not yet exist in memory. Some of the debit of a faraway future thunk can be transferred to a nearer future thunk. We present a complete machine-checked reconstruction of Okasaki's reasoning rules in Iris$, a rich separation logic with time credits. We demonstrate the applicability of the rules by verifying a few operations on streams as well as several of Okasaki's data structures, namely the physicist's queue, implicit queues, and the banker's queue.
{"title":"Thunks and Debits in Separation Logic with Time Credits","authors":"F. Pottier, Armaël Guéneau, Jacques-Henri Jourdan, Glen Mével","doi":"10.1145/3632892","DOIUrl":"https://doi.org/10.1145/3632892","url":null,"abstract":"A thunk is a mutable data structure that offers a simple memoization service: it stores either a suspended computation or the result of this computation. Okasaki [1999] presents many data structures that exploit thunks to achieve good amortized time complexity. He analyzes their complexity by associating a debit with every thunk. A debit can be paid off in several increments; a thunk whose debit has been fully paid off can be forced. Quite strikingly, a debit is associated also with future thunks, which do not yet exist in memory. Some of the debit of a faraway future thunk can be transferred to a nearer future thunk. We present a complete machine-checked reconstruction of Okasaki's reasoning rules in Iris$, a rich separation logic with time credits. We demonstrate the applicability of the rules by verifying a few operations on streams as well as several of Okasaki's data structures, namely the physicist's queue, implicit queues, and the banker's queue.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"31 3","pages":"1482 - 1508"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139382935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Syntax-guided synthesis has been a prevalent theme in various computer-aided programming systems. However, the domain of bit-vector synthesis poses several unique challenges that have not yet been sufficiently addressed and resolved. In this paper, we propose a novel synthesis approach that incorporates a distinct enumeration strategy based on various factors. Technically, this approach weighs in subexpression recurrence by term-graph-based enumeration, avoids useless candidates by example-guided filtration, prioritizes valuable components identified by large language models. This approach also incorporates a bottom-up deduction step to enhance the enumeration algorithm by considering subproblems that contribute to the deductive resolution. We implement all the enhanced enumeration techniques in our SyGuS solver DryadSynth, which outperforms state-of-the-art solvers in terms of the number of solved problems, execution time, and solution size. Notably, DryadSynth successfully solved 31 synthesis problems for the first time, including 5 renowned Hacker's Delight problems.
{"title":"Enhanced Enumeration Techniques for Syntax-Guided Synthesis of Bit-Vector Manipulations","authors":"Yuantian Ding, Xiaokang Qiu","doi":"10.1145/3632913","DOIUrl":"https://doi.org/10.1145/3632913","url":null,"abstract":"Syntax-guided synthesis has been a prevalent theme in various computer-aided programming systems. However, the domain of bit-vector synthesis poses several unique challenges that have not yet been sufficiently addressed and resolved. In this paper, we propose a novel synthesis approach that incorporates a distinct enumeration strategy based on various factors. Technically, this approach weighs in subexpression recurrence by term-graph-based enumeration, avoids useless candidates by example-guided filtration, prioritizes valuable components identified by large language models. This approach also incorporates a bottom-up deduction step to enhance the enumeration algorithm by considering subproblems that contribute to the deductive resolution. We implement all the enhanced enumeration techniques in our SyGuS solver DryadSynth, which outperforms state-of-the-art solvers in terms of the number of solved problems, execution time, and solution size. Notably, DryadSynth successfully solved 31 synthesis problems for the first time, including 5 renowned Hacker's Delight problems.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"46 13","pages":"2129 - 2159"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139382479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Parreaux, Aleksander Boruch-Gruszecki, Andong Fan, Chun Yin Chau
Type inference in the presence of first-class or “impredicative” second-order polymorphism à la System F has been an active research area for several decades, with original works dating back to the end of the 80s. Yet, until now many basic problems remain open, such as how to type check expressions like (𝜆𝑥. (𝑥 123, 𝑥 True)) id reliably. We show that a type inference approach based on multi-bounded polymorphism, a form of implicit polymorphic subtyping with multiple lower and upper bounds, can help us resolve most of these problems in a uniquely simple and regular way. We define F{≤}, a declarative type system derived from the existing theory of implicit coercions by Cretin and Rémy (LICS 2014), and we introduce SuperF, a novel algorithm to infer polymorphic multi-bounded F{≤} types while checking user type annotations written in the syntax of System F. We use a recursion-avoiding heuristic to guarantee termination of type inference at the cost of rejecting some valid programs, which thankfully rarely triggers in practice. We show that SuperF is vastly more powerful than all first-class-polymorphic type inference systems proposed so far, significantly advancing the state of the art in type inference for general-purpose programming languages.
几十年来,类型推断一直是一个活跃的研究领域,其原创性工作可以追溯到上世纪 80 年代末。然而,直到现在,许多基本问题仍悬而未决,例如如何对表达式进行类型检查,如 (𝜆𝑥.(𝑥 123, 𝑥 True)) id 这样的表达式进行可靠的类型检查。我们证明,基于多边界多态性的类型推断方法--一种具有多个下界和上界的隐式多态子类型--可以帮助我们以一种独特的简单而有规律的方式解决大多数此类问题。我们定义了 F{≤},这是一种声明式类型系统,源自 Cretin 和 Rémy(LICS 2014)的现有隐式强制理论;我们还介绍了 SuperF,这是一种新颖的算法,用于推断多态多界的 F{≤} 类型,同时检查用 F 系统语法编写的用户类型注释。我们的研究表明,SuperF 比迄今为止提出的所有第一类多态类型推断系统都要强大得多,极大地推动了通用编程语言类型推断技术的发展。
{"title":"When Subtyping Constraints Liberate: A Novel Type Inference Approach for First-Class Polymorphism","authors":"L. Parreaux, Aleksander Boruch-Gruszecki, Andong Fan, Chun Yin Chau","doi":"10.1145/3632890","DOIUrl":"https://doi.org/10.1145/3632890","url":null,"abstract":"Type inference in the presence of first-class or “impredicative” second-order polymorphism à la System F has been an active research area for several decades, with original works dating back to the end of the 80s. Yet, until now many basic problems remain open, such as how to type check expressions like (𝜆𝑥. (𝑥 123, 𝑥 True)) id reliably. We show that a type inference approach based on multi-bounded polymorphism, a form of implicit polymorphic subtyping with multiple lower and upper bounds, can help us resolve most of these problems in a uniquely simple and regular way. We define F{≤}, a declarative type system derived from the existing theory of implicit coercions by Cretin and Rémy (LICS 2014), and we introduce SuperF, a novel algorithm to infer polymorphic multi-bounded F{≤} types while checking user type annotations written in the syntax of System F. We use a recursion-avoiding heuristic to guarantee termination of type inference at the cost of rejecting some valid programs, which thankfully rarely triggers in practice. We show that SuperF is vastly more powerful than all first-class-polymorphic type inference systems proposed so far, significantly advancing the state of the art in type inference for general-purpose programming languages.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"12 9","pages":"1418 - 1450"},"PeriodicalIF":1.8,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139383093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}