Proceedings of the ACM on Programming Languages最新文献_第7页

From Capabilities to Regions: Enabling Efficient Compilation of Lexical Effect Handlers 从功能到区域:启用词法效果处理程序的有效编译

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622831

Marius Müller, Philipp Schuster, Jonathan Lindegaard Starup, Klaus Ostermann, Jonathan Immanuel Brachthäuser

Effect handlers are a high-level abstraction that enables programmers to use effects in a structured way. They have gained a lot of popularity within academia and subsequently also in industry. However, the abstraction often comes with a significant runtime cost and there has been intensive research recently on how to reduce this price. A promising approach in this regard is to implement effect handlers using a CPS translation and to provide sufficient information about the nesting of handlers. With this information the CPS translation can decide how effects have to be lifted through handlers, i.e., which handlers need to be skipped, in order to handle the effect at the correct place. A structured way to make this information available is to use a calculus with a region system and explicit subregion evidence. Such calculi, however, are quite verbose, which makes them impractical to use as a source-level language. We present a method to infer the lifting information for a calculus underlying a source-level language. This calculus uses second-class capabilities for the safe use of effects. To do so, we define a typed translation to a calculus with regions and evidence and we show that this lift-inference translation is typability- and semantics-preserving. On the one hand, this exposes the precise relation between the second-class property and the structure given by regions. On the other hand, it closes a gap in a compiler pipeline enabling efficient compilation of the source-level language. We have implemented lift inference in this compiler pipeline and conducted benchmarks which indicate that the approach is indeed working.

效果处理程序是一种高级抽象，它使程序员能够以结构化的方式使用效果。它们在学术界和随后的工业界都很受欢迎。然而，抽象通常伴随着巨大的运行成本，最近人们对如何降低这一成本进行了深入的研究。在这方面，一个很有前途的方法是使用CPS转换实现效果处理程序，并提供有关处理程序嵌套的充分信息。有了这些信息，CPS转换可以决定如何通过处理程序提升效果，也就是说，为了在正确的位置处理效果，需要跳过哪些处理程序。使这些信息可用的结构化方法是使用具有区域系统和显式子区域证据的演算。然而，这样的演算是相当冗长的，这使得它们不适合作为源代码级语言使用。我们提出了一种方法来推断一个微积分的提升信息底层的源级语言。这种演算使用二级能力来安全使用效果。为此，我们定义了一个带区域和证据的微积分的类型翻译，并证明了这种提升推理翻译是可类型化和语义保留的。这一方面揭示了二级属性与区域给出的结构之间的精确关系。另一方面，它填补了编译器管道中的空白，从而可以有效地编译源级语言。我们已经在这个编译器管道中实现了提升推理，并进行了基准测试，表明该方法确实有效。

{"title":"From Capabilities to Regions: Enabling Efficient Compilation of Lexical Effect Handlers","authors":"Marius Müller, Philipp Schuster, Jonathan Lindegaard Starup, Klaus Ostermann, Jonathan Immanuel Brachthäuser","doi":"10.1145/3622831","DOIUrl":"https://doi.org/10.1145/3622831","url":null,"abstract":"Effect handlers are a high-level abstraction that enables programmers to use effects in a structured way. They have gained a lot of popularity within academia and subsequently also in industry. However, the abstraction often comes with a significant runtime cost and there has been intensive research recently on how to reduce this price. A promising approach in this regard is to implement effect handlers using a CPS translation and to provide sufficient information about the nesting of handlers. With this information the CPS translation can decide how effects have to be lifted through handlers, i.e., which handlers need to be skipped, in order to handle the effect at the correct place. A structured way to make this information available is to use a calculus with a region system and explicit subregion evidence. Such calculi, however, are quite verbose, which makes them impractical to use as a source-level language. We present a method to infer the lifting information for a calculus underlying a source-level language. This calculus uses second-class capabilities for the safe use of effects. To do so, we define a typed translation to a calculus with regions and evidence and we show that this lift-inference translation is typability- and semantics-preserving. On the one hand, this exposes the precise relation between the second-class property and the structure given by regions. On the other hand, it closes a gap in a compiler pipeline enabling efficient compilation of the source-level language. We have implemented lift inference in this compiler pipeline and conducted benchmarks which indicate that the approach is indeed working.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136112800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MemPerf: Profiling Allocator-Induced Performance Slowdowns MemPerf:分析分配器引起的性能下降

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622848

Jin Zhou, Sam Silvestro, Steven (Jiaxun) Tang, Hanmei Yang, Hongyu Liu, Guangming Zeng, Bo Wu, Cong Liu, Tongping Liu

The memory allocator plays a key role in the performance of applications, but none of the existing profilers can pinpoint performance slowdowns caused by a memory allocator. Consequently, programmers may spend time improving application code incorrectly or unnecessarily, achieving low or no performance improvement. This paper designs the first profiler—MemPerf—to identify allocator-induced performance slowdowns without comparing against another allocator. Based on the key observation that an allocator may impact the whole life-cycle of heap objects, including the accesses (or uses) of these objects, MemPerf proposes a life-cycle based detection to identify slowdowns caused by slow memory management operations and slow accesses separately. For the prior one, MemPerf proposes a thread-aware and type-aware performance modeling to identify slow management operations. For slow memory accesses, MemPerf utilizes a top-down approach to identify all possible reasons for slow memory accesses introduced by the allocator, mainly due to cache and TLB misses, and further proposes a unified method to identify them correctly and efficiently. Based on our extensive evaluation, MemPerf reports 98% medium and large allocator-reduced slowdowns (larger than 5%) correctly without reporting any false positives. MemPerf also pinpoints multiple known and unknown design issues in widely-used allocators.

内存分配器在应用程序的性能中起着关键作用，但是现有的分析器都不能精确地指出由内存分配器引起的性能下降。因此，程序员可能会花费时间错误地或不必要地改进应用程序代码，从而实现较低或没有性能改进。本文设计了第一个分析器—memperf—来识别由分配器引起的性能下降，而无需与另一个分配器进行比较。基于分配器可能影响堆对象的整个生命周期(包括这些对象的访问(或使用))这一关键观察，MemPerf提出了一种基于生命周期的检测，以分别识别由缓慢的内存管理操作和缓慢的访问引起的减速。对于前一个，MemPerf提出了一个线程感知和类型感知的性能建模，以识别缓慢的管理操作。对于内存访问缓慢，MemPerf采用自顶向下的方法识别分配器引入的所有可能的内存访问缓慢的原因，主要是由于缓存和TLB丢失，并进一步提出了一个统一的方法来正确有效地识别它们。根据我们的广泛评估，MemPerf可以正确报告98%的中型和大型分配器减少的减速(大于5%)，而不会报告任何误报。MemPerf还指出了广泛使用的分配器中多个已知和未知的设计问题。

{"title":"MemPerf: Profiling Allocator-Induced Performance Slowdowns","authors":"Jin Zhou, Sam Silvestro, Steven (Jiaxun) Tang, Hanmei Yang, Hongyu Liu, Guangming Zeng, Bo Wu, Cong Liu, Tongping Liu","doi":"10.1145/3622848","DOIUrl":"https://doi.org/10.1145/3622848","url":null,"abstract":"The memory allocator plays a key role in the performance of applications, but none of the existing profilers can pinpoint performance slowdowns caused by a memory allocator. Consequently, programmers may spend time improving application code incorrectly or unnecessarily, achieving low or no performance improvement. This paper designs the first profiler—MemPerf—to identify allocator-induced performance slowdowns without comparing against another allocator. Based on the key observation that an allocator may impact the whole life-cycle of heap objects, including the accesses (or uses) of these objects, MemPerf proposes a life-cycle based detection to identify slowdowns caused by slow memory management operations and slow accesses separately. For the prior one, MemPerf proposes a thread-aware and type-aware performance modeling to identify slow management operations. For slow memory accesses, MemPerf utilizes a top-down approach to identify all possible reasons for slow memory accesses introduced by the allocator, mainly due to cache and TLB misses, and further proposes a unified method to identify them correctly and efficiently. Based on our extensive evaluation, MemPerf reports 98% medium and large allocator-reduced slowdowns (larger than 5%) correctly without reporting any false positives. MemPerf also pinpoints multiple known and unknown design issues in widely-used allocators.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136112806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Inductive Program Synthesis Guided by Observational Program Similarity 以观察性节目相似度为指导的归纳性节目综合

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622830

Jack Feser, Işıl Dillig, Armando Solar-Lezama

We present a new general-purpose synthesis technique for generating programs from input-output examples. Our method, called metric program synthesis, relaxes the observational equivalence idea (used widely in bottom-up enumerative synthesis) into a weaker notion of observational similarity, with the goal of reducing the search space that the synthesizer needs to explore. Our method clusters programs into equivalence classes based on an expert-provided distance metric and constructs a version space that compactly represents “approximately correct” programs. Then, given a “close enough” program sampled from this version space, our approach uses a distance-guided repair algorithm to find a program that exactly matches the given input-output examples. We have implemented our proposed metric program synthesis technique in a tool called SyMetric and evaluate it in three different domains considered in prior work. Our evaluation shows that SyMetric outperforms other domain-agnostic synthesizers that use observational equivalence and that it achieves results competitive with domain-specific synthesizers that are either designed for or trained on those domains.

我们提出了一种新的通用综合技术，用于从输入输出示例生成程序。我们的方法，称为度量程序综合，将观测等效思想(在自下而上的枚举综合中广泛使用)放宽为一个较弱的观测相似性概念，目的是减少综合器需要探索的搜索空间。我们的方法基于专家提供的距离度量将程序聚类到等价类中，并构建一个紧凑地表示“近似正确”程序的版本空间。然后，给定从该版本空间中采样的“足够接近”的程序，我们的方法使用距离引导修复算法来找到与给定输入输出示例完全匹配的程序。我们已经在一个名为SyMetric的工具中实现了我们提出的度量程序合成技术，并在先前工作中考虑的三个不同领域中对其进行了评估。我们的评估表明，SyMetric优于其他使用观察等效的领域不可知合成器，并且它获得的结果与为这些领域设计或在这些领域上训练的领域特定合成器竞争。

{"title":"Inductive Program Synthesis Guided by Observational Program Similarity","authors":"Jack Feser, Işıl Dillig, Armando Solar-Lezama","doi":"10.1145/3622830","DOIUrl":"https://doi.org/10.1145/3622830","url":null,"abstract":"We present a new general-purpose synthesis technique for generating programs from input-output examples. Our method, called metric program synthesis, relaxes the observational equivalence idea (used widely in bottom-up enumerative synthesis) into a weaker notion of observational similarity, with the goal of reducing the search space that the synthesizer needs to explore. Our method clusters programs into equivalence classes based on an expert-provided distance metric and constructs a version space that compactly represents “approximately correct” programs. Then, given a “close enough” program sampled from this version space, our approach uses a distance-guided repair algorithm to find a program that exactly matches the given input-output examples. We have implemented our proposed metric program synthesis technique in a tool called SyMetric and evaluate it in three different domains considered in prior work. Our evaluation shows that SyMetric outperforms other domain-agnostic synthesizers that use observational equivalence and that it achieves results competitive with domain-specific synthesizers that are either designed for or trained on those domains.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136112809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AtomiS: Data-Centric Synchronization Made Practical AtomiS:以数据为中心的同步实现

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622801

Hervé Paulino, Ana Almeida Matos, Jan Cederquist, Marco Giunti, João Matos, António Ravara

Data-Centric Synchronization (DCS) shifts the reasoning about concurrency restrictions from control structures to data declaration. It is a high-level declarative approach that abstracts away from the actual concurrency control mechanism(s) in use. Despite its advantages, the practical use of DCS is hindered by the fact that it may require many annotations and/or multiple implementations of the same method to cope with differently qualified parameters. To overcome these limitations, in this paper we present AtomiS, a new DCS approach that requires only qualifying types of parameters and return values in interface definitions, and of fields in class definitions. The latter may also be abstracted away in type parameters, rendering class implementations virtually annotation-free. From this high level specification, a static analysis infers the atomicity constraints that are local to each method, considering valid only the method variants that are consistent with the specification, and performs code generation for all valid variants of each method. The generated code is then the target for automatic injection of concurrency control primitives that are responsible for ensuring the absence of data-races, atomicity-violations and deadlocks. We provide a Java implementation and showcase the applicability of AtomiS in real-life code. For the benchmarks analysed, AtomiS requires fewer annotations than the original number of regions requiring locks, as well as fewer annotations than Atomic Sets (a reference DCS proposal).

以数据为中心的同步(DCS)将关于并发限制的推理从控制结构转移到数据声明。它是一种高级声明性方法，从正在使用的实际并发控制机制中抽象出来。尽管DCS有很多优点，但它的实际使用受到了阻碍，因为它可能需要许多注释和/或同一方法的多个实现来处理不同的限定参数。为了克服这些限制，本文提出了AtomiS，这是一种新的DCS方法，它只需要接口定义中的参数和返回值以及类定义中的字段的限定类型。后者也可以在类型参数中抽象出来，使类实现几乎没有注释。从这个高级规范中，静态分析推断出每个方法的局部原子性约束，只考虑与规范一致的方法变体，并为每个方法的所有有效变体执行代码生成。生成的代码是自动注入并发控制原语的目标，这些原语负责确保没有数据竞争、原子性冲突和死锁。我们提供了一个Java实现，并展示了AtomiS在实际代码中的适用性。对于所分析的基准测试，AtomiS所需的注释少于需要锁的原始区域数量，也少于Atomic Sets(参考DCS建议)所需的注释。

{"title":"AtomiS: Data-Centric Synchronization Made Practical","authors":"Hervé Paulino, Ana Almeida Matos, Jan Cederquist, Marco Giunti, João Matos, António Ravara","doi":"10.1145/3622801","DOIUrl":"https://doi.org/10.1145/3622801","url":null,"abstract":"Data-Centric Synchronization (DCS) shifts the reasoning about concurrency restrictions from control structures to data declaration. It is a high-level declarative approach that abstracts away from the actual concurrency control mechanism(s) in use. Despite its advantages, the practical use of DCS is hindered by the fact that it may require many annotations and/or multiple implementations of the same method to cope with differently qualified parameters. To overcome these limitations, in this paper we present AtomiS, a new DCS approach that requires only qualifying types of parameters and return values in interface definitions, and of fields in class definitions. The latter may also be abstracted away in type parameters, rendering class implementations virtually annotation-free. From this high level specification, a static analysis infers the atomicity constraints that are local to each method, considering valid only the method variants that are consistent with the specification, and performs code generation for all valid variants of each method. The generated code is then the target for automatic injection of concurrency control primitives that are responsible for ensuring the absence of data-races, atomicity-violations and deadlocks. We provide a Java implementation and showcase the applicability of AtomiS in real-life code. For the benchmarks analysed, AtomiS requires fewer annotations than the original number of regions requiring locks, as well as fewer annotations than Atomic Sets (a reference DCS proposal).","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136115034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Counterexample Driven Quantifier Instantiations with Applications to Distributed Protocols 分布式协议应用的反例驱动量词实例化

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622864

Orr Tamir, Marcelo Taube, Kenneth L. McMillan, Sharon Shoham, Jon Howell, Guy Gueta, Mooly Sagiv

Formally verifying infinite-state systems can be a daunting task, especially when it comes to reasoning about quantifiers. In particular, quantifier alternations in conjunction with function symbols can create function cycles that result in infinitely many ground terms, making it difficult for solvers to instantiate quantifiers and causing them to diverge. This can leave users with no useful information on how to proceed. To address this issue, we propose an interactive verification methodology that uses a relational abstraction technique to mitigate solver divergence in the presence of quantifiers. This technique abstracts functions in the verification conditions (VCs) as one-to-one relations, which avoids the creation of function cycles and the resulting proliferation of ground terms. Relational abstraction is sound and guarantees correctness if the solver cannot find counter-models. However, it may also lead to false counterexamples, which can be addressed by refining the abstraction and requiring the existence of corresponding elements. In the domain of distributed protocols, we can refine the abstraction by diagnosing counterexamples and manually instantiating elements in the range of the original function. If the verification conditions are correct, there always exist finitely many refinement steps that eliminate all spurious counter-models, making the approach complete. We applied this approach in Ivy to verify the safety properties of consensus protocols and found that: (1) most verification goals can be automatically verified using relational abstraction, while SMT solvers often diverge when given the original VC, (2) only a few manual instantiations were needed, and the counterexamples provided valuable guidance for the user compared to timeouts produced by the traditional approach, and (3) the technique can be used to derive efficient low-level implementations of tricky algorithms.

正式验证无限状态系统可能是一项艰巨的任务，特别是在涉及量词推理时。特别是，量词与函数符号的交替可以创建函数循环，导致无限多的基本项，使求解者难以实例化量词并导致它们发散。这可能会让用户没有关于如何继续的有用信息。为了解决这个问题，我们提出了一种交互式验证方法，该方法使用关系抽象技术来减轻存在量词时求解器的分歧。该技术将验证条件(VCs)中的函数抽象为一对一关系，从而避免了函数循环的创建和由此产生的基项的扩散。关系抽象是合理的，如果求解器找不到反模型，它可以保证正确性。然而，它也可能导致错误的反例，这可以通过细化抽象并要求存在相应的元素来解决。在分布式协议领域，我们可以通过诊断反例和手动实例化原始功能范围内的元素来改进抽象。如果验证条件是正确的，总是存在有限的细化步骤来消除所有虚假的反模型，使方法完整。我们在Ivy中应用了这种方法来验证共识协议的安全属性，发现:(1)大多数验证目标可以使用关系抽象自动验证，而SMT求解器在给定原始VC时经常出现偏差;(2)只需要少量的手动实例化，与传统方法产生的超时相比，反例为用户提供了有价值的指导;(3)该技术可用于派生复杂算法的高效底层实现。

{"title":"Counterexample Driven Quantifier Instantiations with Applications to Distributed Protocols","authors":"Orr Tamir, Marcelo Taube, Kenneth L. McMillan, Sharon Shoham, Jon Howell, Guy Gueta, Mooly Sagiv","doi":"10.1145/3622864","DOIUrl":"https://doi.org/10.1145/3622864","url":null,"abstract":"Formally verifying infinite-state systems can be a daunting task, especially when it comes to reasoning about quantifiers. In particular, quantifier alternations in conjunction with function symbols can create function cycles that result in infinitely many ground terms, making it difficult for solvers to instantiate quantifiers and causing them to diverge. This can leave users with no useful information on how to proceed. To address this issue, we propose an interactive verification methodology that uses a relational abstraction technique to mitigate solver divergence in the presence of quantifiers. This technique abstracts functions in the verification conditions (VCs) as one-to-one relations, which avoids the creation of function cycles and the resulting proliferation of ground terms. Relational abstraction is sound and guarantees correctness if the solver cannot find counter-models. However, it may also lead to false counterexamples, which can be addressed by refining the abstraction and requiring the existence of corresponding elements. In the domain of distributed protocols, we can refine the abstraction by diagnosing counterexamples and manually instantiating elements in the range of the original function. If the verification conditions are correct, there always exist finitely many refinement steps that eliminate all spurious counter-models, making the approach complete. We applied this approach in Ivy to verify the safety properties of consensus protocols and found that: (1) most verification goals can be automatically verified using relational abstraction, while SMT solvers often diverge when given the original VC, (2) only a few manual instantiations were needed, and the counterexamples provided valuable guidance for the user compared to timeouts produced by the traditional approach, and (3) the technique can be used to derive efficient low-level implementations of tricky algorithms.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136115049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Structural Subtyping as Parametric Polymorphism 结构子类型作为参数多态性

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622836

Tang, Wenhao, Hillerström, Daniel, McKinna, James, Steuwer, Michel, Dardha, Ornela, Fu, Rongxiao, Lindley, Sam

Structural subtyping and parametric polymorphism provide similar flexibility and reusability to programmers. For example, both features enable the programmer to provide a wider record as an argument to a function that expects a narrower one. However, the means by which they do so differs substantially, and the precise details of the relationship between them exists, at best, as folklore in literature. In this paper, we systematically study the relative expressive power of structural subtyping and parametric polymorphism. We focus our investigation on establishing the extent to which parametric polymorphism, in the form of row and presence polymorphism, can encode structural subtyping for variant and record types. We base our study on various Church-style $lambda$-calculi extended with records and variants, different forms of structural subtyping, and row and presence polymorphism. We characterise expressiveness by exhibiting compositional translations between calculi. For each translation we prove a type preservation and operational correspondence result. We also prove a number of non-existence results. By imposing restrictions on both source and target types, we reveal further subtleties in the expressiveness landscape, the restrictions enabling otherwise impossible translations to be defined. More specifically, we prove that full subtyping cannot be encoded via polymorphism, but we show that several restricted forms of subtyping can be encoded via particular forms of polymorphism.

结构子类型和参数多态性为程序员提供了类似的灵活性和可重用性。例如，这两个特性都允许程序员提供一个更宽的记录作为函数的参数，而函数需要一个更窄的记录。然而，他们这样做的手段有很大的不同，他们之间关系的精确细节存在，充其量，作为文学中的民间传说。本文系统地研究了结构亚型和参数多态性的相对表达能力。我们的研究重点是确定参数多态性(以行和存在多态性的形式)在多大程度上可以为变量和记录类型编码结构子类型。我们的研究基于各种教会风格的$lambda$-演算，这些演算扩展了记录和变体，不同形式的结构亚型，以及行和存在多态性。我们通过展示微积分之间的组合翻译来表征表现力。对于每个翻译，我们证明了一个类型保持和操作对应的结果。我们还证明了一些不存在的结果。通过对源类型和目标类型施加限制，我们进一步揭示了表达性领域的微妙之处，这些限制使不可能的翻译得以定义。更具体地说，我们证明了完整的子类型不能通过多态性编码，但我们证明了一些限制形式的子类型可以通过特定形式的多态性编码。

{"title":"Structural Subtyping as Parametric Polymorphism","authors":"Tang, Wenhao, Hillerström, Daniel, McKinna, James, Steuwer, Michel, Dardha, Ornela, Fu, Rongxiao, Lindley, Sam","doi":"10.1145/3622836","DOIUrl":"https://doi.org/10.1145/3622836","url":null,"abstract":"Structural subtyping and parametric polymorphism provide similar flexibility and reusability to programmers. For example, both features enable the programmer to provide a wider record as an argument to a function that expects a narrower one. However, the means by which they do so differs substantially, and the precise details of the relationship between them exists, at best, as folklore in literature. In this paper, we systematically study the relative expressive power of structural subtyping and parametric polymorphism. We focus our investigation on establishing the extent to which parametric polymorphism, in the form of row and presence polymorphism, can encode structural subtyping for variant and record types. We base our study on various Church-style $lambda$-calculi extended with records and variants, different forms of structural subtyping, and row and presence polymorphism. We characterise expressiveness by exhibiting compositional translations between calculi. For each translation we prove a type preservation and operational correspondence result. We also prove a number of non-existence results. By imposing restrictions on both source and target types, we reveal further subtleties in the expressiveness landscape, the restrictions enabling otherwise impossible translations to be defined. More specifically, we prove that full subtyping cannot be encoded via polymorphism, but we show that several restricted forms of subtyping can be encoded via particular forms of polymorphism.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136077527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Concrete Type Inference for Code Optimization using Machine Learning with SMT Solving 基于SMT求解的机器学习代码优化的具体类型推断

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622825

Fangke Ye, Jisheng Zhao, Jun Shirako, Vivek Sarkar

Despite the widespread popularity of dynamically typed languages such as Python, it is well known that they pose significant challenges to code optimization due to the lack of concrete type information. To overcome this limitation, many ahead-of-time optimizing compiler approaches for Python rely on programmers to provide optional type information as a prerequisite for extensive code optimization. Since few programmers provide this information, a large majority of Python applications are executed without the benefit of code optimization, thereby contributing collectively to a significant worldwide wastage of compute and energy resources. In this paper, we introduce a new approach to concrete type inference that is shown to be effective in enabling code optimization for dynamically typed languages, without requiring the programmer to provide any type information. We explore three kinds of type inference algorithms in our approach based on: 1) machine learning models including GPT-4, 2) constraint-based inference based on SMT solving, and 3) a combination of 1) and 2). Our approach then uses the output from type inference to generate multi-version code for a bounded number of concrete type options, while also including a catch-all untyped version for the case when no match is found. The typed versions are then amenable to code optimization. Experimental results show that the combined algorithm in 3) delivers far superior precision and performance than the separate algorithms for 1) and 2). The performance improvement due to type inference, in terms of geometric mean speedup across all benchmarks compared to standard Python, when using 3) is 26.4× with Numba as an AOT optimizing back-end and 62.2× with the Intrepydd optimizing compiler as a back-end. These vast performance improvements can have a significant impact on programmers’ productivity, while also reducing their applications’ use of compute and energy resources.

尽管Python等动态类型语言广泛流行，但众所周知，由于缺乏具体的类型信息，它们对代码优化构成了重大挑战。为了克服这一限制，许多Python的提前优化编译器方法依赖于程序员提供可选的类型信息，作为广泛的代码优化的先决条件。由于很少有程序员提供此信息，因此大多数Python应用程序在执行时没有获得代码优化的好处，从而共同造成了全球范围内计算和能源资源的重大浪费。在本文中，我们介绍了一种具体类型推断的新方法，该方法被证明可以有效地实现动态类型语言的代码优化，而不需要程序员提供任何类型信息。我们在我们的方法中探索了三种类型推理算法:1)包括GPT-4在内的机器学习模型，2)基于SMT求解的基于约束的推理，以及3)1)和2)的组合。然后，我们的方法使用类型推理的输出为有限数量的具体类型选项生成多版本代码，同时还包括一个捕获所有无类型的版本，用于没有找到匹配的情况。然后，类型化版本可以进行代码优化。实验结果表明，3)中的组合算法比1)和2)中的单独算法提供了更高的精度和性能。与标准Python相比，在所有基准测试中，类型推断带来的性能提升，在使用Numba作为AOT优化后端时为26.4倍，使用Intrepydd优化编译器作为后端时为62.2倍。这些巨大的性能改进可以对程序员的生产力产生重大影响，同时也减少了应用程序对计算和能源的使用。

{"title":"Concrete Type Inference for Code Optimization using Machine Learning with SMT Solving","authors":"Fangke Ye, Jisheng Zhao, Jun Shirako, Vivek Sarkar","doi":"10.1145/3622825","DOIUrl":"https://doi.org/10.1145/3622825","url":null,"abstract":"Despite the widespread popularity of dynamically typed languages such as Python, it is well known that they pose significant challenges to code optimization due to the lack of concrete type information. To overcome this limitation, many ahead-of-time optimizing compiler approaches for Python rely on programmers to provide optional type information as a prerequisite for extensive code optimization. Since few programmers provide this information, a large majority of Python applications are executed without the benefit of code optimization, thereby contributing collectively to a significant worldwide wastage of compute and energy resources. In this paper, we introduce a new approach to concrete type inference that is shown to be effective in enabling code optimization for dynamically typed languages, without requiring the programmer to provide any type information. We explore three kinds of type inference algorithms in our approach based on: 1) machine learning models including GPT-4, 2) constraint-based inference based on SMT solving, and 3) a combination of 1) and 2). Our approach then uses the output from type inference to generate multi-version code for a bounded number of concrete type options, while also including a catch-all untyped version for the case when no match is found. The typed versions are then amenable to code optimization. Experimental results show that the combined algorithm in 3) delivers far superior precision and performance than the separate algorithms for 1) and 2). The performance improvement due to type inference, in terms of geometric mean speedup across all benchmarks compared to standard Python, when using 3) is 26.4× with Numba as an AOT optimizing back-end and 62.2× with the Intrepydd optimizing compiler as a back-end. These vast performance improvements can have a significant impact on programmers’ productivity, while also reducing their applications’ use of compute and energy resources.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136114713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adventure of a Lifetime: Extract Method Refactoring for Rust 一生的冒险:Rust的提取方法重构

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622821

Sewen Thy, Andreea Costea, Kiran Gopinathan, Ilya Sergey

We present a design and implementation of the automated "Extract Method" refactoring for Rust programs. Even though Extract Method is one of the most well-studied and widely used in practice automated refactorings, featured in all major IDEs for all popular programming languages, implementing it soundly for Rust is surprisingly non-trivial due to the restrictions of the Rust's ownership and lifetime-based type system. In this work, we provide a systematic decomposition of the Extract Method refactoring for Rust programs into a series of program transformations, each concerned with satisfying a particular aspect of Rust type safety, eventually producing a well-typed Rust program. Our key discovery is the formulation of Extract Method as a composition of naive function hoisting and a series of automated program repair procedures that progressively make the resulting program "more well-typed" by relying on the corresponding repair oracles. Those oracles include a novel static intra-procedural ownership analysis that infers correct sharing annotations for the extracted function's parameters, and the lifetime checker of rustc, Rust's reference compiler. We implemented our approach in a tool called REM---an automated Extract Method refactoring built on top of IntelliJ IDEA plugin for Rust. Our extensive evaluation on a corpus of changes in five popular Rust projects shows that REM (a) can extract a larger class of feature-rich code fragments into semantically correct functions than other existing refactoring tools, (b) can reproduce method extractions performed manually by human developers in the past, and (c) is efficient enough to be used in interactive development.

我们提出了一个Rust程序自动“提取方法”重构的设计和实现。尽管Extract Method是在自动化重构实践中被研究得最充分、应用最广泛的方法之一，在所有流行编程语言的主要ide中都有它的特点，但由于Rust的所有权和基于生命周期的类型系统的限制，在Rust中实现它是非常重要的。在这项工作中，我们将Rust程序的提取方法重构系统地分解为一系列程序转换，每个转换都涉及满足Rust类型安全的特定方面，最终生成类型良好的Rust程序。我们的关键发现是Extract Method的公式，它是由原始函数提升和一系列自动程序修复程序组成的，这些程序通过依赖相应的修复预言器，逐步使生成的程序“类型更佳”。这些oracle包括一种新的静态过程内所有权分析，它可以为提取的函数参数推断正确的共享注释，以及Rust的参考编译器rustc的生命周期检查器。我们在一个叫做REM的工具中实现了我们的方法——一个基于IntelliJ IDEA Rust插件的自动提取方法重构工具。我们对五个流行的Rust项目的变更语料库进行了广泛的评估，结果表明REM (a)可以比其他现有的重构工具将更大的一类功能丰富的代码片段提取为语义正确的函数，(b)可以重现过去由人类开发人员手动执行的方法提取，(c)足够高效，可用于交互式开发。

{"title":"Adventure of a Lifetime: Extract Method Refactoring for Rust","authors":"Sewen Thy, Andreea Costea, Kiran Gopinathan, Ilya Sergey","doi":"10.1145/3622821","DOIUrl":"https://doi.org/10.1145/3622821","url":null,"abstract":"We present a design and implementation of the automated \"Extract Method\" refactoring for Rust programs. Even though Extract Method is one of the most well-studied and widely used in practice automated refactorings, featured in all major IDEs for all popular programming languages, implementing it soundly for Rust is surprisingly non-trivial due to the restrictions of the Rust's ownership and lifetime-based type system. In this work, we provide a systematic decomposition of the Extract Method refactoring for Rust programs into a series of program transformations, each concerned with satisfying a particular aspect of Rust type safety, eventually producing a well-typed Rust program. Our key discovery is the formulation of Extract Method as a composition of naive function hoisting and a series of automated program repair procedures that progressively make the resulting program \"more well-typed\" by relying on the corresponding repair oracles. Those oracles include a novel static intra-procedural ownership analysis that infers correct sharing annotations for the extracted function's parameters, and the lifetime checker of rustc, Rust's reference compiler. We implemented our approach in a tool called REM---an automated Extract Method refactoring built on top of IntelliJ IDEA plugin for Rust. Our extensive evaluation on a corpus of changes in five popular Rust projects shows that REM (a) can extract a larger class of feature-rich code fragments into semantically correct functions than other existing refactoring tools, (b) can reproduce method extractions performed manually by human developers in the past, and (c) is efficient enough to be used in interactive development.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136115398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Gradual Typing for Effect Handlers 效果处理程序的渐进式输入

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622860

Max S. New, Eric Giovannini, Daniel R. Licata

We present a gradually typed language, GrEff, with effects and handlers that supports migration from unchecked to checked effect typing. This serves as a simple model of the integration of an effect typing discipline with an existing effectful typed language that does not track fine-grained effect information. Our language supports a simple module system to model the programming model of gradual migration from unchecked to checked effect typing in the style of Typed Racket. The surface language GrEff is given semantics by elaboration to a core language Core GrEff. We equip Core GrEff with an inequational theory for reasoning about the semantic error ordering and desired program equivalences for programming with effects and handlers. We derive an operational semantics for the language from the equations provable in the theory. We then show that the theory is sound by constructing an operational logical relations model to prove the graduality theorem. This extends prior work on embedding-projection pair models of gradual typing to handle effect typing and subtyping.

我们提出了一种逐渐类型化的语言GrEff，其效果和处理程序支持从未检查的效果类型迁移到已检查的效果类型。这可以作为效果类型规程与现有的有效类型语言集成的简单模型，该语言不跟踪细粒度的效果信息。我们的语言支持一个简单的模块系统，以类型化球拍的风格对从未检查到检查效果类型逐渐迁移的编程模型进行建模。通过对核心语言core GrEff的细化，赋予表层语言GrEff语义。我们为Core GrEff提供了一个不等式理论，用于推理语义错误排序和使用效果和处理程序编程所需的程序等价。我们从理论中可证明的方程推导出语言的操作语义。然后通过构造一个运算逻辑关系模型来证明渐近定理，证明了该理论的正确性。这扩展了先前关于逐渐类型的嵌入-投影对模型的工作，以处理效果类型和子类型。

引用次数: 0

Historia: Refuting Callback Reachability with Message-History Logics 历史:用消息历史逻辑驳斥回调可达性

Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Proceedings of the ACM on Programming Languages

Pub Date : 2023-10-16 DOI: 10.1145/3622865

Meier, Shawn, Mover, Sergio, Kaki, Gowtham, Chang, Bor-Yuh Evan

This paper considers the callback reachability problem --- determining if a callback can be called by an event-driven framework in an unexpected state. Event-driven programming frameworks are pervasive for creating user-interactive applications (apps) on just about every modern platform. Control flow between callbacks is determined by the framework and largely opaque to the programmer. This opacity of the callback control flow not only causes difficulty for the programmer but is also difficult for those developing static analysis. Previous static analysis techniques address this opacity either by assuming an arbitrary framework implementation or attempting to eagerly specify all possible callback control flow, but this is either too coarse to prove properties requiring callback-ordering constraints or too burdensome and tricky to get right. Instead, we present a middle way where the callback control flow can be gradually refined in a targeted manner to prove assertions of interest. The key insight to get this middle way is by reasoning about the history of method invocations at the boundary between app and framework code --- enabling a decoupling of the specification of callback control flow from the analysis of app code. We call the sequence of such boundary-method invocations message histories and develop message-history logics to do this reasoning. In particular, we define the notion of an application-only transition system with boundary transitions, a message-history program logic for programs with such transitions, and a temporal specification logic for capturing callback control flow in a targeted and compositional manner. Then to utilize the logics in a goal-directed verifier, we define a way to combine after-the-fact an assertion about message histories with a specification of callback control flow. We implemented a prototype message history-based verifier called Historia and provide evidence that our approach is uniquely capable of distinguishing between buggy and fixed versions on challenging examples drawn from real-world issues and that our targeted specification approach enables proving the absence of multi-callback bug patterns in real-world open-source Android apps.

本文考虑回调可达性问题——确定事件驱动框架是否可以在意外状态下调用回调。事件驱动的编程框架在几乎每个现代平台上都广泛用于创建用户交互应用程序(app)。回调之间的控制流由框架决定，对程序员来说很大程度上是不透明的。这种回调控制流的不透明性不仅给程序员带来了困难，也给开发静态分析的人员带来了困难。以前的静态分析技术通过假设任意的框架实现或尝试迫切地指定所有可能的回调控制流来解决这种不透明性，但这要么太粗糙，无法证明需要回调排序约束的属性，要么太繁琐，难以正确处理。相反，我们提出了一种中间方法，其中回调控制流可以以有针对性的方式逐步改进，以证明感兴趣的断言。获得这种中间方法的关键洞察力是通过推理应用程序和框架代码之间边界的方法调用历史——使回调控制流的规范与应用程序代码的分析解耦。我们将这些边界方法调用的序列称为消息历史，并开发消息历史逻辑来执行此推理。特别是，我们定义了具有边界转换的仅应用程序转换系统的概念、用于具有此类转换的程序的消息历史程序逻辑，以及用于以目标和组合方式捕获回调控制流的临时规范逻辑。然后，为了利用目标导向验证器中的逻辑，我们定义了一种方法，将关于消息历史的事后断言与回调控制流规范结合起来。我们实现了一个名为Historia的基于消息历史的原型验证器，并提供了证据，证明我们的方法在从现实世界问题中提取的具有挑战性的例子中能够独特地区分有bug的版本和固定的版本，并且我们的目标规范方法能够证明在现实世界的开源Android应用中不存在多回调错误模式。

{"title":"Historia: Refuting Callback Reachability with Message-History Logics","authors":"Meier, Shawn, Mover, Sergio, Kaki, Gowtham, Chang, Bor-Yuh Evan","doi":"10.1145/3622865","DOIUrl":"https://doi.org/10.1145/3622865","url":null,"abstract":"This paper considers the callback reachability problem --- determining if a callback can be called by an event-driven framework in an unexpected state. Event-driven programming frameworks are pervasive for creating user-interactive applications (apps) on just about every modern platform. Control flow between callbacks is determined by the framework and largely opaque to the programmer. This opacity of the callback control flow not only causes difficulty for the programmer but is also difficult for those developing static analysis. Previous static analysis techniques address this opacity either by assuming an arbitrary framework implementation or attempting to eagerly specify all possible callback control flow, but this is either too coarse to prove properties requiring callback-ordering constraints or too burdensome and tricky to get right. Instead, we present a middle way where the callback control flow can be gradually refined in a targeted manner to prove assertions of interest. The key insight to get this middle way is by reasoning about the history of method invocations at the boundary between app and framework code --- enabling a decoupling of the specification of callback control flow from the analysis of app code. We call the sequence of such boundary-method invocations message histories and develop message-history logics to do this reasoning. In particular, we define the notion of an application-only transition system with boundary transitions, a message-history program logic for programs with such transitions, and a temporal specification logic for capturing callback control flow in a targeted and compositional manner. Then to utilize the logics in a goal-directed verifier, we define a way to combine after-the-fact an assertion about message histories with a specification of callback control flow. We implemented a prototype message history-based verifier called Historia and provide evidence that our approach is uniquely capable of distinguishing between buggy and fixed versions on challenging examples drawn from real-world issues and that our targeted specification approach enables proving the absence of multi-callback bug patterns in real-world open-source Android apps.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136077382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0