Unboxed Data Constructors: Or, How cpp Decides a Halting Problem

IF 2.8 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Proceedings of the ACM on Programming Languages Pub Date : 2024-01-05 DOI:10.1145/3632893

Nicolas Chataing, Stephen Dolan, Gabriel Scherer, J. Yallop

{"title":"Unboxed Data Constructors: Or, How cpp Decides a Halting Problem","authors":"Nicolas Chataing, Stephen Dolan, Gabriel Scherer, J. Yallop","doi":"10.1145/3632893","DOIUrl":null,"url":null,"abstract":"We propose a new language feature for ML-family languages, the ability to selectively unbox certain data constructors, so that their runtime representation gets compiled away to just the identity on their argument. Unboxing must be statically rejected when it could introduce confusion, that is, distinct values with the same representation. We discuss the use-case of big numbers, where unboxing allows to write code that is both efficient and safe, replacing either a safe but slow version or a fast but unsafe version. We explain the static analysis necessary to reject incorrect unboxing requests. We present our prototype implementation of this feature for the OCaml programming language, discuss several design choices and the interaction with advanced features such as Guarded Algebraic Datatypes. Our static analysis requires expanding type definitions in type expressions, which is not necessarily normalizing in presence of recursive type definitions. In other words, we must decide normalization of terms in the first-order λ-calculus with recursion. We provide an algorithm to detect non-termination on-the-fly during reduction, with proofs of correctness and completeness. Our algorithm turns out to be closely related to the normalization strategy for macro expansion in the cpp preprocessor.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"9 5","pages":"1509 - 1539"},"PeriodicalIF":2.8000,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3632893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

We propose a new language feature for ML-family languages, the ability to selectively unbox certain data constructors, so that their runtime representation gets compiled away to just the identity on their argument. Unboxing must be statically rejected when it could introduce confusion, that is, distinct values with the same representation. We discuss the use-case of big numbers, where unboxing allows to write code that is both efficient and safe, replacing either a safe but slow version or a fast but unsafe version. We explain the static analysis necessary to reject incorrect unboxing requests. We present our prototype implementation of this feature for the OCaml programming language, discuss several design choices and the interaction with advanced features such as Guarded Algebraic Datatypes. Our static analysis requires expanding type definitions in type expressions, which is not necessarily normalizing in presence of recursive type definitions. In other words, we must decide normalization of terms in the first-order λ-calculus with recursion. We provide an algorithm to detect non-termination on-the-fly during reduction, with proofs of correctness and completeness. Our algorithm turns out to be closely related to the normalization strategy for macro expansion in the cpp preprocessor.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

无盒式数据构造函数：或者说，cpp 如何解决停顿问题

我们为 ML-family 语言提出了一种新的语言特性，即有选择地取消某些数据构造函数的装箱功能，这样它们的运行时表示法就会被编译为其参数上的标识。如果开箱可能带来混淆，即具有相同表示的不同值，则必须静态地拒绝开箱。我们讨论了大数的用例，在这种情况下，开箱可以编写既高效又安全的代码，取代安全但缓慢的版本或快速但不安全的版本。我们解释了拒绝不正确的开箱请求所需的静态分析。我们介绍了 OCaml 编程语言中该功能的原型实现，讨论了几种设计选择以及与守护代数数据类型等高级功能的交互。我们的静态分析要求在类型表达式中扩展类型定义，而在存在递归类型定义的情况下，这并不一定是规范化的。换句话说，我们必须决定带有递归的一阶 λ 微积分中术语的归一化。我们提供了一种在还原过程中即时检测非终结的算法，并给出了正确性和完备性证明。我们的算法与 cpp 预处理器中宏扩展的规范化策略密切相关。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊