Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation最新文献

英文中文

FlexVec: auto-vectorization for irregular loops FlexVec:不规则循环的自动矢量化

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pub Date : 2016-06-02 DOI: 10.1145/2908080.2908111

Sara S. Baghsorkhi, N. Vasudevan, Youfeng Wu

Traditional vectorization techniques build a dependence graph with distance and direction information to determine whether a loop is vectorizable. Since vectorization reorders the execution of instructions across iterations, in general instructions involved in a strongly connected component (SCC) are deemed not vectorizable unless the SCC can be eliminated using techniques such as scalar expansion or privatization. Therefore, traditional vectorization techniques are limited in their ability to efficiently handle loops with dynamic cross-iteration dependencies or complex control flow interweaved within the dependence cycles. When potential dependencies do not occur very often, the end-result is under utilization of the SIMD hardware. In this paper, we propose FlexVec architecture that combines new vector instructions with novel code generation techniques to dynamically adjusts vector length for loop statements affected by cross-iteration dependencies that happen at runtime. We have designed and implemented FlexVec's new ISA as extensions to the recently released AVX-512 ISA. We have evaluated the performance improvements enabled by FlexVec vectorization for 11 C/C++ SPEC 2006 benchmarks and 7 real applications with AVX-512 vectorization as baseline. We show that FlexVec vectorization technique produces a Geomean speedup of 9% for SPEC 2006 and a Geomean speedup of 11% for 7 real applications.

传统的向量化技术通过建立一个包含距离和方向信息的依赖图来确定环路是否可向量化。由于向量化在迭代中对指令的执行进行了重新排序，一般来说，强连接组件(SCC)中涉及的指令被认为是不可向量化的，除非使用标量展开或私有化等技术可以消除SCC。因此，传统的矢量化技术在有效处理具有动态交叉迭代依赖关系的循环或在依赖循环中交织的复杂控制流的能力方面受到限制。当潜在的依赖关系不经常发生时，最终结果是SIMD硬件的利用率不足。在本文中，我们提出了FlexVec架构，该架构结合了新的矢量指令和新的代码生成技术，可以动态调整受运行时交叉迭代依赖影响的循环语句的矢量长度。我们设计并实现了FlexVec的新ISA，作为最近发布的AVX-512 ISA的扩展。我们已经在11个C/ c++ SPEC 2006基准测试和7个以AVX-512向量化为基准的实际应用中评估了FlexVec向量化所带来的性能改进。我们表明，FlexVec矢量化技术在spec2006中使Geomean加速了9%，在7个实际应用中使Geomean加速了11%。

{"title":"FlexVec: auto-vectorization for irregular loops","authors":"Sara S. Baghsorkhi, N. Vasudevan, Youfeng Wu","doi":"10.1145/2908080.2908111","DOIUrl":"https://doi.org/10.1145/2908080.2908111","url":null,"abstract":"Traditional vectorization techniques build a dependence graph with distance and direction information to determine whether a loop is vectorizable. Since vectorization reorders the execution of instructions across iterations, in general instructions involved in a strongly connected component (SCC) are deemed not vectorizable unless the SCC can be eliminated using techniques such as scalar expansion or privatization. Therefore, traditional vectorization techniques are limited in their ability to efficiently handle loops with dynamic cross-iteration dependencies or complex control flow interweaved within the dependence cycles. When potential dependencies do not occur very often, the end-result is under utilization of the SIMD hardware. In this paper, we propose FlexVec architecture that combines new vector instructions with novel code generation techniques to dynamically adjusts vector length for loop statements affected by cross-iteration dependencies that happen at runtime. We have designed and implemented FlexVec's new ISA as extensions to the recently released AVX-512 ISA. We have evaluated the performance improvements enabled by FlexVec vectorization for 11 C/C++ SPEC 2006 benchmarks and 7 real applications with AVX-512 vectorization as baseline. We show that FlexVec vectorization technique produces a Geomean speedup of 9% for SPEC 2006 and a Geomean speedup of 11% for 7 real applications.","PeriodicalId":178839,"journal":{"name":"Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127487618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Input responsiveness: using canary inputs to dynamically steer approximation 输入响应性:使用金丝雀输入动态引导逼近

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pub Date : 2016-06-02 DOI: 10.1145/2908080.2908087

M. Laurenzano, Parker Hill, M. Samadi, S. Mahlke, Jason Mars, Lingjia Tang

This paper introduces Input Responsive Approximation (IRA), an approach that uses a canary input — a small program input carefully constructed to capture the intrinsic properties of the original input — to automatically control how program approximation is applied on an input-by-input basis. Motivating this approach is the observation that many of the prior techniques focusing on choosing how to approximate arrive at conservative decisions by discounting substantial differences between inputs when applying approximation. The main challenges in overcoming this limitation lie in making the choice of how to approximate both effectively (e.g., the fastest approximation that meets a particular accuracy target) and rapidly for every input. With IRA, each time the approximate program is run, a canary input is constructed and used dynamically to quickly test a spectrum of approximation alternatives. Based on these runtime tests, the approximation that best fits the desired accuracy constraints is selected and applied to the full input to produce an approximate result. We use IRA to select and parameterize mixes of four approximation techniques from the literature for a range of 13 image processing, machine learning, and data mining applications. Our results demonstrate that IRA significantly outperforms prior approaches, delivering an average of 10.2× speedup over exact execution while minimizing accuracy losses in program outputs.

本文介绍了输入响应近似(IRA)，这是一种使用金丝雀输入的方法-一种精心构建的小程序输入，以捕获原始输入的内在属性-自动控制如何在输入的基础上应用程序近似。这种方法的动机是观察到许多先前的技术专注于选择如何近似，通过在应用近似时贴现输入之间的实质性差异来获得保守决策。克服这一限制的主要挑战在于选择如何有效地逼近(例如，满足特定精度目标的最快逼近)和快速地逼近每个输入。使用IRA，每次运行近似程序时，都会构造一个金丝雀输入，并动态地使用它来快速测试一系列近似替代方案。基于这些运行时测试，选择最适合所需精度约束的近似值，并将其应用于整个输入，以产生近似结果。我们使用IRA从文献中选择和参数化四种近似技术的混合，用于13种图像处理、机器学习和数据挖掘应用。我们的结果表明，IRA显着优于先前的方法，在精确执行时平均提供10.2倍的加速，同时最大限度地减少程序输出中的准确性损失。

{"title":"Input responsiveness: using canary inputs to dynamically steer approximation","authors":"M. Laurenzano, Parker Hill, M. Samadi, S. Mahlke, Jason Mars, Lingjia Tang","doi":"10.1145/2908080.2908087","DOIUrl":"https://doi.org/10.1145/2908080.2908087","url":null,"abstract":"This paper introduces Input Responsive Approximation (IRA), an approach that uses a canary input — a small program input carefully constructed to capture the intrinsic properties of the original input — to automatically control how program approximation is applied on an input-by-input basis. Motivating this approach is the observation that many of the prior techniques focusing on choosing how to approximate arrive at conservative decisions by discounting substantial differences between inputs when applying approximation. The main challenges in overcoming this limitation lie in making the choice of how to approximate both effectively (e.g., the fastest approximation that meets a particular accuracy target) and rapidly for every input. With IRA, each time the approximate program is run, a canary input is constructed and used dynamically to quickly test a spectrum of approximation alternatives. Based on these runtime tests, the approximation that best fits the desired accuracy constraints is selected and applied to the full input to produce an approximate result. We use IRA to select and parameterize mixes of four approximation techniques from the literature for a range of 13 image processing, machine learning, and data mining applications. Our results demonstrate that IRA significantly outperforms prior approaches, delivering an average of 10.2× speedup over exact execution while minimizing accuracy losses in program outputs.","PeriodicalId":178839,"journal":{"name":"Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117025348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 67

Automatically learning shape specifications 自动学习形状规格

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pub Date : 2016-06-02 DOI: 10.1145/2908080.2908125

He Zhu, G. Petri, S. Jagannathan

This paper presents a novel automated procedure for discovering expressive shape specifications for sophisticated functional data structures. Our approach extracts potential shape predicates based on the definition of constructors of arbitrary user-defined inductive data types, and combines these predicates within an expressive first-order specification language using a lightweight data-driven learning procedure. Notably, this technique requires no programmer annotations, and is equipped with a type-based decision procedure to verify the correctness of discovered specifications. Experimental results indicate that our implementation is both efficient and effective, capable of automatically synthesizing sophisticated shape specifications over a range of complex data types, going well beyond the scope of existing solutions.

本文提出了一种新的自动化过程，用于发现复杂功能数据结构的表达形状规范。我们的方法基于任意用户定义归纳数据类型的构造函数的定义提取潜在的形状谓词，并使用轻量级数据驱动的学习过程将这些谓词组合在具有表现力的一阶规范语言中。值得注意的是，该技术不需要程序员注释，并且配备了基于类型的决策过程来验证所发现规范的正确性。实验结果表明，我们的实现既高效又有效，能够在一系列复杂数据类型上自动合成复杂的形状规范，远远超出了现有解决方案的范围。

引用次数: 27

Transactional data structure libraries 事务性数据结构库

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pub Date : 2016-06-02 DOI: 10.1145/2908080.2908112

A. Spiegelman, Guy Golan-Gueta, I. Keidar

We introduce transactions into libraries of concurrent data structures; such transactions can be used to ensure atomicity of sequences of data structure operations. By focusing on transactional access to a well-defined set of data structure operations, we strike a balance between the ease-of-programming of transactions and the efficiency of custom-tailored data structures. We exemplify this concept by designing and implementing a library supporting transactions on any number of maps, sets (implemented as skiplists), and queues. Our library offers efficient and scalable transactions, which are an order of magnitude faster than state-of-the-art transactional memory toolkits. Moreover, our approach treats stand-alone data structure operations (like put and enqueue) as first class citizens, and allows them to execute with virtually no overhead, at the speed of the original data structure library.

我们将事务引入并发数据结构库;这样的事务可以用来确保数据结构操作序列的原子性。通过关注对一组定义良好的数据结构操作的事务性访问，我们在事务的易编程性和定制数据结构的效率之间取得了平衡。我们通过设计和实现一个库来举例说明这个概念，该库支持任意数量的映射、集合(作为跳跃列表实现)和队列上的事务。我们的库提供了高效和可扩展的事务，比最先进的事务内存工具包快一个数量级。此外，我们的方法将独立的数据结构操作(如put和enqueue)视为一等公民，并允许它们以原始数据结构库的速度在几乎没有开销的情况下执行。

引用次数: 51

Configuration synthesis for programmable analog devices with Arco 配置合成的可编程模拟设备与Arco

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pub Date : 2016-06-01 DOI: 10.1145/2908080.2908116

Sara Achour, R. Sarpeshkar, M. Rinard

Programmable analog devices have emerged as a powerful computing substrate for performing complex neuromorphic and cytomorphic computations. We present Arco, a new solver that, given a dynamical system specification in the form of a set of differential equations, generates physically realizable configurations for programmable analog devices that are algebraically equivalent to the specified system. On a set of benchmarks from the biological domain, Arco generates configurations with 35 to 534 connections and 28 to 326 components in 1 to 54 minutes.

可编程模拟设备已成为执行复杂神经形态和细胞形态计算的强大计算基础。我们提出了Arco，一个新的求解器，给定一组微分方程形式的动力系统规范，为可编程模拟设备生成物理上可实现的配置，这些配置在代数上等同于指定的系统。在生物领域的一组基准测试中，Arco在1到54分钟内生成35到534个连接和28到326个组件的配置。

引用次数: 18

Types from data: making structured data first-class citizens in F# 数据类型:使结构化数据成为f#中的一等公民

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pub Date : 2016-05-10 DOI: 10.1145/2908080.2908115

T. Petříček, Gustavo Guerra, Don Syme

Most modern applications interact with external services and access data in structured formats such as XML, JSON and CSV. Static type systems do not understand such formats, often making data access more cumbersome. Should we give up and leave the messy world of external data to dynamic typing and runtime checks? Of course, not! We present F# Data, a library that integrates external structured data into F#. As most real-world data does not come with an explicit schema, we develop a shape inference algorithm that infers a shape from representative sample documents. We then integrate the inferred shape into the F# type system using type providers. We formalize the process and prove a relative type soundness theorem. Our library significantly reduces the amount of data access code and it provides additional safety guarantees when contrasted with the widely used weakly typed techniques.

大多数现代应用程序与外部服务交互，并以结构化格式(如XML、JSON和CSV)访问数据。静态类型系统不理解这种格式，通常使数据访问更加麻烦。我们应该放弃，把外部数据的混乱世界留给动态类型和运行时检查吗?当然不是!我们介绍了f# Data，一个将外部结构化数据集成到f#中的库。由于大多数真实世界的数据都没有明确的模式，我们开发了一种形状推断算法，可以从代表性示例文档中推断出形状。然后，我们使用类型提供程序将推断的形状集成到f#类型系统中。我们将这一过程形式化，并证明了一个相对类型完备性定理。与广泛使用的弱类型技术相比，我们的库大大减少了数据访问代码的数量，并提供了额外的安全保证。

引用次数: 26

Into the depths of C: elaborating the de facto standards 进入C的深处:详细说明事实上的标准

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pub Date : 2016-04-27 DOI: 10.1145/2908080.2908081

Kayvan Memarian, Justus Matthiesen, J. Lingard, Kyndylan Nienhuis, D. Chisnall, R. Watson, Peter Sewell

C remains central to our computing infrastructure. It is notionally defined by ISO standards, but in reality the properties of C assumed by systems code and those implemented by compilers have diverged, both from the ISO standards and from each other, and none of these are clearly understood. We make two contributions to help improve this error-prone situation. First, we describe an in-depth analysis of the design space for the semantics of pointers and memory in C as it is used in practice. We articulate many specific questions, build a suite of semantic test cases, gather experimental data from multiple implementations, and survey what C experts believe about the de facto standards. We identify questions where there is a consensus (either following ISO or differing) and where there are conflicts. We apply all this to an experimental C implemented above capability hardware. Second, we describe a formal model, Cerberus, for large parts of C. Cerberus is parameterised on its memory model; it is linkable either with a candidate de facto memory object model, under construction, or with an operational C11 concurrency model; it is defined by elaboration to a much simpler Core language for accessibility, and it is executable as a test oracle on small examples. This should provide a solid basis for discussion of what mainstream C is now: what programmers and analysis tools can assume and what compilers aim to implement. Ultimately we hope it will be a step towards clear, consistent, and accepted semantics for the various use-cases of C.

C仍然是我们计算基础设施的核心。理论上，它是由ISO标准定义的，但实际上，系统代码所假定的C的属性和编译器所实现的属性已经与ISO标准和彼此之间产生了分歧，而且这些都没有被清楚地理解。我们做出了两项贡献来帮助改进这种容易出错的情况。首先，我们深入分析了C语言中指针和内存语义的设计空间，因为它在实践中被使用。我们阐明了许多具体的问题，构建了一套语义测试用例，从多个实现中收集实验数据，并调查了C专家对事实标准的看法。我们识别有共识的问题(遵循ISO或不同)和有冲突的问题。我们将所有这些应用于上述性能硬件实现的实验性C。其次，我们描述了一个正式的模型，Cerberus，对于大部分C. Cerberus是参数化的内存模型;它既可以与正在构建的候选事实内存对象模型链接，也可以与可操作的C11并发模型链接;为了便于访问，它被细化为一种更简单的Core语言，并且它可以作为测试oracle在小示例上执行。这应该为讨论现在的主流C语言是什么提供了坚实的基础:程序员和分析工具可以假设什么，编译器的目标是实现什么。最终，我们希望这将是朝着C语言各种用例清晰、一致和可接受的语义迈出的一步。

{"title":"Into the depths of C: elaborating the de facto standards","authors":"Kayvan Memarian, Justus Matthiesen, J. Lingard, Kyndylan Nienhuis, D. Chisnall, R. Watson, Peter Sewell","doi":"10.1145/2908080.2908081","DOIUrl":"https://doi.org/10.1145/2908080.2908081","url":null,"abstract":"C remains central to our computing infrastructure. It is notionally defined by ISO standards, but in reality the properties of C assumed by systems code and those implemented by compilers have diverged, both from the ISO standards and from each other, and none of these are clearly understood. We make two contributions to help improve this error-prone situation. First, we describe an in-depth analysis of the design space for the semantics of pointers and memory in C as it is used in practice. We articulate many specific questions, build a suite of semantic test cases, gather experimental data from multiple implementations, and survey what C experts believe about the de facto standards. We identify questions where there is a consensus (either following ISO or differing) and where there are conflicts. We apply all this to an experimental C implemented above capability hardware. Second, we describe a formal model, Cerberus, for large parts of C. Cerberus is parameterised on its memory model; it is linkable either with a candidate de facto memory object model, under construction, or with an operational C11 concurrency model; it is defined by elaboration to a much simpler Core language for accessibility, and it is executable as a test oracle on small examples. This should provide a solid basis for discussion of what mainstream C is now: what programmers and analysis tools can assume and what compilers aim to implement. Ultimately we hope it will be a step towards clear, consistent, and accepted semantics for the various use-cases of C.","PeriodicalId":178839,"journal":{"name":"Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124654020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 92

On the complexity and performance of parsing with derivatives 关于导数解析的复杂性和性能

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pub Date : 2016-04-16 DOI: 10.1145/2908080.2908128

Michael D. Adams, Celeste Hollenbeck, M. Might

Current algorithms for context-free parsing inflict a trade-off between ease of understanding, ease of implementation, theoretical complexity, and practical performance. No algorithm achieves all of these properties simultaneously. Might et al. introduced parsing with derivatives, which handles arbitrary context-free grammars while being both easy to understand and simple to implement. Despite much initial enthusiasm and a multitude of independent implementations, its worst-case complexity has never been proven to be better than exponential. In fact, high-level arguments claiming it is fundamentally exponential have been advanced and even accepted as part of the folklore. Performance ended up being sluggish in practice, and this sluggishness was taken as informal evidence of exponentiality. In this paper, we reexamine the performance of parsing with derivatives. We have discovered that it is not exponential but, in fact, cubic. Moreover, simple (though perhaps not obvious) modifications to the implementation by Might et al. lead to an implementation that is not only easy to understand but also highly performant in practice.

当前用于上下文无关解析的算法需要在易于理解、易于实现、理论复杂性和实际性能之间进行权衡。没有一种算法能同时实现所有这些属性。Might等人引入了带有派生的解析，它处理任意与上下文无关的语法，同时易于理解和实现。尽管有很多最初的热情和大量的独立实现，但其最坏情况的复杂性从未被证明优于指数。事实上，声称它基本上是指数级的高级论点已经被提出，甚至被接受为民间传说的一部分。在实践中，性能最终表现迟缓，而这种迟缓被视为指数性的非正式证据。在本文中，我们重新研究了导数解析的性能。我们发现它不是指数的，而是三次的。此外，may等人对实现进行的简单(尽管可能不明显)修改导致实现不仅易于理解，而且在实践中性能也很高。

{"title":"On the complexity and performance of parsing with derivatives","authors":"Michael D. Adams, Celeste Hollenbeck, M. Might","doi":"10.1145/2908080.2908128","DOIUrl":"https://doi.org/10.1145/2908080.2908128","url":null,"abstract":"Current algorithms for context-free parsing inflict a trade-off between ease of understanding, ease of implementation, theoretical complexity, and practical performance. No algorithm achieves all of these properties simultaneously. Might et al. introduced parsing with derivatives, which handles arbitrary context-free grammars while being both easy to understand and simple to implement. Despite much initial enthusiasm and a multitude of independent implementations, its worst-case complexity has never been proven to be better than exponential. In fact, high-level arguments claiming it is fundamentally exponential have been advanced and even accepted as part of the folklore. Performance ended up being sluggish in practice, and this sluggishness was taken as informal evidence of exponentiality. In this paper, we reexamine the performance of parsing with derivatives. We have discovered that it is not exponential but, in fact, cubic. Moreover, simple (though perhaps not obvious) modifications to the implementation by Might et al. lead to an implementation that is not only easy to understand but also highly performant in practice.","PeriodicalId":178839,"journal":{"name":"Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123151031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Just-in-time static type checking for dynamic languages 动态语言的即时静态类型检查

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pub Date : 2016-04-13 DOI: 10.1145/2908080.2908127

Brianna M. Ren, J. Foster

Dynamic languages such as Ruby, Python, and JavaScript have many compelling benefits, but the lack of static types means subtle errors can remain latent in code for a long time. While many researchers have developed various systems to bring some of the benefits of static types to dynamic languages, prior approaches have trouble dealing with metaprogramming, which generates code as the program executes. In this paper, we propose Hummingbird, a new system that uses a novel technique, just-in-time static type checking, to type check Ruby code even in the presence of metaprogramming. In Hummingbird, method type signatures are gathered dynamically at run-time, as those methods are created. When a method is called, Hummingbird statically type checks the method body against current type signatures. Thus, Hummingbird provides thorough static checks on a per-method basis, while also allowing arbitrarily complex metaprogramming. For performance, Hummingbird memoizes the static type checking pass, invalidating cached checks only if necessary. We formalize Hummingbird using a core, Ruby-like language and prove it sound. To evaluate Hummingbird, we applied it to six apps, including three that use Ruby on Rails, a powerful framework that relies heavily on metaprogramming. We found that all apps typecheck successfully using Hummingbird, and that Hummingbird's performance overhead is reasonable. We applied Hummingbird to earlier versions of one Rails app and found several type errors that had been introduced and then fixed. Lastly, we demonstrate using Hummingbird in Rails development mode to typecheck an app as live updates are applied to it.

动态语言(如Ruby、Python和JavaScript)有许多引人注目的优点，但缺乏静态类型意味着代码中可能长时间潜伏着细微的错误。虽然许多研究人员已经开发了各种系统，将静态类型的一些优点引入动态语言，但以前的方法在处理元编程方面存在问题，元编程在程序执行时生成代码。在本文中，我们提出了Hummingbird，一个使用新技术的新系统，即时静态类型检查，即使在存在元编程的情况下也可以对Ruby代码进行类型检查。在Hummingbird中，方法类型签名是在运行时创建这些方法时动态收集的。调用方法时，Hummingbird根据当前类型签名对方法体进行静态类型检查。因此，Hummingbird在每个方法的基础上提供了彻底的静态检查，同时还允许任意复杂的元编程。为了提高性能，Hummingbird记忆静态类型检查传递，仅在必要时才使缓存的检查无效。我们使用类似ruby的核心语言形式化了Hummingbird，并证明了它的正确性。为了评估Hummingbird，我们将其应用于六个应用程序，其中三个使用Ruby on Rails，这是一个非常依赖元编程的强大框架。我们发现所有的应用程序使用Hummingbird都能成功地进行类型检查，而且Hummingbird的性能开销是合理的。我们将Hummingbird应用于一个Rails应用程序的早期版本，发现了几个引入的类型错误，然后进行了修复。最后，我们将演示在Rails开发模式下使用Hummingbird在应用实时更新时对应用程序进行类型检查。

{"title":"Just-in-time static type checking for dynamic languages","authors":"Brianna M. Ren, J. Foster","doi":"10.1145/2908080.2908127","DOIUrl":"https://doi.org/10.1145/2908080.2908127","url":null,"abstract":"Dynamic languages such as Ruby, Python, and JavaScript have many compelling benefits, but the lack of static types means subtle errors can remain latent in code for a long time. While many researchers have developed various systems to bring some of the benefits of static types to dynamic languages, prior approaches have trouble dealing with metaprogramming, which generates code as the program executes. In this paper, we propose Hummingbird, a new system that uses a novel technique, just-in-time static type checking, to type check Ruby code even in the presence of metaprogramming. In Hummingbird, method type signatures are gathered dynamically at run-time, as those methods are created. When a method is called, Hummingbird statically type checks the method body against current type signatures. Thus, Hummingbird provides thorough static checks on a per-method basis, while also allowing arbitrarily complex metaprogramming. For performance, Hummingbird memoizes the static type checking pass, invalidating cached checks only if necessary. We formalize Hummingbird using a core, Ruby-like language and prove it sound. To evaluate Hummingbird, we applied it to six apps, including three that use Ruby on Rails, a powerful framework that relies heavily on metaprogramming. We found that all apps typecheck successfully using Hummingbird, and that Hummingbird's performance overhead is reasonable. We applied Hummingbird to earlier versions of one Rails app and found several type errors that had been introduced and then fixed. Lastly, we demonstrate using Hummingbird in Rails development mode to typecheck an app as live updates are applied to it.","PeriodicalId":178839,"journal":{"name":"Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"279 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127552958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Refinement types for TypeScript TypeScript的细化类型

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Pub Date : 2016-04-08 DOI: 10.1145/2908080.2908110

Panagiotis Vekris, B. Cosman, Ranjit Jhala

We present Refined TypeScript (RSC), a lightweight refinement type system for TypeScript, that enables static verification of higher-order, imperative programs. We develop a formal system for RSC that delineates the interaction between refinement types and mutability, and enables flow-sensitive reasoning by translating input programs to an equivalent intermediate SSA form. By establishing type safety for the intermediate form, we prove safety for the input programs. Next, we extend the core to account for imperative and dynamic features of TypeScript, including overloading, type reflection, ad hoc type hierarchies and object initialization. Finally, we evaluate RSC on a set of real-world benchmarks, including parts of the Octane benchmarks, D3, Transducers, and the TypeScript compiler. We show how RSC successfully establishes a number of value dependent properties, such as the safety of array accesses and downcasts, while incurring a modest overhead in type annotations and code restructuring.

我们介绍了精化TypeScript (RSC)，一个轻量级的TypeScript精化类型系统，它可以对高阶命令式程序进行静态验证。我们为RSC开发了一个形式化系统，该系统描述了细化类型和可变性之间的相互作用，并通过将输入程序转换为等效的中间SSA形式来实现流敏感推理。通过建立中间形式的类型安全性，证明了输入程序的安全性。接下来，我们将扩展核心，以考虑TypeScript的命令式和动态特性，包括重载、类型反射、特殊类型层次结构和对象初始化。最后，我们在一组真实世界的基准测试中评估RSC，包括部分Octane基准测试、D3、转换器和TypeScript编译器。我们将展示RSC如何成功地建立许多依赖于值的属性，例如数组访问和向下转换的安全性，同时在类型注释和代码重构方面产生适度的开销。

引用次数: 47

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀