Generic programming libraries such as Scrap Your Boilerplate eliminate the need to write repetitive code, but typically introduce significant performance overheads. This leaves programmers with the unfortunate choice of writing succinct but slow programs or writing tedious but efficient programs. We show how to systematically transform an implementation of the Scrap Your Boilerplate library in the multi-stage programming language MetaOCaml to eliminate the overhead, making it possible to combine the benefits of high-level abstract programming with the efficiency of low-level code.
像Scrap Your Boilerplate这样的泛型编程库消除了编写重复代码的需要,但通常会带来显著的性能开销。这给程序员留下了一个不幸的选择,要么编写简洁但缓慢的程序,要么编写乏味但高效的程序。我们展示了如何在多阶段编程语言MetaOCaml中系统地转换Scrap Your Boilerplate库的实现,以消除开销,使高级抽象编程的好处与低级代码的效率相结合成为可能。
{"title":"Staging generic programming","authors":"J. Yallop","doi":"10.1145/2847538.2847546","DOIUrl":"https://doi.org/10.1145/2847538.2847546","url":null,"abstract":"Generic programming libraries such as Scrap Your Boilerplate eliminate the need to write repetitive code, but typically introduce significant performance overheads. This leaves programmers with the unfortunate choice of writing succinct but slow programs or writing tedious but efficient programs. We show how to systematically transform an implementation of the Scrap Your Boilerplate library in the multi-stage programming language MetaOCaml to eliminate the overhead, making it possible to combine the benefits of high-level abstract programming with the efficiency of low-level code.","PeriodicalId":211787,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129770318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Powerful optimizations of object queries can lead to reduced asymptotic running times. However, such queries are often used in dynamic languages, and the required generality of the optimizations in handling a dynamic language can lead to significant runtime overhead as well as significantly increased code size. This paper studies combinations of optimizations for reducing this runtime overhead and code size. We describe two new optimizations --- counting elimination and result set elimination, their effectiveness when combined with inlining and when using specialized data structures, and additional optimizations enabled by type analysis and alias analysis. The two new optimizations are enabled by the high-level nature of queries, even though they are difficult and not supported by general compiler optimizations. We have run a variety of benchmarks, including distributed algorithms and benchmarks from prior best systems, obtaining a speedup of up to 56% and code size reduction of up to 37%.
{"title":"Removing runtime overhead for optimized object queries","authors":"Jon Brandvein, Yanhong A. Liu","doi":"10.1145/2847538.2847545","DOIUrl":"https://doi.org/10.1145/2847538.2847545","url":null,"abstract":"Powerful optimizations of object queries can lead to reduced asymptotic running times. However, such queries are often used in dynamic languages, and the required generality of the optimizations in handling a dynamic language can lead to significant runtime overhead as well as significantly increased code size. This paper studies combinations of optimizations for reducing this runtime overhead and code size. We describe two new optimizations --- counting elimination and result set elimination, their effectiveness when combined with inlining and when using specialized data structures, and additional optimizations enabled by type analysis and alias analysis. The two new optimizations are enabled by the high-level nature of queries, even though they are difficult and not supported by general compiler optimizations. We have run a variety of benchmarks, including distributed algorithms and benchmarks from prior best systems, obtaining a speedup of up to 56% and code size reduction of up to 37%.","PeriodicalId":211787,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128623480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper relates 2-level lambda-calculus and staged lambda-calculus (restricted to 2 stages) to obtain monovariant binding-time analysis for lambda-calculus that produces the output in the form of staging annotations. The relationship between the two lambda-calculi provides us with a precise and easy instruction on how to implement binding-time analysis to be incorporated in the staged lambda-calculus. It forms a basis for introducing binding-time analysis to full-fledged staged languages such as MetaOCaml.
{"title":"Toward introducing binding-time analysis to MetaOCaml","authors":"K. Asai","doi":"10.1145/2847538.2847547","DOIUrl":"https://doi.org/10.1145/2847538.2847547","url":null,"abstract":"This paper relates 2-level lambda-calculus and staged lambda-calculus (restricted to 2 stages) to obtain monovariant binding-time analysis for lambda-calculus that produces the output in the form of staging annotations. The relationship between the two lambda-calculi provides us with a precise and easy instruction on how to implement binding-time analysis to be incorporated in the staged lambda-calculus. It forms a basis for introducing binding-time analysis to full-fledged staged languages such as MetaOCaml.","PeriodicalId":211787,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114339254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","authors":"Martin Erwig, Tiark Rompf","doi":"10.1145/2847538","DOIUrl":"https://doi.org/10.1145/2847538","url":null,"abstract":"","PeriodicalId":211787,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126136355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Putback-based bidirectional programming allows the programmer to write only one putback transformation, from which the unique corresponding forward transformation is derived for free. The logic of a putback transformation is more sophisticated than that of a forward transformation and does not always give rise to well-behaved bidirectional programs; this calls for more robust language design to support development of well-behaved putback transformations. In this paper, we design and implement a concise core language BiGUL for putback-based bidirectional programming to serve as a foundation for higher-level putback-based languages. BiGUL is completely formally verified in the dependently typed programming language Agda to guarantee that any putback transformation written in BiGUL is well-behaved.
{"title":"BiGUL: a formally verified core language for putback-based bidirectional programming","authors":"Hsiang-Shang Ko, Tao Zan, Zhenjiang Hu","doi":"10.1145/2847538.2847544","DOIUrl":"https://doi.org/10.1145/2847538.2847544","url":null,"abstract":"Putback-based bidirectional programming allows the programmer to write only one putback transformation, from which the unique corresponding forward transformation is derived for free. The logic of a putback transformation is more sophisticated than that of a forward transformation and does not always give rise to well-behaved bidirectional programs; this calls for more robust language design to support development of well-behaved putback transformations. In this paper, we design and implement a concise core language BiGUL for putback-based bidirectional programming to serve as a foundation for higher-level putback-based languages. BiGUL is completely formally verified in the dependently typed programming language Agda to guarantee that any putback transformation written in BiGUL is well-behaved.","PeriodicalId":211787,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115754614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language-integrated query is an embedding of database queries into a host language to code queries at a higher level than the all-to-common concatenation of strings of SQL fragments. The eventually produced SQL is ensured to be well-formed and well-typed, and hence free from the embarrassing (security) problems. Language-integrated query takes advantage of the host language's functional and modular abstractions to compose and reuse queries and build query libraries. Furthermore, language-integrated query systems like T-LINQ generate efficient SQL, by applying a number of program transformations to the embedded query. Alas, the set of transformation rules is not designed to be extensible. We demonstrate a new technique of integrating database queries into a typed functional programming language, so to write well-typed, composable queries and execute them efficiently on any SQL back-end as well as on an in-memory noSQL store. A distinct feature of our framework is that both the query language as well as the transformation rules needed to generate efficient SQL are safely user-extensible, to account for many variations in the SQL back-ends, as well for domain-specific knowledge. The transformation rules are guaranteed to be type-preserving and hygienic by their very construction. They can be built from separately developed and reusable parts and arbitrarily composed into optimization pipelines. With this technique we have embedded into OCaml a relational query language that supports a very large subset of SQL including grouping and aggregation. Its types cover the complete set of intricate SQL behaviors.
{"title":"Finally, safely-extensible and efficient language-integrated query","authors":"Kenichi Suzuki, O. Kiselyov, Yukiyoshi Kameyama","doi":"10.1145/2847538.2847542","DOIUrl":"https://doi.org/10.1145/2847538.2847542","url":null,"abstract":"Language-integrated query is an embedding of database queries into a host language to code queries at a higher level than the all-to-common concatenation of strings of SQL fragments. The eventually produced SQL is ensured to be well-formed and well-typed, and hence free from the embarrassing (security) problems. Language-integrated query takes advantage of the host language's functional and modular abstractions to compose and reuse queries and build query libraries. Furthermore, language-integrated query systems like T-LINQ generate efficient SQL, by applying a number of program transformations to the embedded query. Alas, the set of transformation rules is not designed to be extensible. We demonstrate a new technique of integrating database queries into a typed functional programming language, so to write well-typed, composable queries and execute them efficiently on any SQL back-end as well as on an in-memory noSQL store. A distinct feature of our framework is that both the query language as well as the transformation rules needed to generate efficient SQL are safely user-extensible, to account for many variations in the SQL back-ends, as well for domain-specific knowledge. The transformation rules are guaranteed to be type-preserving and hygienic by their very construction. They can be built from separately developed and reusable parts and arbitrarily composed into optimization pipelines. With this technique we have embedded into OCaml a relational query language that supports a very large subset of SQL including grouping and aggregation. Its types cover the complete set of intricate SQL behaviors.","PeriodicalId":211787,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117261059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Staging is a program generation paradigm with a clean, well-investigated semantics which statically ensures that the generated code is always well-typed and well-scoped. Staging is often used for specializing programs to the known properties or parts of data to improve efficiency, but so far it has been limited to generating terms. This short paper describes our ongoing work on extending staging, with its strong safety guarantees, to generation of non-terms, focusing on ML-style modules. The purpose is to map out the promises and challenges, then to pose a question to solicit the community's expertise in evaluating how essential our extensions are for the purpose of applying staging beyond the realm of terms. We demonstrate our extensions' use in specializing functor applications to eliminate its (currently large) overhead in OCaml. We explain the challenges that those extensions bring in and identify a promising line of attack. Unexpectedly, however, it turns out that we can avoid module generation altogether by representing modules, possibly containing abstract types, as polymorphic records. With the help of first-class modules, module specialization reduces to ordinary term specialization, which can be done with conventional staging. The extent to which this hack generalizes is unclear. Thus we have a question to the community: is there a compelling use case for module generation? With these insights and questions, we offer a starting point for a long-term program in the next stage of staging research.
{"title":"Staging beyond terms: prospects and challenges","authors":"Jun Inoue, O. Kiselyov, Yukiyoshi Kameyama","doi":"10.1145/2847538.2847548","DOIUrl":"https://doi.org/10.1145/2847538.2847548","url":null,"abstract":"Staging is a program generation paradigm with a clean, well-investigated semantics which statically ensures that the generated code is always well-typed and well-scoped. Staging is often used for specializing programs to the known properties or parts of data to improve efficiency, but so far it has been limited to generating terms. This short paper describes our ongoing work on extending staging, with its strong safety guarantees, to generation of non-terms, focusing on ML-style modules. The purpose is to map out the promises and challenges, then to pose a question to solicit the community's expertise in evaluating how essential our extensions are for the purpose of applying staging beyond the realm of terms. We demonstrate our extensions' use in specializing functor applications to eliminate its (currently large) overhead in OCaml. We explain the challenges that those extensions bring in and identify a promising line of attack. Unexpectedly, however, it turns out that we can avoid module generation altogether by representing modules, possibly containing abstract types, as polymorphic records. With the help of first-class modules, module specialization reduces to ordinary term specialization, which can be done with conventional staging. The extent to which this hack generalizes is unclear. Thus we have a question to the community: is there a compelling use case for module generation? With these insights and questions, we offer a starting point for a long-term program in the next stage of staging research.","PeriodicalId":211787,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122712866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Constructing parsers based on declarative specification of operator precedence is a very old research topic, and there are various existing approaches. However, these approaches are either tied to a particular parsing technique, or cannot deal with all corner cases found in programming languages. In this paper we present an implementation of declarative specification of operator precedence for general parsing that (1) is independent of the underlying parsing algorithm, (2) does not require any grammar transformation that increases the size of the grammar, (3) preserves the shape of parse trees of the original, natural grammar, and (4) can deal with intricate cases of operator precedence found in functional programming languages such as OCaml. Our new approach to operator precedence is formulated using data-dependent grammars, which extend context-free grammars with arbitrary computation, variable binding and constraints. We implemented our approach using Iguana, a data-dependent parsing framework, and evaluated it by parsing Java and OCaml source files. The results show that our approach is practical for parsing programming languages with complicated operator precedence rules.
{"title":"Operator precedence for data-dependent grammars","authors":"A. Afroozeh, Anastasia Izmaylova","doi":"10.1145/2847538.2847540","DOIUrl":"https://doi.org/10.1145/2847538.2847540","url":null,"abstract":"Constructing parsers based on declarative specification of operator precedence is a very old research topic, and there are various existing approaches. However, these approaches are either tied to a particular parsing technique, or cannot deal with all corner cases found in programming languages. In this paper we present an implementation of declarative specification of operator precedence for general parsing that (1) is independent of the underlying parsing algorithm, (2) does not require any grammar transformation that increases the size of the grammar, (3) preserves the shape of parse trees of the original, natural grammar, and (4) can deal with intricate cases of operator precedence found in functional programming languages such as OCaml. Our new approach to operator precedence is formulated using data-dependent grammars, which extend context-free grammars with arbitrary computation, variable binding and constraints. We implemented our approach using Iguana, a data-dependent parsing framework, and evaluated it by parsing Java and OCaml source files. The results show that our approach is practical for parsing programming languages with complicated operator precedence rules.","PeriodicalId":211787,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","volume":"219 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114234175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Parser combinators are a popular approach to parsing where context-free grammars are represented as executable code. However, conventional parser combinators do not support left recursion, and can have worst-case exponential runtime. These limitations hinder the expressivity and performance predictability of parser combinators when constructing parsers for programming languages. In this paper we present general parser combinators that support all context-free grammars and construct a parse forest in cubic time and space in the worst case, while behaving nearly linearly on grammars of real programming languages. Our general parser combinators are based on earlier work on memoized Continuation-Passing Style (CPS) recognizers. First, we extend this work to achieve recognition in cubic time. Second, we extend the resulting cubic CPS recognizers to parsers that construct a binarized Shared Packed Parse Forest (SPPF). Our general parser combinators bring the best of both worlds: the flexibility and extensibility of conventional parser combinators and the expressivity and performance guarantees of general parsing algorithms. We used the approach presented in this paper as the basis for Meerkat, a general parser combinator library for Scala.
{"title":"Practical, general parser combinators","authors":"Anastasia Izmaylova, A. Afroozeh, T. Storm","doi":"10.1145/2847538.2847539","DOIUrl":"https://doi.org/10.1145/2847538.2847539","url":null,"abstract":"Parser combinators are a popular approach to parsing where context-free grammars are represented as executable code. However, conventional parser combinators do not support left recursion, and can have worst-case exponential runtime. These limitations hinder the expressivity and performance predictability of parser combinators when constructing parsers for programming languages. In this paper we present general parser combinators that support all context-free grammars and construct a parse forest in cubic time and space in the worst case, while behaving nearly linearly on grammars of real programming languages. Our general parser combinators are based on earlier work on memoized Continuation-Passing Style (CPS) recognizers. First, we extend this work to achieve recognition in cubic time. Second, we extend the resulting cubic CPS recognizers to parsers that construct a binarized Shared Packed Parse Forest (SPPF). Our general parser combinators bring the best of both worlds: the flexibility and extensibility of conventional parser combinators and the expressivity and performance guarantees of general parsing algorithms. We used the approach presented in this paper as the basis for Meerkat, a general parser combinator library for Scala.","PeriodicalId":211787,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121102361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. V. Antwerpen, P. Néron, A. Tolmach, E. Visser, Guido Wachsmuth
In previous work, we introduced scope graphs as a formalism for describing program binding structure and performing name resolution in an AST-independent way. In this paper, we show how to use scope graphs to build static semantic analyzers. We use constraints extracted from the AST to specify facts about binding, typing, and initialization. We treat name and type resolution as separate building blocks, but our approach can handle language constructs---such as record field access---for which binding and typing are mutually dependent. We also refine and extend our previous scope graph theory to address practical concerns including ambiguity checking and support for a wider range of scope relationships. We describe the details of constraint generation for a model language that illustrates many of the interesting static analysis issues associated with modules and records.
{"title":"A constraint language for static semantic analysis based on scope graphs","authors":"H. V. Antwerpen, P. Néron, A. Tolmach, E. Visser, Guido Wachsmuth","doi":"10.1145/2847538.2847543","DOIUrl":"https://doi.org/10.1145/2847538.2847543","url":null,"abstract":"In previous work, we introduced scope graphs as a formalism for describing program binding structure and performing name resolution in an AST-independent way. In this paper, we show how to use scope graphs to build static semantic analyzers. We use constraints extracted from the AST to specify facts about binding, typing, and initialization. We treat name and type resolution as separate building blocks, but our approach can handle language constructs---such as record field access---for which binding and typing are mutually dependent. We also refine and extend our previous scope graph theory to address practical concerns including ambiguity checking and support for a wider range of scope relationships. We describe the details of constraint generation for a model language that illustrates many of the interesting static analysis issues associated with modules and records.","PeriodicalId":211787,"journal":{"name":"Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122213634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}