首页 > 最新文献

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation最新文献

英文 中文
Improving compiler scalability: optimizing large programs at small price 提高编译器的可伸缩性:以较小的代价优化大型程序
Sanyam Mehta, P. Yew
Compiler scalability is a well known problem: reasoning about the application of useful optimizations over large program scopes consumes too much time and memory during compilation. This problem is exacerbated in polyhedral compilers that use powerful yet costly integer programming algorithms to compose loop optimizations. As a result, the benefits that a polyhedral compiler has to offer to programs such as real scientific applications that contain sequences of loop nests, remain impractical for the common users. In this work, we address this scalability problem in polyhedral compilers. We identify three causes of unscalability, each of which stems from large number of statements and dependences in the program scope. We propose a one-shot solution to the problem by reducing the effective number of statements and dependences as seen by the compiler. We achieve this by representing a sequence of statements in a program by a single super-statement. This set of super-statements exposes the minimum sufficient constraints to the Integer Linear Programming (ILP) solver for finding correct optimizations. We implement our approach in the PLuTo polyhedral compiler and find that it condenses the program statements and program dependences by factors of 4.7x and 6.4x, respectively, averaged over 9 hot regions (ranging from 48 to 121 statements) in 5 real applications. As a result, the improvements in time and memory requirement for compilation are 268x and 20x, respectively, over the latest version of the PLuTo compiler. The final compile times are comparable to the Intel compiler while the performance is 1.92x better on average due to the latter’s conservative approach to loop optimization.
编译器的可伸缩性是一个众所周知的问题:在编译过程中,推断在大型程序范围内应用有用的优化会消耗太多的时间和内存。这个问题在多面体编译器中更加严重,因为多面体编译器使用强大但昂贵的整数编程算法来组成循环优化。因此,多面体编译器为包含循环巢序列的真正科学应用程序提供的好处对普通用户来说仍然是不切实际的。在这项工作中,我们解决了多面体编译器中的可伸缩性问题。我们确定了不可伸缩性的三个原因,每个原因都源于程序范围中的大量语句和依赖项。我们建议通过减少编译器看到的语句和依赖项的有效数量来一次性解决这个问题。我们通过用单个超级语句表示程序中的语句序列来实现这一点。这组超级语句向整数线性规划(ILP)求解器暴露了最小充分约束,以便找到正确的优化。我们在PLuTo多面体编译器中实现了我们的方法,发现它在5个实际应用程序的9个热点区域(从48到121个语句)中分别将程序语句和程序依赖项压缩了4.7倍和6.4倍。因此,与PLuTo编译器的最新版本相比,编译所需的时间和内存分别提高了268x和20x。最终的编译时间与Intel编译器相当,而性能平均提高1.92倍,因为后者采用了保守的循环优化方法。
{"title":"Improving compiler scalability: optimizing large programs at small price","authors":"Sanyam Mehta, P. Yew","doi":"10.1145/2737924.2737954","DOIUrl":"https://doi.org/10.1145/2737924.2737954","url":null,"abstract":"Compiler scalability is a well known problem: reasoning about the application of useful optimizations over large program scopes consumes too much time and memory during compilation. This problem is exacerbated in polyhedral compilers that use powerful yet costly integer programming algorithms to compose loop optimizations. As a result, the benefits that a polyhedral compiler has to offer to programs such as real scientific applications that contain sequences of loop nests, remain impractical for the common users. In this work, we address this scalability problem in polyhedral compilers. We identify three causes of unscalability, each of which stems from large number of statements and dependences in the program scope. We propose a one-shot solution to the problem by reducing the effective number of statements and dependences as seen by the compiler. We achieve this by representing a sequence of statements in a program by a single super-statement. This set of super-statements exposes the minimum sufficient constraints to the Integer Linear Programming (ILP) solver for finding correct optimizations. We implement our approach in the PLuTo polyhedral compiler and find that it condenses the program statements and program dependences by factors of 4.7x and 6.4x, respectively, averaged over 9 hot regions (ranging from 48 to 121 statements) in 5 real applications. As a result, the improvements in time and memory requirement for compilation are 268x and 20x, respectively, over the latest version of the PLuTo compiler. The final compile times are comparable to the Intel compiler while the performance is 1.92x better on average due to the latter’s conservative approach to loop optimization.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122557360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Synthesizing parallel graph programs via automated planning 通过自动规划合成并行图程序
Dimitrios Prountzos, R. Manevich, K. Pingali
We describe a system that uses automated planning to synthesize correct and efficient parallel graph programs from high-level algorithmic specifications. Automated planning allows us to use constraints to declaratively encode program transformations such as scheduling, implementation selection, and insertion of synchronization. Each plan emitted by the planner satisfies all constraints simultaneously, and corresponds to a composition of these transformations. In this way, we obtain an integrated compilation approach for a very challenging problem domain. We have used this system to synthesize parallel programs for four graph problems: triangle counting, maximal independent set computation, preflow-push maxflow, and connected components. Experiments on a variety of inputs show that the synthesized implementations perform competitively with hand-written, highly-tuned code.
我们描述了一个系统,它使用自动规划来合成正确和有效的并行图程序从高级算法规范。自动化计划允许我们使用约束来声明性地编码程序转换,例如调度、实现选择和同步的插入。计划器发出的每个计划同时满足所有约束,并对应于这些转换的组合。通过这种方式,我们获得了一个非常具有挑战性的问题领域的集成编译方法。我们利用该系统合成了四个图问题的并行程序:三角形计数、最大独立集计算、preflow-push maxflow和连通分量。对各种输入的实验表明,合成实现与手工编写的、高度调优的代码相比具有竞争力。
{"title":"Synthesizing parallel graph programs via automated planning","authors":"Dimitrios Prountzos, R. Manevich, K. Pingali","doi":"10.1145/2737924.2737953","DOIUrl":"https://doi.org/10.1145/2737924.2737953","url":null,"abstract":"We describe a system that uses automated planning to synthesize correct and efficient parallel graph programs from high-level algorithmic specifications. Automated planning allows us to use constraints to declaratively encode program transformations such as scheduling, implementation selection, and insertion of synchronization. Each plan emitted by the planner satisfies all constraints simultaneously, and corresponds to a composition of these transformations. In this way, we obtain an integrated compilation approach for a very challenging problem domain. We have used this system to synthesize parallel programs for four graph problems: triangle counting, maximal independent set computation, preflow-push maxflow, and connected components. Experiments on a variety of inputs show that the synthesized implementations perform competitively with hand-written, highly-tuned code.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121642557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Exploring and enforcing security guarantees via program dependence graphs 通过程序依赖图探索和实施安全保证
Andrew Johnson, Lucas Waye, Scott Moore, Stephen Chong
We present PIDGIN, a program analysis and understanding tool that enables the specification and enforcement of precise application-specific information security guarantees. PIDGIN also allows developers to interactively explore the information flows in their applications to develop policies and investigate counter-examples. PIDGIN combines program dependence graphs (PDGs), which precisely capture the information flows in a whole application, with a custom PDG query language. Queries express properties about the paths in the PDG; because paths in the PDG correspond to information flows in the application, queries can be used to specify global security policies. PIDGIN is scalable. Generating a PDG for a 330k line Java application takes 90 seconds, and checking a policy on that PDG takes under 14 seconds. The query language is expressive, supporting a large class of precise, application-specific security guarantees. Policies are separate from the code and do not interfere with testing or development, and can be used for security regression testing. We describe the design and implementation of PIDGIN and report on using it: (1) to explore information security guarantees in legacy programs; (2) to develop and modify security policies concurrently with application development; and (3) to develop policies based on known vulnerabilities.
我们介绍PIDGIN,这是一种程序分析和理解工具,可以规范和执行精确的特定于应用程序的信息安全保证。PIDGIN还允许开发人员交互式地探索其应用程序中的信息流,以制定策略和调查反例。PIDGIN将程序依赖图(PDGs)与定制的PDG查询语言结合起来,PDGs可以精确地捕获整个应用程序中的信息流。查询表示PDG中路径的属性;由于PDG中的路径对应于应用程序中的信息流,因此可以使用查询来指定全局安全策略。PIDGIN是可扩展的。为330k行Java应用程序生成PDG需要90秒,检查PDG上的策略需要不到14秒。查询语言是表达性的,支持大量精确的、特定于应用程序的安全保证。策略与代码是分离的,不会干扰测试或开发,并且可以用于安全回归测试。我们描述了PIDGIN的设计和实现,并报告了使用它的情况:(1)探索遗留程序中的信息安全保障;(二)在开发应用程序的同时,制定和修改安全策略;(3)根据已知漏洞制定策略。
{"title":"Exploring and enforcing security guarantees via program dependence graphs","authors":"Andrew Johnson, Lucas Waye, Scott Moore, Stephen Chong","doi":"10.1145/2737924.2737957","DOIUrl":"https://doi.org/10.1145/2737924.2737957","url":null,"abstract":"We present PIDGIN, a program analysis and understanding tool that enables the specification and enforcement of precise application-specific information security guarantees. PIDGIN also allows developers to interactively explore the information flows in their applications to develop policies and investigate counter-examples. PIDGIN combines program dependence graphs (PDGs), which precisely capture the information flows in a whole application, with a custom PDG query language. Queries express properties about the paths in the PDG; because paths in the PDG correspond to information flows in the application, queries can be used to specify global security policies. PIDGIN is scalable. Generating a PDG for a 330k line Java application takes 90 seconds, and checking a policy on that PDG takes under 14 seconds. The query language is expressive, supporting a large class of precise, application-specific security guarantees. Policies are separate from the code and do not interfere with testing or development, and can be used for security regression testing. We describe the design and implementation of PIDGIN and report on using it: (1) to explore information security guarantees in legacy programs; (2) to develop and modify security policies concurrently with application development; and (3) to develop policies based on known vulnerabilities.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"247 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134431604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Many-core compiler fuzzing 多核编译器模糊测试
Christopher Lidbury, Andrei Lascu, Nathan Chong, A. Donaldson
We address the compiler correctness problem for many-core systems through novel applications of fuzz testing to OpenCL compilers. Focusing on two methods from prior work, random differential testing and testing via equivalence modulo inputs (EMI), we present several strategies for random generation of deterministic, communicating OpenCL kernels, and an injection mechanism that allows EMI testing to be applied to kernels that otherwise exhibit little or no dynamically-dead code. We use these methods to conduct a large, controlled testing campaign with respect to 21 OpenCL (device, compiler) configurations, covering a range of CPU, GPU, accelerator, FPGA and emulator implementations. Our study provides independent validation of claims in prior work related to the effectiveness of random differential testing and EMI testing, proposes novel methods for lifting these techniques to the many-core setting and reveals a significant number of OpenCL compiler bugs in commercial implementations.
我们通过对OpenCL编译器进行模糊测试的新应用,解决了多核系统的编译器正确性问题。重点关注之前工作中的两种方法,随机差分测试和通过等效模输入(EMI)进行测试,我们提出了几种随机生成确定性、通信OpenCL内核的策略,以及一种注入机制,该机制允许EMI测试应用于内核,否则会显示很少或没有动态死代码。我们使用这些方法对21种OpenCL(设备、编译器)配置进行了大规模、可控的测试活动,涵盖了一系列CPU、GPU、加速器、FPGA和模拟器实现。我们的研究为之前与随机差分测试和EMI测试有效性相关的工作提供了独立的验证,提出了将这些技术提升到多核设置的新方法,并揭示了商业实现中大量的OpenCL编译器错误。
{"title":"Many-core compiler fuzzing","authors":"Christopher Lidbury, Andrei Lascu, Nathan Chong, A. Donaldson","doi":"10.1145/2737924.2737986","DOIUrl":"https://doi.org/10.1145/2737924.2737986","url":null,"abstract":"We address the compiler correctness problem for many-core systems through novel applications of fuzz testing to OpenCL compilers. Focusing on two methods from prior work, random differential testing and testing via equivalence modulo inputs (EMI), we present several strategies for random generation of deterministic, communicating OpenCL kernels, and an injection mechanism that allows EMI testing to be applied to kernels that otherwise exhibit little or no dynamically-dead code. We use these methods to conduct a large, controlled testing campaign with respect to 21 OpenCL (device, compiler) configurations, covering a range of CPU, GPU, accelerator, FPGA and emulator implementations. Our study provides independent validation of claims in prior work related to the effectiveness of random differential testing and EMI testing, proposes novel methods for lifting these techniques to the many-core setting and reveals a significant number of OpenCL compiler bugs in commercial implementations.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123235186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Interactive parser synthesis by example 交互式解析器合成示例
Alan Leung, J. Sarracino, Sorin Lerner
Despite decades of research on parsing, the construction of parsers remains a painstaking, manual process prone to subtle bugs and pitfalls. We present a programming-by-example framework called Parsify that is able to synthesize a parser from input/output examples. The user does not write a single line of code. To achieve this, Parsify provides: (a) an iterative algorithm for synthesizing and refining a grammar one example at a time, (b) an interface that provides immediate visual feedback in response to changes in the grammar being refined, and (c) a graphical mechanism for specifying example parse trees using only textual selections. We empirically demonstrate the viability of our approach by using Parsify to construct parsers for source code drawn from Verilog, SQL, Apache, and Tiger.
尽管对解析进行了数十年的研究,但解析器的构建仍然是一个艰苦的手工过程,容易出现细微的错误和陷阱。我们提出了一个名为Parsify的按例编程框架,它能够从输入/输出示例合成解析器。用户不需要编写任何代码。为了实现这一点,Parsify提供了:(a)一个迭代算法,用于一次合成和细化一个示例的语法;(b)一个界面,用于响应正在细化的语法中的变化,提供即时的视觉反馈;(c)一个图形化机制,用于仅使用文本选择来指定示例解析树。通过使用Parsify为来自Verilog、SQL、Apache和Tiger的源代码构建解析器,我们从经验上证明了我们方法的可行性。
{"title":"Interactive parser synthesis by example","authors":"Alan Leung, J. Sarracino, Sorin Lerner","doi":"10.1145/2737924.2738002","DOIUrl":"https://doi.org/10.1145/2737924.2738002","url":null,"abstract":"Despite decades of research on parsing, the construction of parsers remains a painstaking, manual process prone to subtle bugs and pitfalls. We present a programming-by-example framework called Parsify that is able to synthesize a parser from input/output examples. The user does not write a single line of code. To achieve this, Parsify provides: (a) an iterative algorithm for synthesizing and refining a grammar one example at a time, (b) an interface that provides immediate visual feedback in response to changes in the grammar being refined, and (c) a graphical mechanism for specifying example parse trees using only textual selections. We empirically demonstrate the viability of our approach by using Parsify to construct parsers for source code drawn from Verilog, SQL, Apache, and Tiger.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122540139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Diagnosing type errors with class 用类诊断类型错误
Danfeng Zhang, A. Myers, Dimitrios Vytiniotis, S. P. Jones
Type inference engines often give terrible error messages, and the more sophisticated the type system the worse the problem. We show that even with the highly expressive type system implemented by the Glasgow Haskell Compiler (GHC)--including type classes, GADTs, and type families--it is possible to identify the most likely source of the type error, rather than the first source that the inference engine trips over. To determine which are the likely error sources, we apply a simple Bayesian model to a graph representation of the typing constraints; the satisfiability or unsatisfiability of paths within the graph provides evidence for or against possible explanations. While we build on prior work on error diagnosis for simpler type systems, inference in the richer type system of Haskell requires extending the graph with new nodes. The augmentation of the graph creates challenges both for Bayesian reasoning and for ensuring termination. Using a large corpus of Haskell programs, we show that this error localization technique is practical and significantly improves accuracy over the state of the art.
类型推断引擎经常给出可怕的错误消息,类型系统越复杂,问题就越严重。我们表明,即使使用格拉斯哥Haskell编译器(GHC)实现的高度表达的类型系统——包括类型类、gadt和类型族——也有可能识别出最可能的类型错误来源,而不是推理引擎所忽略的第一个来源。为了确定哪些是可能的错误来源,我们将一个简单的贝叶斯模型应用于类型约束的图表示;图中路径的可满足性或不可满足性提供了支持或反对可能解释的证据。当我们建立在先前对简单类型系统的错误诊断工作的基础上时,Haskell的更丰富类型系统中的推理需要用新节点扩展图。图的扩充给贝叶斯推理和确保终止都带来了挑战。通过使用大量的Haskell程序语料库,我们证明了这种错误定位技术是实用的,并且比目前的技术水平显著提高了准确性。
{"title":"Diagnosing type errors with class","authors":"Danfeng Zhang, A. Myers, Dimitrios Vytiniotis, S. P. Jones","doi":"10.1145/2737924.2738009","DOIUrl":"https://doi.org/10.1145/2737924.2738009","url":null,"abstract":"Type inference engines often give terrible error messages, and the more sophisticated the type system the worse the problem. We show that even with the highly expressive type system implemented by the Glasgow Haskell Compiler (GHC)--including type classes, GADTs, and type families--it is possible to identify the most likely source of the type error, rather than the first source that the inference engine trips over. To determine which are the likely error sources, we apply a simple Bayesian model to a graph representation of the typing constraints; the satisfiability or unsatisfiability of paths within the graph provides evidence for or against possible explanations. While we build on prior work on error diagnosis for simpler type systems, inference in the richer type system of Haskell requires extending the graph with new nodes. The augmentation of the graph creates challenges both for Bayesian reasoning and for ensuring termination. Using a large corpus of Haskell programs, we show that this error localization technique is practical and significantly improves accuracy over the state of the art.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"244 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120967635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation 第36届ACM SIGPLAN编程语言设计与实现会议论文集
D. Grove, S. Blackburn
{"title":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","authors":"D. Grove, S. Blackburn","doi":"10.1145/2737924","DOIUrl":"https://doi.org/10.1145/2737924","url":null,"abstract":"","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115183566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Celebrating diversity: a mixture of experts approach for runtime mapping in dynamic environments 庆祝多样性:动态环境中运行时映射的混合专家方法
M. Emani, M. O’Boyle
Matching program parallelism to platform parallelism using thread selection is difficult when the environment and available resources dynamically change. Existing compiler or runtime approaches are typically based on a one-size fits all policy. There is little ability to either evaluate or adapt the policy when encountering new external workloads or hardware resources. This paper focuses on selecting the best number of threads for a parallel application in dynamic environments. It develops a new scheme based on a mixture of experts approach. It learns online which, of a number of existing policies, or experts, is best suited for a particular environment without having to try out each policy. It does this by using a novel environment predictor as a proxy for the quality of an expert thread selection policy. Additional expert policies can easily be added and are selected only when appropriate. We evaluate our scheme in environments with varying external workloads and hardware resources.We then consider the case when workloads use affinity scheduling or are themselves adaptive and show that our approach, in all cases, outperforms existing schemes and surprisingly improves workload performance. On average, we improve 1.66x over OpenMP default, 1.34x over an online scheme, 1.25x over an offline policy and 1.2x over a state-of-art analytic model. Determining the right number and type of experts is an open problem and our initial analysis shows that adding more experts improves accuracy and performance.
当环境和可用资源发生动态变化时,使用线程选择来匹配程序并行性和平台并行性是困难的。现有的编译器或运行时方法通常基于“一刀切”的策略。当遇到新的外部工作负载或硬件资源时,几乎没有能力评估或调整策略。本文的重点是在动态环境中选择并行应用程序的最佳线程数。提出了一种基于混合专家方法的新方案。它可以在线学习,从众多现有的策略或专家中,哪一个最适合特定的环境,而不必尝试每一个策略。它通过使用一个新的环境预测器作为专家线程选择策略质量的代理来实现这一点。可以很容易地添加额外的专家策略,并且只在适当的时候进行选择。我们在具有不同外部工作负载和硬件资源的环境中评估我们的方案。然后,我们将考虑工作负载使用亲和性调度或其本身是自适应的情况,并表明我们的方法在所有情况下都优于现有方案,并惊人地提高了工作负载性能。平均而言,我们比OpenMP默认值提高1.66倍,比在线方案提高1.34倍,比离线策略提高1.25倍,比最先进的分析模型提高1.2倍。确定专家的正确数量和类型是一个悬而未决的问题,我们的初步分析表明,增加更多的专家可以提高准确性和性能。
{"title":"Celebrating diversity: a mixture of experts approach for runtime mapping in dynamic environments","authors":"M. Emani, M. O’Boyle","doi":"10.1145/2737924.2737999","DOIUrl":"https://doi.org/10.1145/2737924.2737999","url":null,"abstract":"Matching program parallelism to platform parallelism using thread selection is difficult when the environment and available resources dynamically change. Existing compiler or runtime approaches are typically based on a one-size fits all policy. There is little ability to either evaluate or adapt the policy when encountering new external workloads or hardware resources. This paper focuses on selecting the best number of threads for a parallel application in dynamic environments. It develops a new scheme based on a mixture of experts approach. It learns online which, of a number of existing policies, or experts, is best suited for a particular environment without having to try out each policy. It does this by using a novel environment predictor as a proxy for the quality of an expert thread selection policy. Additional expert policies can easily be added and are selected only when appropriate. We evaluate our scheme in environments with varying external workloads and hardware resources.We then consider the case when workloads use affinity scheduling or are themselves adaptive and show that our approach, in all cases, outperforms existing schemes and surprisingly improves workload performance. On average, we improve 1.66x over OpenMP default, 1.34x over an online scheme, 1.25x over an offline policy and 1.2x over a state-of-art analytic model. Determining the right number and type of experts is an open problem and our initial analysis shows that adding more experts improves accuracy and performance.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126094432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Synthesis of machine code from semantics 从语义合成机器代码
Venkatesh Srinivasan, T. Reps
In this paper, we present a technique to synthesize machine-code instructions from a semantic specification, given as a Quantifier-Free Bit-Vector (QFBV) logic formula. Our technique uses an instantiation of the Counter-Example Guided Inductive Synthesis (CEGIS) framework, in combination with search-space pruning heuristics to synthesize instruction-sequences. To counter the exponential cost inherent in enumerative synthesis, our technique uses a divide-and-conquer strategy to break the input QFBV formula into independent sub-formulas, and synthesize instructions for the sub-formulas. Synthesizers created by our technique could be used to create semantics-based binary rewriting tools such as optimizers, partial evaluators, program obfuscators/de-obfuscators, etc. Our experiments for Intel's IA-32 instruction set show that, in comparison to our baseline algorithm, our search-space pruning heuristics reduce the synthesis time by a factor of 473, and our divide-and-conquer strategy reduces the synthesis time by a further 3 to 5 orders of magnitude.
在本文中,我们提出了一种从语义规范合成机器代码指令的技术,该规范以无量词位向量(QFBV)逻辑公式的形式给出。我们的技术使用反例引导归纳综合(CEGIS)框架的实例化,结合搜索空间修剪启发式来合成指令序列。为了克服枚举综合中固有的指数成本,我们的技术使用分而治之的策略将输入QFBV公式分解为独立的子公式,并为子公式合成指令。通过我们的技术创建的合成器可以用来创建基于语义的二进制重写工具,如优化器、部分求值器、程序混淆器/去混淆器等。我们对Intel的IA-32指令集进行的实验表明,与我们的基线算法相比,我们的搜索空间剪枝启发式算法将合成时间减少了473倍,而我们的分治策略将合成时间进一步减少了3到5个数量级。
{"title":"Synthesis of machine code from semantics","authors":"Venkatesh Srinivasan, T. Reps","doi":"10.1145/2737924.2737960","DOIUrl":"https://doi.org/10.1145/2737924.2737960","url":null,"abstract":"In this paper, we present a technique to synthesize machine-code instructions from a semantic specification, given as a Quantifier-Free Bit-Vector (QFBV) logic formula. Our technique uses an instantiation of the Counter-Example Guided Inductive Synthesis (CEGIS) framework, in combination with search-space pruning heuristics to synthesize instruction-sequences. To counter the exponential cost inherent in enumerative synthesis, our technique uses a divide-and-conquer strategy to break the input QFBV formula into independent sub-formulas, and synthesize instructions for the sub-formulas. Synthesizers created by our technique could be used to create semantics-based binary rewriting tools such as optimizers, partial evaluators, program obfuscators/de-obfuscators, etc. Our experiments for Intel's IA-32 instruction set show that, in comparison to our baseline algorithm, our search-space pruning heuristics reduce the synthesis time by a factor of 473, and our divide-and-conquer strategy reduces the synthesis time by a further 3 to 5 orders of magnitude.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132129250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Verdi: a framework for implementing and formally verifying distributed systems Verdi:用于实现和正式验证分布式系统的框架
James R. Wilcox, Doug Woos, P. Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, T. Anderson
Distributed systems are difficult to implement correctly because they must handle both concurrency and failures: machines may crash at arbitrary points and networks may reorder, drop, or duplicate packets. Further, their behavior is often too complex to permit exhaustive testing. Bugs in these systems have led to the loss of critical data and unacceptable service outages. We present Verdi, a framework for implementing and formally verifying distributed systems in Coq. Verdi formalizes various network semantics with different faults, and the developer chooses the most appropriate fault model when verifying their implementation. Furthermore, Verdi eases the verification burden by enabling the developer to first verify their system under an idealized fault model, then transfer the resulting correctness guarantees to a more realistic fault model without any additional proof burden. To demonstrate Verdi's utility, we present the first mechanically checked proof of linearizability of the Raft state machine replication algorithm, as well as verified implementations of a primary-backup replication system and a key-value store. These verified systems provide similar performance to unverified equivalents.
分布式系统很难正确实现,因为它们必须同时处理并发性和故障:机器可能在任意点崩溃,网络可能重新排序、丢弃或重复数据包。此外,它们的行为往往过于复杂,无法进行详尽的测试。这些系统中的错误导致了关键数据的丢失和不可接受的服务中断。我们提出了Verdi,一个在Coq中实现和正式验证分布式系统的框架。Verdi将带有不同故障的各种网络语义形式化,开发人员在验证其实现时选择最合适的故障模型。此外,Verdi通过允许开发人员首先在理想化的故障模型下验证他们的系统,然后在没有任何额外证明负担的情况下将结果正确性保证转移到更现实的故障模型,从而减轻了验证负担。为了演示Verdi的实用程序,我们给出了Raft状态机复制算法线性性的第一个机械检验证明,以及主备份复制系统和键值存储的验证实现。这些经过验证的系统提供与未经验证的等效系统相似的性能。
{"title":"Verdi: a framework for implementing and formally verifying distributed systems","authors":"James R. Wilcox, Doug Woos, P. Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, T. Anderson","doi":"10.1145/2737924.2737958","DOIUrl":"https://doi.org/10.1145/2737924.2737958","url":null,"abstract":"Distributed systems are difficult to implement correctly because they must handle both concurrency and failures: machines may crash at arbitrary points and networks may reorder, drop, or duplicate packets. Further, their behavior is often too complex to permit exhaustive testing. Bugs in these systems have led to the loss of critical data and unacceptable service outages. We present Verdi, a framework for implementing and formally verifying distributed systems in Coq. Verdi formalizes various network semantics with different faults, and the developer chooses the most appropriate fault model when verifying their implementation. Furthermore, Verdi eases the verification burden by enabling the developer to first verify their system under an idealized fault model, then transfer the resulting correctness guarantees to a more realistic fault model without any additional proof burden. To demonstrate Verdi's utility, we present the first mechanically checked proof of linearizability of the Raft state machine replication algorithm, as well as verified implementations of a primary-backup replication system and a key-value store. These verified systems provide similar performance to unverified equivalents.","PeriodicalId":104101,"journal":{"name":"Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131826298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 301
期刊
Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1