首页 > 最新文献

Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation最新文献

英文 中文
Armada: low-effort verification of high-performance concurrent programs Armada:高性能并发程序的低工作量验证
Jacob R. Lorch, Yixuan Chen, Manos Kapritsos, Bryan Parno, S. Qadeer, Upamanyu Sharma, James R. Wilcox, Xueyuan Zhao
Safely writing high-performance concurrent programs is notoriously difficult. To aid developers, we introduce Armada, a language and tool designed to formally verify such programs with relatively little effort. Via a C-like language and a small-step, state-machine-based semantics, Armada gives developers the flexibility to choose arbitrary memory layout and synchronization primitives so they are never constrained in their pursuit of performance. To reduce developer effort, Armada leverages SMT-powered automation and a library of powerful reasoning techniques, including rely-guarantee, TSO elimination, reduction, and alias analysis. All these techniques are proven sound, and Armada can be soundly extended with additional strategies over time. Using Armada, we verify four concurrent case studies and show that we can achieve performance equivalent to that of unverified code.
安全地编写高性能并发程序是出了名的困难。为了帮助开发人员,我们介绍了Armada,这是一种语言和工具,旨在以相对较少的工作量正式验证此类程序。通过类似c的语言和基于状态机的小步骤语义,Armada为开发人员提供了选择任意内存布局和同步原语的灵活性,因此他们在追求性能时永远不会受到限制。为了减少开发人员的工作量,Armada利用了smt驱动的自动化和一个强大的推理技术库,包括可靠性保证、TSO消除、减少和别名分析。所有这些技术都是可靠的,并且随着时间的推移,Armada可以使用其他策略进行扩展。使用Armada,我们验证了四个并发的案例研究,并表明我们可以获得与未经验证的代码相当的性能。
{"title":"Armada: low-effort verification of high-performance concurrent programs","authors":"Jacob R. Lorch, Yixuan Chen, Manos Kapritsos, Bryan Parno, S. Qadeer, Upamanyu Sharma, James R. Wilcox, Xueyuan Zhao","doi":"10.1145/3385412.3385971","DOIUrl":"https://doi.org/10.1145/3385412.3385971","url":null,"abstract":"Safely writing high-performance concurrent programs is notoriously difficult. To aid developers, we introduce Armada, a language and tool designed to formally verify such programs with relatively little effort. Via a C-like language and a small-step, state-machine-based semantics, Armada gives developers the flexibility to choose arbitrary memory layout and synchronization primitives so they are never constrained in their pursuit of performance. To reduce developer effort, Armada leverages SMT-powered automation and a library of powerful reasoning techniques, including rely-guarantee, TSO elimination, reduction, and alias analysis. All these techniques are proven sound, and Armada can be soundly extended with additional strategies over time. Using Armada, we verify four concurrent case studies and show that we can achieve performance equivalent to that of unverified code.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81087274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Towards a verified range analysis for JavaScript JITs 迈向JavaScript jit的验证范围分析
Fraser Brown, John Renner, Andres Nötzli, Sorin Lerner, H. Shacham, D. Stefan
We present VeRA, a system for verifying the range analysis pass in browser just-in-time (JIT) compilers. Browser developers write range analysis routines in a subset of C++, and verification developers write infrastructure to verify custom analysis properties. Then, VeRA automatically verifies the range analysis routines, which browser developers can integrate directly into the JIT. We use VeRA to translate and verify Firefox range analysis routines, and it detects a new, confirmed bug that has existed in the browser for six years.
我们提出VeRA,一个在浏览器即时(JIT)编译器中验证范围分析通过的系统。浏览器开发人员用c++的子集编写范围分析例程,验证开发人员编写基础结构来验证自定义分析属性。然后,VeRA自动验证范围分析例程,浏览器开发人员可以将其直接集成到JIT中。我们使用VeRA来翻译和验证Firefox的范围分析例程,它检测到一个新的,已经在浏览器中存在了六年的确认错误。
{"title":"Towards a verified range analysis for JavaScript JITs","authors":"Fraser Brown, John Renner, Andres Nötzli, Sorin Lerner, H. Shacham, D. Stefan","doi":"10.1145/3385412.3385968","DOIUrl":"https://doi.org/10.1145/3385412.3385968","url":null,"abstract":"We present VeRA, a system for verifying the range analysis pass in browser just-in-time (JIT) compilers. Browser developers write range analysis routines in a subset of C++, and verification developers write infrastructure to verify custom analysis properties. Then, VeRA automatically verifies the range analysis routines, which browser developers can integrate directly into the JIT. We use VeRA to translate and verify Firefox range analysis routines, and it detects a new, confirmed bug that has existed in the browser for six years.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83487543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Reconciling enumerative and deductive program synthesis 调和枚举和演绎程序综合
Kangjing Huang, Xiaokang Qiu, Peiyuan Shen, Yanjun Wang
Syntax-guided synthesis (SyGuS) aims to find a program satisfying semantic specification as well as user-provided structural hypotheses. There are two main synthesis approaches: enumerative synthesis, which repeatedly enumerates possible candidate programs and checks their correctness, and deductive synthesis, which leverages a symbolic procedure to construct implementations from specifications. Neither approach is strictly better than the other: automated deductive synthesis is usually very efficient but only works for special grammars or applications; enumerative synthesis is very generally applicable but limited in scalability. In this paper, we propose a cooperative synthesis technique for SyGuS problems with the conditional linear integer arithmetic (CLIA) background theory, as a novel integration of the two approaches, combining the best of the two worlds. The technique exploits several novel divide-and-conquer strategies to split a large synthesis problem to smaller subproblems. The subproblems are solved separately and their solutions are combined to form a final solution. The technique integrates two synthesis engines: a pure deductive component that can efficiently solve some problems, and a height-based enumeration algorithm that can handle arbitrary grammar. We implemented the cooperative synthesis technique, and evaluated it on a wide range of benchmarks. Experiments showed that our technique can solve many challenging synthesis problems not possible before, and tends to be more scalable than state-of-the-art synthesis algorithms.
语法引导合成(SyGuS)旨在找到满足语义规范和用户提供的结构假设的程序。有两种主要的综合方法:枚举综合,它反复枚举可能的候选程序并检查它们的正确性;演绎综合,它利用符号过程根据规范构造实现。这两种方法都不比另一种好:自动演绎合成通常非常有效,但只适用于特殊的语法或应用;枚举综合非常普遍适用,但可扩展性有限。在本文中,我们提出了一种基于条件线性整数算法(CLIA)背景理论的SyGuS问题的合作综合技术,作为两种方法的新颖集成,结合了两者的优点。该技术利用几种新的分治策略将大型综合问题分解为较小的子问题。子问题分别求解,它们的解结合起来形成最终解。该技术集成了两个合成引擎:可以有效解决某些问题的纯演绎组件和可以处理任意语法的基于高度的枚举算法。我们实现了协作合成技术,并在广泛的基准上对其进行了评估。实验表明,我们的技术可以解决许多以前不可能解决的具有挑战性的合成问题,并且往往比最先进的合成算法更具可扩展性。
{"title":"Reconciling enumerative and deductive program synthesis","authors":"Kangjing Huang, Xiaokang Qiu, Peiyuan Shen, Yanjun Wang","doi":"10.1145/3385412.3386027","DOIUrl":"https://doi.org/10.1145/3385412.3386027","url":null,"abstract":"Syntax-guided synthesis (SyGuS) aims to find a program satisfying semantic specification as well as user-provided structural hypotheses. There are two main synthesis approaches: enumerative synthesis, which repeatedly enumerates possible candidate programs and checks their correctness, and deductive synthesis, which leverages a symbolic procedure to construct implementations from specifications. Neither approach is strictly better than the other: automated deductive synthesis is usually very efficient but only works for special grammars or applications; enumerative synthesis is very generally applicable but limited in scalability. In this paper, we propose a cooperative synthesis technique for SyGuS problems with the conditional linear integer arithmetic (CLIA) background theory, as a novel integration of the two approaches, combining the best of the two worlds. The technique exploits several novel divide-and-conquer strategies to split a large synthesis problem to smaller subproblems. The subproblems are solved separately and their solutions are combined to form a final solution. The technique integrates two synthesis engines: a pure deductive component that can efficiently solve some problems, and a height-based enumeration algorithm that can handle arbitrary grammar. We implemented the cooperative synthesis technique, and evaluated it on a wide range of benchmarks. Experiments showed that our technique can solve many challenging synthesis problems not possible before, and tends to be more scalable than state-of-the-art synthesis algorithms.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75707729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Static analysis of Java enterprise applications: frameworks and caches, the elephants in the room Java企业应用的静态分析:框架和缓存,房间里的大象
A. Antoniadis, Nikos Filippakis, Paddy Krishnan, R. Ramesh, N. Allen, Y. Smaragdakis
Enterprise applications are a major success domain of Java, and Java is the default setting for much modern static analysis research. It would stand to reason that high-quality static analysis of Java enterprise applications would be commonplace, but this is far from true. Major analysis frameworks feature virtually no support for enterprise applications and offer analyses that are woefully incomplete and vastly imprecise, when at all scalable. In this work, we present two techniques for drastically enhancing the completeness and precision of static analysis for Java enterprise applications. The first technique identifies domain-specific concepts underlying all enterprise application frameworks, captures them in an extensible, declarative form, and achieves modeling of components and entry points in a largely framework-independent way. The second technique offers precision and scalability via a sound-modulo-analysis modeling of standard data structures. In realistic enterprise applications (an order of magnitude larger than prior benchmarks in the literature) our techniques achieve high degrees of completeness (on average more than 4x higher than conventional techniques) and speedups of about 6x compared to the most precise conventional analysis, with higher precision on multiple metrics. The result is JackEE, an enterprise analysis framework that can offer precise, high-completeness static modeling of realistic enterprise applications.
企业应用程序是Java的主要成功领域,Java是许多现代静态分析研究的默认设置。按理说,Java企业应用程序的高质量静态分析将是司空见惯的,但事实远非如此。主要的分析框架实际上不支持企业应用程序,并且提供的分析非常不完整,而且非常不精确,而且完全可以伸缩。在这项工作中,我们提出了两种技术,可以极大地提高Java企业应用程序静态分析的完整性和准确性。第一种技术确定所有企业应用程序框架下的领域特定概念,以可扩展的声明式形式捕获它们,并以一种很大程度上独立于框架的方式实现组件和入口点的建模。第二种技术通过对标准数据结构进行健全的模分析建模,提供了精确性和可伸缩性。在实际的企业应用程序中(比文献中先前的基准测试大一个数量级),我们的技术实现了高度的完整性(平均比传统技术高出4倍以上),并且与最精确的传统分析相比,速度提高了约6倍,在多个指标上具有更高的精度。其结果是jackie,这是一个企业分析框架,可以为实际的企业应用程序提供精确的、高完整性的静态建模。
{"title":"Static analysis of Java enterprise applications: frameworks and caches, the elephants in the room","authors":"A. Antoniadis, Nikos Filippakis, Paddy Krishnan, R. Ramesh, N. Allen, Y. Smaragdakis","doi":"10.1145/3385412.3386026","DOIUrl":"https://doi.org/10.1145/3385412.3386026","url":null,"abstract":"Enterprise applications are a major success domain of Java, and Java is the default setting for much modern static analysis research. It would stand to reason that high-quality static analysis of Java enterprise applications would be commonplace, but this is far from true. Major analysis frameworks feature virtually no support for enterprise applications and offer analyses that are woefully incomplete and vastly imprecise, when at all scalable. In this work, we present two techniques for drastically enhancing the completeness and precision of static analysis for Java enterprise applications. The first technique identifies domain-specific concepts underlying all enterprise application frameworks, captures them in an extensible, declarative form, and achieves modeling of components and entry points in a largely framework-independent way. The second technique offers precision and scalability via a sound-modulo-analysis modeling of standard data structures. In realistic enterprise applications (an order of magnitude larger than prior benchmarks in the literature) our techniques achieve high degrees of completeness (on average more than 4x higher than conventional techniques) and speedups of about 6x compared to the most precise conventional analysis, with higher precision on multiple metrics. The result is JackEE, an enterprise analysis framework that can offer precise, high-completeness static modeling of realistic enterprise applications.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72571412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Zippy LL(1) parsing with derivatives Zippy LL(1)带有导数的解析
Romain Edelmann, Jad Hamza, Viktor Kunčak
In this paper, we present an efficient, functional, and formally verified parsing algorithm for LL(1) context-free expressions based on the concept of derivatives of formal languages. Parsing with derivatives is an elegant parsing technique, which, in the general case, suffers from cubic worst-case time complexity and slow performance in practice. We specialise the parsing with derivatives algorithm to LL(1) context-free expressions, where alternatives can be chosen given a single token of lookahead. We formalise the notion of LL(1) expressions and show how to efficiently check the LL(1) property. Next, we present a novel linear-time parsing with derivatives algorithm for LL(1) expressions operating on a zipper-inspired data structure. We prove the algorithm correct in Coq and present an implementation as a part of Scallion, a parser combinators framework in Scala with enumeration and pretty printing capabilities.
在本文中,我们提出了一种基于形式语言的导数概念的高效、功能和形式验证的LL(1)上下文无关表达式解析算法。导数解析是一种优雅的解析技术,但在实际应用中存在三次最坏情况时间复杂度和性能低下的问题。我们将使用衍生算法的解析专门用于LL(1)上下文无关表达式,其中可以在给定单个向前看令牌的情况下选择替代方案。我们形式化了LL(1)表达式的概念,并展示了如何有效地检查LL(1)性质。接下来,我们提出了一个新颖的线性时间解析与导数算法的LL(1)表达式操作在一个拉链启发的数据结构。我们在Coq中证明了这个算法是正确的,并给出了一个实现,作为Scallion的一部分,Scallion是Scala中的一个解析器组合子框架,具有枚举和漂亮的打印功能。
{"title":"Zippy LL(1) parsing with derivatives","authors":"Romain Edelmann, Jad Hamza, Viktor Kunčak","doi":"10.1145/3385412.3385992","DOIUrl":"https://doi.org/10.1145/3385412.3385992","url":null,"abstract":"In this paper, we present an efficient, functional, and formally verified parsing algorithm for LL(1) context-free expressions based on the concept of derivatives of formal languages. Parsing with derivatives is an elegant parsing technique, which, in the general case, suffers from cubic worst-case time complexity and slow performance in practice. We specialise the parsing with derivatives algorithm to LL(1) context-free expressions, where alternatives can be chosen given a single token of lookahead. We formalise the notion of LL(1) expressions and show how to efficiently check the LL(1) property. Next, we present a novel linear-time parsing with derivatives algorithm for LL(1) expressions operating on a zipper-inspired data structure. We prove the algorithm correct in Coq and present an implementation as a part of Scallion, a parser combinators framework in Scala with enumeration and pretty printing capabilities.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90913194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Faster general parsing through context-free memoization 通过上下文无关的记忆实现更快的一般解析
G. Herman
We present a novel parsing algorithm for all context-free languages. The algorithm features a clean mathematical formulation: parsing is expressed as a series of standard operations on regular languages and relations. Parsing complexity w.r.t. input length matches the state of the art: it is worst-case cubic, quadratic for unambiguous grammars, and linear for LR-regular grammars. What distinguishes our approach is that parsing can be implemented using only immutable, acyclic data structures. We also propose a parsing optimization technique called context-free memoization. It allows handling an overwhelming majority of input symbols using a simple stack and a lookup table, similarly to the operation of a deterministic LR(1) parser. This allows our proof-of-concept implementation to outperform the best current implementations of common generalized parsing algorithms (Earley, GLR, and GLL). Tested on a large Java source corpus, parsing is 3–5 times faster, while recognition—35 times faster.
我们提出了一种适用于所有上下文无关语言的新的解析算法。该算法具有简洁的数学公式:解析被表示为对规则语言和关系的一系列标准操作。解析复杂度w.r.t.输入长度匹配当前的技术水平:最坏情况是三次,对于明确的语法是二次的,对于lr -正则语法是线性的。我们的方法的不同之处在于,解析只能使用不可变的、无循环的数据结构来实现。我们还提出了一种称为上下文无关记忆的解析优化技术。它允许使用一个简单的堆栈和一个查找表处理绝大多数输入符号,类似于确定性LR(1)解析器的操作。这使得我们的概念验证实现的性能优于当前通用解析算法(Earley、GLR和GLL)的最佳实现。在大型Java源语料库上测试,解析速度快3-5倍,而识别速度快35倍。
{"title":"Faster general parsing through context-free memoization","authors":"G. Herman","doi":"10.1145/3385412.3386032","DOIUrl":"https://doi.org/10.1145/3385412.3386032","url":null,"abstract":"We present a novel parsing algorithm for all context-free languages. The algorithm features a clean mathematical formulation: parsing is expressed as a series of standard operations on regular languages and relations. Parsing complexity w.r.t. input length matches the state of the art: it is worst-case cubic, quadratic for unambiguous grammars, and linear for LR-regular grammars. What distinguishes our approach is that parsing can be implemented using only immutable, acyclic data structures. We also propose a parsing optimization technique called context-free memoization. It allows handling an overwhelming majority of input symbols using a simple stack and a lookup table, similarly to the operation of a deterministic LR(1) parser. This allows our proof-of-concept implementation to outperform the best current implementations of common generalized parsing algorithms (Earley, GLR, and GLL). Tested on a large Java source corpus, parsing is 3–5 times faster, while recognition—35 times faster.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77714692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Compiler-directed soft error resilience for lightweight GPU register file protection 编译器定向软错误弹性轻量级GPU注册文件保护
Hongjune Kim, Jianping Zeng, Qingrui Liu, Mohammad Abdel-Majeed, Jaejin Lee, Changhee Jung
This paper presents Penny, a compiler-directed resilience scheme for protecting GPU register files (RF) against soft errors. Penny replaces the conventional error correction code (ECC) based RF protection by using less expensive error detection code (EDC) along with idempotence based recovery. Compared to the ECC protection, Penny can achieve either the same level of RF resilience yet with significantly lower hardware costs or stronger resilience using the same ECC due to its ability to detect multi-bit errors when it is used solely for detection. In particular, to address the lack of store buffers in GPUs, which causes both checkpoint storage overwriting and the high cost of checkpointing stores, Penny provides several compiler optimizations such as storage coloring and checkpoint pruning. Across 25 benchmarks, Penny causes only ≈3% run-time overhead on average.
本文介绍了Penny,一种用于保护GPU寄存器文件(RF)免受软错误影响的编译器定向弹性方案。Penny通过使用更便宜的错误检测码(EDC)以及基于幂等的恢复取代了传统的基于纠错码(ECC)的射频保护。与ECC保护相比,Penny可以实现相同水平的RF弹性,但硬件成本显着降低,或者使用相同的ECC实现更强的弹性,因为它能够在仅用于检测时检测多比特错误。特别是,为了解决gpu中缺乏存储缓冲区的问题,这会导致检查点存储覆盖和检查点存储的高成本,Penny提供了一些编译器优化,如存储着色和检查点修剪。在25个基准测试中,Penny平均只导致≈3%的运行时开销。
{"title":"Compiler-directed soft error resilience for lightweight GPU register file protection","authors":"Hongjune Kim, Jianping Zeng, Qingrui Liu, Mohammad Abdel-Majeed, Jaejin Lee, Changhee Jung","doi":"10.1145/3385412.3386033","DOIUrl":"https://doi.org/10.1145/3385412.3386033","url":null,"abstract":"This paper presents Penny, a compiler-directed resilience scheme for protecting GPU register files (RF) against soft errors. Penny replaces the conventional error correction code (ECC) based RF protection by using less expensive error detection code (EDC) along with idempotence based recovery. Compared to the ECC protection, Penny can achieve either the same level of RF resilience yet with significantly lower hardware costs or stronger resilience using the same ECC due to its ability to detect multi-bit errors when it is used solely for detection. In particular, to address the lack of store buffers in GPUs, which causes both checkpoint storage overwriting and the high cost of checkpointing stores, Penny provides several compiler optimizations such as storage coloring and checkpoint pruning. Across 25 benchmarks, Penny causes only ≈3% run-time overhead on average.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90498558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Scalable validation of binary lifters 二进制提升器的可扩展验证
Sandeep Dasgupta, S. Dinesh, Deepan Venkatesh, Vikram S. Adve, Christopher W. Fletcher
Validating the correctness of binary lifters is pivotal to gain trust in binary analysis, especially when used in scenarios where correctness is important. Existing approaches focus on validating the correctness of lifting instructions or basic blocks in isolation and do not scale to full programs. In this work, we show that formal translation validation of single instructions for a complex ISA like x86-64 is not only practical, but can be used as a building block for scalable full-program validation. Our work is the first to do translation validation of single instructions on an architecture as extensive as x86-64, uses the most precise formal semantics available, and has the widest coverage in terms of the number of instructions tested for correctness. Next, we develop a novel technique that uses validated instructions to enable program-level validation, without resorting to performance-heavy semantic equivalence checking. Specifically, we compose the validated IR sequences using a tool we develop called Compositional Lifter to create a reference standard. The semantic equivalence check between the reference and the lifter output is then reduced to a graph-isomorphism check through the use of semantic preserving transformations. The translation validation of instructions in isolation revealed 29 new bugs in McSema – a mature open-source lifter from x86-64 to LLVM IR. Towards the validation of full programs, our approach was able to prove the translational correctness of 2254/2348 functions taken from LLVM’s single-source benchmark test-suite.
验证二进制提升器的正确性对于获得对二进制分析的信任至关重要,特别是在正确性很重要的场景中使用时。现有的方法侧重于单独验证提升指令或基本块的正确性,而不能扩展到整个程序。在这项工作中,我们证明了对像x86-64这样的复杂ISA的单个指令的正式转换验证不仅是实用的,而且可以用作可扩展的完整程序验证的构建块。我们的工作是第一个在像x86-64这样广泛的体系结构上对单个指令进行翻译验证,使用最精确的形式语义,并且在测试正确性的指令数量方面具有最广泛的覆盖范围。接下来,我们开发了一种新技术,该技术使用经过验证的指令来启用程序级验证,而无需诉诸于性能繁重的语义等价检查。具体来说,我们使用我们开发的称为composition Lifter的工具来组合经过验证的IR序列,以创建参考标准。然后,通过使用语义保持转换,将引用和升降机输出之间的语义等价性检查简化为图同构检查。单独的指令翻译验证揭示了McSema中的29个新错误——McSema是一个成熟的开源程序,从x86-64升级到LLVM IR。为了验证完整的程序,我们的方法能够证明从LLVM的单源基准测试套件中获取的2254/2348函数的翻译正确性。
{"title":"Scalable validation of binary lifters","authors":"Sandeep Dasgupta, S. Dinesh, Deepan Venkatesh, Vikram S. Adve, Christopher W. Fletcher","doi":"10.1145/3385412.3385964","DOIUrl":"https://doi.org/10.1145/3385412.3385964","url":null,"abstract":"Validating the correctness of binary lifters is pivotal to gain trust in binary analysis, especially when used in scenarios where correctness is important. Existing approaches focus on validating the correctness of lifting instructions or basic blocks in isolation and do not scale to full programs. In this work, we show that formal translation validation of single instructions for a complex ISA like x86-64 is not only practical, but can be used as a building block for scalable full-program validation. Our work is the first to do translation validation of single instructions on an architecture as extensive as x86-64, uses the most precise formal semantics available, and has the widest coverage in terms of the number of instructions tested for correctness. Next, we develop a novel technique that uses validated instructions to enable program-level validation, without resorting to performance-heavy semantic equivalence checking. Specifically, we compose the validated IR sequences using a tool we develop called Compositional Lifter to create a reference standard. The semantic equivalence check between the reference and the lifter output is then reduced to a graph-isomorphism check through the use of semantic preserving transformations. The translation validation of instructions in isolation revealed 29 new bugs in McSema – a mature open-source lifter from x86-64 to LLVM IR. Towards the validation of full programs, our approach was able to prove the translational correctness of 2254/2348 functions taken from LLVM’s single-source benchmark test-suite.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88454020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Blended, precise semantic program embeddings 混合,精确的语义程序嵌入
Ke Wang, Z. Su
Learning neural program embeddings is key to utilizing deep neural networks in program languages research --- precise and efficient program representations enable the application of deep models to a wide range of program analysis tasks. Existing approaches predominately learn to embed programs from their source code, and, as a result, they do not capture deep, precise program semantics. On the other hand, models learned from runtime information critically depend on the quality of program executions, thus leading to trained models with highly variant quality. This paper tackles these inherent weaknesses of prior approaches by introducing a new deep neural network, Liger, which learns program representations from a mixture of symbolic and concrete execution traces. We have evaluated Liger on two tasks: method name prediction and semantics classification. Results show that Liger is significantly more accurate than the state-of-the-art static model code2seq in predicting method names, and requires on average around 10x fewer executions covering nearly 4x fewer paths than the state-of-the-art dynamic model DYPRO in both tasks. Liger offers a new, interesting design point in the space of neural program embeddings and opens up this new direction for exploration.
学习神经程序嵌入是在程序语言研究中利用深度神经网络的关键——精确和有效的程序表示使深度模型能够应用于广泛的程序分析任务。现有的方法主要是从源代码学习嵌入程序,因此,它们不能捕获深入、精确的程序语义。另一方面,从运行时信息中学习的模型严重依赖于程序执行的质量,从而导致训练的模型具有高度变化的质量。本文通过引入一种新的深度神经网络Liger来解决先前方法的这些固有弱点,Liger从符号和具体执行痕迹的混合中学习程序表示。我们在两个任务上对Liger进行了评估:方法名称预测和语义分类。结果表明,在预测方法名方面,Liger比最先进的静态模型code2seq要准确得多,并且在这两个任务中,与最先进的动态模型DYPRO相比,平均需要的执行次数减少了10倍左右,路径减少了近4倍。Liger在神经程序嵌入领域提供了一个新的、有趣的设计点,并开辟了这一探索的新方向。
{"title":"Blended, precise semantic program embeddings","authors":"Ke Wang, Z. Su","doi":"10.1145/3385412.3385999","DOIUrl":"https://doi.org/10.1145/3385412.3385999","url":null,"abstract":"Learning neural program embeddings is key to utilizing deep neural networks in program languages research --- precise and efficient program representations enable the application of deep models to a wide range of program analysis tasks. Existing approaches predominately learn to embed programs from their source code, and, as a result, they do not capture deep, precise program semantics. On the other hand, models learned from runtime information critically depend on the quality of program executions, thus leading to trained models with highly variant quality. This paper tackles these inherent weaknesses of prior approaches by introducing a new deep neural network, Liger, which learns program representations from a mixture of symbolic and concrete execution traces. We have evaluated Liger on two tasks: method name prediction and semantics classification. Results show that Liger is significantly more accurate than the state-of-the-art static model code2seq in predicting method names, and requires on average around 10x fewer executions covering nearly 4x fewer paths than the state-of-the-art dynamic model DYPRO in both tasks. Liger offers a new, interesting design point in the space of neural program embeddings and opens up this new direction for exploration.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74587738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Detecting network load violations for distributed control planes 检测网络负载违反分布式控制平面
Kausik Subramanian, Anubhavnidhi Abhashkumar, Loris D'antoni, Aditya Akella
One of the major challenges faced by network operators pertains to whether their network can meet input traffic demand, avoid overload, and satisfy service-level agreements. Automatically verifying if no network links are overloaded is complicated---requires modeling frequent network failures, complex routing and load-balancing technologies, and evolving traffic requirements. We present QARC, a distributed control plane abstraction that can automatically verify whether a control plane may cause link-load violations under failures. QARC is fully automatic and can help operators program networks that are more resilient to failures and upgrade the network to avoid violations. We apply QARC to real datacenter and ISP networks and find interesting cases of load violations. QARC can detect violations in under an hour.
网络运营商面临的主要挑战之一是其网络能否满足输入流量需求,避免过载,并满足服务水平协议。自动验证是否没有网络链路过载是复杂的——需要对频繁的网络故障、复杂的路由和负载平衡技术以及不断变化的流量需求进行建模。我们提出了一种分布式控制平面抽象QARC,它可以自动验证控制平面在故障情况下是否可能导致链路负载冲突。QARC是全自动的,可以帮助运营商规划更有故障弹性的网络,并升级网络以避免违规。我们将QARC应用于真实的数据中心和ISP网络,发现了一些有趣的负载违规案例。QARC可以在一小时内检测出违规行为。
{"title":"Detecting network load violations for distributed control planes","authors":"Kausik Subramanian, Anubhavnidhi Abhashkumar, Loris D'antoni, Aditya Akella","doi":"10.1145/3385412.3385976","DOIUrl":"https://doi.org/10.1145/3385412.3385976","url":null,"abstract":"One of the major challenges faced by network operators pertains to whether their network can meet input traffic demand, avoid overload, and satisfy service-level agreements. Automatically verifying if no network links are overloaded is complicated---requires modeling frequent network failures, complex routing and load-balancing technologies, and evolving traffic requirements. We present QARC, a distributed control plane abstraction that can automatically verify whether a control plane may cause link-load violations under failures. QARC is fully automatic and can help operators program networks that are more resilient to failures and upgrade the network to avoid violations. We apply QARC to real datacenter and ISP networks and find interesting cases of load violations. QARC can detect violations in under an hour.","PeriodicalId":20580,"journal":{"name":"Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77835577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
期刊
Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1