Context-sensitive data-dependence analysis via linear conjunctive language reachability

Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages Pub Date : 2017-01-01 DOI:10.1145/3009837.3009848

Qirun Zhang, Z. Su

{"title":"Context-sensitive data-dependence analysis via linear conjunctive language reachability","authors":"Qirun Zhang, Z. Su","doi":"10.1145/3009837.3009848","DOIUrl":null,"url":null,"abstract":"Many program analysis problems can be formulated as graph reachability problems. In the literature, context-free language (CFL) reachability has been the most popular formulation and can be computed in subcubic time. The context-sensitive data-dependence analysis is a fundamental abstraction that can express a broad range of program analysis problems. It essentially describes an interleaved matched-parenthesis language reachability problem. The language is not context-free, and the problem is well-known to be undecidable. In practice, many program analyses adopt CFL-reachability to exactly model the matched parentheses for either context-sensitivity or structure-transmitted data-dependence, but not both. Thus, the CFL-reachability formulation for context-sensitive data-dependence analysis is inherently an approximation. To support more precise and scalable analyses, this paper introduces linear conjunctive language (LCL) reachability, a new, expressive class of graph reachability. LCL not only contains the interleaved matched-parenthesis language, but is also closed under all set-theoretic operations. Given a graph with n nodes and m edges, we propose an O(mn) time approximation algorithm for solving all-pairs LCL-reachability, which is asymptotically better than known CFL-reachability algorithms. Our formulation and algorithm offer a new perspective on attacking the aforementioned undecidable problem - the LCL-reachability formulation is exact, while the LCL-reachability algorithm yields a sound approximation. We have applied the LCL-reachability framework to two existing client analyses. The experimental results show that the LCL-reachability framework is both more precise and scalable than the traditional CFL-reachability framework. This paper opens up the opportunity to exploit LCL-reachability in program analysis.","PeriodicalId":20657,"journal":{"name":"Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages","volume":"56 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3009837.3009848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

Abstract

Many program analysis problems can be formulated as graph reachability problems. In the literature, context-free language (CFL) reachability has been the most popular formulation and can be computed in subcubic time. The context-sensitive data-dependence analysis is a fundamental abstraction that can express a broad range of program analysis problems. It essentially describes an interleaved matched-parenthesis language reachability problem. The language is not context-free, and the problem is well-known to be undecidable. In practice, many program analyses adopt CFL-reachability to exactly model the matched parentheses for either context-sensitivity or structure-transmitted data-dependence, but not both. Thus, the CFL-reachability formulation for context-sensitive data-dependence analysis is inherently an approximation. To support more precise and scalable analyses, this paper introduces linear conjunctive language (LCL) reachability, a new, expressive class of graph reachability. LCL not only contains the interleaved matched-parenthesis language, but is also closed under all set-theoretic operations. Given a graph with n nodes and m edges, we propose an O(mn) time approximation algorithm for solving all-pairs LCL-reachability, which is asymptotically better than known CFL-reachability algorithms. Our formulation and algorithm offer a new perspective on attacking the aforementioned undecidable problem - the LCL-reachability formulation is exact, while the LCL-reachability algorithm yields a sound approximation. We have applied the LCL-reachability framework to two existing client analyses. The experimental results show that the LCL-reachability framework is both more precise and scalable than the traditional CFL-reachability framework. This paper opens up the opportunity to exploit LCL-reachability in program analysis.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于线性合取语言可达性的上下文敏感数据依赖性分析

许多程序分析问题可以表述为图可达性问题。在文献中，上下文无关语言(CFL)可达性是最流行的表述，可以在次立方时间内计算。上下文敏感的数据依赖分析是一种基本的抽象，它可以表达广泛的程序分析问题。它本质上描述了交错匹配括号语言的可达性问题。这种语言不是与上下文无关的，众所周知，这个问题是无法确定的。在实践中，许多程序分析采用cfl可达性来精确地为上下文敏感性或结构传输数据依赖性匹配圆括号建模，但不是两者兼而有之。因此，上下文敏感的数据依赖性分析的cfl可达性公式本质上是一个近似值。为了支持更精确和可扩展的分析，本文引入了线性连接语言(LCL)可达性，这是一种新的、表达性强的图可达性。LCL不仅包含交错匹配括号语言，而且在所有集合论操作下都是封闭的。给定一个有n个节点和m条边的图，我们提出了一个O(mn)时间近似算法来求解全对lcl -可达性，该算法渐近地优于已知的cfl -可达性算法。我们的公式和算法为解决上述不可确定问题提供了一个新的视角——lcl -可达性公式是精确的，而lcl -可达性算法产生了一个良好的近似。我们已经将lcl可达性框架应用于两个现有的客户端分析。实验结果表明，与传统的节能灯可达性框架相比，lcl -可达性框架具有更高的精度和可扩展性。本文为在程序分析中利用lcl可达性提供了机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助