Michael Reif, Michael Eichberg, Ben Hermann, Johannes Lerch, M. Mezini
{"title":"Call graph construction for Java libraries","authors":"Michael Reif, Michael Eichberg, Ben Hermann, Johannes Lerch, M. Mezini","doi":"10.1145/2950290.2950312","DOIUrl":null,"url":null,"abstract":"Today, every application uses software libraries. Yet, while a lot of research exists w.r.t. analyzing applications, research that targets the analysis of libraries independent of any application is scarce. This is unfortunate, because, for developers of libraries, such as the Java Development Kit (JDK), it is crucial to ensure that the library behaves as intended regardless of how it is used. To fill this gap, we discuss the construction of call graphs for libraries that abstract over all potential library usages. Call graphs are particularly relevant as they are a precursor of many advanced analyses, such as inter-procedural data-flow analyses. We show that the current practice of using call graph algorithms designed for applications to analyze libraries leads to call graphs that, at the same time, lack relevant call edges and contain unnecessary edges. This motivates the need for call graph construction algorithms dedicated to libraries. Unlike algorithms for applications, call graph construction algorithms for libraries must take into consideration the goals of subsequent analyses. Specifically, we show that it is essential to distinguish between the scenario of an analysis for potential exploitable vulnerabilities from the scenario of an analysis for general software quality attributes, e.g., dead methods or unused fields. This distinction affects the decision about what constitutes the library-private implementation, which therefore, needs special treatment. Thus, building one call graph that satisfies all needs is not sensical. Overall, we observed that the proposed call graph algorithms reduce the number of call edges up to 30% when compared to existing approaches.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2950290.2950312","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 38
Abstract
Today, every application uses software libraries. Yet, while a lot of research exists w.r.t. analyzing applications, research that targets the analysis of libraries independent of any application is scarce. This is unfortunate, because, for developers of libraries, such as the Java Development Kit (JDK), it is crucial to ensure that the library behaves as intended regardless of how it is used. To fill this gap, we discuss the construction of call graphs for libraries that abstract over all potential library usages. Call graphs are particularly relevant as they are a precursor of many advanced analyses, such as inter-procedural data-flow analyses. We show that the current practice of using call graph algorithms designed for applications to analyze libraries leads to call graphs that, at the same time, lack relevant call edges and contain unnecessary edges. This motivates the need for call graph construction algorithms dedicated to libraries. Unlike algorithms for applications, call graph construction algorithms for libraries must take into consideration the goals of subsequent analyses. Specifically, we show that it is essential to distinguish between the scenario of an analysis for potential exploitable vulnerabilities from the scenario of an analysis for general software quality attributes, e.g., dead methods or unused fields. This distinction affects the decision about what constitutes the library-private implementation, which therefore, needs special treatment. Thus, building one call graph that satisfies all needs is not sensical. Overall, we observed that the proposed call graph algorithms reduce the number of call edges up to 30% when compared to existing approaches.
今天,每个应用程序都使用软件库。然而,尽管存在大量的研究来分析应用程序,但是针对独立于任何应用程序的库分析的研究却很少。这是不幸的,因为对于库(如Java Development Kit (JDK))的开发人员来说,无论如何使用,确保库按照预期的方式运行是至关重要的。为了填补这一空白,我们讨论了抽象所有潜在库用法的库的调用图的构造。调用图特别重要,因为它们是许多高级分析的先驱,例如过程间数据流分析。我们表明,目前使用为应用程序设计的调用图算法来分析库的做法导致调用图同时缺乏相关的调用边并包含不必要的边。这激发了对专用于库的调用图构造算法的需求。与应用程序的算法不同,库的调用图构造算法必须考虑后续分析的目标。具体地说,我们表明有必要区分分析潜在可利用漏洞的场景和分析一般软件质量属性的场景,例如,失效方法或未使用的字段。这种区别影响到决定是什么构成了库私有实现,因此需要特殊处理。因此,构建一个满足所有需求的调用图是没有意义的。总的来说,我们观察到,与现有方法相比,所提出的调用图算法将调用边的数量减少了30%。