首页 > 最新文献

2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)最新文献

英文 中文
Automatically Quantifying the Impact of a Change in Systems (Journal-First Abstract) 自动量化系统变化的影响(期刊第一篇摘要)
Nada Almasri, L. Tahat, B. Korel
Software maintenance is becoming more challenging with the increased complexity of the software and the frequently applied changes. Performing impact analysis before the actual implementation of a change is a crucial task during system maintenance. While many tools and techniques are available to measure the impact of a change at the code level, only a few research work is done to measure the impact of a change at an earlier stage in the development process. This work introduces an approach to measure the impact of a change at the model level.
随着软件复杂性的增加和频繁应用的更改,软件维护变得越来越具有挑战性。在实际实现更改之前执行影响分析是系统维护期间的关键任务。虽然有许多工具和技术可用于在代码级别度量变更的影响,但是在开发过程的早期阶段度量变更的影响的研究工作却很少。这项工作引入了一种在模型级别度量变更影响的方法。
{"title":"Automatically Quantifying the Impact of a Change in Systems (Journal-First Abstract)","authors":"Nada Almasri, L. Tahat, B. Korel","doi":"10.1145/3238147.3241984","DOIUrl":"https://doi.org/10.1145/3238147.3241984","url":null,"abstract":"Software maintenance is becoming more challenging with the increased complexity of the software and the frequently applied changes. Performing impact analysis before the actual implementation of a change is a crucial task during system maintenance. While many tools and techniques are available to measure the impact of a change at the code level, only a few research work is done to measure the impact of a change at an earlier stage in the development process. This work introduces an approach to measure the impact of a change at the model level.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"10 1","pages":"952-952"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85544659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Effective API Recommendation without Historical Software Repositories 没有历史软件存储库的有效API推荐
Xiaoyu Liu, LiGuo Huang, Vincent Ng
It is time-consuming and labor-intensive to learn and locate the correct API for programming tasks. Thus, it is beneficial to perform API recommendation automatically. The graph-based statistical model has been shown to recommend top-10 API candidates effectively. It falls short, however, in accurately recommending an actual top-1 API. To address this weakness, we propose RecRank, an approach and tool that applies a novel ranking-based discriminative approach leveraging API usage path features to improve top-1 API recommendation. Empirical evaluation on a large corpus of (1385+8) open source projects shows that RecRank significantly improves top-1 API recommendation accuracy and mean reciprocal rank when compared to state-of-the-art API recommendation approaches.
为编程任务学习和定位正确的API是非常耗时和费力的。因此,自动执行API推荐是有益的。基于图的统计模型已被证明可以有效地推荐前10个候选API。然而,它在准确推荐一个真正的顶级API方面做得不够。为了解决这个缺点,我们提出了RecRank,这是一种方法和工具,它应用了一种基于排名的新方法,利用API使用路径特征来改进前1名的API推荐。对(1385+8)个开源项目的大型语料库的实证评估表明,与最先进的API推荐方法相比,RecRank显著提高了top-1 API推荐的准确性和平均倒数排名。
{"title":"Effective API Recommendation without Historical Software Repositories","authors":"Xiaoyu Liu, LiGuo Huang, Vincent Ng","doi":"10.1145/3238147.3238216","DOIUrl":"https://doi.org/10.1145/3238147.3238216","url":null,"abstract":"It is time-consuming and labor-intensive to learn and locate the correct API for programming tasks. Thus, it is beneficial to perform API recommendation automatically. The graph-based statistical model has been shown to recommend top-10 API candidates effectively. It falls short, however, in accurately recommending an actual top-1 API. To address this weakness, we propose RecRank, an approach and tool that applies a novel ranking-based discriminative approach leveraging API usage path features to improve top-1 API recommendation. Empirical evaluation on a large corpus of (1385+8) open source projects shows that RecRank significantly improves top-1 API recommendation accuracy and mean reciprocal rank when compared to state-of-the-art API recommendation approaches.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"145 1","pages":"282-292"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82270719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Descartes: A PITest Engine to Detect Pseudo-Tested Methods: Tool Demonstration 笛卡儿:一个检测伪测试方法的测试引擎:工具演示
O. Vera-Pérez, Monperrus Martin, B. Baudry
Descartes is a tool that implements extreme mutation operators and aims at finding pseudo-tested methods in Java projects. It leverages the efficient transformation and runtime features of PITest. The demonstration compares Descartes with Gregor, the default mutation engine provided by PITest, in a set of real open source projects. It considers the execution time, number of mutants created and the relationship between the mutation scores produced by both engines. It provides some insights on the main features exposed by Descartes.
笛卡儿是一个实现极端变异操作符的工具,旨在发现Java项目中的伪测试方法。它利用了PITest的高效转换和运行时特性。演示将笛卡尔与Gregor(由PITest提供的默认突变引擎)在一组真正的开源项目中进行比较。它考虑了执行时间、产生的突变数以及两个引擎产生的突变分数之间的关系。它提供了一些关于笛卡尔所揭示的主要特征的见解。
{"title":"Descartes: A PITest Engine to Detect Pseudo-Tested Methods: Tool Demonstration","authors":"O. Vera-Pérez, Monperrus Martin, B. Baudry","doi":"10.1145/3238147.3240474","DOIUrl":"https://doi.org/10.1145/3238147.3240474","url":null,"abstract":"Descartes is a tool that implements extreme mutation operators and aims at finding pseudo-tested methods in Java projects. It leverages the efficient transformation and runtime features of PITest. The demonstration compares Descartes with Gregor, the default mutation engine provided by PITest, in a set of real open source projects. It considers the execution time, number of mutants created and the relationship between the mutation scores produced by both engines. It provides some insights on the main features exposed by Descartes.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"908-911"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83786186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Delta Debugging Microservice Systems Delta调试微服务系统
Xiaoping Zhou, Xin Peng, Tao Xie, Jun Sun, Wenhai Li, Chao Ji, Dan Ding
Debugging microservice systems involves the deployment and manipulation of microservice systems on a containerized environment and faces unique challenges due to the high complexity and dynamism of microservices. To address these challenges, in this paper, we propose a debugging approach for microservice systems based on the delta debugging algorithm, which is to minimize failure-inducing deltas of circumstances (e.g., deployment, environmental configurations) for effective debugging. Our approach includes novel techniques for defining, deploying/manipulating, and executing deltas following the idea of delta debugging. In particular, to construct a (failing) circumstance space for delta debugging to minimize, our approach defines a set of dimensions that can affect the execution of microservice systems. Our experimental study on a medium-size microservice benchmark system shows that our approach can effectively identify failure-inducing deltas that help diagnose the root causes.
调试微服务系统涉及在容器化环境中部署和操作微服务系统,并且由于微服务的高复杂性和动态性而面临独特的挑战。为了解决这些挑战,在本文中,我们提出了一种基于增量调试算法的微服务系统调试方法,该方法可以最大限度地减少导致故障的环境增量(例如,部署,环境配置),以实现有效的调试。我们的方法包括定义、部署/操作和执行增量的新技术,遵循增量调试的思想。特别是,为了构建一个(失败的)环境空间,以便最小化增量调试,我们的方法定义了一组可能影响微服务系统执行的维度。我们在一个中型微服务基准系统上的实验研究表明,我们的方法可以有效地识别故障诱发delta,从而帮助诊断根本原因。
{"title":"Delta Debugging Microservice Systems","authors":"Xiaoping Zhou, Xin Peng, Tao Xie, Jun Sun, Wenhai Li, Chao Ji, Dan Ding","doi":"10.1145/3238147.3240730","DOIUrl":"https://doi.org/10.1145/3238147.3240730","url":null,"abstract":"Debugging microservice systems involves the deployment and manipulation of microservice systems on a containerized environment and faces unique challenges due to the high complexity and dynamism of microservices. To address these challenges, in this paper, we propose a debugging approach for microservice systems based on the delta debugging algorithm, which is to minimize failure-inducing deltas of circumstances (e.g., deployment, environmental configurations) for effective debugging. Our approach includes novel techniques for defining, deploying/manipulating, and executing deltas following the idea of delta debugging. In particular, to construct a (failing) circumstance space for delta debugging to minimize, our approach defines a set of dimensions that can affect the execution of microservice systems. Our experimental study on a medium-size microservice benchmark system shows that our approach can effectively identify failure-inducing deltas that help diagnose the root causes.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"39 1","pages":"802-807"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80161115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
SRCIROR: A Toolset for Mutation Testing of C Source Code and LLVM Intermediate Representation SRCIROR:用于C源代码和LLVM中间表示的突变测试的工具集
Farah Hariri, A. Shi
We present SRCIROR (pronounced “sorcerer”), a toolset for performing mutation testing at the levels of C/C++ source code (SRC) and the LLVM compiler intermediate representation (IR). At the SRC level, SRCIROR identifies program constructs for mutation by pattern-matching on the Clang AST. At the IR level, SRCIROR directly mutates the LLVM IR instructions through LLVM passes. Our implementation enables SRCIROR to (1) handle any program that Clang can handle, extending to large programs with a minimal overhead, and (2) have a small percentage of invalid mutants that do not compile. SRCIROR enables performing mutation testing using the same classes of mutation operators at both the SRC and IR levels, and it is easily extensible to support more operators. In addition, SRCIROR can collect coverage to generate mutants only for covered code elements. Our tool is publicly available on GitHub (https://github.com/TestingResearchIllinois/srciror). We evaluate SRCIROR on Coreutils subjects. Our evaluation shows interesting differences between SRC and IR, demonstrating the value of SR-CIROR in enabling mutation testing research across different levels of code representation.
我们提出了SRCIROR(发音为“sorcerer”),一个用于在C/ c++源代码(SRC)和LLVM编译器中间表示(IR)级别执行突变测试的工具集。在SRC级别,SRCIROR通过Clang AST上的模式匹配来识别要进行突变的程序结构。在IR级别,SRCIROR通过LLVM传递直接对LLVM IR指令进行突变。我们的实现使SRCIROR能够(1)处理Clang可以处理的任何程序,以最小的开销扩展到大型程序,以及(2)具有一小部分无法编译的无效突变。SRCIROR允许在SRC和IR级别使用相同类的突变操作符执行突变测试,并且它很容易扩展以支持更多操作符。此外,SRCIROR可以收集覆盖范围,仅为覆盖的代码元素生成突变。我们的工具在GitHub (https://github.com/TestingResearchIllinois/srciror)上公开可用。我们评估了coretils受试者的SRCIROR。我们的评估显示了SRC和IR之间有趣的差异,证明了SR-CIROR在跨不同级别代码表示的突变测试研究中的价值。
{"title":"SRCIROR: A Toolset for Mutation Testing of C Source Code and LLVM Intermediate Representation","authors":"Farah Hariri, A. Shi","doi":"10.1145/3238147.3240482","DOIUrl":"https://doi.org/10.1145/3238147.3240482","url":null,"abstract":"We present SRCIROR (pronounced “sorcerer”), a toolset for performing mutation testing at the levels of C/C++ source code (SRC) and the LLVM compiler intermediate representation (IR). At the SRC level, SRCIROR identifies program constructs for mutation by pattern-matching on the Clang AST. At the IR level, SRCIROR directly mutates the LLVM IR instructions through LLVM passes. Our implementation enables SRCIROR to (1) handle any program that Clang can handle, extending to large programs with a minimal overhead, and (2) have a small percentage of invalid mutants that do not compile. SRCIROR enables performing mutation testing using the same classes of mutation operators at both the SRC and IR levels, and it is easily extensible to support more operators. In addition, SRCIROR can collect coverage to generate mutants only for covered code elements. Our tool is publicly available on GitHub (https://github.com/TestingResearchIllinois/srciror). We evaluate SRCIROR on Coreutils subjects. Our evaluation shows interesting differences between SRC and IR, demonstrating the value of SR-CIROR in enabling mutation testing research across different levels of code representation.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"10 1","pages":"860-863"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85183959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Mining File Histories: Should We Consider Branches? 挖掘文件历史:我们应该考虑分支吗?
V. Kovalenko, Fabio Palomba, Alberto Bacchelli
Modern distributed version control systems, such as Git, offer support for branching — the possibility to develop parts of software outside the master trunk. Consideration of the repository structure in Mining Software Repository (MSR) studies requires a thorough approach to mining, but there is no well-documented, widespread methodology regarding the handling of merge commits and branches. Moreover, there is still a lack of knowledge of the extent to which considering branches during MSR studies impacts the results of the studies. In this study, we set out to evaluate the importance of proper handling of branches when calculating file modification histories. We analyze over 1,400 Git repositories of four open source ecosystems and compute modification histories for over two million files, using two different algorithms. One algorithm only follows the first parent of each commit when traversing the repository, the other returns the full modification history of a file across all branches. We show that the two algorithms consistently deliver different results, but the scale of the difference varies across projects and ecosystems. Further, we evaluate the importance of accurate mining of file histories by comparing the performance of common techniques that rely on file modification history — reviewer recommendation, change recommendation, and defect prediction — for two algorithms of file history retrieval. We find that considering full file histories leads to an increase in the techniques' performance that is rather modest.
现代分布式版本控制系统,如Git,提供了分支支持——在主主干之外开发软件部分的可能性。在挖掘软件存储库(MSR)研究中,考虑存储库结构需要一种彻底的挖掘方法,但是关于合并提交和分支的处理,没有文档完备的、广泛的方法。此外,在MSR研究中考虑分支对研究结果的影响程度仍然缺乏认识。在本研究中,我们开始评估在计算文件修改历史时正确处理分支的重要性。我们分析了四个开源生态系统的1400多个Git存储库,并使用两种不同的算法计算了超过200万个文件的修改历史。当遍历存储库时,一种算法只遵循每个提交的第一个父节点,另一种算法返回跨所有分支的文件的完整修改历史。我们表明,这两种算法始终提供不同的结果,但差异的规模因项目和生态系统而异。此外,我们通过比较两种文件历史检索算法中依赖于文件修改历史的常用技术(审阅者推荐、变更推荐和缺陷预测)的性能来评估准确挖掘文件历史的重要性。我们发现,考虑完整的文件历史记录会导致技术性能的提高,但这种提高是相当适度的。
{"title":"Mining File Histories: Should We Consider Branches?","authors":"V. Kovalenko, Fabio Palomba, Alberto Bacchelli","doi":"10.1145/3238147.3238169","DOIUrl":"https://doi.org/10.1145/3238147.3238169","url":null,"abstract":"Modern distributed version control systems, such as Git, offer support for branching — the possibility to develop parts of software outside the master trunk. Consideration of the repository structure in Mining Software Repository (MSR) studies requires a thorough approach to mining, but there is no well-documented, widespread methodology regarding the handling of merge commits and branches. Moreover, there is still a lack of knowledge of the extent to which considering branches during MSR studies impacts the results of the studies. In this study, we set out to evaluate the importance of proper handling of branches when calculating file modification histories. We analyze over 1,400 Git repositories of four open source ecosystems and compute modification histories for over two million files, using two different algorithms. One algorithm only follows the first parent of each commit when traversing the repository, the other returns the full modification history of a file across all branches. We show that the two algorithms consistently deliver different results, but the scale of the difference varies across projects and ecosystems. Further, we evaluate the importance of accurate mining of file histories by comparing the performance of common techniques that rely on file modification history — reviewer recommendation, change recommendation, and defect prediction — for two algorithms of file history retrieval. We find that considering full file histories leads to an increase in the techniques' performance that is rather modest.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"19 1","pages":"202-213"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88115509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Reducing Interactive Refactoring Effort via Clustering-Based Multi-objective Search 基于聚类的多目标搜索减少交互式重构工作量
Vahid Alizadeh, M. Kessentini
Refactoring is nowadays widely adopted in the industry because bad design decisions can be very costly and extremely risky. On the one hand, automated refactoring does not always lead to the desired design. On the other hand, manual refactoring is error-prone, time-consuming and not practical for radical changes. Thus, recent research trends in the field focused on integrating developers feedback into automated refactoring recommendations because developers understand the problem domain intuitively and may have a clear target design in mind. However, this interactive process can be repetitive, expensive, and tedious since developers must evaluate recommended refactorings, and adapt them to the targeted design especially in large systems where the number of possible strategies can grow exponentially. In this paper, we propose an interactive approach combining the use of multi-objective and unsupervised learning to reduce the developer's interaction effort when refactoring systems. We generate, first, using multi-objective search different possible refactoring strategies by finding a trade-off between several conflicting quality attributes. Then, an unsupervised learning algorithm clusters the different trade-off solutions, called the Pareto front, to guide the developers in selecting their region of interests and reduce the number of refactoring options to explore. The feedback from the developer, both at the cluster and solution levels, are used to automatically generate constraints to reduce the search space in the next iterations and focus on the region of developer preferences. We selected 14 active developers to manually evaluate the effectiveness our tool on 5 open source projects and one industrial system. The results show that the participants found their desired refactorings faster and more accurate than the current state of the art.
如今,重构在行业中被广泛采用,因为糟糕的设计决策可能非常昂贵且风险极大。一方面,自动化重构并不总能得到想要的设计。另一方面,手动重构容易出错,耗时,而且对于根本的更改不实用。因此,该领域最近的研究趋势集中在将开发人员的反馈集成到自动化重构建议中,因为开发人员可以直观地理解问题域,并且可能在头脑中有一个明确的目标设计。然而,这种交互过程可能是重复的、昂贵的和乏味的,因为开发人员必须评估推荐的重构,并使它们适应目标设计,特别是在可能的策略数量呈指数级增长的大型系统中。在本文中,我们提出了一种结合使用多目标和无监督学习的交互式方法,以减少开发人员在重构系统时的交互努力。首先,我们使用多目标搜索,通过在几个相互冲突的质量属性之间找到权衡,生成不同可能的重构策略。然后,一种无监督学习算法将不同的权衡方案聚类,称为帕累托前沿,以指导开发人员选择他们感兴趣的区域,并减少需要探索的重构选项的数量。来自开发人员的反馈,无论是在集群还是解决方案级别,都用于自动生成约束,以减少下一个迭代中的搜索空间,并关注开发人员首选项的区域。我们选择了14个活跃的开发人员来手动评估我们的工具在5个开源项目和一个工业系统上的有效性。结果表明,参与者发现他们想要的重构比当前的技术状态更快、更准确。
{"title":"Reducing Interactive Refactoring Effort via Clustering-Based Multi-objective Search","authors":"Vahid Alizadeh, M. Kessentini","doi":"10.1145/3238147.3238217","DOIUrl":"https://doi.org/10.1145/3238147.3238217","url":null,"abstract":"Refactoring is nowadays widely adopted in the industry because bad design decisions can be very costly and extremely risky. On the one hand, automated refactoring does not always lead to the desired design. On the other hand, manual refactoring is error-prone, time-consuming and not practical for radical changes. Thus, recent research trends in the field focused on integrating developers feedback into automated refactoring recommendations because developers understand the problem domain intuitively and may have a clear target design in mind. However, this interactive process can be repetitive, expensive, and tedious since developers must evaluate recommended refactorings, and adapt them to the targeted design especially in large systems where the number of possible strategies can grow exponentially. In this paper, we propose an interactive approach combining the use of multi-objective and unsupervised learning to reduce the developer's interaction effort when refactoring systems. We generate, first, using multi-objective search different possible refactoring strategies by finding a trade-off between several conflicting quality attributes. Then, an unsupervised learning algorithm clusters the different trade-off solutions, called the Pareto front, to guide the developers in selecting their region of interests and reduce the number of refactoring options to explore. The feedback from the developer, both at the cluster and solution levels, are used to automatically generate constraints to reduce the search space in the next iterations and focus on the region of developer preferences. We selected 14 active developers to manually evaluate the effectiveness our tool on 5 open source projects and one industrial system. The results show that the participants found their desired refactorings faster and more accurate than the current state of the art.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"44 3","pages":"464-474"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91492481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
A Unified Lattice Model and Framework for Purity Analyses 纯度分析的统一格模型和框架
D. Helm, Florian Kübler, Michael Eichberg, Michael Reif, M. Mezini
Analyzing methods in object-oriented programs whether they are side-effect free and also deterministic, i.e., mathematically pure, has been the target of extensive research. Identifying such methods helps to find code smells and security related issues, and also helps analyses detecting concurrency bugs. Pure methods are also used by formal verification approaches as the foundations for specifications and proving the pureness is necessary to ensure correct specifications. However, so far no common terminology exists which describes the purity of methods. Furthermore, some terms (e.g., pure or side-effect free) are also used inconsistently. Further, all current approaches only report selected purity information making them only suitable for a smaller subset of the potential use cases. In this paper, we present a fine-grained unified lattice model which puts the purity levels found in the literature into relation and which adds a new level that generalizes existing definitions. We have also implemented a scalable, modularized purity analysis which produces significantly more precise results for real-world programs than the best-performing related work. The analysis shows that all defined levels are found in real-world projects.
分析面向对象程序中的方法是否无副作用,是否具有确定性,即在数学上是纯粹的,一直是广泛研究的目标。识别这些方法有助于发现代码气味和与安全相关的问题,还有助于分析检测并发性错误。纯方法也被形式化验证方法用作规范的基础,并且证明纯方法对于确保正确的规范是必要的。然而,到目前为止,还没有描述方法纯度的通用术语。此外,一些术语(例如,纯或无副作用)的使用也不一致。此外,所有当前的方法只报告选定的纯度信息,使得它们只适用于潜在用例的一个较小的子集。在本文中,我们提出了一个细粒度的统一晶格模型,它将文献中发现的纯度层次联系起来,并增加了一个新的层次来推广现有的定义。我们还实现了一个可扩展的、模块化的纯度分析,它为现实世界的程序产生比最好的相关工作更精确的结果。分析表明,所有定义的级别都可以在实际项目中找到。
{"title":"A Unified Lattice Model and Framework for Purity Analyses","authors":"D. Helm, Florian Kübler, Michael Eichberg, Michael Reif, M. Mezini","doi":"10.1145/3238147.3238226","DOIUrl":"https://doi.org/10.1145/3238147.3238226","url":null,"abstract":"Analyzing methods in object-oriented programs whether they are side-effect free and also deterministic, i.e., mathematically pure, has been the target of extensive research. Identifying such methods helps to find code smells and security related issues, and also helps analyses detecting concurrency bugs. Pure methods are also used by formal verification approaches as the foundations for specifications and proving the pureness is necessary to ensure correct specifications. However, so far no common terminology exists which describes the purity of methods. Furthermore, some terms (e.g., pure or side-effect free) are also used inconsistently. Further, all current approaches only report selected purity information making them only suitable for a smaller subset of the potential use cases. In this paper, we present a fine-grained unified lattice model which puts the purity levels found in the literature into relation and which adds a new level that generalizes existing definitions. We have also implemented a scalable, modularized purity analysis which produces significantly more precise results for real-world programs than the best-performing related work. The analysis shows that all defined levels are found in real-world projects.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"340-350"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79590643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
$alpha$ Diff: Cross-Version Binary Code Similarity Detection with DNN $alpha$ Diff:基于DNN的跨版本二进制代码相似度检测
Bingchang Liu, Wei Huo, Chao Zhang, Wenchao Li, Feng Li, Aihua Piao, Wei Zou
Binary code similarity detection (BCSD) has many applications, including patch analysis, plagiarism detection, malware detection, and vulnerability search etc. Existing solutions usually perform comparisons over specific syntactic features extracted from binary code, based on expert knowledge. They have either high performance overheads or low detection accuracy. Moreover, few solutions are suitable for detecting similarities between cross-version binaries, which may not only diverge in syntactic structures but also diverge slightly in semantics. In this paper, we propose a solution $alpha$ Diff, employing three semantic features, to address the cross-version BCSD challenge. It first extracts the intra-function feature of each binary function using a deep neural network (DNN). The DNN works directly on raw bytes of each function, rather than features (e.g., syntactic structures) provided by experts. $alpha$ Diff further analyzes the function call graph of each binary, which are relatively stable in cross-version binaries, and extracts the inter-function and inter-module features. Then, a distance is computed based on these three features and used for BCSD. We have implemented a prototype of $alpha$ Diff, and evaluated it on a dataset with about 2.5 million samples. The result shows that $alpha$ Diff outperforms state-of-the-art static solutions by over 10 percentages on average in different BCSD settings.
二进制代码相似度检测(BCSD)有许多应用,包括补丁分析、剽窃检测、恶意软件检测和漏洞搜索等。现有的解决方案通常基于专家知识,对从二进制代码中提取的特定语法特征进行比较。它们要么性能开销高,要么检测精度低。此外,很少有解决方案适合检测跨版本二进制文件之间的相似性,这些解决方案不仅在语法结构上存在差异,而且在语义上也有轻微的差异。在本文中,我们提出了一个解决方案$alpha$ Diff,采用三个语义特征,以解决跨版本BCSD的挑战。首先利用深度神经网络(DNN)提取每个二元函数的函数内特征;DNN直接处理每个函数的原始字节,而不是专家提供的特征(例如语法结构)。$alpha$ Diff进一步分析每个在跨版本二进制文件中相对稳定的二进制文件的函数调用图,并提取函数间和模块间的特征。然后,根据这三个特征计算距离并用于BCSD。我们已经实现了$alpha$ Diff的原型,并在大约250万个样本的数据集上对其进行了评估。结果表明,在不同的BCSD设置中,$alpha$ Diff比最先进的静态解决方案平均高出10个百分点以上。
{"title":"$alpha$ Diff: Cross-Version Binary Code Similarity Detection with DNN","authors":"Bingchang Liu, Wei Huo, Chao Zhang, Wenchao Li, Feng Li, Aihua Piao, Wei Zou","doi":"10.1145/3238147.3238199","DOIUrl":"https://doi.org/10.1145/3238147.3238199","url":null,"abstract":"Binary code similarity detection (BCSD) has many applications, including patch analysis, plagiarism detection, malware detection, and vulnerability search etc. Existing solutions usually perform comparisons over specific syntactic features extracted from binary code, based on expert knowledge. They have either high performance overheads or low detection accuracy. Moreover, few solutions are suitable for detecting similarities between cross-version binaries, which may not only diverge in syntactic structures but also diverge slightly in semantics. In this paper, we propose a solution $alpha$ Diff, employing three semantic features, to address the cross-version BCSD challenge. It first extracts the intra-function feature of each binary function using a deep neural network (DNN). The DNN works directly on raw bytes of each function, rather than features (e.g., syntactic structures) provided by experts. $alpha$ Diff further analyzes the function call graph of each binary, which are relatively stable in cross-version binaries, and extracts the inter-function and inter-module features. Then, a distance is computed based on these three features and used for BCSD. We have implemented a prototype of $alpha$ Diff, and evaluated it on a dataset with about 2.5 million samples. The result shows that $alpha$ Diff outperforms state-of-the-art static solutions by over 10 percentages on average in different BCSD settings.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"75 1","pages":"667-678"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86166190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 148
Code2graph: Automatic Generation of Static Call Graphs for Python Source Code 自动生成静态调用图的Python源代码
Gharib Gharibi, Rashmi Tripathi, Yugyung Lee
A static call graph is an imperative prerequisite used in most interprocedural analyses and software comprehension tools. However, there is a lack of software tools that can automatically analyze the Python source-code and construct its static call graph. In this paper, we introduce a prototype Python tool, named code2graph, which automates the tasks of (1) analyzing the Python source-code and extracting its structure, (2) constructing static call graphs from the source code, and (3) generating a similarity matrix of all possible execution paths in the system. Our goal is twofold: First, assist the developers in understanding the overall structure of the system. Second, provide a stepping stone for further research that can utilize the tool in software searching and similarity detection applications. For example, clustering the execution paths into a logical workflow of the system would be applied to automate specific software tasks. Code2graph has been successfully used to generate static call graphs and similarity matrices of the paths for three popular open-source Deep Learning projects (TensorFlow, Keras, PyTorch). A tool demo is available at https://youtu.be/ecctePpcAKU.
静态调用图是大多数过程间分析和软件理解工具中必不可少的先决条件。然而,缺乏能够自动分析Python源代码并构建其静态调用图的软件工具。在本文中,我们介绍了一个原型Python工具code2graph,它可以自动完成以下任务:(1)分析Python源代码并提取其结构;(2)从源代码构建静态调用图;(3)生成系统中所有可能执行路径的相似矩阵。我们的目标是双重的:首先,帮助开发人员理解系统的整体结构。其次,为进一步的研究提供一个跳板,可以利用该工具在软件搜索和相似度检测应用中。例如,将执行路径集群到系统的逻辑工作流中可以应用于自动化特定的软件任务。Code2graph已经成功地用于为三个流行的开源深度学习项目(TensorFlow, Keras, PyTorch)生成静态调用图和路径的相似矩阵。该工具的演示可在https://youtu.be/ecctePpcAKU上获得。
{"title":"Code2graph: Automatic Generation of Static Call Graphs for Python Source Code","authors":"Gharib Gharibi, Rashmi Tripathi, Yugyung Lee","doi":"10.1145/3238147.3240484","DOIUrl":"https://doi.org/10.1145/3238147.3240484","url":null,"abstract":"A static call graph is an imperative prerequisite used in most interprocedural analyses and software comprehension tools. However, there is a lack of software tools that can automatically analyze the Python source-code and construct its static call graph. In this paper, we introduce a prototype Python tool, named code2graph, which automates the tasks of (1) analyzing the Python source-code and extracting its structure, (2) constructing static call graphs from the source code, and (3) generating a similarity matrix of all possible execution paths in the system. Our goal is twofold: First, assist the developers in understanding the overall structure of the system. Second, provide a stepping stone for further research that can utilize the tool in software searching and similarity detection applications. For example, clustering the execution paths into a logical workflow of the system would be applied to automate specific software tasks. Code2graph has been successfully used to generate static call graphs and similarity matrices of the paths for three popular open-source Deep Learning projects (TensorFlow, Keras, PyTorch). A tool demo is available at https://youtu.be/ecctePpcAKU.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"63 1","pages":"880-883"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82542976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
期刊
2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1