ALBFL: A Novel Neural Ranking Model for Software Fault Localization via Combining Static and Dynamic Features

2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) Pub Date : 2020-12-01 DOI:10.1109/TrustCom50675.2020.00107

Yuqing Pan, Xi Xiao, Guangwu Hu, Bin Zhang, Qing Li, Haitao Zheng

{"title":"ALBFL: A Novel Neural Ranking Model for Software Fault Localization via Combining Static and Dynamic Features","authors":"Yuqing Pan, Xi Xiao, Guangwu Hu, Bin Zhang, Qing Li, Haitao Zheng","doi":"10.1109/TrustCom50675.2020.00107","DOIUrl":null,"url":null,"abstract":"Automatic fault localization plays a significant role in assisting developers to fix software bugs efficiently. Although existing approaches, e.g., static methods and dynamic ones, have greatly alleviated this problem by analyzing static features in source code and diagnosing dynamic behaviors in software running state respectively, the fault localization accuracy still does not meet user requirements. To improve the fault locating ability with statement granularity, this paper proposes ALBFL, a novel neural ranking model that involves the attention mechanism and the LambdaRank model, which can integrate the static and dynamic features and achieve very high accuracy for identifying software faults. ALBFL first introduces a transformer encoder to learn the semantic features from software source code. Also, it leverages other static statistical features and dynamic features, i.e., eleven Spectrum-Based Fault Localization (SBFL) features, three mutation features, to evaluate software together. Specially, the two types of features are integrated through a self-attention layer, and fed into the LambdaRank model so as to rank a list of possible fault statements. Finally, thorough experiments are conducted on 5 open-source projects with 357 faulty programs in Defects4J. The results show that ALBFL outperforms 11 traditional SBFL methods (by three times) and 2 state-of-the-art approaches (by 13%) on ranking faulty statements in the first position.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TrustCom50675.2020.00107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Automatic fault localization plays a significant role in assisting developers to fix software bugs efficiently. Although existing approaches, e.g., static methods and dynamic ones, have greatly alleviated this problem by analyzing static features in source code and diagnosing dynamic behaviors in software running state respectively, the fault localization accuracy still does not meet user requirements. To improve the fault locating ability with statement granularity, this paper proposes ALBFL, a novel neural ranking model that involves the attention mechanism and the LambdaRank model, which can integrate the static and dynamic features and achieve very high accuracy for identifying software faults. ALBFL first introduces a transformer encoder to learn the semantic features from software source code. Also, it leverages other static statistical features and dynamic features, i.e., eleven Spectrum-Based Fault Localization (SBFL) features, three mutation features, to evaluate software together. Specially, the two types of features are integrated through a self-attention layer, and fed into the LambdaRank model so as to rank a list of possible fault statements. Finally, thorough experiments are conducted on 5 open-source projects with 357 faulty programs in Defects4J. The results show that ALBFL outperforms 11 traditional SBFL methods (by three times) and 2 state-of-the-art approaches (by 13%) on ranking faulty statements in the first position.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ALBFL:一种结合静态和动态特征的软件故障定位神经网络排序模型

自动故障定位在帮助开发人员有效地修复软件错误方面起着重要的作用。虽然现有的静态方法和动态方法分别通过分析源代码中的静态特征和诊断软件运行状态中的动态行为，大大缓解了这一问题，但故障定位的精度仍然不能满足用户的要求。为了提高基于语句粒度的故障定位能力，本文提出了一种新的神经排序模型ALBFL，该模型结合了注意机制和LambdaRank模型，可以将静态和动态特征结合起来，对软件故障进行识别，具有很高的准确率。ALBFL首先引入了一个转换器编码器，从软件源代码中学习语义特征。此外，它还利用其他静态统计特征和动态特征，即11个基于谱的故障定位(SBFL)特征和3个突变特征，共同对软件进行评估。特别地，这两种类型的特征通过一个自关注层集成，并输入到LambdaRank模型中，从而对可能的故障陈述列表进行排序。最后，在缺陷4j中对包含357个错误程序的5个开源项目进行了彻底的实验。结果表明，ALBFL在将错误语句排在第一位置上优于11种传统的SBFL方法(高出3倍)和2种最先进的方法(高出13%)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)

自引率

0.00%

发文量

期刊最新文献

Research on Stitching and Alignment of Mouse Carcass EM Images One Covert Channel to Rule Them All: A Practical Approach to Data Exfiltration in the Cloud MAUSPAD: Mouse-based Authentication Using Segmentation-based, Progress-Adjusted DTW Finding Geometric Medians with Location Privacy Multi-Input Functional Encryption: Efficient Applications from Symmetric Primitives