Context-based Cluster Fault Localization

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC) Pub Date : 2022-05-01 DOI:10.1145/3524610.3527891

Ju-Yeol Yu, Yan Lei, Huan Xie, Lingfeng Fu, Chunyan Liu

{"title":"Context-based Cluster Fault Localization","authors":"Ju-Yeol Yu, Yan Lei, Huan Xie, Lingfeng Fu, Chunyan Liu","doi":"10.1145/3524610.3527891","DOIUrl":null,"url":null,"abstract":"Automated fault localization techniques collect runtime information as input data to identify suspicious statement potentially respon-sible for program failures. To discover the statistical coincidences between test results (i.e., failing or passing) and the executions of the different statements of a program (i.e., executed or not exe-cuted), researchers developed a suspiciousness methodology (e.g., spectrum-based formulas and deep neural network models). How-ever, the occurrences of coincidental correctness (CC) which means the faulty statements were executed but the output of the program was right affect the effectiveness of fault localization. Many re-searchers seek to identify CC tests using cluster analysis. However, the high-dimensional data containing too much noise reduce the effectiveness of cluster analysis. To overcome the obstacle, we propose CBCFL: a context-based cluster fault localization approach, which incorporates a failure context showing how a failure is produced into cluster analysis. Specifically, CBCFL uses the failure context containing the state-ments whose execution affects the output of a failing test as input data for cluster analysis to improve the effectiveness of identifying CC tests. Since CC tests execute the faulty statement, we change the labels of CC tests into failing tests. We take the context and the corresponding changed labels as the input data for fault local-ization techniques. To evaluate the effectiveness of CBCFL, we conduct large-scale experiments on six large-sized programs using five state-of-the-art fault localization approaches. The experimen-tal results show that CBCFL is more effective than the baselines, e.g., our approach can improve the MLP-FL method using cluster analysis by at most 200%, 250%, and 320% under the Top-1, Top-5, and Top-10 accuracies.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":" 35","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3524610.3527891","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Automated fault localization techniques collect runtime information as input data to identify suspicious statement potentially respon-sible for program failures. To discover the statistical coincidences between test results (i.e., failing or passing) and the executions of the different statements of a program (i.e., executed or not exe-cuted), researchers developed a suspiciousness methodology (e.g., spectrum-based formulas and deep neural network models). How-ever, the occurrences of coincidental correctness (CC) which means the faulty statements were executed but the output of the program was right affect the effectiveness of fault localization. Many re-searchers seek to identify CC tests using cluster analysis. However, the high-dimensional data containing too much noise reduce the effectiveness of cluster analysis. To overcome the obstacle, we propose CBCFL: a context-based cluster fault localization approach, which incorporates a failure context showing how a failure is produced into cluster analysis. Specifically, CBCFL uses the failure context containing the state-ments whose execution affects the output of a failing test as input data for cluster analysis to improve the effectiveness of identifying CC tests. Since CC tests execute the faulty statement, we change the labels of CC tests into failing tests. We take the context and the corresponding changed labels as the input data for fault local-ization techniques. To evaluate the effectiveness of CBCFL, we conduct large-scale experiments on six large-sized programs using five state-of-the-art fault localization approaches. The experimen-tal results show that CBCFL is more effective than the baselines, e.g., our approach can improve the MLP-FL method using cluster analysis by at most 200%, 250%, and 320% under the Top-1, Top-5, and Top-10 accuracies.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于上下文的集群故障定位

自动故障定位技术收集运行时信息作为输入数据，以识别可能导致程序故障的可疑语句。为了发现测试结果(即失败或通过)与程序不同语句的执行(即执行或未执行)之间的统计一致性，研究人员开发了一种怀疑方法(例如，基于频谱的公式和深度神经网络模型)。然而，在执行了错误语句但程序的输出是正确的情况下出现的巧合正确性影响了错误定位的有效性。许多研究人员试图用聚类分析来确定CC测试。然而，高维数据中含有过多的噪声会降低聚类分析的有效性。为了克服这一障碍，我们提出了CBCFL:一种基于上下文的聚类故障定位方法，该方法将显示故障如何产生的故障上下文纳入聚类分析。具体来说，CBCFL使用包含其执行影响失败测试输出的状态的失败上下文作为聚类分析的输入数据，以提高识别CC测试的有效性。由于CC测试执行错误语句，因此我们将CC测试的标签更改为失败测试。我们将上下文和相应的变化标签作为故障局部化技术的输入数据。为了评估CBCFL的有效性，我们使用五种最先进的故障定位方法在六个大型程序上进行了大规模实验。实验结果表明，在Top-1、Top-5和Top-10的准确率下，CBCFL比使用聚类分析的MLP-FL方法的准确率提高了200%、250%和320%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

自引率

0.00%

发文量

期刊最新文献

Context-based Cluster Fault Localization Fine-Grained Code-Comment Semantic Interaction Analysis Find Bugs in Static Bug Finders Self-Supervised Learning of Smart Contract Representations An Exploratory Study of Analyzing JavaScript Online Code Clones