Context-based Cluster Fault Localization

Ju-Yeol Yu, Yan Lei, Huan Xie, Lingfeng Fu, Chunyan Liu
{"title":"Context-based Cluster Fault Localization","authors":"Ju-Yeol Yu, Yan Lei, Huan Xie, Lingfeng Fu, Chunyan Liu","doi":"10.1145/3524610.3527891","DOIUrl":null,"url":null,"abstract":"Automated fault localization techniques collect runtime information as input data to identify suspicious statement potentially respon-sible for program failures. To discover the statistical coincidences between test results (i.e., failing or passing) and the executions of the different statements of a program (i.e., executed or not exe-cuted), researchers developed a suspiciousness methodology (e.g., spectrum-based formulas and deep neural network models). How-ever, the occurrences of coincidental correctness (CC) which means the faulty statements were executed but the output of the program was right affect the effectiveness of fault localization. Many re-searchers seek to identify CC tests using cluster analysis. However, the high-dimensional data containing too much noise reduce the effectiveness of cluster analysis. To overcome the obstacle, we propose CBCFL: a context-based cluster fault localization approach, which incorporates a failure context showing how a failure is produced into cluster analysis. Specifically, CBCFL uses the failure context containing the state-ments whose execution affects the output of a failing test as input data for cluster analysis to improve the effectiveness of identifying CC tests. Since CC tests execute the faulty statement, we change the labels of CC tests into failing tests. We take the context and the corresponding changed labels as the input data for fault local-ization techniques. To evaluate the effectiveness of CBCFL, we conduct large-scale experiments on six large-sized programs using five state-of-the-art fault localization approaches. The experimen-tal results show that CBCFL is more effective than the baselines, e.g., our approach can improve the MLP-FL method using cluster analysis by at most 200%, 250%, and 320% under the Top-1, Top-5, and Top-10 accuracies.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":" 35","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3524610.3527891","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Automated fault localization techniques collect runtime information as input data to identify suspicious statement potentially respon-sible for program failures. To discover the statistical coincidences between test results (i.e., failing or passing) and the executions of the different statements of a program (i.e., executed or not exe-cuted), researchers developed a suspiciousness methodology (e.g., spectrum-based formulas and deep neural network models). How-ever, the occurrences of coincidental correctness (CC) which means the faulty statements were executed but the output of the program was right affect the effectiveness of fault localization. Many re-searchers seek to identify CC tests using cluster analysis. However, the high-dimensional data containing too much noise reduce the effectiveness of cluster analysis. To overcome the obstacle, we propose CBCFL: a context-based cluster fault localization approach, which incorporates a failure context showing how a failure is produced into cluster analysis. Specifically, CBCFL uses the failure context containing the state-ments whose execution affects the output of a failing test as input data for cluster analysis to improve the effectiveness of identifying CC tests. Since CC tests execute the faulty statement, we change the labels of CC tests into failing tests. We take the context and the corresponding changed labels as the input data for fault local-ization techniques. To evaluate the effectiveness of CBCFL, we conduct large-scale experiments on six large-sized programs using five state-of-the-art fault localization approaches. The experimen-tal results show that CBCFL is more effective than the baselines, e.g., our approach can improve the MLP-FL method using cluster analysis by at most 200%, 250%, and 320% under the Top-1, Top-5, and Top-10 accuracies.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于上下文的集群故障定位
自动故障定位技术收集运行时信息作为输入数据,以识别可能导致程序故障的可疑语句。为了发现测试结果(即失败或通过)与程序不同语句的执行(即执行或未执行)之间的统计一致性,研究人员开发了一种怀疑方法(例如,基于频谱的公式和深度神经网络模型)。然而,在执行了错误语句但程序的输出是正确的情况下出现的巧合正确性影响了错误定位的有效性。许多研究人员试图用聚类分析来确定CC测试。然而,高维数据中含有过多的噪声会降低聚类分析的有效性。为了克服这一障碍,我们提出了CBCFL:一种基于上下文的聚类故障定位方法,该方法将显示故障如何产生的故障上下文纳入聚类分析。具体来说,CBCFL使用包含其执行影响失败测试输出的状态的失败上下文作为聚类分析的输入数据,以提高识别CC测试的有效性。由于CC测试执行错误语句,因此我们将CC测试的标签更改为失败测试。我们将上下文和相应的变化标签作为故障局部化技术的输入数据。为了评估CBCFL的有效性,我们使用五种最先进的故障定位方法在六个大型程序上进行了大规模实验。实验结果表明,在Top-1、Top-5和Top-10的准确率下,CBCFL比使用聚类分析的MLP-FL方法的准确率提高了200%、250%和320%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Context-based Cluster Fault Localization Fine-Grained Code-Comment Semantic Interaction Analysis Find Bugs in Static Bug Finders Self-Supervised Learning of Smart Contract Representations An Exploratory Study of Analyzing JavaScript Online Code Clones
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1