{"title":"Evaluating the Adaptive Selection of Classifiers for Cross-Project Bug Prediction","authors":"D. D. Nucci, Fabio Palomba, A. D. Lucia","doi":"10.1145/3194104.3194112","DOIUrl":null,"url":null,"abstract":"Bug prediction models are used to locate source code elements more likely to be defective. One of the key factors influencing their performances is related to the selection of a machine learning method (a.k.a., classifier) to use when discriminating buggy and non-buggy classes. Given the high complementarity of stand-alone classifiers, a recent trend is the definition of ensemble techniques, which try to effectively combine the predictions of different stand-alone machine learners. In a recent work we proposed ASCI, a technique that dynamically selects the right classifier to use based on the characteristics of the class on which the prediction has to be done. We tested it in a within-project scenario, showing its higher accuracy with respect to the Validation and Voting strategy. In this paper, we continue on the line of research, by (i) evaluating ASCI in a global and local cross-project setting and (ii) comparing its performances with those achieved by a stand-alone and an ensemble baselines, namely Naive Bayes and Validation and Voting, respectively. A key finding of our study shows that ASCI is able to perform better than the other techniques in the context of cross-project bug prediction. Moreover, despite local learning is not able to improve the performances of the corresponding models in most cases, it is able to improve the robustness of the models relying on ASCI.","PeriodicalId":249268,"journal":{"name":"2018 IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3194104.3194112","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Bug prediction models are used to locate source code elements more likely to be defective. One of the key factors influencing their performances is related to the selection of a machine learning method (a.k.a., classifier) to use when discriminating buggy and non-buggy classes. Given the high complementarity of stand-alone classifiers, a recent trend is the definition of ensemble techniques, which try to effectively combine the predictions of different stand-alone machine learners. In a recent work we proposed ASCI, a technique that dynamically selects the right classifier to use based on the characteristics of the class on which the prediction has to be done. We tested it in a within-project scenario, showing its higher accuracy with respect to the Validation and Voting strategy. In this paper, we continue on the line of research, by (i) evaluating ASCI in a global and local cross-project setting and (ii) comparing its performances with those achieved by a stand-alone and an ensemble baselines, namely Naive Bayes and Validation and Voting, respectively. A key finding of our study shows that ASCI is able to perform better than the other techniques in the context of cross-project bug prediction. Moreover, despite local learning is not able to improve the performances of the corresponding models in most cases, it is able to improve the robustness of the models relying on ASCI.
Bug预测模型用于定位更有可能存在缺陷的源代码元素。影响其性能的关键因素之一与在区分有bug和无bug类时使用的机器学习方法(又称分类器)的选择有关。考虑到独立分类器的高度互补性,最近的一个趋势是集成技术的定义,它试图有效地结合不同独立机器学习器的预测。在最近的一项工作中,我们提出了ascii,这是一种基于必须对其进行预测的类的特征动态选择要使用的正确分类器的技术。我们在一个项目内部场景中对其进行了测试,显示出它相对于Validation and Voting策略具有更高的准确性。在本文中,我们继续研究,通过(i)在全球和本地跨项目设置中评估ASCI,以及(ii)将其性能与独立基线和集成基线(分别为朴素贝叶斯和验证和投票)所取得的性能进行比较。我们研究的一个关键发现表明,在跨项目错误预测的背景下,ASCI能够比其他技术表现得更好。此外,尽管局部学习在大多数情况下不能提高相应模型的性能,但它能够提高依赖于ASCI的模型的鲁棒性。