H. Jahanshahi, Dhanya Jothimani, Ayse Basar, Mucahit Cevik
{"title":"时间顺序在JIT缺陷预测中重要吗?:部分重复研究","authors":"H. Jahanshahi, Dhanya Jothimani, Ayse Basar, Mucahit Cevik","doi":"10.1145/3345629.3351449","DOIUrl":null,"url":null,"abstract":"BACKGROUND: Just-In-Time (JIT) models, unlike the traditional defect prediction models, detect the fix-inducing changes (or defect inducing changes). These models are designed based on the assumption that past code change properties are similar to future ones. However, as the system evolves, the expertise of developers and/or the complexity of the system also change. AIM: In this work, we aim to investigate the effect of code change properties on JIT models over time. We also study the impact of using recent data as well as all available data on the performance of JIT models. Further, we analyze the effect of weighted sampling on the performance of fix-inducing properties of JIT models. For this purpose, we used datasets from four open-source projects, namely Eclipse JDT, Mozilla, Eclipse Platform, and PostgreSQL. METHOD: We used five families of change code properties such as size, diffusion, history, experience, and purpose. We used Random Forest to train and test the JIT model and Brier Score (BS) and Area Under Curve (AUC) for performance measurement. We applied the Wilcoxon Signed Rank Test on the output to statistically validate whether the performance of JIT models improves using all the available data or the recent data. RESULTS: Our paper suggest that the predictive power of JIT models does not change by time. Furthermore, we observed that the chronology of data in JIT defect prediction models can be discarded by considering all the available data. On the other hand, the importance score of families of code change properties is found to oscillate over time. CONCLUSION: To mitigate the impact of the evolution of code change properties, it is recommended to use weighted sampling approach in which more emphasis is placed upon the changes occurring closer to the current time. Moreover, since properties such as \"Expertise of the Developer\" and \"Size\" evolve with the time, the models obtained from old data may exhibit different characteristics compared to those employing the newer dataset. Hence, practitioners should constantly retrain JIT models to include fresh data.","PeriodicalId":424201,"journal":{"name":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Does chronology matter in JIT defect prediction?: A Partial Replication Study\",\"authors\":\"H. Jahanshahi, Dhanya Jothimani, Ayse Basar, Mucahit Cevik\",\"doi\":\"10.1145/3345629.3351449\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"BACKGROUND: Just-In-Time (JIT) models, unlike the traditional defect prediction models, detect the fix-inducing changes (or defect inducing changes). These models are designed based on the assumption that past code change properties are similar to future ones. However, as the system evolves, the expertise of developers and/or the complexity of the system also change. AIM: In this work, we aim to investigate the effect of code change properties on JIT models over time. We also study the impact of using recent data as well as all available data on the performance of JIT models. Further, we analyze the effect of weighted sampling on the performance of fix-inducing properties of JIT models. For this purpose, we used datasets from four open-source projects, namely Eclipse JDT, Mozilla, Eclipse Platform, and PostgreSQL. METHOD: We used five families of change code properties such as size, diffusion, history, experience, and purpose. We used Random Forest to train and test the JIT model and Brier Score (BS) and Area Under Curve (AUC) for performance measurement. We applied the Wilcoxon Signed Rank Test on the output to statistically validate whether the performance of JIT models improves using all the available data or the recent data. RESULTS: Our paper suggest that the predictive power of JIT models does not change by time. Furthermore, we observed that the chronology of data in JIT defect prediction models can be discarded by considering all the available data. On the other hand, the importance score of families of code change properties is found to oscillate over time. CONCLUSION: To mitigate the impact of the evolution of code change properties, it is recommended to use weighted sampling approach in which more emphasis is placed upon the changes occurring closer to the current time. Moreover, since properties such as \\\"Expertise of the Developer\\\" and \\\"Size\\\" evolve with the time, the models obtained from old data may exhibit different characteristics compared to those employing the newer dataset. Hence, practitioners should constantly retrain JIT models to include fresh data.\",\"PeriodicalId\":424201,\"journal\":{\"name\":\"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3345629.3351449\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3345629.3351449","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
摘要
背景:即时(JIT)模型与传统的缺陷预测模型不同,它检测引起修复的变更(或引起缺陷的变更)。这些模型是基于过去代码更改属性与未来代码更改属性相似的假设而设计的。然而,随着系统的发展,开发人员的专业知识和/或系统的复杂性也会发生变化。目的:在这项工作中,我们的目标是随着时间的推移研究代码更改属性对JIT模型的影响。我们还研究了使用最新数据以及所有可用数据对JIT模型性能的影响。进一步,我们分析了加权抽样对JIT模型的固定诱导性能的影响。为此,我们使用了来自四个开源项目的数据集,即Eclipse JDT、Mozilla、Eclipse Platform和PostgreSQL。方法:我们使用了五类变更代码属性,如大小、扩散、历史、经验和目的。我们使用Random Forest来训练和测试JIT模型,并使用Brier Score (BS)和Area Under Curve (AUC)进行性能测量。我们对输出应用了Wilcoxon Signed Rank Test,以统计地验证JIT模型的性能是使用所有可用数据还是使用最近的数据得到改善。结果:JIT模型的预测能力不随时间变化。此外,我们观察到JIT缺陷预测模型中数据的时间顺序可以通过考虑所有可用数据而被丢弃。另一方面,发现代码更改属性家族的重要性得分随时间而波动。结论:为了减轻代码更改属性演变的影响,建议使用加权抽样方法,其中更强调发生在更接近当前时间的更改。此外,由于“开发人员的专业知识”和“规模”等属性随着时间的推移而变化,与使用新数据集的模型相比,从旧数据中获得的模型可能表现出不同的特征。因此,从业者应该不断地重新训练JIT模型以包含新的数据。
Does chronology matter in JIT defect prediction?: A Partial Replication Study
BACKGROUND: Just-In-Time (JIT) models, unlike the traditional defect prediction models, detect the fix-inducing changes (or defect inducing changes). These models are designed based on the assumption that past code change properties are similar to future ones. However, as the system evolves, the expertise of developers and/or the complexity of the system also change. AIM: In this work, we aim to investigate the effect of code change properties on JIT models over time. We also study the impact of using recent data as well as all available data on the performance of JIT models. Further, we analyze the effect of weighted sampling on the performance of fix-inducing properties of JIT models. For this purpose, we used datasets from four open-source projects, namely Eclipse JDT, Mozilla, Eclipse Platform, and PostgreSQL. METHOD: We used five families of change code properties such as size, diffusion, history, experience, and purpose. We used Random Forest to train and test the JIT model and Brier Score (BS) and Area Under Curve (AUC) for performance measurement. We applied the Wilcoxon Signed Rank Test on the output to statistically validate whether the performance of JIT models improves using all the available data or the recent data. RESULTS: Our paper suggest that the predictive power of JIT models does not change by time. Furthermore, we observed that the chronology of data in JIT defect prediction models can be discarded by considering all the available data. On the other hand, the importance score of families of code change properties is found to oscillate over time. CONCLUSION: To mitigate the impact of the evolution of code change properties, it is recommended to use weighted sampling approach in which more emphasis is placed upon the changes occurring closer to the current time. Moreover, since properties such as "Expertise of the Developer" and "Size" evolve with the time, the models obtained from old data may exhibit different characteristics compared to those employing the newer dataset. Hence, practitioners should constantly retrain JIT models to include fresh data.