An assessment of heterogenous ensemble classifiers for analyzing change-proneness in open-source software systems

IF 1.7 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING Journal of Software-Evolution and Process Pub Date : 2024-02-24 DOI:10.1002/smr.2660
Megha Khanna, Ankita Bansal
{"title":"An assessment of heterogenous ensemble classifiers for analyzing change-proneness in open-source software systems","authors":"Megha Khanna,&nbsp;Ankita Bansal","doi":"10.1002/smr.2660","DOIUrl":null,"url":null,"abstract":"<p>Software managers constantly look out for methods that ensure cost effective development of good quality software products. An important means of accomplishing this is by allocating more resources to weak classes of a software product, which are prone to changes. Therefore, correct prediction of these change-prone classes is critical. Though various researchers have investigated the performance of several algorithms for identifying them, the search for an optimum classifier still persists. To this end, this study critically investigates the use of six Heterogenous Ensemble Classifiers (HEC) for Software Change Prediction (SCP) by empirically validating datasets obtained from 12 open-source software systems. The results of the study are statistically assessed using three robust performance indicators (AUC, F-measure and Mathew Correlation Coefficient) in two different validation scenarios (within project and cross-project). They indicate the superiority of Average Probability Voting Ensemble, a heterogenous classifier for determining change-proneness in the investigated systems. The average AUC values of software change prediction models developed using this ensemble classifier exhibited an improvement of 3%-9% and 3%-11% respectively when compared with its base learners and homogeneous counter parts. Similar observations were inferred using other investigated performance measures. Furthermore, the evidence obtained from the results suggests that the change in number of base learners or type of meta-learner does not exhibit significant change in the performance of corresponding heterogenous ensemble classifiers.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 8","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Software-Evolution and Process","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/smr.2660","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Software managers constantly look out for methods that ensure cost effective development of good quality software products. An important means of accomplishing this is by allocating more resources to weak classes of a software product, which are prone to changes. Therefore, correct prediction of these change-prone classes is critical. Though various researchers have investigated the performance of several algorithms for identifying them, the search for an optimum classifier still persists. To this end, this study critically investigates the use of six Heterogenous Ensemble Classifiers (HEC) for Software Change Prediction (SCP) by empirically validating datasets obtained from 12 open-source software systems. The results of the study are statistically assessed using three robust performance indicators (AUC, F-measure and Mathew Correlation Coefficient) in two different validation scenarios (within project and cross-project). They indicate the superiority of Average Probability Voting Ensemble, a heterogenous classifier for determining change-proneness in the investigated systems. The average AUC values of software change prediction models developed using this ensemble classifier exhibited an improvement of 3%-9% and 3%-11% respectively when compared with its base learners and homogeneous counter parts. Similar observations were inferred using other investigated performance measures. Furthermore, the evidence obtained from the results suggests that the change in number of base learners or type of meta-learner does not exhibit significant change in the performance of corresponding heterogenous ensemble classifiers.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估用于分析开源软件系统易变性的异源集合分类器
软件管理者一直在寻找各种方法,以确保经济高效地开发出高质量的软件产品。实现这一目标的一个重要手段就是为软件产品中容易发生变化的薄弱类分配更多资源。因此,正确预测这些易变类至关重要。虽然不同的研究人员已经研究了几种识别算法的性能,但对最佳分类器的探索仍在继续。为此,本研究通过对从 12 个开源软件系统中获取的数据集进行经验验证,批判性地研究了六种异源集合分类器(HEC)在软件变更预测(SCP)中的应用。在两种不同的验证场景(项目内和跨项目)中,使用三个稳健的性能指标(AUC、F-measure 和 Mathew 相关系数)对研究结果进行了统计评估。结果表明,平均概率投票合集这种异质分类器在确定所研究系统的易变性方面具有优势。使用这种集合分类器开发的软件变更预测模型的平均 AUC 值与基础学习器和同质分类器相比,分别提高了 3%-9% 和 3%-11%。使用其他调查性能指标也得出了类似的结论。此外,从结果中获得的证据表明,基础学习器数量或元学习器类型的变化不会对相应的异质集合分类器的性能产生显著影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Software-Evolution and Process
Journal of Software-Evolution and Process COMPUTER SCIENCE, SOFTWARE ENGINEERING-
自引率
10.00%
发文量
109
期刊最新文献
Issue Information Issue Information A hybrid‐ensemble model for software defect prediction for balanced and imbalanced datasets using AI‐based techniques with feature preservation: SMERKP‐XGB Issue Information LLMs for science: Usage for code generation and data analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1