改进软件错误检测和提交分类的特征转换

IF 3.7 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Journal of Systems and Software Pub Date : 2024-09-06 DOI:10.1016/j.jss.2024.112205
{"title":"改进软件错误检测和提交分类的特征转换","authors":"","doi":"10.1016/j.jss.2024.112205","DOIUrl":null,"url":null,"abstract":"<div><p>Testing and debugging software to fix bugs is considered one of the most important stages of the software life cycle. Many studies have investigated ways to predict bugs in software artifacts using machine learning techniques. It is important to consider the explanatory aspects of such models for reliable prediction. In this paper, we show how feature transformation can significantly improve prediction accuracy and provide insight into the inner workings of bug prediction models. We propose a new approach for bug prediction that first extracts the features, then finds a weighted transformation of these features using a genetic algorithm that best separates bugs from non-bugs when plotted in a low-dimensional space, and finally, trains predictive models using the transformed dataset. In our experiment using the proposed feature transformation, the traditional machine learning and deep learning classifiers achieved an average improvement of 4.25% and 9.6% in recall values for bug classification over 8 software systems compared to the models built on original data. We also examined the generalizability of our concept for multiclass classification tasks such as commit classification in software systems and found modest improvements in F1-scores (sometimes up to 3%) for traditional machine learning models and 4% with deep learning models.</p></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0164121224002498/pdfft?md5=24be736d13c3422f3ae6248d88baf8da&pid=1-s2.0-S0164121224002498-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Feature transformation for improved software bug detection and commit classification\",\"authors\":\"\",\"doi\":\"10.1016/j.jss.2024.112205\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Testing and debugging software to fix bugs is considered one of the most important stages of the software life cycle. Many studies have investigated ways to predict bugs in software artifacts using machine learning techniques. It is important to consider the explanatory aspects of such models for reliable prediction. In this paper, we show how feature transformation can significantly improve prediction accuracy and provide insight into the inner workings of bug prediction models. We propose a new approach for bug prediction that first extracts the features, then finds a weighted transformation of these features using a genetic algorithm that best separates bugs from non-bugs when plotted in a low-dimensional space, and finally, trains predictive models using the transformed dataset. In our experiment using the proposed feature transformation, the traditional machine learning and deep learning classifiers achieved an average improvement of 4.25% and 9.6% in recall values for bug classification over 8 software systems compared to the models built on original data. We also examined the generalizability of our concept for multiclass classification tasks such as commit classification in software systems and found modest improvements in F1-scores (sometimes up to 3%) for traditional machine learning models and 4% with deep learning models.</p></div>\",\"PeriodicalId\":51099,\"journal\":{\"name\":\"Journal of Systems and Software\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0164121224002498/pdfft?md5=24be736d13c3422f3ae6248d88baf8da&pid=1-s2.0-S0164121224002498-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Systems and Software\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0164121224002498\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems and Software","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0164121224002498","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

测试和调试软件以修复错误被认为是软件生命周期中最重要的阶段之一。许多研究都在探讨如何利用机器学习技术预测软件工件中的错误。要进行可靠的预测,必须考虑这些模型的解释性方面。在本文中,我们展示了特征转换如何显著提高预测准确性,并深入探讨了错误预测模型的内部工作原理。我们提出了一种新的错误预测方法,该方法首先提取特征,然后使用遗传算法对这些特征进行加权变换,在低维空间中绘制出最能区分错误与非错误的图像,最后使用变换后的数据集训练预测模型。在我们使用所提出的特征转换进行的实验中,与基于原始数据构建的模型相比,传统机器学习和深度学习分类器在 8 个软件系统的错误分类召回值上平均提高了 4.25% 和 9.6%。我们还检验了我们的概念在多类分类任务(如软件系统中的提交分类)中的通用性,发现传统机器学习模型的 F1 分数略有提高(有时可达 3%),而深度学习模型的 F1 分数提高了 4%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Feature transformation for improved software bug detection and commit classification

Testing and debugging software to fix bugs is considered one of the most important stages of the software life cycle. Many studies have investigated ways to predict bugs in software artifacts using machine learning techniques. It is important to consider the explanatory aspects of such models for reliable prediction. In this paper, we show how feature transformation can significantly improve prediction accuracy and provide insight into the inner workings of bug prediction models. We propose a new approach for bug prediction that first extracts the features, then finds a weighted transformation of these features using a genetic algorithm that best separates bugs from non-bugs when plotted in a low-dimensional space, and finally, trains predictive models using the transformed dataset. In our experiment using the proposed feature transformation, the traditional machine learning and deep learning classifiers achieved an average improvement of 4.25% and 9.6% in recall values for bug classification over 8 software systems compared to the models built on original data. We also examined the generalizability of our concept for multiclass classification tasks such as commit classification in software systems and found modest improvements in F1-scores (sometimes up to 3%) for traditional machine learning models and 4% with deep learning models.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Systems and Software
Journal of Systems and Software 工程技术-计算机:理论方法
CiteScore
8.60
自引率
5.70%
发文量
193
审稿时长
16 weeks
期刊介绍: The Journal of Systems and Software publishes papers covering all aspects of software engineering and related hardware-software-systems issues. All articles should include a validation of the idea presented, e.g. through case studies, experiments, or systematic comparisons with other approaches already in practice. Topics of interest include, but are not limited to: • Methods and tools for, and empirical studies on, software requirements, design, architecture, verification and validation, maintenance and evolution • Agile, model-driven, service-oriented, open source and global software development • Approaches for mobile, multiprocessing, real-time, distributed, cloud-based, dependable and virtualized systems • Human factors and management concerns of software development • Data management and big data issues of software systems • Metrics and evaluation, data mining of software development resources • Business and economic aspects of software development processes The journal welcomes state-of-the-art surveys and reports of practical experience for all of these topics.
期刊最新文献
FSECAM: A contextual thematic approach for linking feature to multi-level software architectural components Exploring emergent microservice evolution in elastic deployment environments An empirical study of AI techniques in mobile applications Information needs in bug reports for web applications Development and benchmarking of multilingual code clone detector
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1