Comparative analysis of real issues in open-source machine learning projects

IF 3.5 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Empirical Software Engineering Pub Date : 2024-05-02 DOI:10.1007/s10664-024-10467-3

Tuan Dung Lai, Anj Simmons, Scott Barnett, Jean-Guy Schneider, Rajesh Vasa

{"title":"Comparative analysis of real issues in open-source machine learning projects","authors":"Tuan Dung Lai, Anj Simmons, Scott Barnett, Jean-Guy Schneider, Rajesh Vasa","doi":"10.1007/s10664-024-10467-3","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Context</h3><p>In the last decade of data-driven decision-making, Machine Learning (ML) systems reign supreme. Because of the different characteristics between ML and traditional Software Engineering systems, we do not know to what extent the issue-reporting needs are different, and to what extent these differences impact the issue resolution process.</p><h3 data-test=\"abstract-sub-heading\">Objective</h3><p>We aim to compare the differences between ML and non-ML issues in open-source applied AI projects in terms of resolution time and size of fix. This research aims to enhance the predictability of maintenance tasks by providing valuable insights for issue reporting and task scheduling activities.</p><h3 data-test=\"abstract-sub-heading\">Method</h3><p>We collect issue reports from Github repositories of open-source ML projects using an automatic approach, filter them using ML keywords and libraries, manually categorize them using an adapted deep learning bug taxonomy, and compare resolution time and fix size for ML and non-ML issues in a controlled sample.</p><h3 data-test=\"abstract-sub-heading\">Result</h3><p>147 ML issues and 147 non-ML issues are collected for analysis. We found that ML issues take more time to resolve than non-ML issues, the median difference is 14 days. There is no significant difference in terms of size of fix between ML and non-ML issues. No significant differences are found between different ML issue categories in terms of resolution time and size of fix.</p><h3 data-test=\"abstract-sub-heading\">Conclusion</h3><p>Our study provided evidence that the life cycle for ML issues is stretched, and thus further work is required to identify the reason. The results also highlighted the need for future work to design custom tooling to support faster resolution of ML issues.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"5 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Empirical Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10664-024-10467-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Context

In the last decade of data-driven decision-making, Machine Learning (ML) systems reign supreme. Because of the different characteristics between ML and traditional Software Engineering systems, we do not know to what extent the issue-reporting needs are different, and to what extent these differences impact the issue resolution process.

Objective

We aim to compare the differences between ML and non-ML issues in open-source applied AI projects in terms of resolution time and size of fix. This research aims to enhance the predictability of maintenance tasks by providing valuable insights for issue reporting and task scheduling activities.

Method

We collect issue reports from Github repositories of open-source ML projects using an automatic approach, filter them using ML keywords and libraries, manually categorize them using an adapted deep learning bug taxonomy, and compare resolution time and fix size for ML and non-ML issues in a controlled sample.

Result

147 ML issues and 147 non-ML issues are collected for analysis. We found that ML issues take more time to resolve than non-ML issues, the median difference is 14 days. There is no significant difference in terms of size of fix between ML and non-ML issues. No significant differences are found between different ML issue categories in terms of resolution time and size of fix.

Conclusion

Our study provided evidence that the life cycle for ML issues is stretched, and thus further work is required to identify the reason. The results also highlighted the need for future work to design custom tooling to support faster resolution of ML issues.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

开源机器学习项目中实际问题的比较分析

背景在过去十年的数据驱动决策中，机器学习（ML）系统占据了主导地位。由于 ML 与传统软件工程系统的不同特性，我们不知道问题报告的需求在多大程度上有所不同，也不知道这些差异在多大程度上影响了问题的解决过程。这项研究旨在为问题报告和任务调度活动提供有价值的见解，从而提高维护任务的可预测性。方法我们使用自动方法从开源 ML 项目的 Github 存储库中收集问题报告，使用 ML 关键字和库对其进行过滤，使用适应的深度学习错误分类法对其进行手动分类，并在控制样本中比较 ML 和非 ML 问题的解决时间和修复规模。我们发现，ML 问题比非ML 问题的解决时间更长，中位数相差 14 天。就修复规模而言，ML 和非 ML 问题之间没有明显差异。在解决时间和修复规模方面，不同类型的 ML 问题之间也没有明显差异。研究结果还强调，今后需要设计定制工具，以支持更快地解决 ML 问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Empirical Software Engineering 工程技术-计算机：软件工程

CiteScore

8.50

自引率

12.20%

发文量

169

审稿时长

>12 weeks

期刊介绍： Empirical Software Engineering provides a forum for applied software engineering research with a strong empirical component, and a venue for publishing empirical results relevant to both researchers and practitioners. Empirical studies presented here usually involve the collection and analysis of data and experience that can be used to characterize, evaluate and reveal relationships between software development deliverables, practices, and technologies. Over time, it is expected that such empirical results will form a body of knowledge leading to widely accepted and well-formed theories. The journal also offers industrial experience reports detailing the application of software technologies - processes, methods, or tools - and their effectiveness in industrial settings. Empirical Software Engineering promotes the publication of industry-relevant research, to address the significant gap between research and practice.

期刊最新文献

The effect of data complexity on classifier performance. Reinforcement learning for online testing of autonomous driving systems: a replication and extension study. An empirical study on developers’ shared conversations with ChatGPT in GitHub pull requests and issues Quality issues in machine learning software systems An empirical study of token-based micro commits