“Project smells” - Experiences in Analysing the Software Quality of ML Projects with mllint

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) Pub Date : 2022-01-20 DOI:10.1145/3510457.3513041

B. V. Oort, L. Cruz, B. Loni, A. Deursen

{"title":"“Project smells” - Experiences in Analysing the Software Quality of ML Projects with mllint","authors":"B. V. Oort, L. Cruz, B. Loni, A. Deursen","doi":"10.1145/3510457.3513041","DOIUrl":null,"url":null,"abstract":"Machine Learning (ML) projects incur novel challenges in their development and productionisation over traditional software applications, though established principles and best practices in ensuring the project's software quality still apply. While using static analysis to catch code smells has been shown to improve software quality attributes, it is only a small piece of the software quality puzzle, especially in the case of ML projects given their additional challenges and lower degree of Software Engineering (SE) experience in the data scientists that develop them. We introduce the novel concept of project smells which consider deficits in project management as a more holistic perspective on software quality in ML projects. An open-source static analysis tool mllint was also implemented to help detect and mitigate these. Our research evaluates this novel concept of project smells in the industrial context of ING, a global bank and large software- and data-intensive organisation. We also investigate the perceived importance of these project smells for proof-of-concept versus production-ready ML projects, as well as the perceived obstructions and benefits to using static analysis tools such as mllint. Our findings indicate a need for context-aware static analysis tools, that fit the needs of the project at its current stage of development, while requiring minimal configuration effort from the user.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510457.3513041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Machine Learning (ML) projects incur novel challenges in their development and productionisation over traditional software applications, though established principles and best practices in ensuring the project's software quality still apply. While using static analysis to catch code smells has been shown to improve software quality attributes, it is only a small piece of the software quality puzzle, especially in the case of ML projects given their additional challenges and lower degree of Software Engineering (SE) experience in the data scientists that develop them. We introduce the novel concept of project smells which consider deficits in project management as a more holistic perspective on software quality in ML projects. An open-source static analysis tool mllint was also implemented to help detect and mitigate these. Our research evaluates this novel concept of project smells in the industrial context of ING, a global bank and large software- and data-intensive organisation. We also investigate the perceived importance of these project smells for proof-of-concept versus production-ready ML projects, as well as the perceived obstructions and benefits to using static analysis tools such as mllint. Our findings indicate a need for context-aware static analysis tools, that fit the needs of the project at its current stage of development, while requiring minimal configuration effort from the user.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

“项目气味”——用mllint分析ML项目软件质量的经验

尽管确保项目软件质量的既定原则和最佳实践仍然适用，但机器学习(ML)项目在开发和生产过程中比传统软件应用程序面临新的挑战。虽然使用静态分析来捕捉代码气味已经被证明可以提高软件质量属性，但它只是软件质量难题的一小部分，特别是在ML项目的情况下，考虑到它们的额外挑战和开发它们的数据科学家较低程度的软件工程(SE)经验。我们引入了项目气味的新概念，它将项目管理中的缺陷视为ML项目中软件质量的更全面的视角。还实现了一个开源静态分析工具mllint来帮助检测和缓解这些问题。我们的研究在荷兰国际集团(一家全球性银行和大型软件和数据密集型组织)的工业背景下评估了这种新颖的项目气味概念。我们还研究了这些项目气味对于概念验证和生产就绪ML项目的重要性，以及使用静态分析工具(如mllint)的障碍和好处。我们的发现表明需要上下文感知的静态分析工具，它适合项目在当前开发阶段的需要，同时需要用户最小的配置工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

自引率

0.00%

发文量

期刊最新文献

Industry's Cry for Tools that Support Large-Scale Refactoring Code Reviewer Recommendation in Tencent: Practice, Challenge, and Direction* What's bothering developers in code review? The Impact of Flaky Tests on Historical Test Prioritization on Chrome Surveying the Developer Experience of Flaky Tests