工业测试机器学习系统:一项实证研究

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) Pub Date : 2022-05-01 DOI:10.1145/3510457.3513036

Shuyue Li†, Jiaqi Guo, Jian-Guang Lou, Ming Fan, Ting Liu, Dongmei Zhang

{"title":"工业测试机器学习系统:一项实证研究","authors":"Shuyue Li†, Jiaqi Guo, Jian-Guang Lou, Ming Fan, Ting Liu, Dongmei Zhang","doi":"10.1145/3510457.3513036","DOIUrl":null,"url":null,"abstract":"Machine learning becomes increasingly prevalent and integrated into a wide range of software systems. These systems, named ML systems, must be adequately tested to gain confidence that they behave correctly. Although many research efforts have been devoted to testing technologies for ML systems, the industrial teams are faced with new challenges on testing the ML systems in real-world settings. To absorb inspirations from the industry on the problems in ML testing, we conducted an empirical study including a survey with 87 responses and interviews with 7 senior ML practitioners from well-known IT companies. Our study uncovers significant industrial concerns on major testing activities, i.e., test data collection, test execution, and test result analysis, and also the good practices and open challenges from the perspective of the industry. (1) Test data collection is conducted in different ways on ML model, data, and code and faced with different challenges. (2) Test execution in ML systems suffers from two major problems: entanglement among the components and the regression on model performance. (3) Test result analysis centers on quantitative methods, e.g., metric-based evaluation, and is combined with some qualitative methods based on practitioners’ experience. Based on our findings, we highlight the research opportunities and also provide some implications for practitioners.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"658 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Testing Machine Learning Systems in Industry: An Empirical Study\",\"authors\":\"Shuyue Li†, Jiaqi Guo, Jian-Guang Lou, Ming Fan, Ting Liu, Dongmei Zhang\",\"doi\":\"10.1145/3510457.3513036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning becomes increasingly prevalent and integrated into a wide range of software systems. These systems, named ML systems, must be adequately tested to gain confidence that they behave correctly. Although many research efforts have been devoted to testing technologies for ML systems, the industrial teams are faced with new challenges on testing the ML systems in real-world settings. To absorb inspirations from the industry on the problems in ML testing, we conducted an empirical study including a survey with 87 responses and interviews with 7 senior ML practitioners from well-known IT companies. Our study uncovers significant industrial concerns on major testing activities, i.e., test data collection, test execution, and test result analysis, and also the good practices and open challenges from the perspective of the industry. (1) Test data collection is conducted in different ways on ML model, data, and code and faced with different challenges. (2) Test execution in ML systems suffers from two major problems: entanglement among the components and the regression on model performance. (3) Test result analysis centers on quantitative methods, e.g., metric-based evaluation, and is combined with some qualitative methods based on practitioners’ experience. Based on our findings, we highlight the research opportunities and also provide some implications for practitioners.\",\"PeriodicalId\":119790,\"journal\":{\"name\":\"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)\",\"volume\":\"658 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3510457.3513036\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3510457.3513036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

机器学习变得越来越普遍，并集成到广泛的软件系统中。这些被称为ML系统的系统必须经过充分的测试，以获得对其正确行为的信心。尽管许多研究都致力于机器学习系统的测试技术，但工业团队面临着在现实环境中测试机器学习系统的新挑战。为了吸收业界对机器学习测试问题的启示，我们进行了一项实证研究，包括对87个回复的调查和对7位来自知名IT公司的高级机器学习从业者的访谈。我们的研究揭示了主要测试活动的重要工业关注点，即测试数据收集，测试执行和测试结果分析，以及从行业角度来看的良好实践和开放挑战。(1) ML模型、数据、代码的测试数据收集方式不同，面临的挑战也不同。(2)机器学习系统中的测试执行存在两个主要问题:组件之间的纠缠和模型性能的回归。(3)测试结果分析以定量方法为中心，如基于度量的评价，并结合一些基于从业者经验的定性方法。基于我们的研究结果，我们强调了研究的机会，并为从业者提供了一些启示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Testing Machine Learning Systems in Industry: An Empirical Study

Machine learning becomes increasingly prevalent and integrated into a wide range of software systems. These systems, named ML systems, must be adequately tested to gain confidence that they behave correctly. Although many research efforts have been devoted to testing technologies for ML systems, the industrial teams are faced with new challenges on testing the ML systems in real-world settings. To absorb inspirations from the industry on the problems in ML testing, we conducted an empirical study including a survey with 87 responses and interviews with 7 senior ML practitioners from well-known IT companies. Our study uncovers significant industrial concerns on major testing activities, i.e., test data collection, test execution, and test result analysis, and also the good practices and open challenges from the perspective of the industry. (1) Test data collection is conducted in different ways on ML model, data, and code and faced with different challenges. (2) Test execution in ML systems suffers from two major problems: entanglement among the components and the regression on model performance. (3) Test result analysis centers on quantitative methods, e.g., metric-based evaluation, and is combined with some qualitative methods based on practitioners’ experience. Based on our findings, we highlight the research opportunities and also provide some implications for practitioners.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

自引率

0.00%

发文量

期刊最新文献

Industry's Cry for Tools that Support Large-Scale Refactoring Code Reviewer Recommendation in Tencent: Practice, Challenge, and Direction* What's bothering developers in code review? The Impact of Flaky Tests on Historical Test Prioritization on Chrome Surveying the Developer Experience of Flaky Tests