Test Case Prioritization using Transfer Learning in Continuous Integration Environments

2023 IEEE/ACM International Conference on Automation of Software Test (AST) Pub Date : 2023-05-01 DOI:10.1109/AST58925.2023.00023

Rezwana Mamata, Akramul Azim, R. Liscano, Kevin Smith, Yee-Kang Chang, Gkerta Seferi, Qasim Tauseef

{"title":"Test Case Prioritization using Transfer Learning in Continuous Integration Environments","authors":"Rezwana Mamata, Akramul Azim, R. Liscano, Kevin Smith, Yee-Kang Chang, Gkerta Seferi, Qasim Tauseef","doi":"10.1109/AST58925.2023.00023","DOIUrl":null,"url":null,"abstract":"The continuous Integration (CI) process runs a large set of automated test cases to verify software builds. The testing phase in the CI systems has timing constraints to ensure software quality without significantly delaying the CI builds. Therefore, CI requires efficient testing techniques such as Test Case Prioritization (TCP) to run faulty test cases with priority. Recent research studies on TCP utilize different Machine Learning (ML) methods to adopt the dynamic and complex nature of CI. However, the performance of ML for TCP may decrease for a low volume of data and less failure rate, whereas using existing data with similar patterns from other domains can be valuable. We formulate this as a transfer learning (TL) problem. TL has proven to be beneficial for many real-world applications where source domains have plenty of data, but the target domains have a scarcity of it. Therefore, this research investigates leveraging the benefit of transfer learning for test case prioritization (TCP). However, only some industrial CI datasets are publicly available due to data privacy protection regulations. In such cases, model-based transfer learning is a potential solution to share knowledge among different projects without revealing data to other stakeholders. This paper applies TransBoost, a tree-kernel-based TL algorithm, to evaluate the TL approach for 24 study subjects and identify potential source datasets.","PeriodicalId":252417,"journal":{"name":"2023 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM International Conference on Automation of Software Test (AST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AST58925.2023.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The continuous Integration (CI) process runs a large set of automated test cases to verify software builds. The testing phase in the CI systems has timing constraints to ensure software quality without significantly delaying the CI builds. Therefore, CI requires efficient testing techniques such as Test Case Prioritization (TCP) to run faulty test cases with priority. Recent research studies on TCP utilize different Machine Learning (ML) methods to adopt the dynamic and complex nature of CI. However, the performance of ML for TCP may decrease for a low volume of data and less failure rate, whereas using existing data with similar patterns from other domains can be valuable. We formulate this as a transfer learning (TL) problem. TL has proven to be beneficial for many real-world applications where source domains have plenty of data, but the target domains have a scarcity of it. Therefore, this research investigates leveraging the benefit of transfer learning for test case prioritization (TCP). However, only some industrial CI datasets are publicly available due to data privacy protection regulations. In such cases, model-based transfer learning is a potential solution to share knowledge among different projects without revealing data to other stakeholders. This paper applies TransBoost, a tree-kernel-based TL algorithm, to evaluate the TL approach for 24 study subjects and identify potential source datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在持续集成环境中使用迁移学习来确定测试用例的优先级

持续集成(CI)过程运行大量自动化测试用例来验证软件构建。CI系统中的测试阶段有时间限制，以确保软件质量，而不会显著延迟CI构建。因此，CI需要有效的测试技术，例如测试用例优先级(TCP)来运行有优先级的错误测试用例。最近对TCP的研究利用不同的机器学习(ML)方法来采用CI的动态性和复杂性。然而，对于低数据量和低故障率，TCP的ML性能可能会下降，而使用来自其他领域的具有类似模式的现有数据可能是有价值的。我们将其表述为迁移学习(TL)问题。TL已被证明对许多真实世界的应用程序是有益的，在这些应用程序中，源域具有大量数据，而目标域缺乏数据。因此，本研究探讨了利用迁移学习对测试用例优先级(TCP)的好处。然而，由于数据隐私保护规定，只有一些工业CI数据集是公开可用的。在这种情况下，基于模型的迁移学习是在不向其他利益相关者泄露数据的情况下在不同项目之间共享知识的潜在解决方案。本文采用基于树核的TL算法TransBoost对24个研究对象的TL方法进行评估，并识别潜在的源数据集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2023 IEEE/ACM International Conference on Automation of Software Test (AST)

自引率

0.00%

发文量