Rezwana Mamata, Akramul Azim, R. Liscano, Kevin Smith, Yee-Kang Chang, Gkerta Seferi, Qasim Tauseef
{"title":"Test Case Prioritization using Transfer Learning in Continuous Integration Environments","authors":"Rezwana Mamata, Akramul Azim, R. Liscano, Kevin Smith, Yee-Kang Chang, Gkerta Seferi, Qasim Tauseef","doi":"10.1109/AST58925.2023.00023","DOIUrl":null,"url":null,"abstract":"The continuous Integration (CI) process runs a large set of automated test cases to verify software builds. The testing phase in the CI systems has timing constraints to ensure software quality without significantly delaying the CI builds. Therefore, CI requires efficient testing techniques such as Test Case Prioritization (TCP) to run faulty test cases with priority. Recent research studies on TCP utilize different Machine Learning (ML) methods to adopt the dynamic and complex nature of CI. However, the performance of ML for TCP may decrease for a low volume of data and less failure rate, whereas using existing data with similar patterns from other domains can be valuable. We formulate this as a transfer learning (TL) problem. TL has proven to be beneficial for many real-world applications where source domains have plenty of data, but the target domains have a scarcity of it. Therefore, this research investigates leveraging the benefit of transfer learning for test case prioritization (TCP). However, only some industrial CI datasets are publicly available due to data privacy protection regulations. In such cases, model-based transfer learning is a potential solution to share knowledge among different projects without revealing data to other stakeholders. This paper applies TransBoost, a tree-kernel-based TL algorithm, to evaluate the TL approach for 24 study subjects and identify potential source datasets.","PeriodicalId":252417,"journal":{"name":"2023 IEEE/ACM International Conference on Automation of Software Test (AST)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM International Conference on Automation of Software Test (AST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AST58925.2023.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The continuous Integration (CI) process runs a large set of automated test cases to verify software builds. The testing phase in the CI systems has timing constraints to ensure software quality without significantly delaying the CI builds. Therefore, CI requires efficient testing techniques such as Test Case Prioritization (TCP) to run faulty test cases with priority. Recent research studies on TCP utilize different Machine Learning (ML) methods to adopt the dynamic and complex nature of CI. However, the performance of ML for TCP may decrease for a low volume of data and less failure rate, whereas using existing data with similar patterns from other domains can be valuable. We formulate this as a transfer learning (TL) problem. TL has proven to be beneficial for many real-world applications where source domains have plenty of data, but the target domains have a scarcity of it. Therefore, this research investigates leveraging the benefit of transfer learning for test case prioritization (TCP). However, only some industrial CI datasets are publicly available due to data privacy protection regulations. In such cases, model-based transfer learning is a potential solution to share knowledge among different projects without revealing data to other stakeholders. This paper applies TransBoost, a tree-kernel-based TL algorithm, to evaluate the TL approach for 24 study subjects and identify potential source datasets.