持续集成成果的跨项目可预测性实证研究

2017 14th Web Information Systems and Applications Conference (WISA) Pub Date : 2017-11-01 DOI:10.1109/WISA.2017.53

Jing Xia, Yanhui Li, Chuanqi Wang

{"title":"持续集成成果的跨项目可预测性实证研究","authors":"Jing Xia, Yanhui Li, Chuanqi Wang","doi":"10.1109/WISA.2017.53","DOIUrl":null,"url":null,"abstract":"Build prediction can reduce latency between continuous integration outcomes and the corresponding decisions, improving the efficiency of development team. Current build predictions are generally within-project, making it unavailable on those projects without enough build data. Cross-project prediction is the-state-of-art technique to solve the lack of training data on the studied projects by importing data from other projects. However, no previous study focuses on cross-project build predictions and checks the performance in the real world projects. This paper carries out an empirical study on the performance of cross-project build prediction with a wide range of 126 opensource projects under 6 common classifiers. In this paper, to select the training sets for cross-project prediction, we introduce two widely used data selection methods: Burak Filter based on build-level and Bellwether Strategy based on project-level. According to the results of our experiments, we have the following observations. Firstly, by the comparison between these two methods, we find that project-level selection (Bellwether strategy) performs better than build-level selection (Burak Filter). Furthermore, we observe that the prediction results can be improved by clustering the 126 studied projects into several smaller communities containing about 20-40 projects. And among 6 used classifiers, we find that decision tree classifier performs the best. Finally, by computing the optimal prediction results, we conclude that current selection methods still need to be improved to get close to the optimal prediction in cross-project build predictions.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"224 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"An Empirical Study on the Cross-Project Predictability of Continuous Integration Outcomes\",\"authors\":\"Jing Xia, Yanhui Li, Chuanqi Wang\",\"doi\":\"10.1109/WISA.2017.53\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Build prediction can reduce latency between continuous integration outcomes and the corresponding decisions, improving the efficiency of development team. Current build predictions are generally within-project, making it unavailable on those projects without enough build data. Cross-project prediction is the-state-of-art technique to solve the lack of training data on the studied projects by importing data from other projects. However, no previous study focuses on cross-project build predictions and checks the performance in the real world projects. This paper carries out an empirical study on the performance of cross-project build prediction with a wide range of 126 opensource projects under 6 common classifiers. In this paper, to select the training sets for cross-project prediction, we introduce two widely used data selection methods: Burak Filter based on build-level and Bellwether Strategy based on project-level. According to the results of our experiments, we have the following observations. Firstly, by the comparison between these two methods, we find that project-level selection (Bellwether strategy) performs better than build-level selection (Burak Filter). Furthermore, we observe that the prediction results can be improved by clustering the 126 studied projects into several smaller communities containing about 20-40 projects. And among 6 used classifiers, we find that decision tree classifier performs the best. Finally, by computing the optimal prediction results, we conclude that current selection methods still need to be improved to get close to the optimal prediction in cross-project build predictions.\",\"PeriodicalId\":204706,\"journal\":{\"name\":\"2017 14th Web Information Systems and Applications Conference (WISA)\",\"volume\":\"224 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 14th Web Information Systems and Applications Conference (WISA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WISA.2017.53\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th Web Information Systems and Applications Conference (WISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2017.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

构建预测可以减少持续集成结果和相应决策之间的延迟，提高开发团队的效率。当前的构建预测通常是在项目内部进行的，这使得它在那些没有足够构建数据的项目中不可用。跨项目预测是通过导入其他项目的数据来解决所研究项目缺乏训练数据的最新技术。然而，以前没有研究关注跨项目构建预测和检查实际项目中的性能。本文在6种常用分类器下对126个开源项目的跨项目构建预测性能进行了实证研究。在本文中，为了选择跨项目预测的训练集，我们引入了两种广泛使用的数据选择方法:基于构建级的Burak Filter和基于项目级的Bellwether Strategy。根据我们的实验结果，我们有以下观察结果。首先，通过两种方法的比较，我们发现项目级选择(Bellwether策略)优于构建级选择(Burak Filter)。此外，我们观察到，通过将126个研究项目聚类到包含20-40个项目的几个较小的社区中，可以提高预测结果。在使用的6种分类器中，我们发现决策树分类器表现最好。最后，通过计算最优预测结果，我们得出结论，在跨项目构建预测中，目前的选择方法仍需要改进，以接近最优预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

An Empirical Study on the Cross-Project Predictability of Continuous Integration Outcomes

Build prediction can reduce latency between continuous integration outcomes and the corresponding decisions, improving the efficiency of development team. Current build predictions are generally within-project, making it unavailable on those projects without enough build data. Cross-project prediction is the-state-of-art technique to solve the lack of training data on the studied projects by importing data from other projects. However, no previous study focuses on cross-project build predictions and checks the performance in the real world projects. This paper carries out an empirical study on the performance of cross-project build prediction with a wide range of 126 opensource projects under 6 common classifiers. In this paper, to select the training sets for cross-project prediction, we introduce two widely used data selection methods: Burak Filter based on build-level and Bellwether Strategy based on project-level. According to the results of our experiments, we have the following observations. Firstly, by the comparison between these two methods, we find that project-level selection (Bellwether strategy) performs better than build-level selection (Burak Filter). Furthermore, we observe that the prediction results can be improved by clustering the 126 studied projects into several smaller communities containing about 20-40 projects. And among 6 used classifiers, we find that decision tree classifier performs the best. Finally, by computing the optimal prediction results, we conclude that current selection methods still need to be improved to get close to the optimal prediction in cross-project build predictions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 14th Web Information Systems and Applications Conference (WISA)

自引率

0.00%

发文量

期刊最新文献

Efficient Time Series Classification via Sparse Linear Combination Checking the Statutes in Chinese Judgment Document Based on Editing Distance Algorithm Information Extraction from Chinese Judgment Documents Topic Classification Based on Improved Word Embedding Keyword Extraction for Social Media Short Text