I. Wiese, R. Ré, Igor Steinmacher, R. T. Kuroda, G. Oliva, M. Gerosa
{"title":"从存储库信息预测变更传播","authors":"I. Wiese, R. Ré, Igor Steinmacher, R. T. Kuroda, G. Oliva, M. Gerosa","doi":"10.1109/SBES.2015.21","DOIUrl":null,"url":null,"abstract":"Change propagation occurs when a change in an artifact leads to changes in other artifacts. Previous research has used frequency of past changes between artifacts and different types of artifacts coupling to build prediction models of change propagation. To improve the accuracy of the prediction, we explored the combination of different data from software development repository, such as change requests, communication data, and artifacts modifications. This information can capture different dimensions of software development, what can lead to improvements on the accuracy of the models. We conducted an empirical study in four open source projects, namely Cassandra, Camel, Hadoop, and Lucene. Classifiers were constructed for each pair of artifacts that change together to predict if the change propagation between two files occurs in a certain change request. The models obtained values of area under the curve (AUC) of 0.849 on average. Furthermore, the sensitivity (recall) obtained is almost 4 times higher (57.06% vs. 15.70%) when compared our models to a baseline model built using association rules. With a reduced number of false positives, the models could be used in practice to help developers during software evolution.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Predicting Change Propagation from Repository Information\",\"authors\":\"I. Wiese, R. Ré, Igor Steinmacher, R. T. Kuroda, G. Oliva, M. Gerosa\",\"doi\":\"10.1109/SBES.2015.21\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Change propagation occurs when a change in an artifact leads to changes in other artifacts. Previous research has used frequency of past changes between artifacts and different types of artifacts coupling to build prediction models of change propagation. To improve the accuracy of the prediction, we explored the combination of different data from software development repository, such as change requests, communication data, and artifacts modifications. This information can capture different dimensions of software development, what can lead to improvements on the accuracy of the models. We conducted an empirical study in four open source projects, namely Cassandra, Camel, Hadoop, and Lucene. Classifiers were constructed for each pair of artifacts that change together to predict if the change propagation between two files occurs in a certain change request. The models obtained values of area under the curve (AUC) of 0.849 on average. Furthermore, the sensitivity (recall) obtained is almost 4 times higher (57.06% vs. 15.70%) when compared our models to a baseline model built using association rules. With a reduced number of false positives, the models could be used in practice to help developers during software evolution.\",\"PeriodicalId\":329313,\"journal\":{\"name\":\"2015 29th Brazilian Symposium on Software Engineering\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 29th Brazilian Symposium on Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SBES.2015.21\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 29th Brazilian Symposium on Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBES.2015.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Predicting Change Propagation from Repository Information
Change propagation occurs when a change in an artifact leads to changes in other artifacts. Previous research has used frequency of past changes between artifacts and different types of artifacts coupling to build prediction models of change propagation. To improve the accuracy of the prediction, we explored the combination of different data from software development repository, such as change requests, communication data, and artifacts modifications. This information can capture different dimensions of software development, what can lead to improvements on the accuracy of the models. We conducted an empirical study in four open source projects, namely Cassandra, Camel, Hadoop, and Lucene. Classifiers were constructed for each pair of artifacts that change together to predict if the change propagation between two files occurs in a certain change request. The models obtained values of area under the curve (AUC) of 0.849 on average. Furthermore, the sensitivity (recall) obtained is almost 4 times higher (57.06% vs. 15.70%) when compared our models to a baseline model built using association rules. With a reduced number of false positives, the models could be used in practice to help developers during software evolution.