Predicting Change Propagation from Repository Information

I. Wiese, R. Ré, Igor Steinmacher, R. T. Kuroda, G. Oliva, M. Gerosa
{"title":"Predicting Change Propagation from Repository Information","authors":"I. Wiese, R. Ré, Igor Steinmacher, R. T. Kuroda, G. Oliva, M. Gerosa","doi":"10.1109/SBES.2015.21","DOIUrl":null,"url":null,"abstract":"Change propagation occurs when a change in an artifact leads to changes in other artifacts. Previous research has used frequency of past changes between artifacts and different types of artifacts coupling to build prediction models of change propagation. To improve the accuracy of the prediction, we explored the combination of different data from software development repository, such as change requests, communication data, and artifacts modifications. This information can capture different dimensions of software development, what can lead to improvements on the accuracy of the models. We conducted an empirical study in four open source projects, namely Cassandra, Camel, Hadoop, and Lucene. Classifiers were constructed for each pair of artifacts that change together to predict if the change propagation between two files occurs in a certain change request. The models obtained values of area under the curve (AUC) of 0.849 on average. Furthermore, the sensitivity (recall) obtained is almost 4 times higher (57.06% vs. 15.70%) when compared our models to a baseline model built using association rules. With a reduced number of false positives, the models could be used in practice to help developers during software evolution.","PeriodicalId":329313,"journal":{"name":"2015 29th Brazilian Symposium on Software Engineering","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 29th Brazilian Symposium on Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBES.2015.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Change propagation occurs when a change in an artifact leads to changes in other artifacts. Previous research has used frequency of past changes between artifacts and different types of artifacts coupling to build prediction models of change propagation. To improve the accuracy of the prediction, we explored the combination of different data from software development repository, such as change requests, communication data, and artifacts modifications. This information can capture different dimensions of software development, what can lead to improvements on the accuracy of the models. We conducted an empirical study in four open source projects, namely Cassandra, Camel, Hadoop, and Lucene. Classifiers were constructed for each pair of artifacts that change together to predict if the change propagation between two files occurs in a certain change request. The models obtained values of area under the curve (AUC) of 0.849 on average. Furthermore, the sensitivity (recall) obtained is almost 4 times higher (57.06% vs. 15.70%) when compared our models to a baseline model built using association rules. With a reduced number of false positives, the models could be used in practice to help developers during software evolution.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从存储库信息预测变更传播
当一个工件中的变更导致其他工件中的变更时,就会发生变更传播。以往的研究利用工件之间过去变化的频率和不同类型工件的耦合来建立变化传播的预测模型。为了提高预测的准确性,我们探索了来自软件开发存储库的不同数据的组合,例如变更请求、通信数据和工件修改。这些信息可以捕获软件开发的不同维度,从而提高模型的准确性。我们对四个开源项目Cassandra、Camel、Hadoop和Lucene进行了实证研究。为每一对一起更改的工件构建分类器,以预测在某个更改请求中两个文件之间的更改传播是否发生。模型的曲线下面积(AUC)均值为0.849。此外,与使用关联规则构建的基线模型相比,我们的模型获得的灵敏度(召回率)几乎高出4倍(57.06%对15.70%)。由于误报的数量减少,这些模型可以在实践中用于帮助开发人员进行软件开发。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Bayesian Network Model to Assess Agile Teams' Teamwork Quality Evaluating Collaborative Practices in Acquiring Programming Skills: Findings of a Controlled Experiment A Method to Derive Metric Thresholds for Software Product Lines An Experiment on Process Model Understandability Using Textual Work Instructions and BPMN Models Influence of the Review of Executed Activities Utilizing Planning Poker
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1