Learning to rank relevant files for bug reports using domain knowledge

Xin Ye, Razvan C. Bunescu, Chang Liu
{"title":"Learning to rank relevant files for bug reports using domain knowledge","authors":"Xin Ye, Razvan C. Bunescu, Chang Liu","doi":"10.1145/2635868.2635874","DOIUrl":null,"url":null,"abstract":"When a new bug report is received, developers usually need to reproduce the bug and perform code reviews to find the cause, a process that can be tedious and time consuming. A tool for ranking all the source files of a project with respect to how likely they are to contain the cause of the bug would enable developers to narrow down their search and potentially could lead to a substantial increase in productivity. This paper introduces an adaptive ranking approach that leverages domain knowledge through functional decompositions of source code files into methods, API descriptions of library components used in the code, the bug-fixing history, and the code change history. Given a bug report, the ranking score of each source file is computed as a weighted combination of an array of features encoding domain knowledge, where the weights are trained automatically on previously solved bug reports using a learning-to-rank technique. We evaluated our system on six large scale open source Java projects, using the before-fix version of the project for every bug report. The experimental results show that the newly introduced learning-to-rank approach significantly outperforms two recent state-of-the-art methods in recommending relevant files for bug reports. In particular, our method makes correct recommendations within the top 10 ranked source files for over 70% of the bug reports in the Eclipse Platform and Tomcat projects.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"257","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2635868.2635874","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 257

Abstract

When a new bug report is received, developers usually need to reproduce the bug and perform code reviews to find the cause, a process that can be tedious and time consuming. A tool for ranking all the source files of a project with respect to how likely they are to contain the cause of the bug would enable developers to narrow down their search and potentially could lead to a substantial increase in productivity. This paper introduces an adaptive ranking approach that leverages domain knowledge through functional decompositions of source code files into methods, API descriptions of library components used in the code, the bug-fixing history, and the code change history. Given a bug report, the ranking score of each source file is computed as a weighted combination of an array of features encoding domain knowledge, where the weights are trained automatically on previously solved bug reports using a learning-to-rank technique. We evaluated our system on six large scale open source Java projects, using the before-fix version of the project for every bug report. The experimental results show that the newly introduced learning-to-rank approach significantly outperforms two recent state-of-the-art methods in recommending relevant files for bug reports. In particular, our method makes correct recommendations within the top 10 ranked source files for over 70% of the bug reports in the Eclipse Platform and Tomcat projects.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
学习使用领域知识对bug报告的相关文件进行排序
当收到新的错误报告时,开发人员通常需要重新生成错误并执行代码审查以找到原因,这是一个乏味且耗时的过程。如果有一个工具可以根据包含错误原因的可能性对项目的所有源文件进行排序,这将使开发人员能够缩小搜索范围,并可能大大提高生产力。本文介绍了一种自适应排序方法,该方法通过将源代码文件分解为方法、代码中使用的库组件的API描述、错误修复历史和代码更改历史来利用领域知识。给定一个错误报告,每个源文件的排名分数被计算为编码领域知识的特征数组的加权组合,其中权重是使用学习排序技术在先前解决的错误报告上自动训练的。我们在六个大型开源Java项目上评估了我们的系统,对每个bug报告使用修复前的项目版本。实验结果表明,新引入的学习排序方法在为bug报告推荐相关文件方面明显优于最近两种最先进的方法。特别是,我们的方法对Eclipse平台和Tomcat项目中超过70%的bug报告中排名前10位的源文件给出了正确的建议。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Omen+: a precise dynamic deadlock detector for multithreaded Java libraries Improving the software testing skills of novices during onboarding through social transparency Counterexample guided abstraction refinement of product-line behavioural models A tool suite for the model-driven software engineering of cyber-physical systems Statistical symbolic execution with informed sampling
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1