An incremental update framework for efficient retrieval from software libraries for bug localization

Shivani Rao, Henry Medeiros, A. Kak
{"title":"An incremental update framework for efficient retrieval from software libraries for bug localization","authors":"Shivani Rao, Henry Medeiros, A. Kak","doi":"10.1109/WCRE.2013.6671281","DOIUrl":null,"url":null,"abstract":"Information Retrieval (IR) based bug localization techniques use a bug reports to query a software repository to retrieve relevant source files. These techniques index the source files in the software repository and train a model which is then queried for retrieval purposes. Much of the current research is focused on improving the retrieval effectiveness of these methods. However, little consideration has been given to the efficiency of such approaches for software repositories that are constantly evolving. As the software repository evolves, the index creation and model learning have to be repeated to ensure accuracy of retrieval for each new bug. In doing so, the query latency may be unreasonably high, and also, re-computing the index and the model for files that did not change is computationally redundant. We propose an incremental update framework to continuously update the index and the model using the changes made at each commit. We demonstrate that the same retrieval accuracy can be achieved but with a fraction of the time needed by current approaches. Our results are based on two basic IR modeling techniques - Vector Space Model (VSM) and Smoothed Unigram Model (SUM). The dataset we used in our validation experiments was created by tracking commit history of AspectJ and JodaTime software libraries over a span of 10 years.","PeriodicalId":275092,"journal":{"name":"2013 20th Working Conference on Reverse Engineering (WCRE)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 20th Working Conference on Reverse Engineering (WCRE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WCRE.2013.6671281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Information Retrieval (IR) based bug localization techniques use a bug reports to query a software repository to retrieve relevant source files. These techniques index the source files in the software repository and train a model which is then queried for retrieval purposes. Much of the current research is focused on improving the retrieval effectiveness of these methods. However, little consideration has been given to the efficiency of such approaches for software repositories that are constantly evolving. As the software repository evolves, the index creation and model learning have to be repeated to ensure accuracy of retrieval for each new bug. In doing so, the query latency may be unreasonably high, and also, re-computing the index and the model for files that did not change is computationally redundant. We propose an incremental update framework to continuously update the index and the model using the changes made at each commit. We demonstrate that the same retrieval accuracy can be achieved but with a fraction of the time needed by current approaches. Our results are based on two basic IR modeling techniques - Vector Space Model (VSM) and Smoothed Unigram Model (SUM). The dataset we used in our validation experiments was created by tracking commit history of AspectJ and JodaTime software libraries over a span of 10 years.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一个增量更新框架,用于从软件库中有效检索bug定位
基于信息检索(IR)的缺陷定位技术使用缺陷报告来查询软件存储库以检索相关的源文件。这些技术索引软件存储库中的源文件,并训练一个模型,然后为检索目的查询该模型。目前的研究主要集中在提高这些方法的检索效率上。然而,对于不断发展的软件存储库,很少考虑这种方法的效率。随着软件存储库的发展,必须重复索引创建和模型学习,以确保检索每个新错误的准确性。这样做,查询延迟可能会高得不合理,而且,为未更改的文件重新计算索引和模型在计算上是冗余的。我们提出了一个增量更新框架,使用每次提交时所做的更改来持续更新索引和模型。我们证明了相同的检索精度可以实现,但与目前的方法所需的时间的一小部分。我们的研究结果是基于两种基本的红外建模技术——向量空间模型(VSM)和平滑单格图模型(SUM)。我们在验证实验中使用的数据集是通过跟踪AspectJ和JodaTime软件库在10年内的提交历史创建的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An IDE-based context-aware meta search engine Do developers care about code smells? An exploratory survey Automated library recommendation Circe: A grammar-based oracle for testing Cross-site scripting in web applications Extracting business rules from COBOL: A model-based framework
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1