使用概率模型来预测bug修复

Mauricio Soto, Claire Le Goues
{"title":"使用概率模型来预测bug修复","authors":"Mauricio Soto, Claire Le Goues","doi":"10.1109/SANER.2018.8330211","DOIUrl":null,"url":null,"abstract":"Automatic Software Repair (APR) has significant potential to reduce software maintenance costs by reducing the human effort required to localize and fix bugs. State-of-the-art generate-and-validate APR techniques select between and instantiate various mutation operators to construct candidate patches, informed largely by heuristic probability distributions. This may reduce effectiveness in terms of both efficiency and output quality. In practice, human developers have many options in terms of how to edit code to fix bugs, some of which are far more common than others (e.g., deleting a line of code is more common than adding a new class). We mined the most recent 100 bug-fixing commits from each of the 500 most popular Java projects in GitHub (the largest dataset to date) to create a probabilistic model describing edit distributions. We categorize, compare and evaluate the different mutation operators used in state-of-the-art approaches. We find that a probabilistic modelbased APR approach patches bugs more quickly in the majority of bugs studied, and that the resulting patches are of higher quality than those produced by previous approaches. Finally, we mine association rules for multi-edit source code changes, an understudied but important problem. We validate the association rules by analyzing how much of our corpus can be built from them. Our evaluation indicates that 84.6% of the multi-edit patches from the corpus can be built from the association rules, while maintaining 90% confidence.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"40 1","pages":"221-231"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":"{\"title\":\"Using a probabilistic model to predict bug fixes\",\"authors\":\"Mauricio Soto, Claire Le Goues\",\"doi\":\"10.1109/SANER.2018.8330211\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic Software Repair (APR) has significant potential to reduce software maintenance costs by reducing the human effort required to localize and fix bugs. State-of-the-art generate-and-validate APR techniques select between and instantiate various mutation operators to construct candidate patches, informed largely by heuristic probability distributions. This may reduce effectiveness in terms of both efficiency and output quality. In practice, human developers have many options in terms of how to edit code to fix bugs, some of which are far more common than others (e.g., deleting a line of code is more common than adding a new class). We mined the most recent 100 bug-fixing commits from each of the 500 most popular Java projects in GitHub (the largest dataset to date) to create a probabilistic model describing edit distributions. We categorize, compare and evaluate the different mutation operators used in state-of-the-art approaches. We find that a probabilistic modelbased APR approach patches bugs more quickly in the majority of bugs studied, and that the resulting patches are of higher quality than those produced by previous approaches. Finally, we mine association rules for multi-edit source code changes, an understudied but important problem. We validate the association rules by analyzing how much of our corpus can be built from them. Our evaluation indicates that 84.6% of the multi-edit patches from the corpus can be built from the association rules, while maintaining 90% confidence.\",\"PeriodicalId\":6602,\"journal\":{\"name\":\"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)\",\"volume\":\"40 1\",\"pages\":\"221-231\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"36\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SANER.2018.8330211\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SANER.2018.8330211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36

摘要

自动软件修复(Automatic Software Repair, APR)具有显著的潜力,可以通过减少本地化和修复错误所需的人力来降低软件维护成本。最先进的生成和验证APR技术选择并实例化各种突变操作符来构建候选补丁,主要由启发式概率分布提供信息。这可能会降低效率和产出质量方面的有效性。在实践中,人类开发人员在如何编辑代码以修复错误方面有许多选择,其中一些比其他更常见(例如,删除一行代码比添加一个新类更常见)。我们从GitHub(迄今为止最大的数据集)中500个最受欢迎的Java项目中挖掘了最近的100个bug修复提交,以创建描述编辑发行版的概率模型。我们对最先进的方法中使用的不同突变算子进行分类,比较和评估。我们发现基于概率模型的APR方法在研究的大多数错误中更快地修补错误,并且所得到的补丁质量比以前的方法更高。最后,我们挖掘了多编辑源代码更改的关联规则,这是一个尚未得到充分研究但很重要的问题。我们通过分析有多少语料库可以由它们构建来验证关联规则。我们的评估表明,84.6%的多编辑补丁可以从关联规则中构建,同时保持90%的置信度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Using a probabilistic model to predict bug fixes
Automatic Software Repair (APR) has significant potential to reduce software maintenance costs by reducing the human effort required to localize and fix bugs. State-of-the-art generate-and-validate APR techniques select between and instantiate various mutation operators to construct candidate patches, informed largely by heuristic probability distributions. This may reduce effectiveness in terms of both efficiency and output quality. In practice, human developers have many options in terms of how to edit code to fix bugs, some of which are far more common than others (e.g., deleting a line of code is more common than adding a new class). We mined the most recent 100 bug-fixing commits from each of the 500 most popular Java projects in GitHub (the largest dataset to date) to create a probabilistic model describing edit distributions. We categorize, compare and evaluate the different mutation operators used in state-of-the-art approaches. We find that a probabilistic modelbased APR approach patches bugs more quickly in the majority of bugs studied, and that the resulting patches are of higher quality than those produced by previous approaches. Finally, we mine association rules for multi-edit source code changes, an understudied but important problem. We validate the association rules by analyzing how much of our corpus can be built from them. Our evaluation indicates that 84.6% of the multi-edit patches from the corpus can be built from the association rules, while maintaining 90% confidence.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Exploring the integration of user feedback in automated testing of Android applications The Statechart Workbench: Enabling scalable software event log analysis using process mining Detecting code smells using machine learning techniques: Are we there yet? Classifying stack overflow posts on API issues Re-evaluating method-level bug prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1