Hunting bugs: Towards an automated approach to identifying which change caused a bug through regression testing

IF 3.5 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Empirical Software Engineering Pub Date : 2024-05-04 DOI:10.1007/s10664-024-10479-z

Michel Maes-Bermejo, Alexander Serebrenik, Micael Gallego, Francisco Gortázar, Gregorio Robles, Jesús María González Barahona

{"title":"Hunting bugs: Towards an automated approach to identifying which change caused a bug through regression testing","authors":"Michel Maes-Bermejo, Alexander Serebrenik, Micael Gallego, Francisco Gortázar, Gregorio Robles, Jesús María González Barahona","doi":"10.1007/s10664-024-10479-z","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Context</h3><p>Finding code changes that introduced bugs is important both for practitioners and researchers, but doing it precisely is a manual, effort-intensive process. The <i>perfect test</i> method is a theoretical construct aimed at detecting Bug-Introducing Changes (BIC) through a theoretical <i>perfect test</i>. This <i>perfect test</i> always fails if the bug is present, and passes otherwise.</p><h3 data-test=\"abstract-sub-heading\">Objective</h3><p>To explore a possible automatic operationalization of the <i>perfect test</i> method.</p><h3 data-test=\"abstract-sub-heading\">Method</h3><p>To use regression tests as substitutes for the <i>perfect test</i>. For this, we transplant the regression tests to past snapshots of the code, and use them to identify the BIC, on a well-known collection of bugs from the Defects4J dataset.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>From 809 bugs in the dataset, when running our operationalization of the perfect test method, for 95 of them the BIC was identified precisely and in the remaining 4 cases, a list of candidates including the BIC was provided.</p><h3 data-test=\"abstract-sub-heading\">Conclusions</h3><p>We demonstrate that the operationalization of the <i>perfect test</i> method through regression tests is feasible and can be completely automated in practice when tests can be transplanted and run in past snapshots of the code. Given that implementing regression tests when a bug is fixed is considered a good practice, when developers follow it, they can detect effortlessly bug-introducing changes by using our operationalization of the <i>perfect test</i> method.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"1 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Empirical Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10664-024-10479-z","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Context

Finding code changes that introduced bugs is important both for practitioners and researchers, but doing it precisely is a manual, effort-intensive process. The perfect test method is a theoretical construct aimed at detecting Bug-Introducing Changes (BIC) through a theoretical perfect test. This perfect test always fails if the bug is present, and passes otherwise.

Objective

To explore a possible automatic operationalization of the perfect test method.

Method

To use regression tests as substitutes for the perfect test. For this, we transplant the regression tests to past snapshots of the code, and use them to identify the BIC, on a well-known collection of bugs from the Defects4J dataset.

Results

From 809 bugs in the dataset, when running our operationalization of the perfect test method, for 95 of them the BIC was identified precisely and in the remaining 4 cases, a list of candidates including the BIC was provided.

Conclusions

We demonstrate that the operationalization of the perfect test method through regression tests is feasible and can be completely automated in practice when tests can be transplanted and run in past snapshots of the code. Given that implementing regression tests when a bug is fixed is considered a good practice, when developers follow it, they can detect effortlessly bug-introducing changes by using our operationalization of the perfect test method.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

寻找错误：通过回归测试自动识别导致错误的变更

背景寻找引入错误的代码变更对于从业人员和研究人员来说都很重要，但要精确地完成这项工作却是一个需要大量人力和精力的过程。完美测试法是一种理论构造，旨在通过理论上的完美测试来检测引入错误的变更（BIC）。目标探索完美测试法的自动操作方法。方法使用回归测试来替代完美测试。为此，我们将回归测试移植到过去的代码快照中，并在 Defects4J 数据集中的著名错误集合上使用它们来识别 BIC。结果从数据集中的 809 个错误中，当运行我们的完美测试方法操作化时，其中 95 个错误的 BIC 被精确识别，在其余 4 个案例中，提供了包括 BIC 在内的候选列表。结论我们证明，通过回归测试实现完美测试方法的可操作性是可行的，而且在实践中，当测试可以移植到过去的代码快照中运行时，完全可以实现自动化。鉴于在修复错误时实施回归测试被认为是一种良好的做法，因此当开发人员遵循这种做法时，他们可以通过使用我们的完美测试方法的操作化，毫不费力地检测到引入错误的更改。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Empirical Software Engineering 工程技术-计算机：软件工程

CiteScore

8.50

自引率

12.20%

发文量

169

审稿时长

>12 weeks

期刊介绍： Empirical Software Engineering provides a forum for applied software engineering research with a strong empirical component, and a venue for publishing empirical results relevant to both researchers and practitioners. Empirical studies presented here usually involve the collection and analysis of data and experience that can be used to characterize, evaluate and reveal relationships between software development deliverables, practices, and technologies. Over time, it is expected that such empirical results will form a body of knowledge leading to widely accepted and well-formed theories. The journal also offers industrial experience reports detailing the application of software technologies - processes, methods, or tools - and their effectiveness in industrial settings. Empirical Software Engineering promotes the publication of industry-relevant research, to address the significant gap between research and practice.

期刊最新文献

The effect of data complexity on classifier performance. Reinforcement learning for online testing of autonomous driving systems: a replication and extension study. An empirical study on developers’ shared conversations with ChatGPT in GitHub pull requests and issues Quality issues in machine learning software systems An empirical study of token-based micro commits