(Un-)Covering Equivalent Mutants

2010 Third International Conference on Software Testing, Verification and Validation Pub Date : 2010-04-06 DOI:10.1109/ICST.2010.30

David Schuler, A. Zeller

{"title":"(Un-)Covering Equivalent Mutants","authors":"David Schuler, A. Zeller","doi":"10.1109/ICST.2010.30","DOIUrl":null,"url":null,"abstract":"Mutation testing measures the adequacy of a test suite by seeding artificial defects (mutations) into a program. If a test suite fails to detect a mutation, it may also fail to detect real defects-and hence should be improved. However, there also are mutations which keep the program semantics unchanged and thus cannot be detected by any test suite. Such equivalent mutants must be weeded out manually, which is a tedious task. In this paper, we examine whether changes in coverage can be used to detect non-equivalent mutants: If a mutant changes the coverage of a run, it is more likely to be non-equivalent. Ina sample of 140 manually classified mutations of seven Java programs with 5,000to 100,000 lines of code, we found that: (a) the problem is serious and widespread-about 45% of all undetected mutants turned out to be equivalent;(b) manual classification takes time-about 15 minutes per mutation; (c)coverage is a simple, efficient, and effective means to identify equivalent mutants-with a classification precision of 75% and a recall of 56%; and (d)coverage as an equivalence detector is superior to the state of the art, in particular violations of dynamic invariants. Our detectors have been released as part of the open source Javalanche framework; the data set is publicly available for replication and extension of experiments.","PeriodicalId":192678,"journal":{"name":"2010 Third International Conference on Software Testing, Verification and Validation","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"127","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Third International Conference on Software Testing, Verification and Validation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICST.2010.30","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 127

Abstract

Mutation testing measures the adequacy of a test suite by seeding artificial defects (mutations) into a program. If a test suite fails to detect a mutation, it may also fail to detect real defects-and hence should be improved. However, there also are mutations which keep the program semantics unchanged and thus cannot be detected by any test suite. Such equivalent mutants must be weeded out manually, which is a tedious task. In this paper, we examine whether changes in coverage can be used to detect non-equivalent mutants: If a mutant changes the coverage of a run, it is more likely to be non-equivalent. Ina sample of 140 manually classified mutations of seven Java programs with 5,000to 100,000 lines of code, we found that: (a) the problem is serious and widespread-about 45% of all undetected mutants turned out to be equivalent;(b) manual classification takes time-about 15 minutes per mutation; (c)coverage is a simple, efficient, and effective means to identify equivalent mutants-with a classification precision of 75% and a recall of 56%; and (d)coverage as an equivalence detector is superior to the state of the art, in particular violations of dynamic invariants. Our detectors have been released as part of the open source Javalanche framework; the data set is publicly available for replication and extension of experiments.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

(不)覆盖等效突变体

突变测试通过在程序中植入人工缺陷(突变)来度量测试套件的充分性。如果测试套件未能检测到突变，那么它也可能无法检测到真正的缺陷——因此应该进行改进。然而，也有一些突变使程序语义保持不变，因此无法被任何测试套件检测到。必须手动清除这些等效的突变体，这是一项繁琐的任务。在本文中，我们研究了覆盖率的变化是否可以用于检测非等效突变:如果突变改变了运行的覆盖率，则更有可能是非等效的。在对7个Java程序的140个人工分类突变的样本中，我们发现(a)这个问题是严重和普遍的——在所有未被检测到的突变中，大约有45%是相同的;(b)人工分类需要时间——每个突变大约15分钟;(c)覆盖度是一种简单、高效、有效的识别等效突变体的方法——分类精度为75%，召回率为56%;(d)覆盖范围作为等效检测器优于现有技术，特别是违反动态不变量的情况。我们的检测器已经作为开源Javalanche框架的一部分发布;数据集是公开的，可用于复制和扩展实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2010 Third International Conference on Software Testing, Verification and Validation

自引率

0.00%

发文量

期刊最新文献

Using Mutation to Automatically Suggest Fixes for Faulty Programs Holistic Model-Based Testing for Business Information Systems Prioritizing State-Based Aspect Tests Towards Automated, Formal Verification of Model Transformations (Un-)Covering Equivalent Mutants