T. Chekam, Mike Papadakis, Yves Le Traon, M. Harman
{"title":"避免不可靠干净程序假设的突变、声明和分支覆盖故障揭示的实证研究","authors":"T. Chekam, Mike Papadakis, Yves Le Traon, M. Harman","doi":"10.1109/ICSE.2017.61","DOIUrl":null,"url":null,"abstract":"Many studies suggest using coverage concepts, such as branch coverage, as the starting point of testing, while others as the most prominent test quality indicator. Yet the relationship between coverage and fault-revelation remains unknown, yielding uncertainty and controversy. Most previous studies rely on the Clean Program Assumption, that a test suite will obtain similar coverage for both faulty and fixed ('clean') program versions. This assumption may appear intuitive, especially for bugs that denote small semantic deviations. However, we present evidence that the Clean Program Assumption does not always hold, thereby raising a critical threat to the validity of previous results. We then conducted a study using a robust experimental methodology that avoids this threat to validity, from which our primary finding is that strong mutation testing has the highest fault revelation of four widely-used criteria. Our findings also revealed that fault revelation starts to increase significantly only once relatively high levels of coverage are attained.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"20 1","pages":"597-608"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"128","resultStr":"{\"title\":\"An Empirical Study on Mutation, Statement and Branch Coverage Fault Revelation That Avoids the Unreliable Clean Program Assumption\",\"authors\":\"T. Chekam, Mike Papadakis, Yves Le Traon, M. Harman\",\"doi\":\"10.1109/ICSE.2017.61\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many studies suggest using coverage concepts, such as branch coverage, as the starting point of testing, while others as the most prominent test quality indicator. Yet the relationship between coverage and fault-revelation remains unknown, yielding uncertainty and controversy. Most previous studies rely on the Clean Program Assumption, that a test suite will obtain similar coverage for both faulty and fixed ('clean') program versions. This assumption may appear intuitive, especially for bugs that denote small semantic deviations. However, we present evidence that the Clean Program Assumption does not always hold, thereby raising a critical threat to the validity of previous results. We then conducted a study using a robust experimental methodology that avoids this threat to validity, from which our primary finding is that strong mutation testing has the highest fault revelation of four widely-used criteria. Our findings also revealed that fault revelation starts to increase significantly only once relatively high levels of coverage are attained.\",\"PeriodicalId\":6505,\"journal\":{\"name\":\"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)\",\"volume\":\"20 1\",\"pages\":\"597-608\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"128\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSE.2017.61\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2017.61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 128
摘要
许多研究建议使用覆盖概念,例如分支覆盖,作为测试的起点,而其他研究建议使用最突出的测试质量指标。然而,报道和断层揭露之间的关系仍然未知,产生了不确定性和争议。大多数先前的研究都依赖于Clean Program Assumption,即测试套件对于有缺陷的和固定的(“干净的”)程序版本将获得相似的覆盖率。这个假设可能看起来很直观,特别是对于表示小语义偏差的bug。然而,我们提出的证据表明,清洁计划假设并不总是成立,从而对先前结果的有效性提出了重大威胁。然后,我们使用一种强大的实验方法进行了一项研究,避免了这种对有效性的威胁,从中我们的主要发现是强突变测试在四种广泛使用的标准中具有最高的错误揭示。我们的发现还表明,只有在达到相对较高的覆盖水平时,断层显示才开始显著增加。
An Empirical Study on Mutation, Statement and Branch Coverage Fault Revelation That Avoids the Unreliable Clean Program Assumption
Many studies suggest using coverage concepts, such as branch coverage, as the starting point of testing, while others as the most prominent test quality indicator. Yet the relationship between coverage and fault-revelation remains unknown, yielding uncertainty and controversy. Most previous studies rely on the Clean Program Assumption, that a test suite will obtain similar coverage for both faulty and fixed ('clean') program versions. This assumption may appear intuitive, especially for bugs that denote small semantic deviations. However, we present evidence that the Clean Program Assumption does not always hold, thereby raising a critical threat to the validity of previous results. We then conducted a study using a robust experimental methodology that avoids this threat to validity, from which our primary finding is that strong mutation testing has the highest fault revelation of four widely-used criteria. Our findings also revealed that fault revelation starts to increase significantly only once relatively high levels of coverage are attained.