{"title":"对EvoSuite生成的问题测试进行系统评估","authors":"Zhiyu Fan","doi":"10.1109/ICSE-Companion.2019.00068","DOIUrl":null,"url":null,"abstract":"With the rapidly growing scale of modern software, the reliability of software systems has become essential. To ease the developers' pressure of writing unit tests manually, test generation tools such as EvoSuite and Randoop were proposed. Although these approaches have been shown to be able to automatically generate tests for achieving high coverage, the generated tests may be ineffective in detecting real faults. Particularly, these automatically generated tests may suffer from several problems (we call them problematic tests): (1) incorrect oracle. (2) unexpected exception/error. (3) flaky test. We present a comprehensive study of EvoSuite in Defects4j, and performed a detailed analysis of the reasons behind these automatically generated problematic tests. Our analysis identifies 528 problematic tests: 208 (39.4%) of them are caused by incorrect oracle, 319 (60.4%) are caused by unexpected exception/error, and one flaky test.","PeriodicalId":273100,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Systematic Evaluation of Problematic Tests Generated by EvoSuite\",\"authors\":\"Zhiyu Fan\",\"doi\":\"10.1109/ICSE-Companion.2019.00068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapidly growing scale of modern software, the reliability of software systems has become essential. To ease the developers' pressure of writing unit tests manually, test generation tools such as EvoSuite and Randoop were proposed. Although these approaches have been shown to be able to automatically generate tests for achieving high coverage, the generated tests may be ineffective in detecting real faults. Particularly, these automatically generated tests may suffer from several problems (we call them problematic tests): (1) incorrect oracle. (2) unexpected exception/error. (3) flaky test. We present a comprehensive study of EvoSuite in Defects4j, and performed a detailed analysis of the reasons behind these automatically generated problematic tests. Our analysis identifies 528 problematic tests: 208 (39.4%) of them are caused by incorrect oracle, 319 (60.4%) are caused by unexpected exception/error, and one flaky test.\",\"PeriodicalId\":273100,\"journal\":{\"name\":\"2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSE-Companion.2019.00068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE-Companion.2019.00068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Systematic Evaluation of Problematic Tests Generated by EvoSuite
With the rapidly growing scale of modern software, the reliability of software systems has become essential. To ease the developers' pressure of writing unit tests manually, test generation tools such as EvoSuite and Randoop were proposed. Although these approaches have been shown to be able to automatically generate tests for achieving high coverage, the generated tests may be ineffective in detecting real faults. Particularly, these automatically generated tests may suffer from several problems (we call them problematic tests): (1) incorrect oracle. (2) unexpected exception/error. (3) flaky test. We present a comprehensive study of EvoSuite in Defects4j, and performed a detailed analysis of the reasons behind these automatically generated problematic tests. Our analysis identifies 528 problematic tests: 208 (39.4%) of them are caused by incorrect oracle, 319 (60.4%) are caused by unexpected exception/error, and one flaky test.