Junjie Chen, Wenxiang Hu, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, Bing Xie
{"title":"编译器测试技术的实证比较","authors":"Junjie Chen, Wenxiang Hu, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, Bing Xie","doi":"10.1145/2884781.2884878","DOIUrl":null,"url":null,"abstract":"Compilers, as one of the most important infrastructure of today’s digital world, are expected to be trustworthy. Different testing techniques are developed for testing compilers automatically. However, it is unknown so far how these testing techniques compared to each other in terms of testing effectiveness: how many bugs a testing technique can find within a time limit. In this paper, we conduct a systematic and comprehensive empirical comparison of three compiler testing techniques, namely, Randomized Differential Testing (RDT), a variant of RDT—Different Optimization Levels (DOL), and Equivalence Modulo Inputs (EMI). Our results show that DOL is more effective at detecting bugs related to optimization, whereas RDT is more effective at detecting other types of bugs, and the three techniques can complement each other to a certain degree. Furthermore, in order to understand why their effectiveness differs, we investigate three factors that influence the effectiveness of compiler testing, namely, efficiency, strength of test oracles, and effectiveness of generated test programs. The results indicate that all the three factors are statistically significant, and efficiency has the most significant impact.","PeriodicalId":6485,"journal":{"name":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","volume":"17 1","pages":"180-190"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"89","resultStr":"{\"title\":\"An Empirical Comparison of Compiler Testing Techniques\",\"authors\":\"Junjie Chen, Wenxiang Hu, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, Bing Xie\",\"doi\":\"10.1145/2884781.2884878\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Compilers, as one of the most important infrastructure of today’s digital world, are expected to be trustworthy. Different testing techniques are developed for testing compilers automatically. However, it is unknown so far how these testing techniques compared to each other in terms of testing effectiveness: how many bugs a testing technique can find within a time limit. In this paper, we conduct a systematic and comprehensive empirical comparison of three compiler testing techniques, namely, Randomized Differential Testing (RDT), a variant of RDT—Different Optimization Levels (DOL), and Equivalence Modulo Inputs (EMI). Our results show that DOL is more effective at detecting bugs related to optimization, whereas RDT is more effective at detecting other types of bugs, and the three techniques can complement each other to a certain degree. Furthermore, in order to understand why their effectiveness differs, we investigate three factors that influence the effectiveness of compiler testing, namely, efficiency, strength of test oracles, and effectiveness of generated test programs. The results indicate that all the three factors are statistically significant, and efficiency has the most significant impact.\",\"PeriodicalId\":6485,\"journal\":{\"name\":\"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)\",\"volume\":\"17 1\",\"pages\":\"180-190\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"89\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2884781.2884878\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2884781.2884878","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Empirical Comparison of Compiler Testing Techniques
Compilers, as one of the most important infrastructure of today’s digital world, are expected to be trustworthy. Different testing techniques are developed for testing compilers automatically. However, it is unknown so far how these testing techniques compared to each other in terms of testing effectiveness: how many bugs a testing technique can find within a time limit. In this paper, we conduct a systematic and comprehensive empirical comparison of three compiler testing techniques, namely, Randomized Differential Testing (RDT), a variant of RDT—Different Optimization Levels (DOL), and Equivalence Modulo Inputs (EMI). Our results show that DOL is more effective at detecting bugs related to optimization, whereas RDT is more effective at detecting other types of bugs, and the three techniques can complement each other to a certain degree. Furthermore, in order to understand why their effectiveness differs, we investigate three factors that influence the effectiveness of compiler testing, namely, efficiency, strength of test oracles, and effectiveness of generated test programs. The results indicate that all the three factors are statistically significant, and efficiency has the most significant impact.