通过深度学习实现编译器模糊测试

Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis Pub Date : 2018-07-12 DOI:10.1145/3213846.3213848

Chris Cummins, Pavlos Petoumenos, A. Murray, Hugh Leather

{"title":"通过深度学习实现编译器模糊测试","authors":"Chris Cummins, Pavlos Petoumenos, A. Murray, Hugh Leather","doi":"10.1145/3213846.3213848","DOIUrl":null,"url":null,"abstract":"Random program generation — fuzzing — is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested. We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative models for compiler inputs. Our approach infers a learned model of the structure of real world code based on a large corpus of open source code. Then, it uses the model to automatically generate tens of thousands of realistic programs. Finally, we apply established differential testing methodologies on them to expose bugs in compilers. We apply our approach to the OpenCL programming language, automatically exposing bugs with little effort on our side. In 1,000 hours of automated testing of commercial and open source compilers, we discover bugs in all of them, submitting 67 bug reports. Our test cases are on average two orders of magnitude smaller than the state-of-the-art, require 3.03× less time to generate and evaluate, and expose bugs which the state-of-the-art cannot. Our random program generator, comprising only 500 lines of code, took 12 hours to train for OpenCL versus the state-of-the-art taking 9 man months to port from a generator for C and 50,000 lines of code. With 18 lines of code we extended our program generator to a second language, uncovering crashes in Solidity compilers in 12 hours of automated testing.","PeriodicalId":20542,"journal":{"name":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"4 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"108","resultStr":"{\"title\":\"Compiler fuzzing through deep learning\",\"authors\":\"Chris Cummins, Pavlos Petoumenos, A. Murray, Hugh Leather\",\"doi\":\"10.1145/3213846.3213848\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Random program generation — fuzzing — is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested. We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative models for compiler inputs. Our approach infers a learned model of the structure of real world code based on a large corpus of open source code. Then, it uses the model to automatically generate tens of thousands of realistic programs. Finally, we apply established differential testing methodologies on them to expose bugs in compilers. We apply our approach to the OpenCL programming language, automatically exposing bugs with little effort on our side. In 1,000 hours of automated testing of commercial and open source compilers, we discover bugs in all of them, submitting 67 bug reports. Our test cases are on average two orders of magnitude smaller than the state-of-the-art, require 3.03× less time to generate and evaluate, and expose bugs which the state-of-the-art cannot. Our random program generator, comprising only 500 lines of code, took 12 hours to train for OpenCL versus the state-of-the-art taking 9 man months to port from a generator for C and 50,000 lines of code. With 18 lines of code we extended our program generator to a second language, uncovering crashes in Solidity compilers in 12 hours of automated testing.\",\"PeriodicalId\":20542,\"journal\":{\"name\":\"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis\",\"volume\":\"4 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"108\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3213846.3213848\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3213846.3213848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 108

摘要

随机程序生成——模糊测试——是发现编译器中的bug的一种有效技术，但是成功的模糊测试需要针对编译器支持的每种语言进行大量的开发工作，并且通常会有部分语言空间未经过测试。我们介绍了DeepSmith，这是一种新的机器学习方法，通过对编译器输入的生成模型进行推理来加速编译器验证。我们的方法根据大量开源代码语料库推断出真实世界代码结构的学习模型。然后，利用该模型自动生成数以万计的逼真程序。最后，我们对它们应用已建立的差异测试方法来暴露编译器中的错误。我们将我们的方法应用于OpenCL编程语言，我们可以轻松地自动发现错误。在对商业和开源编译器进行了1000小时的自动化测试后，我们发现了所有这些编译器中的错误，提交了67个错误报告。我们的测试用例平均比当前状态小两个数量级，生成和评估所需的时间减少了3.03倍，并且暴露了当前状态无法暴露的bug。我们的随机程序生成器只包含500行代码，花了12个小时来训练OpenCL，而最先进的程序需要9个月才能从C生成器移植到50,000行代码。我们用18行代码将程序生成器扩展到第二种语言，在12小时的自动化测试中发现了Solidity编译器的崩溃。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Compiler fuzzing through deep learning

Random program generation — fuzzing — is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested. We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative models for compiler inputs. Our approach infers a learned model of the structure of real world code based on a large corpus of open source code. Then, it uses the model to automatically generate tens of thousands of realistic programs. Finally, we apply established differential testing methodologies on them to expose bugs in compilers. We apply our approach to the OpenCL programming language, automatically exposing bugs with little effort on our side. In 1,000 hours of automated testing of commercial and open source compilers, we discover bugs in all of them, submitting 67 bug reports. Our test cases are on average two orders of magnitude smaller than the state-of-the-art, require 3.03× less time to generate and evaluate, and expose bugs which the state-of-the-art cannot. Our random program generator, comprising only 500 lines of code, took 12 hours to train for OpenCL versus the state-of-the-art taking 9 man months to port from a generator for C and 50,000 lines of code. With 18 lines of code we extended our program generator to a second language, uncovering crashes in Solidity compilers in 12 hours of automated testing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

自引率

0.00%

发文量