DeepHunter: a coverage-guided fuzz testing framework for deep neural networks

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis Pub Date : 2019-07-10 DOI:10.1145/3293882.3330579

Xiaofei Xie, L. Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, S. See

{"title":"DeepHunter: a coverage-guided fuzz testing framework for deep neural networks","authors":"Xiaofei Xie, L. Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, S. See","doi":"10.1145/3293882.3330579","DOIUrl":null,"url":null,"abstract":"The past decade has seen the great potential of applying deep neural network (DNN) based software to safety-critical scenarios, such as autonomous driving. Similar to traditional software, DNNs could exhibit incorrect behaviors, caused by hidden defects, leading to severe accidents and losses. In this paper, we propose DeepHunter, a coverage-guided fuzz testing framework for detecting potential defects of general-purpose DNNs. To this end, we first propose a metamorphic mutation strategy to generate new semantically preserved tests, and leverage multiple extensible coverage criteria as feedback to guide the test generation. We further propose a seed selection strategy that combines both diversity-based and recency-based seed selection. We implement and incorporate 5 existing testing criteria and 4 seed selection strategies in DeepHunter. Large-scale experiments demonstrate that (1) our metamorphic mutation strategy is useful to generate new valid tests with the same semantics as the original seed, by up to a 98% validity ratio; (2) the diversity-based seed selection generally weighs more than recency-based seed selection in boosting the coverage and in detecting defects; (3) DeepHunter outperforms the state of the arts by coverage as well as the quantity and diversity of defects identified; (4) guided by corner-region based criteria, DeepHunter is useful to capture defects during the DNN quantization for platform migration.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"306","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3293882.3330579","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 306

Abstract

The past decade has seen the great potential of applying deep neural network (DNN) based software to safety-critical scenarios, such as autonomous driving. Similar to traditional software, DNNs could exhibit incorrect behaviors, caused by hidden defects, leading to severe accidents and losses. In this paper, we propose DeepHunter, a coverage-guided fuzz testing framework for detecting potential defects of general-purpose DNNs. To this end, we first propose a metamorphic mutation strategy to generate new semantically preserved tests, and leverage multiple extensible coverage criteria as feedback to guide the test generation. We further propose a seed selection strategy that combines both diversity-based and recency-based seed selection. We implement and incorporate 5 existing testing criteria and 4 seed selection strategies in DeepHunter. Large-scale experiments demonstrate that (1) our metamorphic mutation strategy is useful to generate new valid tests with the same semantics as the original seed, by up to a 98% validity ratio; (2) the diversity-based seed selection generally weighs more than recency-based seed selection in boosting the coverage and in detecting defects; (3) DeepHunter outperforms the state of the arts by coverage as well as the quantity and diversity of defects identified; (4) guided by corner-region based criteria, DeepHunter is useful to capture defects during the DNN quantization for platform migration.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

深度神经网络的覆盖引导模糊测试框架

在过去的十年中，我们看到了将基于深度神经网络(DNN)的软件应用于安全关键场景(如自动驾驶)的巨大潜力。与传统软件类似，dnn可能由于隐藏缺陷而表现出不正确的行为，从而导致严重的事故和损失。在本文中，我们提出了DeepHunter，这是一个覆盖引导的模糊测试框架，用于检测通用dnn的潜在缺陷。为此，我们首先提出了一种变形突变策略来生成新的语义保留的测试，并利用多个可扩展的覆盖标准作为反馈来指导测试的生成。我们进一步提出了一种结合多样性和近代性的种子选择策略。我们在DeepHunter中实现并合并了5个现有的测试标准和4个种子选择策略。大规模实验表明:(1)我们的变形突变策略可以产生与原始种子具有相同语义的新有效测试，效度高达98%;(2)在提高覆盖率和检测缺陷方面，基于多样性的种子选择通常比基于近代性的种子选择更重要;(3) DeepHunter在覆盖范围以及已识别缺陷的数量和多样性方面优于现有技术;(4)在基于角域的准则指导下，DeepHunter可以在深度神经网络量化平台迁移过程中捕获缺陷。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

自引率

0.00%

发文量