Abstraction and Refinement: Towards Scalable and Exact Verification of Neural Networks

IF 6.6 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING ACM Transactions on Software Engineering and Methodology Pub Date : 2024-02-05 DOI:10.1145/3644387

Jiaxiang Liu, Yunhan Xing, Xiaomu Shi, Fu Song, Zhiwu Xu, Zhong Ming

{"title":"Abstraction and Refinement: Towards Scalable and Exact Verification of Neural Networks","authors":"Jiaxiang Liu, Yunhan Xing, Xiaomu Shi, Fu Song, Zhiwu Xu, Zhong Ming","doi":"10.1145/3644387","DOIUrl":null,"url":null,"abstract":"As a new programming paradigm, deep neural networks (DNNs) have been increasingly deployed in practice, but the lack of robustness hinders their applications in safety-critical domains. While there are techniques for verifying DNNs with formal guarantees, they are limited in scalability and accuracy. In this paper, we present a novel counterexample-guided abstraction refinement (CEGAR) approach for scalable and exact verification of DNNs. Specifically, we propose a novel abstraction to break down the size of DNNs by over-approximation. The result of verifying the abstract DNN is conclusive if no spurious counterexample is reported. To eliminate each spurious counterexample introduced by abstraction, we propose a novel counterexample-guided refinement that refines the abstract DNN to exclude the spurious counterexample while still over-approximating the original one, leading to a sound, complete yet efficient CEGAR approach. Our approach is orthogonal to and can be integrated with many existing verification techniques. For demonstration, we implement our approach using two promising tools Marabou and Planet as the underlying verification engines, and evaluate on widely-used benchmarks for three datasets <monospace>ACAS</monospace> <monospace>Xu</monospace>, <monospace>MNIST</monospace> and <monospace>CIFAR-10</monospace>. The results show that our approach can boost their performance by solving more problems in the same time limit, reducing on average 13.4%–86.3% verification time of Marabou on almost all the verification tasks, and reducing on average 8.3%–78.0% verification time of Planet on all the verification tasks. Compared to the most relevant CEGAR-based approach, our approach is 11.6–26.6 times faster.","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"29 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3644387","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

As a new programming paradigm, deep neural networks (DNNs) have been increasingly deployed in practice, but the lack of robustness hinders their applications in safety-critical domains. While there are techniques for verifying DNNs with formal guarantees, they are limited in scalability and accuracy. In this paper, we present a novel counterexample-guided abstraction refinement (CEGAR) approach for scalable and exact verification of DNNs. Specifically, we propose a novel abstraction to break down the size of DNNs by over-approximation. The result of verifying the abstract DNN is conclusive if no spurious counterexample is reported. To eliminate each spurious counterexample introduced by abstraction, we propose a novel counterexample-guided refinement that refines the abstract DNN to exclude the spurious counterexample while still over-approximating the original one, leading to a sound, complete yet efficient CEGAR approach. Our approach is orthogonal to and can be integrated with many existing verification techniques. For demonstration, we implement our approach using two promising tools Marabou and Planet as the underlying verification engines, and evaluate on widely-used benchmarks for three datasets ACAS Xu, MNIST and CIFAR-10. The results show that our approach can boost their performance by solving more problems in the same time limit, reducing on average 13.4%–86.3% verification time of Marabou on almost all the verification tasks, and reducing on average 8.3%–78.0% verification time of Planet on all the verification tasks. Compared to the most relevant CEGAR-based approach, our approach is 11.6–26.6 times faster.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

抽象与细化：实现神经网络的可扩展和精确验证

作为一种新的编程范式，深度神经网络（DNN）在实践中的应用越来越广泛，但其鲁棒性的缺乏阻碍了其在安全关键领域的应用。虽然有一些技术可以用形式保证来验证 DNN，但它们在可扩展性和准确性方面都很有限。在本文中，我们提出了一种新颖的反例引导抽象细化（CEGAR）方法，用于对 DNN 进行可扩展的精确验证。具体来说，我们提出了一种新颖的抽象，通过过度逼近来分解 DNN 的大小。如果没有虚假反例，抽象 DNN 的验证结果就是确定的。为了消除由抽象引入的每一个虚假反例，我们提出了一种新颖的反例引导细化方法，它可以细化抽象 DNN 以排除虚假反例，同时仍然过度逼近原始 DNN，从而形成一种完善、完整而高效的 CEGAR 方法。我们的方法与许多现有的验证技术正交，并可与之集成。为了进行演示，我们使用 Marabou 和 Planet 这两个前景看好的工具作为底层验证引擎来实现我们的方法，并在 ACAS Xu、MNIST 和 CIFAR-10 这三个数据集的广泛使用基准上进行了评估。结果表明，我们的方法可以在相同时限内解决更多问题，从而提高它们的性能，在几乎所有验证任务中，Marabou 的验证时间平均缩短了 13.4% 到 86.3%，而 Planet 在所有验证任务中的验证时间平均缩短了 8.3% 到 78.0%。与最相关的基于 CEGAR 的方法相比，我们的方法快 11.6-26.6 倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Software Engineering and Methodology 工程技术-计算机：软件工程

CiteScore

6.30

自引率

4.50%

发文量

164

审稿时长

>12 weeks

期刊介绍： Designing and building a large, complex software system is a tremendous challenge. ACM Transactions on Software Engineering and Methodology (TOSEM) publishes papers on all aspects of that challenge: specification, design, development and maintenance. It covers tools and methodologies, languages, data structures, and algorithms. TOSEM also reports on successful efforts, noting practical lessons that can be scaled and transferred to other projects, and often looks at applications of innovative technologies. The tone is scholarly but readable; the content is worthy of study; the presentation is effective.