A new zhoneypot defence for deep neural networks

Other Conferences Pub Date : 2024-06-06 DOI:10.1117/12.3031913

Zhou Zhou

{"title":"A new zhoneypot defence for deep neural networks","authors":"Zhou Zhou","doi":"10.1117/12.3031913","DOIUrl":null,"url":null,"abstract":"Deep Neural Networks (DNNs) have been found to be vulnerable to adversarial attacks. Numerous prior studies aim to address vulnerabilities in trained models, either by patching these weaknesses or by introducing measures to make it challenging or resource-intensive to compute adversarial examples exploiting them. In our research, we take a distinctive approach by exploring and developing a \"honeypot\" (DMZ-enabled) strategy for Deep Neural Networks (DNNs) to safeguard against adversarial attacks, setting our work apart from existing methods in the protection of DNN models. Unlike other methods, instead of modifying the original model, we split this high dimensional space to get DMZ (demilitarized zone). We intentionally keep weaknesses like traditional honeypot do in the classification model that allow adversaries to search for adversarial examples. The adversary's optimization algorithms gravitate towards DMZ, the examples they generate in the feature space will eventually fall into the DMZ. Then, by checking if the input example falls into the DMZ, we can pick out the adversarial example. More specifically, in this paper, we introduce DMZ in DNN classifiers and show an implementation of a DMZ-enabled defense named zHoneypot. We integrate the current SOTA (state-of-the-art) adversarial example defense methods, i.e., mitigation and detection, into a single approach zHoneypot, we show experimentally that DMZ-protected models can proper handling adversarial examples which generated by state-of-the-art white-box attacks (such as FGSM, PGD, CW, OnePixel and DeepFool) with high accuracy, and with negligible impact on normal benign model. In the end, our defense method achieved a score of 97.66 on MNIST, 86.1 on CIFAR10, and 90.94 on SVHN.","PeriodicalId":198425,"journal":{"name":"Other Conferences","volume":"346 13‐15","pages":"1317508 - 1317508-6"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Other Conferences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3031913","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep Neural Networks (DNNs) have been found to be vulnerable to adversarial attacks. Numerous prior studies aim to address vulnerabilities in trained models, either by patching these weaknesses or by introducing measures to make it challenging or resource-intensive to compute adversarial examples exploiting them. In our research, we take a distinctive approach by exploring and developing a "honeypot" (DMZ-enabled) strategy for Deep Neural Networks (DNNs) to safeguard against adversarial attacks, setting our work apart from existing methods in the protection of DNN models. Unlike other methods, instead of modifying the original model, we split this high dimensional space to get DMZ (demilitarized zone). We intentionally keep weaknesses like traditional honeypot do in the classification model that allow adversaries to search for adversarial examples. The adversary's optimization algorithms gravitate towards DMZ, the examples they generate in the feature space will eventually fall into the DMZ. Then, by checking if the input example falls into the DMZ, we can pick out the adversarial example. More specifically, in this paper, we introduce DMZ in DNN classifiers and show an implementation of a DMZ-enabled defense named zHoneypot. We integrate the current SOTA (state-of-the-art) adversarial example defense methods, i.e., mitigation and detection, into a single approach zHoneypot, we show experimentally that DMZ-protected models can proper handling adversarial examples which generated by state-of-the-art white-box attacks (such as FGSM, PGD, CW, OnePixel and DeepFool) with high accuracy, and with negligible impact on normal benign model. In the end, our defense method achieved a score of 97.66 on MNIST, 86.1 on CIFAR10, and 90.94 on SVHN.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

针对深度神经网络的新型 "zhoneypot "防御系统

人们发现，深度神经网络（DNN）很容易受到恶意攻击。之前的许多研究都旨在解决训练模型中的漏洞，要么修补这些弱点，要么引入措施，使利用这些弱点计算对抗性示例变得具有挑战性或资源密集型。在我们的研究中，我们采取了一种与众不同的方法，探索并开发了一种针对深度神经网络（DNN）的 "蜜罐"（DMZ-enabled）策略，以防范对抗性攻击，从而使我们的工作有别于现有的保护深度神经网络模型的方法。与其他方法不同的是，我们没有修改原始模型，而是将这个高维空间分割成 DMZ（非军事区）。我们有意在分类模型中保留了像传统蜜罐一样的弱点，允许对手搜索敌对示例。对手的优化算法倾向于 DMZ，他们在特征空间中生成的示例最终会落入 DMZ。然后，通过检查输入示例是否落入 DMZ，我们就能找出对抗示例。更具体地说，在本文中，我们在 DNN 分类器中引入了 DMZ，并展示了名为 zHoneypot 的 DMZ 防御实现。我们将当前的 SOTA（最先进的）对抗性示例防御方法（即缓解和检测）集成到了一个单一的方法 zHoneypot 中，我们通过实验证明，受 DMZ 保护的模型可以高精度地正确处理由最先进的白盒攻击（如 FGSM、PGD、CW、OnePixel 和 DeepFool）产生的对抗性示例，并且对正常良性模型的影响可以忽略不计。最终，我们的防御方法在 MNIST 上获得了 97.66 分，在 CIFAR10 上获得了 86.1 分，在 SVHN 上获得了 90.94 分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Other Conferences

自引率

0.00%

发文量