A new zhoneypot defence for deep neural networks

Zhou Zhou
{"title":"A new zhoneypot defence for deep neural networks","authors":"Zhou Zhou","doi":"10.1117/12.3031913","DOIUrl":null,"url":null,"abstract":"Deep Neural Networks (DNNs) have been found to be vulnerable to adversarial attacks. Numerous prior studies aim to address vulnerabilities in trained models, either by patching these weaknesses or by introducing measures to make it challenging or resource-intensive to compute adversarial examples exploiting them. In our research, we take a distinctive approach by exploring and developing a \"honeypot\" (DMZ-enabled) strategy for Deep Neural Networks (DNNs) to safeguard against adversarial attacks, setting our work apart from existing methods in the protection of DNN models. Unlike other methods, instead of modifying the original model, we split this high dimensional space to get DMZ (demilitarized zone). We intentionally keep weaknesses like traditional honeypot do in the classification model that allow adversaries to search for adversarial examples. The adversary's optimization algorithms gravitate towards DMZ, the examples they generate in the feature space will eventually fall into the DMZ. Then, by checking if the input example falls into the DMZ, we can pick out the adversarial example. More specifically, in this paper, we introduce DMZ in DNN classifiers and show an implementation of a DMZ-enabled defense named zHoneypot. We integrate the current SOTA (state-of-the-art) adversarial example defense methods, i.e., mitigation and detection, into a single approach zHoneypot, we show experimentally that DMZ-protected models can proper handling adversarial examples which generated by state-of-the-art white-box attacks (such as FGSM, PGD, CW, OnePixel and DeepFool) with high accuracy, and with negligible impact on normal benign model. In the end, our defense method achieved a score of 97.66 on MNIST, 86.1 on CIFAR10, and 90.94 on SVHN.","PeriodicalId":198425,"journal":{"name":"Other Conferences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Other Conferences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3031913","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Deep Neural Networks (DNNs) have been found to be vulnerable to adversarial attacks. Numerous prior studies aim to address vulnerabilities in trained models, either by patching these weaknesses or by introducing measures to make it challenging or resource-intensive to compute adversarial examples exploiting them. In our research, we take a distinctive approach by exploring and developing a "honeypot" (DMZ-enabled) strategy for Deep Neural Networks (DNNs) to safeguard against adversarial attacks, setting our work apart from existing methods in the protection of DNN models. Unlike other methods, instead of modifying the original model, we split this high dimensional space to get DMZ (demilitarized zone). We intentionally keep weaknesses like traditional honeypot do in the classification model that allow adversaries to search for adversarial examples. The adversary's optimization algorithms gravitate towards DMZ, the examples they generate in the feature space will eventually fall into the DMZ. Then, by checking if the input example falls into the DMZ, we can pick out the adversarial example. More specifically, in this paper, we introduce DMZ in DNN classifiers and show an implementation of a DMZ-enabled defense named zHoneypot. We integrate the current SOTA (state-of-the-art) adversarial example defense methods, i.e., mitigation and detection, into a single approach zHoneypot, we show experimentally that DMZ-protected models can proper handling adversarial examples which generated by state-of-the-art white-box attacks (such as FGSM, PGD, CW, OnePixel and DeepFool) with high accuracy, and with negligible impact on normal benign model. In the end, our defense method achieved a score of 97.66 on MNIST, 86.1 on CIFAR10, and 90.94 on SVHN.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
针对深度神经网络的新型 "zhoneypot "防御系统
人们发现,深度神经网络(DNN)很容易受到恶意攻击。之前的许多研究都旨在解决训练模型中的漏洞,要么修补这些弱点,要么引入措施,使利用这些弱点计算对抗性示例变得具有挑战性或资源密集型。在我们的研究中,我们采取了一种与众不同的方法,探索并开发了一种针对深度神经网络(DNN)的 "蜜罐"(DMZ-enabled)策略,以防范对抗性攻击,从而使我们的工作有别于现有的保护深度神经网络模型的方法。与其他方法不同的是,我们没有修改原始模型,而是将这个高维空间分割成 DMZ(非军事区)。我们有意在分类模型中保留了像传统蜜罐一样的弱点,允许对手搜索敌对示例。对手的优化算法倾向于 DMZ,他们在特征空间中生成的示例最终会落入 DMZ。然后,通过检查输入示例是否落入 DMZ,我们就能找出对抗示例。更具体地说,在本文中,我们在 DNN 分类器中引入了 DMZ,并展示了名为 zHoneypot 的 DMZ 防御实现。我们将当前的 SOTA(最先进的)对抗性示例防御方法(即缓解和检测)集成到了一个单一的方法 zHoneypot 中,我们通过实验证明,受 DMZ 保护的模型可以高精度地正确处理由最先进的白盒攻击(如 FGSM、PGD、CW、OnePixel 和 DeepFool)产生的对抗性示例,并且对正常良性模型的影响可以忽略不计。最终,我们的防御方法在 MNIST 上获得了 97.66 分,在 CIFAR10 上获得了 86.1 分,在 SVHN 上获得了 90.94 分。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Small data in model calibration for optical tissue phantom validation New approaches of supersmooth surfaces diagnostics by using carbon nanoparticles Uses of 3D printing technologies in opto-mechanics and opto-mechatronics for laboratory instruments Integrated approach to precision instrumentation: design, modeling, and experimental validation of a compliant mechanical amplifier for laser scalpel prototype Laser-induced periodic surface structures on TiAl6V4 surfaces by picosecond laser processing for dental abutments
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1