{"title":"针对深度神经网络的新型 \"zhoneypot \"防御系统","authors":"Zhou Zhou","doi":"10.1117/12.3031913","DOIUrl":null,"url":null,"abstract":"Deep Neural Networks (DNNs) have been found to be vulnerable to adversarial attacks. Numerous prior studies aim to address vulnerabilities in trained models, either by patching these weaknesses or by introducing measures to make it challenging or resource-intensive to compute adversarial examples exploiting them. In our research, we take a distinctive approach by exploring and developing a \"honeypot\" (DMZ-enabled) strategy for Deep Neural Networks (DNNs) to safeguard against adversarial attacks, setting our work apart from existing methods in the protection of DNN models. Unlike other methods, instead of modifying the original model, we split this high dimensional space to get DMZ (demilitarized zone). We intentionally keep weaknesses like traditional honeypot do in the classification model that allow adversaries to search for adversarial examples. The adversary's optimization algorithms gravitate towards DMZ, the examples they generate in the feature space will eventually fall into the DMZ. Then, by checking if the input example falls into the DMZ, we can pick out the adversarial example. More specifically, in this paper, we introduce DMZ in DNN classifiers and show an implementation of a DMZ-enabled defense named zHoneypot. We integrate the current SOTA (state-of-the-art) adversarial example defense methods, i.e., mitigation and detection, into a single approach zHoneypot, we show experimentally that DMZ-protected models can proper handling adversarial examples which generated by state-of-the-art white-box attacks (such as FGSM, PGD, CW, OnePixel and DeepFool) with high accuracy, and with negligible impact on normal benign model. In the end, our defense method achieved a score of 97.66 on MNIST, 86.1 on CIFAR10, and 90.94 on SVHN.","PeriodicalId":198425,"journal":{"name":"Other Conferences","volume":"346 13‐15","pages":"1317508 - 1317508-6"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A new zhoneypot defence for deep neural networks\",\"authors\":\"Zhou Zhou\",\"doi\":\"10.1117/12.3031913\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep Neural Networks (DNNs) have been found to be vulnerable to adversarial attacks. Numerous prior studies aim to address vulnerabilities in trained models, either by patching these weaknesses or by introducing measures to make it challenging or resource-intensive to compute adversarial examples exploiting them. In our research, we take a distinctive approach by exploring and developing a \\\"honeypot\\\" (DMZ-enabled) strategy for Deep Neural Networks (DNNs) to safeguard against adversarial attacks, setting our work apart from existing methods in the protection of DNN models. Unlike other methods, instead of modifying the original model, we split this high dimensional space to get DMZ (demilitarized zone). We intentionally keep weaknesses like traditional honeypot do in the classification model that allow adversaries to search for adversarial examples. The adversary's optimization algorithms gravitate towards DMZ, the examples they generate in the feature space will eventually fall into the DMZ. Then, by checking if the input example falls into the DMZ, we can pick out the adversarial example. More specifically, in this paper, we introduce DMZ in DNN classifiers and show an implementation of a DMZ-enabled defense named zHoneypot. We integrate the current SOTA (state-of-the-art) adversarial example defense methods, i.e., mitigation and detection, into a single approach zHoneypot, we show experimentally that DMZ-protected models can proper handling adversarial examples which generated by state-of-the-art white-box attacks (such as FGSM, PGD, CW, OnePixel and DeepFool) with high accuracy, and with negligible impact on normal benign model. In the end, our defense method achieved a score of 97.66 on MNIST, 86.1 on CIFAR10, and 90.94 on SVHN.\",\"PeriodicalId\":198425,\"journal\":{\"name\":\"Other Conferences\",\"volume\":\"346 13‐15\",\"pages\":\"1317508 - 1317508-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Other Conferences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.3031913\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Other Conferences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3031913","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Neural Networks (DNNs) have been found to be vulnerable to adversarial attacks. Numerous prior studies aim to address vulnerabilities in trained models, either by patching these weaknesses or by introducing measures to make it challenging or resource-intensive to compute adversarial examples exploiting them. In our research, we take a distinctive approach by exploring and developing a "honeypot" (DMZ-enabled) strategy for Deep Neural Networks (DNNs) to safeguard against adversarial attacks, setting our work apart from existing methods in the protection of DNN models. Unlike other methods, instead of modifying the original model, we split this high dimensional space to get DMZ (demilitarized zone). We intentionally keep weaknesses like traditional honeypot do in the classification model that allow adversaries to search for adversarial examples. The adversary's optimization algorithms gravitate towards DMZ, the examples they generate in the feature space will eventually fall into the DMZ. Then, by checking if the input example falls into the DMZ, we can pick out the adversarial example. More specifically, in this paper, we introduce DMZ in DNN classifiers and show an implementation of a DMZ-enabled defense named zHoneypot. We integrate the current SOTA (state-of-the-art) adversarial example defense methods, i.e., mitigation and detection, into a single approach zHoneypot, we show experimentally that DMZ-protected models can proper handling adversarial examples which generated by state-of-the-art white-box attacks (such as FGSM, PGD, CW, OnePixel and DeepFool) with high accuracy, and with negligible impact on normal benign model. In the end, our defense method achieved a score of 97.66 on MNIST, 86.1 on CIFAR10, and 90.94 on SVHN.