Generating Targeted Adversarial Attacks and Assessing their Effectiveness in Fooling Deep Neural Networks

2022 IEEE International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2022-07-11 DOI:10.1109/SPCOM55316.2022.9840784

Shivangi Gajjar, Avik Hati, Shruti Bhilare, Srimanta Mandal

{"title":"Generating Targeted Adversarial Attacks and Assessing their Effectiveness in Fooling Deep Neural Networks","authors":"Shivangi Gajjar, Avik Hati, Shruti Bhilare, Srimanta Mandal","doi":"10.1109/SPCOM55316.2022.9840784","DOIUrl":null,"url":null,"abstract":"Deep neural network (DNN) models have gained popularity for most image classification problems. However, DNNs also have numerous vulnerable areas. These vulnerabilities can be exploited by an adversary to execute a successful adversarial attack, which is an algorithm to generate perturbed inputs that can fool a well-trained DNN. Among various existing adversarial attacks, DeepFool, a white-box untargeted attack is considered as one of the most reliable algorithms to compute adversarial perturbations. However, in some scenarios such as person recognition, adversary might want to carry out a targeted attack such that the input gets misclassified in a specific target class. Moreover, studies show that defense against a targeted attack is tougher than an untargeted one. Hence, generating a targeted adversarial example is desirable from an attacker’s perspective. In this paper, we propose ‘Targeted DeepFool’, which is based on computing a minimal amount of perturbation required to reach the target hyperplane. The proposed algorithm produces minimal amount of distortion for conventional image datasets: MNIST and CIFAR10. Further, Targeted DeepFool shows excellent performance in terms of adversarial success rate.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"251 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM55316.2022.9840784","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Deep neural network (DNN) models have gained popularity for most image classification problems. However, DNNs also have numerous vulnerable areas. These vulnerabilities can be exploited by an adversary to execute a successful adversarial attack, which is an algorithm to generate perturbed inputs that can fool a well-trained DNN. Among various existing adversarial attacks, DeepFool, a white-box untargeted attack is considered as one of the most reliable algorithms to compute adversarial perturbations. However, in some scenarios such as person recognition, adversary might want to carry out a targeted attack such that the input gets misclassified in a specific target class. Moreover, studies show that defense against a targeted attack is tougher than an untargeted one. Hence, generating a targeted adversarial example is desirable from an attacker’s perspective. In this paper, we propose ‘Targeted DeepFool’, which is based on computing a minimal amount of perturbation required to reach the target hyperplane. The proposed algorithm produces minimal amount of distortion for conventional image datasets: MNIST and CIFAR10. Further, Targeted DeepFool shows excellent performance in terms of adversarial success rate.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

生成目标对抗性攻击并评估其欺骗深度神经网络的有效性

深度神经网络(DNN)模型在大多数图像分类问题中得到了广泛的应用。然而，深层神经网络也有许多脆弱的区域。这些漏洞可以被对手利用来执行成功的对抗性攻击，这是一种生成干扰输入的算法，可以欺骗训练有素的DNN。在现有的各种对抗性攻击中，DeepFool，白盒非目标攻击被认为是计算对抗性扰动最可靠的算法之一。然而，在某些场景中，例如人物识别，攻击者可能想要执行有针对性的攻击，以便将输入错误地分类为特定的目标类。此外，研究表明，防御有针对性的攻击比防御无针对性的攻击更困难。因此，从攻击者的角度来看，生成目标对抗性示例是可取的。在本文中，我们提出了“目标DeepFool”，它基于计算到达目标超平面所需的最小扰动。该算法对传统的MNIST和CIFAR10图像数据集产生最小的失真。此外，Targeted DeepFool在对抗成功率方面表现出色。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 IEEE International Conference on Signal Processing and Communications (SPCOM)

自引率

0.00%

发文量