Gradient Correction for White-Box Adversarial Attacks.

IF 10.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE transactions on neural networks and learning systems Pub Date : 2023-10-11 DOI:10.1109/TNNLS.2023.3315414

Hongying Liu, Zhijin Ge, Zhenyu Zhou, Fanhua Shang, Yuanyuan Liu, Licheng Jiao

{"title":"Gradient Correction for White-Box Adversarial Attacks.","authors":"Hongying Liu, Zhijin Ge, Zhenyu Zhou, Fanhua Shang, Yuanyuan Liu, Licheng Jiao","doi":"10.1109/TNNLS.2023.3315414","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) play key roles in various artificial intelligence applications such as image classification and object recognition. However, a growing number of studies have shown that there exist adversarial examples in DNNs, which are almost imperceptibly different from the original samples but can greatly change the output of DNNs. Recently, many white-box attack algorithms have been proposed, and most of the algorithms concentrate on how to make the best use of gradients per iteration to improve adversarial performance. In this article, we focus on the properties of the widely used activation function, rectified linear unit (ReLU), and find that there exist two phenomena (i.e., wrong blocking and over transmission) misguiding the calculation of gradients for ReLU during backpropagation. Both issues enlarge the difference between the predicted changes of the loss function from gradients and corresponding actual changes and misguide the optimized direction, which results in larger perturbations. Therefore, we propose a universal gradient correction adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient-based white-box attack algorithms such as fast gradient signed method (FGSM), iterative FGSM (I-FGSM), momentum I-FGSM (MI-FGSM), and variance tuning MI-FGSM (VMI-FGSM). Through backpropagation, our approach calculates the gradient of the loss function with respect to the network input, maps the values to scores, and selects a part of them to update the misguided gradients. Comprehensive experimental results on ImageNet and CIFAR10 demonstrate that our ADV-ReLU can be easily integrated into many state-of-the-art gradient-based white-box attack algorithms, as well as transferred to black-box attacks, to further decrease perturbations measured in the l2 -norm.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.2000,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TNNLS.2023.3315414","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Deep neural networks (DNNs) play key roles in various artificial intelligence applications such as image classification and object recognition. However, a growing number of studies have shown that there exist adversarial examples in DNNs, which are almost imperceptibly different from the original samples but can greatly change the output of DNNs. Recently, many white-box attack algorithms have been proposed, and most of the algorithms concentrate on how to make the best use of gradients per iteration to improve adversarial performance. In this article, we focus on the properties of the widely used activation function, rectified linear unit (ReLU), and find that there exist two phenomena (i.e., wrong blocking and over transmission) misguiding the calculation of gradients for ReLU during backpropagation. Both issues enlarge the difference between the predicted changes of the loss function from gradients and corresponding actual changes and misguide the optimized direction, which results in larger perturbations. Therefore, we propose a universal gradient correction adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient-based white-box attack algorithms such as fast gradient signed method (FGSM), iterative FGSM (I-FGSM), momentum I-FGSM (MI-FGSM), and variance tuning MI-FGSM (VMI-FGSM). Through backpropagation, our approach calculates the gradient of the loss function with respect to the network input, maps the values to scores, and selects a part of them to update the misguided gradients. Comprehensive experimental results on ImageNet and CIFAR10 demonstrate that our ADV-ReLU can be easily integrated into many state-of-the-art gradient-based white-box attack algorithms, as well as transferred to black-box attacks, to further decrease perturbations measured in the l₂ -norm.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

白盒对抗性攻击的梯度修正。

深度神经网络在图像分类和物体识别等各种人工智能应用中发挥着关键作用。然而，越来越多的研究表明，DNN中存在对抗性例子，这些例子与原始样本几乎没有明显的不同，但会极大地改变DNN的输出。最近，人们提出了许多白盒攻击算法，大多数算法都集中在如何最大限度地利用每次迭代的梯度来提高对抗性能上。在本文中，我们重点研究了广泛使用的激活函数——校正线性单元（ReLU）的性质，并发现在反向传播过程中存在两种错误引导ReLU梯度计算的现象（即错误阻塞和过度传输）。这两个问题都扩大了梯度损失函数的预测变化与相应实际变化之间的差异，并误导了优化方向，从而导致更大的扰动。因此，我们提出了一种通用的梯度校正对抗性示例生成方法，称为ADV-ReLU，以提高基于梯度的白盒攻击算法的性能，如快速梯度签名法（FGSM）、迭代FGSM（I-FGSM），动量I-FGSM（MI-FGSM）和方差调整MI-FGSM（VMI-FGSM）。通过反向传播，我们的方法计算损失函数相对于网络输入的梯度，将值映射到分数，并选择其中的一部分来更新被误导的梯度。在ImageNet和CIFAR10上的综合实验结果表明，我们的ADV ReLU可以很容易地集成到许多最先进的基于梯度的白盒攻击算法中，也可以转移到黑盒攻击中，以进一步减少在l2范数中测量的扰动。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.