Detecting Adversarial Patch Attacks through Global-local Consistency

Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia Pub Date : 2021-10-20 DOI:10.1145/3475724.3483606

Bo Li, Jianghe Xu, Shuang Wu, Shouhong Ding, Jilin Li, Feiyue Huang

引用次数: 8

Abstract

Recent works have well-demonstrated the threat of adversarial patch attacks to real-world vision media systems. By arbitrarily modifying pixels within a small restricted area in the image, adversarial patches can mislead neural-network-based image classifiers. In this paper, we propose a simple but very effective approach to detect adversarial patches based on an interesting observation called global-local consistency. We verify this insight and propose to use Random-Local-Ensemble (RLE) strategy to further enhance it in the detection. The proposed method is trivial to implement and can be applied to protect any image classification models. Experiments on two popular datasets show that our algorithm can accurately detect the adversarial patches while maintaining high clean accuracy. Moreover, unlike the prior detection approaches which can be easily broken by adaptive attacks, our method is proved to have high robustness when facing adaptive attacks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用全局-局部一致性检测对抗性补丁攻击

最近的工作已经很好地证明了对抗性补丁攻击对现实世界视觉媒体系统的威胁。通过在图像的一个小限制区域内任意修改像素，对抗性补丁可以误导基于神经网络的图像分类器。在本文中，我们提出了一个简单但非常有效的方法来检测对抗性补丁基于一个有趣的观察称为全局-局部一致性。我们验证了这一见解，并提出使用随机局部集成(RLE)策略来进一步增强其在检测中的应用。该方法实现简单，可用于保护任何图像分类模型。在两个流行的数据集上的实验表明，该算法可以准确地检测出对抗斑块，同时保持较高的清洁精度。此外，与之前的检测方法容易被自适应攻击破坏不同，我们的方法在面对自适应攻击时具有很高的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia

自引率

0.00%

发文量