An Information Geometric Perspective to Adversarial Attacks and Defenses

2022 International Joint Conference on Neural Networks (IJCNN) Pub Date : 2022-07-18 DOI:10.1109/IJCNN55064.2022.9892170

Kyle Naddeo, N. Bouaynaya, R. Shterenberg

{"title":"An Information Geometric Perspective to Adversarial Attacks and Defenses","authors":"Kyle Naddeo, N. Bouaynaya, R. Shterenberg","doi":"10.1109/IJCNN55064.2022.9892170","DOIUrl":null,"url":null,"abstract":"Deep learning models have achieved state-of-the-art accuracy in complex tasks, sometimes outperforming human-level accuracy. Yet, they suffer from vulnerabilities known as adversarial attacks, which are imperceptible input perturbations that fool the models on inputs that were originally classified correctly. The adversarial problem remains poorly understood and commonly thought to be an inherent weakness of deep learning models. We argue that understanding and alleviating the adversarial phenomenon may require us to go beyond the Euclidean view and consider the relationship between the input and output spaces as a statistical manifold with the Fisher Information as its Riemannian metric. Under this information geometric view, the optimal attack is constructed as the direction corresponding to the highest eigenvalue of the Fisher Information Matrix - called the Fisher spectral attack. We show that an orthogonal transformation of the data cleverly alters its manifold by keeping the highest eigenvalue but changing the optimal direction of attack; thus deceiving the attacker into adopting the wrong direction. We demonstrate the defensive capabilities of the proposed orthogonal scheme - against the Fisher spectral attack and the popular fast gradient sign method - on standard networks, e.g., LeNet and MobileNetV2 for benchmark data sets, MNIST and CIFAR-10.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN55064.2022.9892170","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning models have achieved state-of-the-art accuracy in complex tasks, sometimes outperforming human-level accuracy. Yet, they suffer from vulnerabilities known as adversarial attacks, which are imperceptible input perturbations that fool the models on inputs that were originally classified correctly. The adversarial problem remains poorly understood and commonly thought to be an inherent weakness of deep learning models. We argue that understanding and alleviating the adversarial phenomenon may require us to go beyond the Euclidean view and consider the relationship between the input and output spaces as a statistical manifold with the Fisher Information as its Riemannian metric. Under this information geometric view, the optimal attack is constructed as the direction corresponding to the highest eigenvalue of the Fisher Information Matrix - called the Fisher spectral attack. We show that an orthogonal transformation of the data cleverly alters its manifold by keeping the highest eigenvalue but changing the optimal direction of attack; thus deceiving the attacker into adopting the wrong direction. We demonstrate the defensive capabilities of the proposed orthogonal scheme - against the Fisher spectral attack and the popular fast gradient sign method - on standard networks, e.g., LeNet and MobileNetV2 for benchmark data sets, MNIST and CIFAR-10.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

对抗性攻击与防御的信息几何视角

深度学习模型在复杂任务中达到了最先进的精度，有时甚至超过了人类水平的精度。然而，它们遭受了被称为对抗性攻击的漏洞，这是一种难以察觉的输入扰动，可以在最初正确分类的输入上欺骗模型。对抗性问题仍然知之甚少，通常被认为是深度学习模型的固有弱点。我们认为，理解和缓解对抗现象可能需要我们超越欧几里得观点，并将输入和输出空间之间的关系视为统计流形，并将Fisher信息作为其黎曼度量。在这种信息几何视图下，将最优攻击构造为Fisher信息矩阵最高特征值对应的方向，称为Fisher谱攻击。我们证明了数据的正交变换通过保持最高特征值而改变最优攻击方向巧妙地改变了它的流形;从而欺骗攻击者采取错误的方向。我们在标准网络上展示了所提出的正交方案的防御能力-对抗Fisher频谱攻击和流行的快速梯度符号方法-例如，LeNet和MobileNetV2的基准数据集，MNIST和CIFAR-10。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 International Joint Conference on Neural Networks (IJCNN)

自引率

0.00%

发文量