NPAT Null-Space Projected Adversarial Training Towards Zero Deterioration

arXiv - CS - Machine Learning Pub Date : 2024-09-18 DOI:arxiv-2409.11754

Hanyi Hu, Qiao Han, Kui Chen, Yao Yang

{"title":"NPAT Null-Space Projected Adversarial Training Towards Zero Deterioration","authors":"Hanyi Hu, Qiao Han, Kui Chen, Yao Yang","doi":"arxiv-2409.11754","DOIUrl":null,"url":null,"abstract":"To mitigate the susceptibility of neural networks to adversarial attacks,\nadversarial training has emerged as a prevalent and effective defense strategy.\nIntrinsically, this countermeasure incurs a trade-off, as it sacrifices the\nmodel's accuracy in processing normal samples. To reconcile the trade-off, we\npioneer the incorporation of null-space projection into adversarial training\nand propose two innovative Null-space Projection based Adversarial\nTraining(NPAT) algorithms tackling sample generation and gradient optimization,\nnamed Null-space Projected Data Augmentation (NPDA) and Null-space Projected\nGradient Descent (NPGD), to search for an overarching optimal solutions, which\nenhance robustness with almost zero deterioration in generalization\nperformance. Adversarial samples and perturbations are constrained within the\nnull-space of the decision boundary utilizing a closed-form null-space\nprojector, effectively mitigating threat of attack stemming from unreliable\nfeatures. Subsequently, we conducted experiments on the CIFAR10 and SVHN\ndatasets and reveal that our methodology can seamlessly combine with\nadversarial training methods and obtain comparable robustness while keeping\ngeneralization close to a high-accuracy model.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":"16 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11754","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

To mitigate the susceptibility of neural networks to adversarial attacks, adversarial training has emerged as a prevalent and effective defense strategy. Intrinsically, this countermeasure incurs a trade-off, as it sacrifices the model's accuracy in processing normal samples. To reconcile the trade-off, we pioneer the incorporation of null-space projection into adversarial training and propose two innovative Null-space Projection based Adversarial Training(NPAT) algorithms tackling sample generation and gradient optimization, named Null-space Projected Data Augmentation (NPDA) and Null-space Projected Gradient Descent (NPGD), to search for an overarching optimal solutions, which enhance robustness with almost zero deterioration in generalization performance. Adversarial samples and perturbations are constrained within the null-space of the decision boundary utilizing a closed-form null-space projector, effectively mitigating threat of attack stemming from unreliable features. Subsequently, we conducted experiments on the CIFAR10 and SVHN datasets and reveal that our methodology can seamlessly combine with adversarial training methods and obtain comparable robustness while keeping generalization close to a high-accuracy model.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

NPAT 零空间预测对抗训练，实现零恶化

为了降低神经网络对对抗性攻击的敏感性，对抗性训练已成为一种普遍而有效的防御策略。从本质上讲，这种对策需要权衡利弊，因为它牺牲了模型处理正常样本的准确性。为了调和这种权衡，我们率先在对抗训练中加入了空空间投影，并提出了两种创新的基于空空间投影的对抗训练（NPAT）算法，即空空间投影数据增强算法（NPDA）和空空间投影梯度下降算法（NPGD），这两种算法解决了样本生成和梯度优化的问题，以寻找总体最优解，从而在几乎不降低泛化性能的情况下提高鲁棒性。利用闭式空空间投影器将对抗样本和扰动限制在决策边界的空空间内，从而有效降低了来自不可靠特征的攻击威胁。随后，我们在 CIFAR10 和 SVHN 数据集上进行了实验，结果表明我们的方法可以与对抗训练方法无缝结合，并获得相当的鲁棒性，同时使泛化接近高精度模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Machine Learning

自引率

0.00%

发文量