Toward Model Resistant to Transferable Adversarial Examples via Trigger Activation

IF 8 1区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS IEEE Transactions on Information Forensics and Security Pub Date : 2025-03-19 DOI:10.1109/TIFS.2025.3553043
Yi Yu;Song Xia;Xun Lin;Chenqi Kong;Wenhan Yang;Shijian Lu;Yap-Peng Tan;Alex C. Kot
{"title":"Toward Model Resistant to Transferable Adversarial Examples via Trigger Activation","authors":"Yi Yu;Song Xia;Xun Lin;Chenqi Kong;Wenhan Yang;Shijian Lu;Yap-Peng Tan;Alex C. Kot","doi":"10.1109/TIFS.2025.3553043","DOIUrl":null,"url":null,"abstract":"Adversarial examples, characterized by imperceptible perturbations, pose significant threats to deep neural networks by misleading their predictions. A critical aspect of these examples is their transferability, allowing them to deceive unseen models in closed-box scenarios. Despite the widespread exploration of defense methods, including those on transferability, they show limitations: inefficient deployment, ineffective defense, and degraded performance on clean images. In this work, we introduce a novel training paradigm aimed at enhancing robustness against transferable adversarial examples (TAEs) in a more efficient and effective way. We propose a model that exhibits random guessing behavior when presented with clean data <inline-formula> <tex-math>$\\boldsymbol {x}$ </tex-math></inline-formula> as input, and generates accurate predictions when with triggered data <inline-formula> <tex-math>$\\boldsymbol {x}+\\boldsymbol {\\tau }$ </tex-math></inline-formula>. Importantly, the trigger <inline-formula> <tex-math>$\\boldsymbol {\\tau }$ </tex-math></inline-formula> remains constant for all data instances. We refer to these models as models with trigger activation. We are surprised to find that these models exhibit certain robustness against TAEs. Through the consideration of first-order gradients, we provide a theoretical analysis of this robustness. Moreover, through the joint optimization of the learnable trigger and the model, we achieve improved robustness to transferable attacks. Extensive experiments conducted across diverse datasets, evaluating a variety of attacking methods, underscore the effectiveness and superiority of our approach.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"3745-3757"},"PeriodicalIF":8.0000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10934010/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Adversarial examples, characterized by imperceptible perturbations, pose significant threats to deep neural networks by misleading their predictions. A critical aspect of these examples is their transferability, allowing them to deceive unseen models in closed-box scenarios. Despite the widespread exploration of defense methods, including those on transferability, they show limitations: inefficient deployment, ineffective defense, and degraded performance on clean images. In this work, we introduce a novel training paradigm aimed at enhancing robustness against transferable adversarial examples (TAEs) in a more efficient and effective way. We propose a model that exhibits random guessing behavior when presented with clean data $\boldsymbol {x}$ as input, and generates accurate predictions when with triggered data $\boldsymbol {x}+\boldsymbol {\tau }$ . Importantly, the trigger $\boldsymbol {\tau }$ remains constant for all data instances. We refer to these models as models with trigger activation. We are surprised to find that these models exhibit certain robustness against TAEs. Through the consideration of first-order gradients, we provide a theoretical analysis of this robustness. Moreover, through the joint optimization of the learnable trigger and the model, we achieve improved robustness to transferable attacks. Extensive experiments conducted across diverse datasets, evaluating a variety of attacking methods, underscore the effectiveness and superiority of our approach.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于触发激活的抗可转移对抗样本模型
对抗性示例以难以察觉的扰动为特征,通过误导深度神经网络的预测,对其构成重大威胁。这些示例的一个关键方面是它们的可移植性,允许它们在封闭的场景中欺骗不可见的模型。尽管对包括可转移性在内的防御方法进行了广泛的探索,但它们显示出局限性:低效的部署,无效的防御以及在干净图像上的性能下降。在这项工作中,我们引入了一种新的训练范式,旨在以更高效和有效的方式增强对可转移对抗示例(TAEs)的鲁棒性。我们提出了一个模型,当使用干净数据$\boldsymbol {x}$作为输入时,它表现出随机猜测行为,当使用触发数据$\boldsymbol {x}+\boldsymbol {\tau}$时,它产生准确的预测。重要的是,对于所有数据实例,触发器$\boldsymbol {\tau}$保持不变。我们将这些模型称为具有触发器激活的模型。我们惊讶地发现,这些模型对TAEs表现出一定的鲁棒性。通过考虑一阶梯度,我们对这种鲁棒性进行了理论分析。此外,通过对可学习触发器和模型的联合优化,提高了对可转移攻击的鲁棒性。在不同的数据集上进行了大量的实验,评估了各种攻击方法,强调了我们方法的有效性和优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Information Forensics and Security
IEEE Transactions on Information Forensics and Security 工程技术-工程:电子与电气
CiteScore
14.40
自引率
7.40%
发文量
234
审稿时长
6.5 months
期刊介绍: The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features
期刊最新文献
GCI-GANomaly: A Novel GPS Spoofing Detection Scheme based on Grayscale Constellation Image Towards Generalizable Deepfake Detection via Forgery-aware Audio-Visual Adaptation: A Variational Bayesian Approach Adversarial Semantic and Label Perturbation Attack for Pedestrian Attribute Recognition Exploratory Detection of Unknown Cyber-Attacks via Evolutionary Strategy and Machine Learning Comments on “APFed: Anti-Poisoning Attacks in Privacy-Preserving Heterogeneous Federated Learning”
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1