Efficient Generation of Targeted and Transferable Adversarial Examples for Vision-Language Models via Diffusion Models

IF 8 1区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS IEEE Transactions on Information Forensics and Security Pub Date : 2024-12-23 DOI:10.1109/TIFS.2024.3518072
Qi Guo;Shanmin Pang;Xiaojun Jia;Yang Liu;Qing Guo
{"title":"Efficient Generation of Targeted and Transferable Adversarial Examples for Vision-Language Models via Diffusion Models","authors":"Qi Guo;Shanmin Pang;Xiaojun Jia;Yang Liu;Qing Guo","doi":"10.1109/TIFS.2024.3518072","DOIUrl":null,"url":null,"abstract":"Adversarial attacks, particularly targeted transfer-based attacks, can be used to assess the adversarial robustness of large visual-language models (VLMs), allowing for a more thorough examination of potential security flaws before deployment. However, previous transfer-based adversarial attacks incur high costs due to high iteration counts and complex method structure. Furthermore, due to the unnaturalness of adversarial semantics, the generated adversarial examples have low transferability. These issues limit the utility of existing methods for assessing robustness. To address these issues, we propose AdvDiffVLM, which uses diffusion models to generate natural, unrestricted and targeted adversarial examples via score matching. Specifically, AdvDiffVLM uses Adaptive Ensemble Gradient Estimation (AEGE) to modify the score during the diffusion model’s reverse generation process, ensuring that the produced adversarial examples have natural adversarial targeted semantics, which improves their transferability. Simultaneously, to improve the quality of adversarial examples, we use the GradCAM-guided Mask Generation (GCMG) to disperse adversarial semantics throughout the image rather than concentrating them in a single area. Finally, AdvDiffVLM embeds more target semantics into adversarial examples after multiple iterations. Experimental results show that our method generates adversarial examples 5x to 10x faster than state-of-the-art (SOTA) transfer-based adversarial attacks while maintaining higher quality adversarial examples. Furthermore, compared to previous transfer-based adversarial attacks, the adversarial examples generated by our method have better transferability. Notably, AdvDiffVLM can successfully attack a variety of commercial VLMs in a black-box environment, including GPT-4V. The code is available at <uri>https://github.com/gq-max/AdvDiffVLM</uri>","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"1333-1348"},"PeriodicalIF":8.0000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10812818/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Adversarial attacks, particularly targeted transfer-based attacks, can be used to assess the adversarial robustness of large visual-language models (VLMs), allowing for a more thorough examination of potential security flaws before deployment. However, previous transfer-based adversarial attacks incur high costs due to high iteration counts and complex method structure. Furthermore, due to the unnaturalness of adversarial semantics, the generated adversarial examples have low transferability. These issues limit the utility of existing methods for assessing robustness. To address these issues, we propose AdvDiffVLM, which uses diffusion models to generate natural, unrestricted and targeted adversarial examples via score matching. Specifically, AdvDiffVLM uses Adaptive Ensemble Gradient Estimation (AEGE) to modify the score during the diffusion model’s reverse generation process, ensuring that the produced adversarial examples have natural adversarial targeted semantics, which improves their transferability. Simultaneously, to improve the quality of adversarial examples, we use the GradCAM-guided Mask Generation (GCMG) to disperse adversarial semantics throughout the image rather than concentrating them in a single area. Finally, AdvDiffVLM embeds more target semantics into adversarial examples after multiple iterations. Experimental results show that our method generates adversarial examples 5x to 10x faster than state-of-the-art (SOTA) transfer-based adversarial attacks while maintaining higher quality adversarial examples. Furthermore, compared to previous transfer-based adversarial attacks, the adversarial examples generated by our method have better transferability. Notably, AdvDiffVLM can successfully attack a variety of commercial VLMs in a black-box environment, including GPT-4V. The code is available at https://github.com/gq-max/AdvDiffVLM
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于扩散模型的视觉语言模型的目标和可转移对抗示例的高效生成
对抗性攻击,特别是有针对性的基于传输的攻击,可以用来评估大型视觉语言模型(vlm)的对抗性鲁棒性,允许在部署之前更彻底地检查潜在的安全漏洞。然而,以往基于转移的对抗性攻击由于迭代次数多、方法结构复杂,成本高。此外,由于对抗语义的非自然性,生成的对抗示例具有较低的可移植性。这些问题限制了评估鲁棒性的现有方法的实用性。为了解决这些问题,我们提出了AdvDiffVLM,它使用扩散模型通过分数匹配生成自然的、不受限制的和有针对性的对抗示例。具体来说,AdvDiffVLM在扩散模型逆向生成过程中使用自适应集成梯度估计(age)来修改分数,确保生成的对抗样本具有自然的对抗目标语义,从而提高了它们的可移植性。同时,为了提高对抗示例的质量,我们使用gradcam引导的掩码生成(GCMG)将对抗语义分散到整个图像中,而不是将它们集中在单个区域。最后,AdvDiffVLM在多次迭代后将更多的目标语义嵌入到对抗性示例中。实验结果表明,我们的方法生成对抗性示例的速度比基于最先进(SOTA)传输的对抗性攻击快5到10倍,同时保持更高质量的对抗性示例。此外,与以往基于转移的对抗性攻击相比,本文方法生成的对抗性示例具有更好的可转移性。值得注意的是,AdvDiffVLM可以在黑箱环境中成功攻击各种商用vlm,包括GPT-4V。代码可在https://github.com/gq-max/AdvDiffVLM上获得
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Information Forensics and Security
IEEE Transactions on Information Forensics and Security 工程技术-工程:电子与电气
CiteScore
14.40
自引率
7.40%
发文量
234
审稿时长
6.5 months
期刊介绍: The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features
期刊最新文献
Mitigating Delivery Fraud and Path Manipulation in UAV-Based E-Commerce: A Fair Exchange Protocol Dishonest Majority Passive-to-Active Compiler over Rings for MPC with Constant Online Communication GCI-GANomaly: A Novel GPS Spoofing Detection Scheme based on Grayscale Constellation Image Towards Generalizable Deepfake Detection via Forgery-aware Audio-Visual Adaptation: A Variational Bayesian Approach Adversarial Semantic and Label Perturbation Attack for Pedestrian Attribute Recognition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1