Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing Retrieval

IF 5.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Information Systems Pub Date : 2024-03-02 DOI:10.1145/3650205

Tianshi Wang, Fengling Li, Lei Zhu, Jingjing Li, Zheng Zhang, Heng Tao Shen

{"title":"Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing Retrieval","authors":"Tianshi Wang, Fengling Li, Lei Zhu, Jingjing Li, Zheng Zhang, Heng Tao Shen","doi":"10.1145/3650205","DOIUrl":null,"url":null,"abstract":"<p>Deep cross-modal hashing has promoted the field of multi-modal retrieval due to its excellent efficiency and storage, but its vulnerability to backdoor attacks is rarely studied. Notably, current deep cross-modal hashing methods inevitably require large-scale training data, resulting in poisoned samples with imperceptible triggers that can easily be camouflaged into the training data to bury backdoors in the victim model. Nevertheless, existing backdoor attacks focus on the uni-modal vision domain, while the multi-modal gap and hash quantization weaken their attack performance. In addressing the aforementioned challenges, we undertake an invisible black-box backdoor attack against deep cross-modal hashing retrieval in this paper. To the best of our knowledge, this is the first attempt in this research field. Specifically, we develop a flexible trigger generator to generate the attacker’s specified triggers, which learns the sample semantics of the non-poisoned modality to bridge the cross-modal attack gap. Then, we devise an input-aware injection network, which embeds the generated triggers into benign samples in the form of sample-specific stealth and realizes cross-modal semantic interaction between triggers and poisoned samples. Owing to the knowledge-agnostic of victim models, we enable any cross-modal hashing knockoff to facilitate the black-box backdoor attack and alleviate the attack weakening of hash quantization. Moreover, we propose a confusing perturbation and mask strategy to induce the high-performance victim models to focus on imperceptible triggers in poisoned samples. Extensive experiments on benchmark datasets demonstrate that our method has a state-of-the-art attack performance against deep cross-modal hashing retrieval. Besides, we investigate the influences of transferable attacks, few-shot poisoning, multi-modal poisoning, perceptibility, and potential defenses on backdoor attacks. Our codes and datasets are available at https://github.com/tswang0116/IB3A.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"54 1","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3650205","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Deep cross-modal hashing has promoted the field of multi-modal retrieval due to its excellent efficiency and storage, but its vulnerability to backdoor attacks is rarely studied. Notably, current deep cross-modal hashing methods inevitably require large-scale training data, resulting in poisoned samples with imperceptible triggers that can easily be camouflaged into the training data to bury backdoors in the victim model. Nevertheless, existing backdoor attacks focus on the uni-modal vision domain, while the multi-modal gap and hash quantization weaken their attack performance. In addressing the aforementioned challenges, we undertake an invisible black-box backdoor attack against deep cross-modal hashing retrieval in this paper. To the best of our knowledge, this is the first attempt in this research field. Specifically, we develop a flexible trigger generator to generate the attacker’s specified triggers, which learns the sample semantics of the non-poisoned modality to bridge the cross-modal attack gap. Then, we devise an input-aware injection network, which embeds the generated triggers into benign samples in the form of sample-specific stealth and realizes cross-modal semantic interaction between triggers and poisoned samples. Owing to the knowledge-agnostic of victim models, we enable any cross-modal hashing knockoff to facilitate the black-box backdoor attack and alleviate the attack weakening of hash quantization. Moreover, we propose a confusing perturbation and mask strategy to induce the high-performance victim models to focus on imperceptible triggers in poisoned samples. Extensive experiments on benchmark datasets demonstrate that our method has a state-of-the-art attack performance against deep cross-modal hashing retrieval. Besides, we investigate the influences of transferable attacks, few-shot poisoning, multi-modal poisoning, perceptibility, and potential defenses on backdoor attacks. Our codes and datasets are available at https://github.com/tswang0116/IB3A.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

针对深度跨模态哈希检索的隐形黑盒后门攻击

深度跨模态哈希算法因其出色的效率和存储能力，推动了多模态检索领域的发展，但其对后门攻击的脆弱性却鲜有研究。值得注意的是，目前的深度跨模态散列方法不可避免地需要大规模的训练数据，从而导致样本中毒，其触发因素不易察觉，很容易伪装成训练数据，在受害者模型中埋下后门。然而，现有的后门攻击主要集中在单模态视觉领域，而多模态差距和哈希量化削弱了它们的攻击性能。针对上述挑战，我们在本文中针对深度跨模态哈希检索进行了隐形黑盒后门攻击。据我们所知，这是该研究领域的首次尝试。具体来说，我们开发了一种灵活的触发器生成器来生成攻击者指定的触发器，它可以学习非中毒模态的样本语义，从而弥补跨模态攻击的差距。然后，我们设计了一个输入感知注入网络，它以特定样本隐身的形式将生成的触发器嵌入良性样本中，并实现触发器与中毒样本之间的跨模态语义交互。由于受害者模型的知识不可知性，我们使任何跨模态哈希山寨版都能促进黑盒后门攻击，并减轻哈希量化的攻击削弱。此外，我们还提出了一种混淆扰动和掩码策略，以诱导高性能受害者模型关注中毒样本中不易察觉的触发点。在基准数据集上进行的大量实验表明，我们的方法对深度跨模态哈希检索的攻击性能达到了一流水平。此外，我们还研究了可转移攻击、少量中毒、多模态中毒、可感知性以及后门攻击的潜在防御等因素的影响。我们的代码和数据集可在 https://github.com/tswang0116/IB3A 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Information Systems 工程技术-计算机：信息系统

CiteScore

9.40

自引率

14.30%

发文量

165

审稿时长

>12 weeks

期刊介绍： The ACM Transactions on Information Systems (TOIS) publishes papers on information retrieval (such as search engines, recommender systems) that contain: new principled information retrieval models or algorithms with sound empirical validation; observational, experimental and/or theoretical studies yielding new insights into information retrieval or information seeking; accounts of applications of existing information retrieval techniques that shed light on the strengths and weaknesses of the techniques; formalization of new information retrieval or information seeking tasks and of methods for evaluating the performance on those tasks; development of content (text, image, speech, video, etc) analysis methods to support information retrieval and information seeking; development of computational models of user information preferences and interaction behaviors; creation and analysis of evaluation methodologies for information retrieval and information seeking; or surveys of existing work that propose a significant synthesis. The information retrieval scope of ACM Transactions on Information Systems (TOIS) appeals to industry practitioners for its wealth of creative ideas, and to academic researchers for its descriptions of their colleagues'' work.

期刊最新文献

ROGER: Ranking-oriented Generative Retrieval Adversarial Item Promotion on Visually-Aware Recommender Systems by Guided Diffusion Bridging Dense and Sparse Maximum Inner Product Search MvStHgL: Multi-view Hypergraph Learning with Spatial-temporal Periodic Interests for Next POI Recommendation City Matters! A Dual-Target Cross-City Sequential POI Recommendation Model