Robust Cross-Modal Retrieval by Adversarial Training

2022 International Joint Conference on Neural Networks (IJCNN) Pub Date : 2022-07-18 DOI:10.1109/IJCNN55064.2022.9892637

Tao Zhang, Shiliang Sun, Jing Zhao

{"title":"Robust Cross-Modal Retrieval by Adversarial Training","authors":"Tao Zhang, Shiliang Sun, Jing Zhao","doi":"10.1109/IJCNN55064.2022.9892637","DOIUrl":null,"url":null,"abstract":"Cross-modal retrieval is usually implemented based on cross-modal representation learning, which is used to extract semantic information from cross-modal data. Recent work shows that cross-modal representation learning is vulnerable to adversarial attacks, even using large-scale pre-trained networks. By attacking the representation, it can be simple to attack the downstream tasks, especially for cross-modal retrieval tasks. Adversarial attacks on any modality will easily lead to obvious retrieval errors, which brings the challenge to improve the adversarial robustness of cross-modal retrieval. In this paper, we propose a robust cross-modal retrieval method (RoCMR), which generates adversarial examples for both the query modality and candidate modality and performs adversarial training for cross-modal retrieval. Specifically, we generate adversarial examples for both image and text modalities and train the model with benign and adversarial examples in the framework of contrastive learning. We evaluate the proposed RoCMR on two datasets and show its effectiveness in defending against gradient-based attacks.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN55064.2022.9892637","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Cross-modal retrieval is usually implemented based on cross-modal representation learning, which is used to extract semantic information from cross-modal data. Recent work shows that cross-modal representation learning is vulnerable to adversarial attacks, even using large-scale pre-trained networks. By attacking the representation, it can be simple to attack the downstream tasks, especially for cross-modal retrieval tasks. Adversarial attacks on any modality will easily lead to obvious retrieval errors, which brings the challenge to improve the adversarial robustness of cross-modal retrieval. In this paper, we propose a robust cross-modal retrieval method (RoCMR), which generates adversarial examples for both the query modality and candidate modality and performs adversarial training for cross-modal retrieval. Specifically, we generate adversarial examples for both image and text modalities and train the model with benign and adversarial examples in the framework of contrastive learning. We evaluate the proposed RoCMR on two datasets and show its effectiveness in defending against gradient-based attacks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于对抗训练的鲁棒跨模态检索

跨模态检索通常基于跨模态表示学习来实现，该学习用于从跨模态数据中提取语义信息。最近的研究表明，即使使用大规模预训练的网络，跨模态表示学习也容易受到对抗性攻击。通过攻击表示，可以很容易地攻击下游任务，特别是跨模态检索任务。针对任何模态的对抗性攻击都容易导致明显的检索错误，这给提高跨模态检索的对抗性鲁棒性带来了挑战。在本文中，我们提出了一种鲁棒跨模态检索方法(RoCMR)，该方法为查询模态和候选模态生成对抗性示例，并对跨模态检索进行对抗性训练。具体来说，我们为图像和文本模式生成对抗示例，并在对比学习的框架中使用良性和对抗示例训练模型。我们在两个数据集上评估了所提出的RoCMR，并展示了它在防御基于梯度的攻击方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 International Joint Conference on Neural Networks (IJCNN)

自引率

0.00%

发文量