Enhancing zero-shot object detection with external knowledge-guided robust contrast learning

IF 3.9 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Letters Pub Date : 2024-08-05 DOI:10.1016/j.patrec.2024.08.003

Lijuan Duan , Guangyuan Liu , Qing En , Zhaoying Liu , Zhi Gong , Bian Ma

{"title":"Enhancing zero-shot object detection with external knowledge-guided robust contrast learning","authors":"Lijuan Duan , Guangyuan Liu , Qing En , Zhaoying Liu , Zhi Gong , Bian Ma","doi":"10.1016/j.patrec.2024.08.003","DOIUrl":null,"url":null,"abstract":"<div><p>Zero-shot object detection aims to identify objects from unseen categories not present during training. Existing methods rely on category labels to create pseudo-features for unseen categories, but they face limitations in exploring semantic information and lack robustness. To address these issues, we introduce a novel framework, EKZSD, enhancing zero-shot object detection by incorporating external knowledge and contrastive paradigms. This framework enriches semantic diversity, enhancing discriminative ability and robustness. Specifically, we introduce a novel external knowledge extraction module that leverages attribute and relationship prompts to enrich semantic information. Moreover, a novel external knowledge contrastive learning module is proposed to enhance the model’s discriminative and robust capabilities by exploring pseudo-visual features. Additionally, we use cycle consistency learning to align generated visual features with original semantic features and adversarial learning to align visual features with semantic features. Collaboratively trained with contrast learning loss, cycle consistency loss, adversarial learning loss, and classification loss, our framework outperforms superior performance on the MSCOCO and Ship-43 datasets, as demonstrated in experimental results.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 152-159"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524002356","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Zero-shot object detection aims to identify objects from unseen categories not present during training. Existing methods rely on category labels to create pseudo-features for unseen categories, but they face limitations in exploring semantic information and lack robustness. To address these issues, we introduce a novel framework, EKZSD, enhancing zero-shot object detection by incorporating external knowledge and contrastive paradigms. This framework enriches semantic diversity, enhancing discriminative ability and robustness. Specifically, we introduce a novel external knowledge extraction module that leverages attribute and relationship prompts to enrich semantic information. Moreover, a novel external knowledge contrastive learning module is proposed to enhance the model’s discriminative and robust capabilities by exploring pseudo-visual features. Additionally, we use cycle consistency learning to align generated visual features with original semantic features and adversarial learning to align visual features with semantic features. Collaboratively trained with contrast learning loss, cycle consistency loss, adversarial learning loss, and classification loss, our framework outperforms superior performance on the MSCOCO and Ship-43 datasets, as demonstrated in experimental results.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用外部知识引导的鲁棒对比度学习增强零镜头物体检测能力

零镜头对象检测旨在从训练过程中未出现的未知类别中识别对象。现有方法依赖于类别标签来为未见类别创建伪特征，但这些方法在探索语义信息方面存在局限性，并且缺乏鲁棒性。为了解决这些问题，我们引入了一个新颖的框架 EKZSD，通过结合外部知识和对比范例来增强零镜头对象检测。该框架丰富了语义多样性，增强了判别能力和鲁棒性。具体来说，我们引入了一个新颖的外部知识提取模块，利用属性和关系提示来丰富语义信息。此外，我们还提出了一个新颖的外部知识对比学习模块，通过探索伪视觉特征来增强模型的判别能力和鲁棒性。此外，我们还利用循环一致性学习将生成的视觉特征与原始语义特征相一致，并利用对抗学习将视觉特征与语义特征相一致。通过对比学习损失、周期一致性损失、对抗学习损失和分类损失的协同训练，我们的框架在 MSCOCO 和 Ship-43 数据集上表现出了卓越的性能，实验结果也证明了这一点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Pattern Recognition Letters 工程技术-计算机：人工智能

CiteScore

12.40

自引率

5.90%

发文量

287

审稿时长

9.1 months

期刊介绍： Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition. Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.