Enhancing zero-shot object detection with external knowledge-guided robust contrast learning

IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Letters Pub Date : 2024-08-05 DOI:10.1016/j.patrec.2024.08.003
Lijuan Duan , Guangyuan Liu , Qing En , Zhaoying Liu , Zhi Gong , Bian Ma
{"title":"Enhancing zero-shot object detection with external knowledge-guided robust contrast learning","authors":"Lijuan Duan ,&nbsp;Guangyuan Liu ,&nbsp;Qing En ,&nbsp;Zhaoying Liu ,&nbsp;Zhi Gong ,&nbsp;Bian Ma","doi":"10.1016/j.patrec.2024.08.003","DOIUrl":null,"url":null,"abstract":"<div><p>Zero-shot object detection aims to identify objects from unseen categories not present during training. Existing methods rely on category labels to create pseudo-features for unseen categories, but they face limitations in exploring semantic information and lack robustness. To address these issues, we introduce a novel framework, EKZSD, enhancing zero-shot object detection by incorporating external knowledge and contrastive paradigms. This framework enriches semantic diversity, enhancing discriminative ability and robustness. Specifically, we introduce a novel external knowledge extraction module that leverages attribute and relationship prompts to enrich semantic information. Moreover, a novel external knowledge contrastive learning module is proposed to enhance the model’s discriminative and robust capabilities by exploring pseudo-visual features. Additionally, we use cycle consistency learning to align generated visual features with original semantic features and adversarial learning to align visual features with semantic features. Collaboratively trained with contrast learning loss, cycle consistency loss, adversarial learning loss, and classification loss, our framework outperforms superior performance on the MSCOCO and Ship-43 datasets, as demonstrated in experimental results.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 152-159"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524002356","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Zero-shot object detection aims to identify objects from unseen categories not present during training. Existing methods rely on category labels to create pseudo-features for unseen categories, but they face limitations in exploring semantic information and lack robustness. To address these issues, we introduce a novel framework, EKZSD, enhancing zero-shot object detection by incorporating external knowledge and contrastive paradigms. This framework enriches semantic diversity, enhancing discriminative ability and robustness. Specifically, we introduce a novel external knowledge extraction module that leverages attribute and relationship prompts to enrich semantic information. Moreover, a novel external knowledge contrastive learning module is proposed to enhance the model’s discriminative and robust capabilities by exploring pseudo-visual features. Additionally, we use cycle consistency learning to align generated visual features with original semantic features and adversarial learning to align visual features with semantic features. Collaboratively trained with contrast learning loss, cycle consistency loss, adversarial learning loss, and classification loss, our framework outperforms superior performance on the MSCOCO and Ship-43 datasets, as demonstrated in experimental results.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用外部知识引导的鲁棒对比度学习增强零镜头物体检测能力
零镜头对象检测旨在从训练过程中未出现的未知类别中识别对象。现有方法依赖于类别标签来为未见类别创建伪特征,但这些方法在探索语义信息方面存在局限性,并且缺乏鲁棒性。为了解决这些问题,我们引入了一个新颖的框架 EKZSD,通过结合外部知识和对比范例来增强零镜头对象检测。该框架丰富了语义多样性,增强了判别能力和鲁棒性。具体来说,我们引入了一个新颖的外部知识提取模块,利用属性和关系提示来丰富语义信息。此外,我们还提出了一个新颖的外部知识对比学习模块,通过探索伪视觉特征来增强模型的判别能力和鲁棒性。此外,我们还利用循环一致性学习将生成的视觉特征与原始语义特征相一致,并利用对抗学习将视觉特征与语义特征相一致。通过对比学习损失、周期一致性损失、对抗学习损失和分类损失的协同训练,我们的框架在 MSCOCO 和 Ship-43 数据集上表现出了卓越的性能,实验结果也证明了这一点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Pattern Recognition Letters
Pattern Recognition Letters 工程技术-计算机:人工智能
CiteScore
12.40
自引率
5.90%
发文量
287
审稿时长
9.1 months
期刊介绍: Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition. Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.
期刊最新文献
Personalized Federated Learning on long-tailed data via knowledge distillation and generated features Adaptive feature alignment for adversarial training Discrete diffusion models with Refined Language-Image Pre-trained representations for remote sensing image captioning A unified framework to stereotyped behavior detection for screening Autism Spectrum Disorder Explainable hypergraphs for gait based Parkinson classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1