ECEA：可扩展的共存注意力，用于少镜头物体检测。

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2024-06-14 DOI:10.1109/TIP.2024.3411771

Zhimeng Xin, Tianxu Wu, Shiming Chen, Yixiong Zou, Ling Shao, Xinge You

{"title":"ECEA：可扩展的共存注意力，用于少镜头物体检测。","authors":"Zhimeng Xin, Tianxu Wu, Shiming Chen, Yixiong Zou, Ling Shao, Xinge You","doi":"10.1109/TIP.2024.3411771","DOIUrl":null,"url":null,"abstract":"Few-shot object detection (FSOD) identifies objects from extremely few annotated samples. Most existing FSOD methods, recently, apply the two-stage learning paradigm, which transfers the knowledge learned from abundant base classes to assist the few-shot detectors by learning the global features. However, such existing FSOD approaches seldom consider the localization of objects from local to global. Limited by the scarce training data in FSOD, the training samples of novel classes typically capture part of objects, resulting in such FSOD methods being unable to detect the completely unseen object during testing. To tackle this problem, we propose an Extensible Co-Existing Attention (ECEA) module to enable the model to infer the global object according to the local parts. Specifically, we first devise an extensible attention mechanism that starts with a local region and extends attention to co-existing regions that are similar and adjacent to the given local region. We then implement the extensible attention mechanism in different feature scales to progressively discover the full object in various receptive fields. In the training process, the model learns the extensible ability on the base stage with abundant samples and transfers it to the novel stage of continuous extensible learning, which can assist the few-shot model to quickly adapt in extending local regions to co-existing regions. Extensive experiments on the PASCAL VOC and COCO datasets show that our ECEA module can assist the few-shot detector to completely predict the object despite some regions failing to appear in the training samples and achieve the new state-of-the-art compared with existing FSOD methods. Code is released at https://github.com/zhimengXin/ECEA.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ECEA: Extensible Co-Existing Attention for Few-Shot Object Detection.\",\"authors\":\"Zhimeng Xin, Tianxu Wu, Shiming Chen, Yixiong Zou, Ling Shao, Xinge You\",\"doi\":\"10.1109/TIP.2024.3411771\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Few-shot object detection (FSOD) identifies objects from extremely few annotated samples. Most existing FSOD methods, recently, apply the two-stage learning paradigm, which transfers the knowledge learned from abundant base classes to assist the few-shot detectors by learning the global features. However, such existing FSOD approaches seldom consider the localization of objects from local to global. Limited by the scarce training data in FSOD, the training samples of novel classes typically capture part of objects, resulting in such FSOD methods being unable to detect the completely unseen object during testing. To tackle this problem, we propose an Extensible Co-Existing Attention (ECEA) module to enable the model to infer the global object according to the local parts. Specifically, we first devise an extensible attention mechanism that starts with a local region and extends attention to co-existing regions that are similar and adjacent to the given local region. We then implement the extensible attention mechanism in different feature scales to progressively discover the full object in various receptive fields. In the training process, the model learns the extensible ability on the base stage with abundant samples and transfers it to the novel stage of continuous extensible learning, which can assist the few-shot model to quickly adapt in extending local regions to co-existing regions. Extensive experiments on the PASCAL VOC and COCO datasets show that our ECEA module can assist the few-shot detector to completely predict the object despite some regions failing to appear in the training samples and achieve the new state-of-the-art compared with existing FSOD methods. Code is released at https://github.com/zhimengXin/ECEA.\",\"PeriodicalId\":94032,\"journal\":{\"name\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TIP.2024.3411771\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TIP.2024.3411771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

少量对象检测（FSOD）可从极少的注释样本中识别对象。最近，现有的大多数 FSOD 方法都采用了两阶段学习模式，即通过学习全局特征，将从丰富的基础类中学习到的知识转移到少数几个检测器中。然而，这些现有的 FSOD 方法很少考虑物体从局部到全局的定位。受限于 FSOD 中稀缺的训练数据，新类别的训练样本通常只能捕捉到部分物体，导致此类 FSOD 方法在测试过程中无法检测到完全未见的物体。为了解决这个问题，我们提出了一个可扩展的共存注意力（ECEA）模块，使模型能够根据局部来推断全局对象。具体来说，我们首先设计了一种可扩展的注意力机制，该机制从局部区域开始，将注意力扩展到与给定局部区域相似且相邻的共存区域。然后，我们在不同的特征尺度上实施可扩展的关注机制，以逐步发现不同感受野中的完整对象。在训练过程中，模型会在样本丰富的基础阶段学习可扩展能力，并将其转移到持续可扩展学习的新阶段，这可以帮助少镜头模型快速适应将局部区域扩展到共存区域的过程。在 PASCAL VOC 和 COCO 数据集上进行的大量实验表明，尽管有些区域没有出现在训练样本中，我们的 ECEA 模块仍能帮助 few-shot 检测器完全预测对象，与现有的 FSOD 方法相比达到了新的先进水平。代码发布于 https://github.com/zhimengXin/ECEA。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ECEA: Extensible Co-Existing Attention for Few-Shot Object Detection.

Few-shot object detection (FSOD) identifies objects from extremely few annotated samples. Most existing FSOD methods, recently, apply the two-stage learning paradigm, which transfers the knowledge learned from abundant base classes to assist the few-shot detectors by learning the global features. However, such existing FSOD approaches seldom consider the localization of objects from local to global. Limited by the scarce training data in FSOD, the training samples of novel classes typically capture part of objects, resulting in such FSOD methods being unable to detect the completely unseen object during testing. To tackle this problem, we propose an Extensible Co-Existing Attention (ECEA) module to enable the model to infer the global object according to the local parts. Specifically, we first devise an extensible attention mechanism that starts with a local region and extends attention to co-existing regions that are similar and adjacent to the given local region. We then implement the extensible attention mechanism in different feature scales to progressively discover the full object in various receptive fields. In the training process, the model learns the extensible ability on the base stage with abundant samples and transfers it to the novel stage of continuous extensible learning, which can assist the few-shot model to quickly adapt in extending local regions to co-existing regions. Extensive experiments on the PASCAL VOC and COCO datasets show that our ECEA module can assist the few-shot detector to completely predict the object despite some regions failing to appear in the training samples and achieve the new state-of-the-art compared with existing FSOD methods. Code is released at https://github.com/zhimengXin/ECEA.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量