One Shot Object Detection Via Hierarchical Adaptive Alignment

Enquan Zhang, Cheolkon Jung
{"title":"One Shot Object Detection Via Hierarchical Adaptive Alignment","authors":"Enquan Zhang, Cheolkon Jung","doi":"10.1109/VCIP56404.2022.10008884","DOIUrl":null,"url":null,"abstract":"Recently, deep learning based object detectors have achieved good performance with abundant labeled data. However, data labeling is often expensive and time-consuming in real life. Therefore, it is required to introduce one shot learning into object detection. In this paper, we propose one shot object detection based on hierarchical adaptive alignment to address the limited information of one shot in feature representation. We present a multi-adaptive alignment framework based on faster R-CNN to extract effective features from query patch and target image using siamese convolutional feature extraction, then generate a fused feature map by aggregating query and target features. We use the fused feature map in object classification and localization. The proposed framework adaptively adjusts feature representation through hierarchical and aggregated alignment so that it can learn correlation between the target image and the query patch. Experimental results demonstrate that the proposed method significantly improves the unseen-class object detection from 24.3 AP50 to 26.2 AP50 on the MS-COCO dataset.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP56404.2022.10008884","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recently, deep learning based object detectors have achieved good performance with abundant labeled data. However, data labeling is often expensive and time-consuming in real life. Therefore, it is required to introduce one shot learning into object detection. In this paper, we propose one shot object detection based on hierarchical adaptive alignment to address the limited information of one shot in feature representation. We present a multi-adaptive alignment framework based on faster R-CNN to extract effective features from query patch and target image using siamese convolutional feature extraction, then generate a fused feature map by aggregating query and target features. We use the fused feature map in object classification and localization. The proposed framework adaptively adjusts feature representation through hierarchical and aggregated alignment so that it can learn correlation between the target image and the query patch. Experimental results demonstrate that the proposed method significantly improves the unseen-class object detection from 24.3 AP50 to 26.2 AP50 on the MS-COCO dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过分层自适应对齐的单镜头目标检测
近年来,基于深度学习的目标检测器在标记数据丰富的情况下取得了良好的性能。然而,在现实生活中,数据标注通常既昂贵又耗时。因此,需要将一次性学习引入到目标检测中。本文提出了一种基于分层自适应对齐的单镜头目标检测方法,以解决特征表示中单镜头信息有限的问题。提出了一种基于更快R-CNN的多自适应对齐框架,利用连体卷积特征提取从查询补丁和目标图像中提取有效特征,然后通过聚合查询和目标特征生成融合特征映射。将融合特征映射用于目标分类和定位。该框架通过分层和聚合对齐自适应调整特征表示,从而学习目标图像与查询补丁之间的相关性。实验结果表明,该方法显著提高了MS-COCO数据集上看不见类目标的检测效率,从24.3 AP50提高到26.2 AP50。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CdCLR: Clip-Driven Contrastive Learning for Skeleton-Based Action Recognition Spectral Analysis of Aerial Light Field for Optimization Sampling and Rendering of Unmanned Aerial Vehicle Near-lossless Point Cloud Geometry Compression Based on Adaptive Residual Compensation Efficient Interpolation Filters for Chroma Motion Compensation in Video Coding Rate Controllable Learned Image Compression Based on RFL Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1