基于变压器的拥挤情况下头部特征不完全行人检测

IF 3.2 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Signal Processing Letters Pub Date : 2025-01-02 DOI:10.1109/LSP.2024.3525397
Zefei Chen;Yongjie Lin;Jianmin Xu;Kai Lu;Yanfang Shou
{"title":"基于变压器的拥挤情况下头部特征不完全行人检测","authors":"Zefei Chen;Yongjie Lin;Jianmin Xu;Kai Lu;Yanfang Shou","doi":"10.1109/LSP.2024.3525397","DOIUrl":null,"url":null,"abstract":"Pedestrian detection in crowded situation is a challenging task. This study presents a straightforward and effective method called Det RCNN to detect pedestrians in crowded situation, while also pairing the body and head of individual pedestrian. On the one hand, pedestrians' heads have their characteristics of stable shape and distinct feature. On the other hand, their heads are usually positioned higher in image, so even in crowded situation, it is difficult to completely cover the pedestrians' heads. Therefore, this study equipped the DETR model with a Head Decoder (HDecoder) parallel to the Decoder. HDecoder takes the head knowledge generated in the Decoder phase as head queries. Simultaneously, the HDecoder uses a key-query mechanism to search the entire image for the body bounding boxes corresponding to the head queries. Lastly, the proposed method conducts a straightforward IOU (Intersection over Union) matching between the body bounding boxes produced in the Decoder and HDecoder phases. This HDecoder resembles the second stage of the Faster RCNN model, hence this paper termed it Det RCNN (DETR RCNN). Compared to Deformable DETR, the experimental results on the CrowdHuman dataset show that the proposed model can increase AP<inline-formula><tex-math>$_{m}$</tex-math></inline-formula> from 53.02 to 53.87. Furthermore, the mMR<inline-formula><tex-math>$^{-2}$</tex-math></inline-formula> decreased from 52.46 to 42.32 compared to the existing BFJ.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"576-580"},"PeriodicalIF":3.2000,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detecting Pedestrian With Incomplete Head Feature in Crowded Situation Based on Transformer\",\"authors\":\"Zefei Chen;Yongjie Lin;Jianmin Xu;Kai Lu;Yanfang Shou\",\"doi\":\"10.1109/LSP.2024.3525397\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pedestrian detection in crowded situation is a challenging task. This study presents a straightforward and effective method called Det RCNN to detect pedestrians in crowded situation, while also pairing the body and head of individual pedestrian. On the one hand, pedestrians' heads have their characteristics of stable shape and distinct feature. On the other hand, their heads are usually positioned higher in image, so even in crowded situation, it is difficult to completely cover the pedestrians' heads. Therefore, this study equipped the DETR model with a Head Decoder (HDecoder) parallel to the Decoder. HDecoder takes the head knowledge generated in the Decoder phase as head queries. Simultaneously, the HDecoder uses a key-query mechanism to search the entire image for the body bounding boxes corresponding to the head queries. Lastly, the proposed method conducts a straightforward IOU (Intersection over Union) matching between the body bounding boxes produced in the Decoder and HDecoder phases. This HDecoder resembles the second stage of the Faster RCNN model, hence this paper termed it Det RCNN (DETR RCNN). Compared to Deformable DETR, the experimental results on the CrowdHuman dataset show that the proposed model can increase AP<inline-formula><tex-math>$_{m}$</tex-math></inline-formula> from 53.02 to 53.87. Furthermore, the mMR<inline-formula><tex-math>$^{-2}$</tex-math></inline-formula> decreased from 52.46 to 42.32 compared to the existing BFJ.\",\"PeriodicalId\":13154,\"journal\":{\"name\":\"IEEE Signal Processing Letters\",\"volume\":\"32 \",\"pages\":\"576-580\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Signal Processing Letters\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10820533/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10820533/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

拥挤环境下的行人检测是一项具有挑战性的任务。本研究提出了一种简单有效的方法,称为Det RCNN,用于在拥挤情况下检测行人,同时还将行人个体的身体和头部进行配对。一方面,行人头部具有形状稳定、特征鲜明的特点。另一方面,他们的头部通常在图像中的位置较高,因此即使在拥挤的情况下,也很难完全覆盖行人的头部。因此,本研究为DETR模型配备了一个与解码器并行的头部解码器(HDecoder)。HDecoder将在Decoder阶段生成的头部知识作为头部查询。同时,HDecoder使用键查询机制在整个图像中搜索与头部查询相对应的body边界框。最后,提出的方法在Decoder和HDecoder阶段产生的体边界框之间进行直接的IOU (Intersection over Union)匹配。这种HDecoder类似于Faster RCNN模型的第二阶段,因此本文将其称为Det RCNN (DETR RCNN)。与Deformable DETR相比,在CrowdHuman数据集上的实验结果表明,该模型可以将AP$_{m}$从53.02提高到53.87。mMR$^{-2}$从52.46下降到42.32。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Detecting Pedestrian With Incomplete Head Feature in Crowded Situation Based on Transformer
Pedestrian detection in crowded situation is a challenging task. This study presents a straightforward and effective method called Det RCNN to detect pedestrians in crowded situation, while also pairing the body and head of individual pedestrian. On the one hand, pedestrians' heads have their characteristics of stable shape and distinct feature. On the other hand, their heads are usually positioned higher in image, so even in crowded situation, it is difficult to completely cover the pedestrians' heads. Therefore, this study equipped the DETR model with a Head Decoder (HDecoder) parallel to the Decoder. HDecoder takes the head knowledge generated in the Decoder phase as head queries. Simultaneously, the HDecoder uses a key-query mechanism to search the entire image for the body bounding boxes corresponding to the head queries. Lastly, the proposed method conducts a straightforward IOU (Intersection over Union) matching between the body bounding boxes produced in the Decoder and HDecoder phases. This HDecoder resembles the second stage of the Faster RCNN model, hence this paper termed it Det RCNN (DETR RCNN). Compared to Deformable DETR, the experimental results on the CrowdHuman dataset show that the proposed model can increase AP$_{m}$ from 53.02 to 53.87. Furthermore, the mMR$^{-2}$ decreased from 52.46 to 42.32 compared to the existing BFJ.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Signal Processing Letters
IEEE Signal Processing Letters 工程技术-工程:电子与电气
CiteScore
7.40
自引率
12.80%
发文量
339
审稿时长
2.8 months
期刊介绍: The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.
期刊最新文献
Heterogeneous Dual-Branch Emotional Consistency Network for Facial Expression Recognition Adaptive Superpixel-Guided Non-Homogeneous Image Dehazing Video Inpainting Localization With Contrastive Learning Cross-View Fusion for Multi-View Clustering Piecewise Student's t-distribution Mixture Model-Based Estimation for NAND Flash Memory Channels
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1