基于遮罩变换器的即时姿势提取，用于模糊人物再识别

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Pub Date : 2025-03-01 Epub Date: 2024-10-22 DOI:10.1016/j.patcog.2024.111082

Ting-Ting Yuan , Qing-Ling Shu , Si-Bao Chen , Li-Li Huang, Bin Luo

{"title":"基于遮罩变换器的即时姿势提取，用于模糊人物再识别","authors":"Ting-Ting Yuan , Qing-Ling Shu , Si-Bao Chen , Li-Li Huang, Bin Luo","doi":"10.1016/j.patcog.2024.111082","DOIUrl":null,"url":null,"abstract":"<div><div>Re-Identification (Re-ID) of obscured pedestrians is a daunting task, primarily due to the frequent occlusion caused by various obstacles like buildings, vehicles, and even other pedestrians. To address this challenge, we propose a novel approach named Instant Pose Extraction based on Mask Transformer (MTIPE), tailored specifically for occluded person Re-ID. MTIPE consists of several new modules: a Mask Aware Module (MAM) for alignment between the overall prototype and the occluded image; a Multi-headed Attention Constraint Module (MACM) to enrich the feature representation; a Pose Aggregation Module (PAM) to separate useful human information from the occlusion noise; a Feature Matching Module (FMM) in matching non-occluded parts; introduction of learnable local prototypes in the defined local prototype-based transformer decoder; a Pooling Attention Module (PAM) instead of traditional self-attention module to better extract and propagate local contextual information; and Pose Key-points Loss to better match non-occluded body parts. Through comprehensive experimental evaluations and comparisons, MTIPE demonstrates encouraging performance improvements in both occluded and holistic person Re-ID tasks. Its results surpass or at least match those of current state-of-the-art methods in various aspects, highlighting its potential advantages and promising application prospects.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111082"},"PeriodicalIF":7.6000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Instant pose extraction based on mask transformer for occluded person re-identification\",\"authors\":\"Ting-Ting Yuan , Qing-Ling Shu , Si-Bao Chen , Li-Li Huang, Bin Luo\",\"doi\":\"10.1016/j.patcog.2024.111082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Re-Identification (Re-ID) of obscured pedestrians is a daunting task, primarily due to the frequent occlusion caused by various obstacles like buildings, vehicles, and even other pedestrians. To address this challenge, we propose a novel approach named Instant Pose Extraction based on Mask Transformer (MTIPE), tailored specifically for occluded person Re-ID. MTIPE consists of several new modules: a Mask Aware Module (MAM) for alignment between the overall prototype and the occluded image; a Multi-headed Attention Constraint Module (MACM) to enrich the feature representation; a Pose Aggregation Module (PAM) to separate useful human information from the occlusion noise; a Feature Matching Module (FMM) in matching non-occluded parts; introduction of learnable local prototypes in the defined local prototype-based transformer decoder; a Pooling Attention Module (PAM) instead of traditional self-attention module to better extract and propagate local contextual information; and Pose Key-points Loss to better match non-occluded body parts. Through comprehensive experimental evaluations and comparisons, MTIPE demonstrates encouraging performance improvements in both occluded and holistic person Re-ID tasks. Its results surpass or at least match those of current state-of-the-art methods in various aspects, highlighting its potential advantages and promising application prospects.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"159 \",\"pages\":\"Article 111082\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320324008331\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/22 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324008331","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/22 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

对被遮挡的行人进行再识别（Re-ID）是一项艰巨的任务，这主要是由于建筑物、车辆甚至其他行人等各种障碍物经常造成遮挡。为了应对这一挑战，我们提出了一种名为 "基于掩模变换器的即时姿态提取"（MTIPE）的新方法，专门用于模糊行人的重新识别。MTIPE 由几个新模块组成：遮罩感知模块（MAM），用于整体原型与遮挡图像之间的对齐；多头注意力约束模块（MACM），用于丰富特征表示；姿态聚合模块（PAM），用于从遮挡噪声中分离出有用的人体信息；特征匹配模块（FMM），用于匹配非遮挡部分；在已定义的基于局部原型的变换解码器中引入可学习的局部原型；汇集注意力模块（PAM）取代传统的自我注意力模块，以更好地提取和传播局部上下文信息；以及姿势关键点丢失，以更好地匹配非闭塞身体部位。通过全面的实验评估和比较，MTIPE 在隐蔽和整体人物再识别任务中都取得了令人鼓舞的性能改进。其结果在各个方面都超越或至少与当前最先进的方法相当，凸显了其潜在优势和广阔的应用前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Instant pose extraction based on mask transformer for occluded person re-identification

Re-Identification (Re-ID) of obscured pedestrians is a daunting task, primarily due to the frequent occlusion caused by various obstacles like buildings, vehicles, and even other pedestrians. To address this challenge, we propose a novel approach named Instant Pose Extraction based on Mask Transformer (MTIPE), tailored specifically for occluded person Re-ID. MTIPE consists of several new modules: a Mask Aware Module (MAM) for alignment between the overall prototype and the occluded image; a Multi-headed Attention Constraint Module (MACM) to enrich the feature representation; a Pose Aggregation Module (PAM) to separate useful human information from the occlusion noise; a Feature Matching Module (FMM) in matching non-occluded parts; introduction of learnable local prototypes in the defined local prototype-based transformer decoder; a Pooling Attention Module (PAM) instead of traditional self-attention module to better extract and propagate local contextual information; and Pose Key-points Loss to better match non-occluded body parts. Through comprehensive experimental evaluations and comparisons, MTIPE demonstrates encouraging performance improvements in both occluded and holistic person Re-ID tasks. Its results surpass or at least match those of current state-of-the-art methods in various aspects, highlighting its potential advantages and promising application prospects.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.