Adaptive transformer with Pyramid Fusion for cloth-changing Person Re-Identification

IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Pub Date : 2025-07-01 Epub Date: 2025-02-12 DOI:10.1016/j.patcog.2025.111443
Guoqing Zhang , Jieqiong Zhou , Yuhui Zheng , Gaven Martin , Ruili Wang
{"title":"Adaptive transformer with Pyramid Fusion for cloth-changing Person Re-Identification","authors":"Guoqing Zhang ,&nbsp;Jieqiong Zhou ,&nbsp;Yuhui Zheng ,&nbsp;Gaven Martin ,&nbsp;Ruili Wang","doi":"10.1016/j.patcog.2025.111443","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, Transformer-based methods have made great progress in person re-identification (Re-ID), especially in handling identity changes in clothing-changing scenarios. Most current studies usually use biometric information-assisted methods such as human pose estimation to enhance the local perception ability of clothes-changing Re-ID. However, it is usually difficult for them to establish the connection between local biometric information and global identity semantics during training, resulting in the lack of local perception ability during the inference phase, which limits the improvement of model performance. In this paper, we propose a Transformer-based Adaptive-Aware Attention and Pyramid Fusion Network (<span><math><mrow><msup><mrow><mi>A</mi></mrow><mrow><mn>3</mn></mrow></msup><mi>P</mi><mi>F</mi><mi>N</mi></mrow></math></span>) for CC Re-ID, which can capture and integrate multi-scale visual information to enhance recognition ability. Firstly, to improve the information utilization efficiency of the model in cloth-changing scenarios, we propose a Multi-Layer Dynamic Concentration module (MLDC) to evaluate the importance features at each layer in real time and reduce the computational overlap between related layers. Secondly, we propose a Local Pyramid Aggregation Module (LPAM) to extract multi-scale features, aiming to maintain global perceptual capability and focus on key local information. In this module, we also combine the Fast Fourier Transform (FFT) with self-attention mechanism to more effectively identify and analyze pedestrian gait and other structural details in the frequency domain and reduce the computational complexity of processing high-dimensional data in the self-attention mechanism. Finally, we build a new dataset incorporating diverse atmospheric conditions (for instance wind and rain) to more realistically simulate natural scenarios for the changing of clothes. Extensive experiments on multiple cloth-changing datasets clearly confirm the superior performance of <span><math><mrow><msup><mrow><mi>A</mi></mrow><mrow><mn>3</mn></mrow></msup><mi>P</mi><mi>F</mi><mi>N</mi></mrow></math></span>. The dataset and related code are available on the website: <span><span>https://github.com/jieqiongz1999/vcclothes-w-r</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111443"},"PeriodicalIF":7.6000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325001037","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/12 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Recently, Transformer-based methods have made great progress in person re-identification (Re-ID), especially in handling identity changes in clothing-changing scenarios. Most current studies usually use biometric information-assisted methods such as human pose estimation to enhance the local perception ability of clothes-changing Re-ID. However, it is usually difficult for them to establish the connection between local biometric information and global identity semantics during training, resulting in the lack of local perception ability during the inference phase, which limits the improvement of model performance. In this paper, we propose a Transformer-based Adaptive-Aware Attention and Pyramid Fusion Network (A3PFN) for CC Re-ID, which can capture and integrate multi-scale visual information to enhance recognition ability. Firstly, to improve the information utilization efficiency of the model in cloth-changing scenarios, we propose a Multi-Layer Dynamic Concentration module (MLDC) to evaluate the importance features at each layer in real time and reduce the computational overlap between related layers. Secondly, we propose a Local Pyramid Aggregation Module (LPAM) to extract multi-scale features, aiming to maintain global perceptual capability and focus on key local information. In this module, we also combine the Fast Fourier Transform (FFT) with self-attention mechanism to more effectively identify and analyze pedestrian gait and other structural details in the frequency domain and reduce the computational complexity of processing high-dimensional data in the self-attention mechanism. Finally, we build a new dataset incorporating diverse atmospheric conditions (for instance wind and rain) to more realistically simulate natural scenarios for the changing of clothes. Extensive experiments on multiple cloth-changing datasets clearly confirm the superior performance of A3PFN. The dataset and related code are available on the website: https://github.com/jieqiongz1999/vcclothes-w-r.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于金字塔融合的自适应变布人再识别
近年来,基于变形金刚的方法在人的再识别(Re-ID)方面取得了很大进展,特别是在处理换衣服场景中的身份变化方面。目前的研究大多采用人体姿态估计等生物特征信息辅助方法来增强换衣Re-ID的局部感知能力。然而,在训练过程中往往难以建立局部生物特征信息与全局身份语义之间的联系,导致在推理阶段缺乏局部感知能力,限制了模型性能的提高。本文针对CC Re-ID,提出了一种基于变压器的自适应意识注意力与金字塔融合网络(A3PFN),该网络可以捕获和整合多尺度视觉信息,提高识别能力。首先,为了提高模型在换布场景下的信息利用效率,提出了多层动态集中模块(Multi-Layer Dynamic Concentration module, MLDC),实时评估每一层的重要特征,减少相关层之间的计算重叠;其次,我们提出了一种局部金字塔聚合模块(LPAM)来提取多尺度特征,以保持全局感知能力并关注关键的局部信息;在本模块中,我们还将快速傅里叶变换(Fast Fourier Transform, FFT)与自注意机制相结合,在频域中更有效地识别和分析行人步态等结构细节,降低自注意机制中处理高维数据的计算复杂度。最后,我们建立了一个包含不同大气条件(例如风和雨)的新数据集,以更逼真地模拟换衣服的自然场景。在多个换布数据集上的大量实验清楚地证实了A3PFN的优越性能。数据集和相关代码可在网站上获得:https://github.com/jieqiongz1999/vcclothes-w-r。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
期刊最新文献
IrisMAE: Structure-aware masked image modeling for iris recognition Minimizing the pretraining gap: Domain-aligned text-based person retrieval Stealthy backdoor attack method targeting group fairness in self-supervised learning Single-domain generalization for fastener detection via sample reconstruction and class-wise domain contrast EdgeFusionNet: Edge information-guided small object detection for remote sensing images
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1