Adaptive transformer with Pyramid Fusion for cloth-changing Person Re-Identification

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Pub Date : 2025-02-12 DOI:10.1016/j.patcog.2025.111443

Guoqing Zhang , Jieqiong Zhou , Yuhui Zheng , Gaven Martin , Ruili Wang

{"title":"Adaptive transformer with Pyramid Fusion for cloth-changing Person Re-Identification","authors":"Guoqing Zhang , Jieqiong Zhou , Yuhui Zheng , Gaven Martin , Ruili Wang","doi":"10.1016/j.patcog.2025.111443","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, Transformer-based methods have made great progress in person re-identification (Re-ID), especially in handling identity changes in clothing-changing scenarios. Most current studies usually use biometric information-assisted methods such as human pose estimation to enhance the local perception ability of clothes-changing Re-ID. However, it is usually difficult for them to establish the connection between local biometric information and global identity semantics during training, resulting in the lack of local perception ability during the inference phase, which limits the improvement of model performance. In this paper, we propose a Transformer-based Adaptive-Aware Attention and Pyramid Fusion Network (<span><math><mrow><msup><mrow><mi>A</mi></mrow><mrow><mn>3</mn></mrow></msup><mi>P</mi><mi>F</mi><mi>N</mi></mrow></math></span>) for CC Re-ID, which can capture and integrate multi-scale visual information to enhance recognition ability. Firstly, to improve the information utilization efficiency of the model in cloth-changing scenarios, we propose a Multi-Layer Dynamic Concentration module (MLDC) to evaluate the importance features at each layer in real time and reduce the computational overlap between related layers. Secondly, we propose a Local Pyramid Aggregation Module (LPAM) to extract multi-scale features, aiming to maintain global perceptual capability and focus on key local information. In this module, we also combine the Fast Fourier Transform (FFT) with self-attention mechanism to more effectively identify and analyze pedestrian gait and other structural details in the frequency domain and reduce the computational complexity of processing high-dimensional data in the self-attention mechanism. Finally, we build a new dataset incorporating diverse atmospheric conditions (for instance wind and rain) to more realistically simulate natural scenarios for the changing of clothes. Extensive experiments on multiple cloth-changing datasets clearly confirm the superior performance of <span><math><mrow><msup><mrow><mi>A</mi></mrow><mrow><mn>3</mn></mrow></msup><mi>P</mi><mi>F</mi><mi>N</mi></mrow></math></span>. The dataset and related code are available on the website: <span><span>https://github.com/jieqiongz1999/vcclothes-w-r</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111443"},"PeriodicalIF":7.5000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325001037","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, Transformer-based methods have made great progress in person re-identification (Re-ID), especially in handling identity changes in clothing-changing scenarios. Most current studies usually use biometric information-assisted methods such as human pose estimation to enhance the local perception ability of clothes-changing Re-ID. However, it is usually difficult for them to establish the connection between local biometric information and global identity semantics during training, resulting in the lack of local perception ability during the inference phase, which limits the improvement of model performance. In this paper, we propose a Transformer-based Adaptive-Aware Attention and Pyramid Fusion Network (

A^{3} P F N

) for CC Re-ID, which can capture and integrate multi-scale visual information to enhance recognition ability. Firstly, to improve the information utilization efficiency of the model in cloth-changing scenarios, we propose a Multi-Layer Dynamic Concentration module (MLDC) to evaluate the importance features at each layer in real time and reduce the computational overlap between related layers. Secondly, we propose a Local Pyramid Aggregation Module (LPAM) to extract multi-scale features, aiming to maintain global perceptual capability and focus on key local information. In this module, we also combine the Fast Fourier Transform (FFT) with self-attention mechanism to more effectively identify and analyze pedestrian gait and other structural details in the frequency domain and reduce the computational complexity of processing high-dimensional data in the self-attention mechanism. Finally, we build a new dataset incorporating diverse atmospheric conditions (for instance wind and rain) to more realistically simulate natural scenarios for the changing of clothes. Extensive experiments on multiple cloth-changing datasets clearly confirm the superior performance of

A^{3} P F N

. The dataset and related code are available on the website: https://github.com/jieqiongz1999/vcclothes-w-r.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.

期刊最新文献

Editorial Board EACE: Explain Anomaly via Counterfactual Explanations Scientific poster generation: A new dataset and approach TLR-3DRN: Unsupervised single-view reconstruction via tri-layer renderer Learning multi-granularity representation with transformer for visible-infrared person re-identification