Caijie Zhao, Ying Qin, Bob Zhang, Yajie Zhao, Baoyun Wu
{"title":"An end-to-end occluded person re-identification network with smoothing corrupted feature prediction","authors":"Caijie Zhao, Ying Qin, Bob Zhang, Yajie Zhao, Baoyun Wu","doi":"10.1007/s10462-024-11047-z","DOIUrl":null,"url":null,"abstract":"<div><p>Occluded person re-identification (ReID) is a challenging task as the images suffer from various obstacles and less discriminative information caused by incomplete body parts. Most current works rely on auxiliary models to infer the visible body parts and partial-level features matching to overcome the contaminated body information, which consumes extra inference time and fails when facing complex occlusions. More recently, some methods utilized masks provided from image occlusion augmentation (OA) for the supervision of mask learning. These works estimated occlusion scores for each part of the image by roughly dividing it in the horizontal direction, but cannot accurately predict the occlusion, as well as failing in vertical occlusions. To address this issue, we proposed a Smoothing Corrupted Feature Prediction (SCFP) network in an end-to-end manner for occluded person ReID. Specifically, aided by OA that simulates occlusions appearing in pedestrians and providing occlusion masks, the proposed Occlusion Decoder and Estimator (ODE) estimates and eliminates corrupted features, which is supervised by mask labels generated via restricting all occlusions into a group of patterns. We also designed an Occlusion Pattern Smoothing (OPS) to improve the performance of ODE when predicting irregular obstacles. Subsequently, a Local-to-Body (L2B) representation is constructed to mitigate the limitation of the partial body information for final matching. To investigate the performance of SCFP, we compared our model to the existing state-of-the-art methods in occluded and holistic person ReID benchmarks and proved that our method achieves superior results over the state-of-the-art methods. We also achieved the highest Rank-1 accuracies of 70.9%, 87.0%, and 93.2% in Occluded-Duke, Occluded-ReID, and P-DukeMTMC, respectively. Furthermore, the proposed SCFP generalizes well in holistic datasets, yielding accuracies of 95.8% in Market-1510 and 90.7% in DukeMTMC-reID.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 2","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-11047-z.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-11047-z","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Occluded person re-identification (ReID) is a challenging task as the images suffer from various obstacles and less discriminative information caused by incomplete body parts. Most current works rely on auxiliary models to infer the visible body parts and partial-level features matching to overcome the contaminated body information, which consumes extra inference time and fails when facing complex occlusions. More recently, some methods utilized masks provided from image occlusion augmentation (OA) for the supervision of mask learning. These works estimated occlusion scores for each part of the image by roughly dividing it in the horizontal direction, but cannot accurately predict the occlusion, as well as failing in vertical occlusions. To address this issue, we proposed a Smoothing Corrupted Feature Prediction (SCFP) network in an end-to-end manner for occluded person ReID. Specifically, aided by OA that simulates occlusions appearing in pedestrians and providing occlusion masks, the proposed Occlusion Decoder and Estimator (ODE) estimates and eliminates corrupted features, which is supervised by mask labels generated via restricting all occlusions into a group of patterns. We also designed an Occlusion Pattern Smoothing (OPS) to improve the performance of ODE when predicting irregular obstacles. Subsequently, a Local-to-Body (L2B) representation is constructed to mitigate the limitation of the partial body information for final matching. To investigate the performance of SCFP, we compared our model to the existing state-of-the-art methods in occluded and holistic person ReID benchmarks and proved that our method achieves superior results over the state-of-the-art methods. We also achieved the highest Rank-1 accuracies of 70.9%, 87.0%, and 93.2% in Occluded-Duke, Occluded-ReID, and P-DukeMTMC, respectively. Furthermore, the proposed SCFP generalizes well in holistic datasets, yielding accuracies of 95.8% in Market-1510 and 90.7% in DukeMTMC-reID.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.