{"title":"Pro-ReID: Producing reliable pseudo labels for unsupervised person re-identification","authors":"Haiming Sun, Shiwei Ma","doi":"10.1016/j.imavis.2024.105244","DOIUrl":null,"url":null,"abstract":"<div><p>Mainstream unsupervised person ReIDentification (ReID) is on the basis of the alternation of clustering and fine-tuning to promote the task performance, but the clustering process inevitably produces noisy pseudo labels, which seriously constrains the further advancement of the task performance. To conquer the above concerns, the novel Pro-ReID framework is proposed to produce reliable person samples from the pseudo-labeled dataset to learn feature representations in this work. It consists of two modules: Pseudo Labels Correction (PLC) and Pseudo Labels Selection (PLS). Specifically, we further leverage the temporal ensemble prior knowledge to promote task performance. The PLC module assigns corresponding soft pseudo labels to each sample with control of soft pseudo label participation to potentially correct for noisy pseudo labels generated during clustering; the PLS module associates the predictions of the temporal ensemble model with pseudo label annotations and it detects noisy pseudo labele examples as out-of-distribution examples through the Gaussian Mixture Model (GMM) to supply reliable pseudo labels for the unsupervised person ReID task in consideration of their loss data distribution. Experimental findings validated on three person (Market-1501, DukeMTMC-reID and MSMT17) and one vehicle (VeRi-776) ReID benchmark establish that the novel Pro-ReID framework achieves competitive performance, in particular the mAP on the ambitious MSMT17 that is 4.3% superior to the state-of-the-art methods.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"150 ","pages":"Article 105244"},"PeriodicalIF":4.2000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885624003494","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Mainstream unsupervised person ReIDentification (ReID) is on the basis of the alternation of clustering and fine-tuning to promote the task performance, but the clustering process inevitably produces noisy pseudo labels, which seriously constrains the further advancement of the task performance. To conquer the above concerns, the novel Pro-ReID framework is proposed to produce reliable person samples from the pseudo-labeled dataset to learn feature representations in this work. It consists of two modules: Pseudo Labels Correction (PLC) and Pseudo Labels Selection (PLS). Specifically, we further leverage the temporal ensemble prior knowledge to promote task performance. The PLC module assigns corresponding soft pseudo labels to each sample with control of soft pseudo label participation to potentially correct for noisy pseudo labels generated during clustering; the PLS module associates the predictions of the temporal ensemble model with pseudo label annotations and it detects noisy pseudo labele examples as out-of-distribution examples through the Gaussian Mixture Model (GMM) to supply reliable pseudo labels for the unsupervised person ReID task in consideration of their loss data distribution. Experimental findings validated on three person (Market-1501, DukeMTMC-reID and MSMT17) and one vehicle (VeRi-776) ReID benchmark establish that the novel Pro-ReID framework achieves competitive performance, in particular the mAP on the ambitious MSMT17 that is 4.3% superior to the state-of-the-art methods.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.