DCSG: data complement pseudo-label refinement and self-guided pre-training for unsupervised person re-identification

The Visual Computer Pub Date : 2024-07-01 DOI:10.1007/s00371-024-03542-9

Qing Han, Jiongjin Chen, Weidong Min, Jiahao Li, Lixin Zhan, Longfei Li

{"title":"DCSG: data complement pseudo-label refinement and self-guided pre-training for unsupervised person re-identification","authors":"Qing Han, Jiongjin Chen, Weidong Min, Jiahao Li, Lixin Zhan, Longfei Li","doi":"10.1007/s00371-024-03542-9","DOIUrl":null,"url":null,"abstract":"<p>Existing unsupervised person re-identification (Re-ID) methods use clustering to generate pseudo-labels that are generally noisy, and initializing the model with ImageNet pre-training weights introduces a large domain gap that severely impacts the model’s performance. To address the aforementioned issues, we propose the data complement pseudo-label refinement and self-guided pre-training framework, referred to as DCSG. Firstly, our method utilizes image information from multiple augmentation views to complement the source image data, resulting in aggregated information. We employ this aggregated information to design a correlation score that serves as a reliability evaluation for the source features and cluster centroids. By optimizing the pseudo-labels for each sample, we enhance their robustness. Secondly, we propose a pre-training strategy that leverages the potential information within the training process. This strategy involves mining classes with high similarity in the training set to guide model training and facilitate smooth pre-training. Consequently, the model acquires preliminary capabilities to distinguish pedestrian-related features at an early stage of training, thereby reducing the impact of domain gaps arising from ImageNet pre-training weights. Our method demonstrates superior performance on multiple person Re-ID datasets, validating the effectiveness of our proposed approach. Notably, it achieves an mAP metric of 84.3% on the Market1501 dataset, representing a 2.8% improvement compared to the state-of-the-art method. The code is available at https://github.com/duolaJohn/DCSG.git.</p>","PeriodicalId":501186,"journal":{"name":"The Visual Computer","volume":"22-23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Visual Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00371-024-03542-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Existing unsupervised person re-identification (Re-ID) methods use clustering to generate pseudo-labels that are generally noisy, and initializing the model with ImageNet pre-training weights introduces a large domain gap that severely impacts the model’s performance. To address the aforementioned issues, we propose the data complement pseudo-label refinement and self-guided pre-training framework, referred to as DCSG. Firstly, our method utilizes image information from multiple augmentation views to complement the source image data, resulting in aggregated information. We employ this aggregated information to design a correlation score that serves as a reliability evaluation for the source features and cluster centroids. By optimizing the pseudo-labels for each sample, we enhance their robustness. Secondly, we propose a pre-training strategy that leverages the potential information within the training process. This strategy involves mining classes with high similarity in the training set to guide model training and facilitate smooth pre-training. Consequently, the model acquires preliminary capabilities to distinguish pedestrian-related features at an early stage of training, thereby reducing the impact of domain gaps arising from ImageNet pre-training weights. Our method demonstrates superior performance on multiple person Re-ID datasets, validating the effectiveness of our proposed approach. Notably, it achieves an mAP metric of 84.3% on the Market1501 dataset, representing a 2.8% improvement compared to the state-of-the-art method. The code is available at https://github.com/duolaJohn/DCSG.git.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DCSG：数据补充伪标签完善和无监督人员再识别的自我指导预训练

现有的无监督人员再识别（Re-ID）方法使用聚类来生成伪标签，而这些伪标签通常是有噪声的，而且使用 ImageNet 预训练权重初始化模型会带来很大的领域差距，严重影响模型的性能。为了解决上述问题，我们提出了数据补充伪标签完善和自引导预训练框架，简称 DCSG。首先，我们的方法利用来自多个增强视图的图像信息对源图像数据进行补充，从而产生聚合信息。我们利用这些聚合信息来设计一个相关性得分，作为源特征和聚类中心点的可靠性评估。通过优化每个样本的伪标签，我们增强了它们的鲁棒性。其次，我们提出了一种在训练过程中利用潜在信息的预训练策略。该策略包括挖掘训练集中相似度高的类别，以指导模型训练并促进预训练的顺利进行。因此，模型在训练的早期阶段就获得了分辨行人相关特征的初步能力，从而减少了 ImageNet 预训练权重带来的领域差距的影响。我们的方法在多个人物再识别数据集上表现出了卓越的性能，验证了我们提出的方法的有效性。值得注意的是，它在 Market1501 数据集上实现了 84.3% 的 mAP 指标，与最先进的方法相比提高了 2.8%。代码见 https://github.com/duolaJohn/DCSG.git。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

The Visual Computer

自引率

0.00%

发文量