{"title":"利用三重一致性感知进行噪声抑制再识别","authors":"","doi":"10.1016/j.imavis.2024.105197","DOIUrl":null,"url":null,"abstract":"<div><p>Traditional re-identification (ReID) methods heavily rely on clean and accurately annotated training data, rendering them susceptible to label noise in real-world scenarios. Although some noise-robust learning methods have been proposed and achieved promising recognition performance, however, most of these methods are designed for the image classification task and they are not suitable in ReID (engaging in the association and matching of objects rather than solely focusing on their identification). To address this problem, in this paper, we propose a Triple-consistency Perception based Noise-robust Re-identification Model (TcP-ReID), in which we make the model mine and focus more on the clean samples and reliable relationships among samples from different perspectives. Specifically, the self-consistency strategy guides the model to emphasize and prioritize clean samples, thereby preventing overfitting to noise labels during the initial stages of model training. Rather than focusing solely on individual samples, the context-consistency loss exploits similarities between samples in the feature space, promoting predictions for each sample to align with those of its nearest neighbors. Moreover, to further enforce the robustness of our model, a Jensen-Shannon divergence based cross-view consistency loss is introduced by encouraging the consistency of samples across different views. Extensive experiments demonstrate the superiority of the proposed TcP-ReID over the competing methods under instance-dependent noise and instance-independent noise. For instance, on the Market1501 dataset, our method achieves 85.8% in rank-1 accuracy and 56.3% in mAP score (5.6% and 8.7% improvements) under instance-independent noise with <em>noise ratio 50%</em>, and similarly 5.7% and 1.4% under instance-dependent label noise.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":null,"pages":null},"PeriodicalIF":4.2000,"publicationDate":"2024-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Noise-robust re-identification with triple-consistency perception\",\"authors\":\"\",\"doi\":\"10.1016/j.imavis.2024.105197\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Traditional re-identification (ReID) methods heavily rely on clean and accurately annotated training data, rendering them susceptible to label noise in real-world scenarios. Although some noise-robust learning methods have been proposed and achieved promising recognition performance, however, most of these methods are designed for the image classification task and they are not suitable in ReID (engaging in the association and matching of objects rather than solely focusing on their identification). To address this problem, in this paper, we propose a Triple-consistency Perception based Noise-robust Re-identification Model (TcP-ReID), in which we make the model mine and focus more on the clean samples and reliable relationships among samples from different perspectives. Specifically, the self-consistency strategy guides the model to emphasize and prioritize clean samples, thereby preventing overfitting to noise labels during the initial stages of model training. Rather than focusing solely on individual samples, the context-consistency loss exploits similarities between samples in the feature space, promoting predictions for each sample to align with those of its nearest neighbors. Moreover, to further enforce the robustness of our model, a Jensen-Shannon divergence based cross-view consistency loss is introduced by encouraging the consistency of samples across different views. Extensive experiments demonstrate the superiority of the proposed TcP-ReID over the competing methods under instance-dependent noise and instance-independent noise. For instance, on the Market1501 dataset, our method achieves 85.8% in rank-1 accuracy and 56.3% in mAP score (5.6% and 8.7% improvements) under instance-independent noise with <em>noise ratio 50%</em>, and similarly 5.7% and 1.4% under instance-dependent label noise.</p></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2024-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885624003020\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885624003020","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Noise-robust re-identification with triple-consistency perception
Traditional re-identification (ReID) methods heavily rely on clean and accurately annotated training data, rendering them susceptible to label noise in real-world scenarios. Although some noise-robust learning methods have been proposed and achieved promising recognition performance, however, most of these methods are designed for the image classification task and they are not suitable in ReID (engaging in the association and matching of objects rather than solely focusing on their identification). To address this problem, in this paper, we propose a Triple-consistency Perception based Noise-robust Re-identification Model (TcP-ReID), in which we make the model mine and focus more on the clean samples and reliable relationships among samples from different perspectives. Specifically, the self-consistency strategy guides the model to emphasize and prioritize clean samples, thereby preventing overfitting to noise labels during the initial stages of model training. Rather than focusing solely on individual samples, the context-consistency loss exploits similarities between samples in the feature space, promoting predictions for each sample to align with those of its nearest neighbors. Moreover, to further enforce the robustness of our model, a Jensen-Shannon divergence based cross-view consistency loss is introduced by encouraging the consistency of samples across different views. Extensive experiments demonstrate the superiority of the proposed TcP-ReID over the competing methods under instance-dependent noise and instance-independent noise. For instance, on the Market1501 dataset, our method achieves 85.8% in rank-1 accuracy and 56.3% in mAP score (5.6% and 8.7% improvements) under instance-independent noise with noise ratio 50%, and similarly 5.7% and 1.4% under instance-dependent label noise.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.