Enhancing Sample Utilization in Noise-Robust Deep Metric Learning With Subgroup-Based Positive-Pair Selection

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2024-10-22 DOI:10.1109/TIP.2024.3482182

Zhipeng Yu;Qianqian Xu;Yangbangyan Jiang;Yingfei Sun;Qingming Huang

{"title":"Enhancing Sample Utilization in Noise-Robust Deep Metric Learning With Subgroup-Based Positive-Pair Selection","authors":"Zhipeng Yu;Qianqian Xu;Yangbangyan Jiang;Yingfei Sun;Qingming Huang","doi":"10.1109/TIP.2024.3482182","DOIUrl":null,"url":null,"abstract":"The existence of noisy labels in real-world data negatively impacts the performance of deep learning models. Although much research effort has been devoted to improving the robustness towards noisy labels in classification tasks, the problem of noisy labels in deep metric learning (DML) remains under-explored. Existing noisy label learning methods designed for DML mainly discard suspicious noisy samples, resulting in a waste of the training data. To address this issue, we propose a noise-robust DML framework with SubGroup-based Positive-pair Selection (SGPS), which constructs reliable positive pairs for noisy samples to enhance the sample utilization. Specifically, SGPS first effectively identifies clean and noisy samples by a probability-based clean sample selectionstrategy. To further utilize the remaining noisy samples, we discover their potential similar samples based on the subgroup information given by a subgroup generation module and then aggregate them into informative positive prototypes for each noisy sample via a positive prototype generation module. Afterward, a new contrastive loss is tailored for the noisy samples with their selected positive pairs. SGPS can be easily integrated into the training process of existing pair-wise DML tasks, like image retrieval and face recognition. Extensive experiments on multiple synthetic and real-world large-scale label noise datasets demonstrate the effectiveness of our proposed method. Without any bells and whistles, our SGPS framework outperforms the state-of-the-art noisy label DML methods.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"6083-6097"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10729738/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The existence of noisy labels in real-world data negatively impacts the performance of deep learning models. Although much research effort has been devoted to improving the robustness towards noisy labels in classification tasks, the problem of noisy labels in deep metric learning (DML) remains under-explored. Existing noisy label learning methods designed for DML mainly discard suspicious noisy samples, resulting in a waste of the training data. To address this issue, we propose a noise-robust DML framework with SubGroup-based Positive-pair Selection (SGPS), which constructs reliable positive pairs for noisy samples to enhance the sample utilization. Specifically, SGPS first effectively identifies clean and noisy samples by a probability-based clean sample selectionstrategy. To further utilize the remaining noisy samples, we discover their potential similar samples based on the subgroup information given by a subgroup generation module and then aggregate them into informative positive prototypes for each noisy sample via a positive prototype generation module. Afterward, a new contrastive loss is tailored for the noisy samples with their selected positive pairs. SGPS can be easily integrated into the training process of existing pair-wise DML tasks, like image retrieval and face recognition. Extensive experiments on multiple synthetic and real-world large-scale label noise datasets demonstrate the effectiveness of our proposed method. Without any bells and whistles, our SGPS framework outperforms the state-of-the-art noisy label DML methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用基于子群的正对选择提高噪声抑制深度度量学习中的样本利用率

真实世界数据中存在的噪声标签会对深度学习模型的性能产生负面影响。尽管已有很多研究致力于提高分类任务中噪声标签的鲁棒性，但深度度量学习（DML）中的噪声标签问题仍未得到充分探索。为 DML 设计的现有噪声标签学习方法主要丢弃可疑的噪声样本，造成了训练数据的浪费。为了解决这个问题，我们提出了一种基于子群的正对选择（SGPS）的噪声抑制 DML 框架，它能为噪声样本构建可靠的正对，从而提高样本利用率。具体来说，SGPS 首先通过基于概率的干净样本选择策略有效识别干净样本和噪声样本。为了进一步利用剩余的噪声样本，我们根据子群生成模块给出的子群信息发现其潜在的相似样本，然后通过正向原型生成模块为每个噪声样本聚合成信息丰富的正向原型。之后，再根据所选的正对样本为噪声样本定制新的对比损失。SGPS 可以很容易地集成到现有的成对 DML 任务（如图像检索和人脸识别）的训练过程中。在多个合成和真实世界大规模标签噪声数据集上进行的广泛实验证明了我们提出的方法的有效性。在没有任何附加功能的情况下，我们的 SGPS 框架优于最先进的噪声标签 DML 方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量