OTAMatch: Optimal Transport Assignment With PseudoNCE for Semi-Supervised Learning

Jinjin Zhang;Junjie Liu;Debang Li;Qiuyu Huang;Jiaxin Chen;Di Huang
{"title":"OTAMatch: Optimal Transport Assignment With PseudoNCE for Semi-Supervised Learning","authors":"Jinjin Zhang;Junjie Liu;Debang Li;Qiuyu Huang;Jiaxin Chen;Di Huang","doi":"10.1109/TIP.2024.3425174","DOIUrl":null,"url":null,"abstract":"In semi-supervised learning (SSL), many approaches follow the effective self-training paradigm with consistency regularization, utilizing threshold heuristics to alleviate label noise. However, such threshold heuristics lead to the underutilization of crucial discriminative information from the excluded data. In this paper, we present OTAMatch, a novel SSL framework that reformulates pseudo-labeling as an optimal transport (OT) assignment problem and simultaneously exploits data with high confidence to mitigate the confirmation bias. Firstly, OTAMatch models the pseudo-label allocation task as a convex minimization problem, facilitating end-to-end optimization with all pseudo-labels and employing the Sinkhorn-Knopp algorithm for efficient approximation. Meanwhile, we incorporate epsilon-greedy posterior regularization and curriculum bias correction strategies to constrain the distribution of OT assignments, improving the robustness with noisy pseudo-labels. Secondly, we propose PseudoNCE, which explicitly exploits pseudo-label consistency with threshold heuristics to maximize mutual information within self-training, significantly boosting the balance of convergence speed and performance. Consequently, our proposed approach achieves competitive performance on various SSL benchmarks. Specifically, OTAMatch substantially outperforms the previous state-of-the-art SSL algorithms in realistic and challenging scenarios, exemplified by a no\n<xref>table 9</xref>\n.45% error rate reduction over SoftMatch on ImageNet with 100K-label split, underlining its robustness and effectiveness.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10599208/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In semi-supervised learning (SSL), many approaches follow the effective self-training paradigm with consistency regularization, utilizing threshold heuristics to alleviate label noise. However, such threshold heuristics lead to the underutilization of crucial discriminative information from the excluded data. In this paper, we present OTAMatch, a novel SSL framework that reformulates pseudo-labeling as an optimal transport (OT) assignment problem and simultaneously exploits data with high confidence to mitigate the confirmation bias. Firstly, OTAMatch models the pseudo-label allocation task as a convex minimization problem, facilitating end-to-end optimization with all pseudo-labels and employing the Sinkhorn-Knopp algorithm for efficient approximation. Meanwhile, we incorporate epsilon-greedy posterior regularization and curriculum bias correction strategies to constrain the distribution of OT assignments, improving the robustness with noisy pseudo-labels. Secondly, we propose PseudoNCE, which explicitly exploits pseudo-label consistency with threshold heuristics to maximize mutual information within self-training, significantly boosting the balance of convergence speed and performance. Consequently, our proposed approach achieves competitive performance on various SSL benchmarks. Specifically, OTAMatch substantially outperforms the previous state-of-the-art SSL algorithms in realistic and challenging scenarios, exemplified by a no table 9 .45% error rate reduction over SoftMatch on ImageNet with 100K-label split, underlining its robustness and effectiveness.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
OTAMatch:利用伪 NCE 进行半监督学习的最佳传输分配。
在半监督学习(SSL)中,许多方法都遵循有效的自我训练范式,利用一致性正则化、阈值启发法来减轻标签噪声。然而,这种阈值启发式方法会导致未充分利用排除数据中的关键判别信息。在本文中,我们提出了 OTAMatch,这是一种新颖的 SSL 框架,它将伪标签重新表述为最优传输(OT)分配问题,并同时利用高置信度数据来减轻确认偏差。首先,OTAMatch 将伪标签分配任务建模为一个凸最小化问题,便于使用所有伪标签进行端到端优化,并采用 Sinkhorn-Knopp 算法进行高效逼近。同时,我们采用ε-贪婪后验正则化和课程偏差校正策略来约束 OT 分配的分布,提高了噪声伪标签的鲁棒性。其次,我们提出了 PseudoNCE,它明确利用伪标签的一致性和阈值启发法,在自我训练中最大化互信息,显著提高了收敛速度和性能之间的平衡。因此,我们提出的方法在各种 SSL 基准上都取得了具有竞争力的性能。具体来说,OTAMatch 在现实和具有挑战性的场景中的表现大大优于之前最先进的 SSL 算法,例如,在 100K 标签分割的 ImageNet 上,OTAMatch 比 SoftMatch 明显降低了 9.45% 的错误率,凸显了其鲁棒性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Balanced Destruction-Reconstruction Dynamics for Memory-Replay Class Incremental Learning Blind Video Quality Prediction by Uncovering Human Video Perceptual Representation. Contrastive Open-set Active Learning based Sample Selection for Image Classification. Generating Stylized Features for Single-Source Cross-Dataset Palmprint Recognition With Unseen Target Dataset Learning Prompt-Enhanced Context Features for Weakly-Supervised Video Anomaly Detection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1