Mutual-support generalized category discovery

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Information Fusion Pub Date : 2025-02-14 DOI:10.1016/j.inffus.2025.103020

Yu Duan , Zhanxuan Hu , Rong Wang , Zhensheng Sun , Feiping Nie , Xuelong Li

{"title":"Mutual-support generalized category discovery","authors":"Yu Duan , Zhanxuan Hu , Rong Wang , Zhensheng Sun , Feiping Nie , Xuelong Li","doi":"10.1016/j.inffus.2025.103020","DOIUrl":null,"url":null,"abstract":"<div><div>This work focuses on the problem of Generalized Category Discovery (GCD), a more realistic and challenging semi-supervised learning setting where unlabeled data may belong to either previously known or unseen categories. Recent advancements have demonstrated the efficacy of both pseudo-label-based parametric classification methods and representation-based non-parametric classification methods in tackling this problem. However, there exists a gap in the literature concerning the integration of their respective advantages. The former tends to be biased towards the ’Old’ categories, making it easier to classify samples into the ’Old’ groups. The latter cannot learn discriminative representations, decreasing the clustering performance. To this end, we propose Mutual-Support Generalized Category Discovery (MSGCD), a framework that unifies these two paradigms, leveraging their strengths in a mutually reinforcing manner. It simultaneously learns high-quality pseudo-labels and discriminative representations. It incorporates a novel <em>Mutual-Support mechanism</em> to facilitate symbiotic enhancement. Specifically, high-quality pseudo-labels furnish valuable weakly supervised information for learning discriminative representations, while discriminative representations enable the estimation of semantic similarity between samples, guiding the model in generating more reliable pseudo-labels. MSGCD is remarkably effective, achieving state-of-the-art results on several datasets. Moreover, <em>Mutual-Support mechanism</em> is not only effective in image classification tasks, but also provides intuition for cross-modal representation learning, open-world image segmentation, and recognition. The codes is available at <span><span>https://github.com/DuannYu/MSGCD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"119 ","pages":"Article 103020"},"PeriodicalIF":14.7000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525000934","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This work focuses on the problem of Generalized Category Discovery (GCD), a more realistic and challenging semi-supervised learning setting where unlabeled data may belong to either previously known or unseen categories. Recent advancements have demonstrated the efficacy of both pseudo-label-based parametric classification methods and representation-based non-parametric classification methods in tackling this problem. However, there exists a gap in the literature concerning the integration of their respective advantages. The former tends to be biased towards the ’Old’ categories, making it easier to classify samples into the ’Old’ groups. The latter cannot learn discriminative representations, decreasing the clustering performance. To this end, we propose Mutual-Support Generalized Category Discovery (MSGCD), a framework that unifies these two paradigms, leveraging their strengths in a mutually reinforcing manner. It simultaneously learns high-quality pseudo-labels and discriminative representations. It incorporates a novel Mutual-Support mechanism to facilitate symbiotic enhancement. Specifically, high-quality pseudo-labels furnish valuable weakly supervised information for learning discriminative representations, while discriminative representations enable the estimation of semantic similarity between samples, guiding the model in generating more reliable pseudo-labels. MSGCD is remarkably effective, achieving state-of-the-art results on several datasets. Moreover, Mutual-Support mechanism is not only effective in image classification tasks, but also provides intuition for cross-modal representation learning, open-world image segmentation, and recognition. The codes is available at https://github.com/DuannYu/MSGCD.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.