GCSTG: Generating Class-Confusion-Aware Samples With a Tree-Structure Graph for Few-Shot Object Detection

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2025-01-23 DOI:10.1109/TIP.2025.3530792

Longrong Yang;Hanbin Zhao;Hongliang Li;Liang Qiao;Ziwei Yang;Xi Li

{"title":"GCSTG: Generating Class-Confusion-Aware Samples With a Tree-Structure Graph for Few-Shot Object Detection","authors":"Longrong Yang;Hanbin Zhao;Hongliang Li;Liang Qiao;Ziwei Yang;Xi Li","doi":"10.1109/TIP.2025.3530792","DOIUrl":null,"url":null,"abstract":"Few-Shot Object Detection (FSOD) aims to detect the objects of novel classes using only a few manually annotated samples. With the few novel class samples, learning the inter-class relationships among foreground and constructing the corresponding class hierarchy in FSOD is a challenging task. The poor construction of the class hierarchy will result in the inter-class confusion problem, which has been identified as a primary cause of inferior performance in novel classes by recent FSOD methods. In this work, we further find that the intra-super-class confusion, where samples are misclassified as classes within their associated super-classes, is the main challenge in solving the confusion problem. To solve this issue, this work generates class-confusion-aware samples with a pre-defined tree-structure graph, for helping models to construct a precise class hierarchy. In precise, for generating class-confusion-aware samples, we add the noise into available samples and update the noise to maximize confidence scores on associated confusion categories of samples. Then, a confusion-aware curriculum learning strategy is proposed to make generated samples gradually participate in the training, which benefits the model convergence while learning the generated samples. Experimental results show that our method can be used as a plug-in in recent FSOD methods and consistently improve the model performance.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"772-784"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10851817/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Few-Shot Object Detection (FSOD) aims to detect the objects of novel classes using only a few manually annotated samples. With the few novel class samples, learning the inter-class relationships among foreground and constructing the corresponding class hierarchy in FSOD is a challenging task. The poor construction of the class hierarchy will result in the inter-class confusion problem, which has been identified as a primary cause of inferior performance in novel classes by recent FSOD methods. In this work, we further find that the intra-super-class confusion, where samples are misclassified as classes within their associated super-classes, is the main challenge in solving the confusion problem. To solve this issue, this work generates class-confusion-aware samples with a pre-defined tree-structure graph, for helping models to construct a precise class hierarchy. In precise, for generating class-confusion-aware samples, we add the noise into available samples and update the noise to maximize confidence scores on associated confusion categories of samples. Then, a confusion-aware curriculum learning strategy is proposed to make generated samples gradually participate in the training, which benefits the model convergence while learning the generated samples. Experimental results show that our method can be used as a plug-in in recent FSOD methods and consistently improve the model performance.

查看原文