A class sensitivity feature guided T-type generative model for noisy label classification

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Machine Learning Pub Date : 2024-08-20 DOI:10.1007/s10994-024-06598-9

Yidi Bai, Hengjian Cui

{"title":"A class sensitivity feature guided T-type generative model for noisy label classification","authors":"Yidi Bai, Hengjian Cui","doi":"10.1007/s10994-024-06598-9","DOIUrl":null,"url":null,"abstract":"<p>Large-scale datasets inevitably contain noisy labels, which induces weak performance of deep neural networks (DNNs). Many existing methods focus on loss and regularization tricks, as well as characterizing and modelling differences between noisy and clean samples. However, taking advantage of information from different extents of distortion in latent feature space, is less explored and remains challenging. To solve this problem, we analyze characteristic distortion extents of different high-dimensional features, achieving the conclusion that features vary in their degree of deformation in their correlations with respect to categorical variables. Aforementioned disturbances on features not only reduce sensitivity and contribution of latent features to classification, but also bring obstacles into generating decision boundaries. To mitigate these issues, we propose class sensitivity feature extractor (CSFE) and T-type generative classifier (TGC). Based on the weighted Mahalanobis distance between conditional and unconditional cumulative distribution function after variance-stabilizing transformation, CSFE realizes high quality feature extraction through evaluating class-wise discrimination ability and sensitivity to classification. TGC introduces student-t estimator to clustering analysis in latent space, which is more robust in generating decision boundaries while maintaining equivalent efficiency. To alleviate the cost of retraining a whole DNN, we propose an ensemble model to simultaneously generate robust decision boundaries and train the DNN with the improved CSFE named SoftCSFE. Extensive experiments on three datasets, which are the RML2016.10a dataset, UCR Time Series Classification Archive dataset and a real-world dataset Clothing1M, show advantages of our methods.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"58 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06598-9","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Large-scale datasets inevitably contain noisy labels, which induces weak performance of deep neural networks (DNNs). Many existing methods focus on loss and regularization tricks, as well as characterizing and modelling differences between noisy and clean samples. However, taking advantage of information from different extents of distortion in latent feature space, is less explored and remains challenging. To solve this problem, we analyze characteristic distortion extents of different high-dimensional features, achieving the conclusion that features vary in their degree of deformation in their correlations with respect to categorical variables. Aforementioned disturbances on features not only reduce sensitivity and contribution of latent features to classification, but also bring obstacles into generating decision boundaries. To mitigate these issues, we propose class sensitivity feature extractor (CSFE) and T-type generative classifier (TGC). Based on the weighted Mahalanobis distance between conditional and unconditional cumulative distribution function after variance-stabilizing transformation, CSFE realizes high quality feature extraction through evaluating class-wise discrimination ability and sensitivity to classification. TGC introduces student-t estimator to clustering analysis in latent space, which is more robust in generating decision boundaries while maintaining equivalent efficiency. To alleviate the cost of retraining a whole DNN, we propose an ensemble model to simultaneously generate robust decision boundaries and train the DNN with the improved CSFE named SoftCSFE. Extensive experiments on three datasets, which are the RML2016.10a dataset, UCR Time Series Classification Archive dataset and a real-world dataset Clothing1M, show advantages of our methods.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于噪声标签分类的类敏感度特征引导 T 型生成模型

大规模数据集不可避免地包含噪声标签，这会导致深度神经网络（DNN）性能低下。许多现有方法都侧重于损失和正则化技巧，以及对噪声样本和干净样本之间的差异进行表征和建模。然而，如何利用潜在特征空间中不同失真程度的信息，这方面的探索较少，仍然具有挑战性。为了解决这个问题，我们分析了不同高维特征的特征变形程度，得出的结论是，特征的变形程度不同，它们与分类变量的相关性也不同。上述对特征的干扰不仅降低了潜在特征的灵敏度和对分类的贡献，还为生成决策边界带来了障碍。为了缓解这些问题，我们提出了类别灵敏度特征提取器（CSFE）和 T 型生成分类器（TGC）。CSFE 基于经过方差稳定变换后的条件累积分布函数和非条件累积分布函数之间的加权马哈拉诺比斯距离，通过评估分类鉴别能力和分类灵敏度来实现高质量的特征提取。TGC 在潜空间聚类分析中引入了 student-t 估计器，在生成决策边界时更加稳健，同时保持了同等效率。为了减轻重新训练整个 DNN 的成本，我们提出了一种集合模型，以同时生成稳健的决策边界，并用改进的 CSFE 训练 DNN，命名为 SoftCSFE。在三个数据集（RML2016.10a 数据集、UCR 时间序列分类档案数据集和现实世界数据集 Clothing1M）上的广泛实验显示了我们方法的优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Machine Learning 工程技术-计算机：人工智能

CiteScore

11.00

自引率

2.70%

发文量

162

审稿时长

3 months

期刊介绍： Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.