{"title":"A class sensitivity feature guided T-type generative model for noisy label classification","authors":"Yidi Bai, Hengjian Cui","doi":"10.1007/s10994-024-06598-9","DOIUrl":null,"url":null,"abstract":"<p>Large-scale datasets inevitably contain noisy labels, which induces weak performance of deep neural networks (DNNs). Many existing methods focus on loss and regularization tricks, as well as characterizing and modelling differences between noisy and clean samples. However, taking advantage of information from different extents of distortion in latent feature space, is less explored and remains challenging. To solve this problem, we analyze characteristic distortion extents of different high-dimensional features, achieving the conclusion that features vary in their degree of deformation in their correlations with respect to categorical variables. Aforementioned disturbances on features not only reduce sensitivity and contribution of latent features to classification, but also bring obstacles into generating decision boundaries. To mitigate these issues, we propose class sensitivity feature extractor (CSFE) and T-type generative classifier (TGC). Based on the weighted Mahalanobis distance between conditional and unconditional cumulative distribution function after variance-stabilizing transformation, CSFE realizes high quality feature extraction through evaluating class-wise discrimination ability and sensitivity to classification. TGC introduces student-t estimator to clustering analysis in latent space, which is more robust in generating decision boundaries while maintaining equivalent efficiency. To alleviate the cost of retraining a whole DNN, we propose an ensemble model to simultaneously generate robust decision boundaries and train the DNN with the improved CSFE named SoftCSFE. Extensive experiments on three datasets, which are the RML2016.10a dataset, UCR Time Series Classification Archive dataset and a real-world dataset Clothing1M, show advantages of our methods.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06598-9","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Large-scale datasets inevitably contain noisy labels, which induces weak performance of deep neural networks (DNNs). Many existing methods focus on loss and regularization tricks, as well as characterizing and modelling differences between noisy and clean samples. However, taking advantage of information from different extents of distortion in latent feature space, is less explored and remains challenging. To solve this problem, we analyze characteristic distortion extents of different high-dimensional features, achieving the conclusion that features vary in their degree of deformation in their correlations with respect to categorical variables. Aforementioned disturbances on features not only reduce sensitivity and contribution of latent features to classification, but also bring obstacles into generating decision boundaries. To mitigate these issues, we propose class sensitivity feature extractor (CSFE) and T-type generative classifier (TGC). Based on the weighted Mahalanobis distance between conditional and unconditional cumulative distribution function after variance-stabilizing transformation, CSFE realizes high quality feature extraction through evaluating class-wise discrimination ability and sensitivity to classification. TGC introduces student-t estimator to clustering analysis in latent space, which is more robust in generating decision boundaries while maintaining equivalent efficiency. To alleviate the cost of retraining a whole DNN, we propose an ensemble model to simultaneously generate robust decision boundaries and train the DNN with the improved CSFE named SoftCSFE. Extensive experiments on three datasets, which are the RML2016.10a dataset, UCR Time Series Classification Archive dataset and a real-world dataset Clothing1M, show advantages of our methods.
期刊介绍:
Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.