Ensembles of deep one-class classifiers for multi-class image classification

IF 4.9 Machine learning with applications Pub Date : 2025-01-22 DOI:10.1016/j.mlwa.2025.100621

Alexander Novotny , George Bebis , Alireza Tavakkoli , Mircea Nicolescu

{"title":"Ensembles of deep one-class classifiers for multi-class image classification","authors":"Alexander Novotny , George Bebis , Alireza Tavakkoli , Mircea Nicolescu","doi":"10.1016/j.mlwa.2025.100621","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional methods for multi-class classification (MCC) involve using a monolithic feature extractor and classifier trained on data from all the classes simultaneously. These methods are dependent on the number and types of classes and are therefore rigid against changes to the class structure. For instance, if the number of classes needs to be modified or new training data becomes available, retraining would be required for optimum classification performance. Moreover, these classifiers can become biased toward classes with a large data imbalance. An alternative, more attractive framework is to consider an ensemble of one-class classifiers (EOCC) where each one-class classifier (OCC) is trained with data from a single class only, without using any information from the other classes. Although this framework has not yet systematically matched or surpassed the performance of traditional MCC approaches, it deserves further investigation for several reasons. First, it provides a more flexible framework for handling changes in class structure compared to the traditional MCC approach. Second, it is less biased toward classes with large data imbalances compared to the multi-class classification approach. Finally, each OCC can be separately optimized depending on the characteristics of the class it represents. In this paper, we have performed extensive experiments to evaluate EOCC for MCC using traditional OCCs based on Principal Component Analysis (PCA) and Auto-encoders (AE) as well as newly proposed OCCs based on Generative Adversarial Networks (GANs). Moreover, we have compared the performance of EOCC with traditional multi-class DL classifiers including VGG-19, Resnet and EfficientNet. Two different datasets were used in our experiments: (i) a subset from the Plant Village dataset plant disease dataset with high variance in the number of classes and amount of data in each class, and (ii) an Alzheimer’s disease dataset with low amounts of data and a large imbalance in data between classes. Our results show that the GAN-based EOCC outperform previous EOCC approaches and improve the performance gap with traditional MCC approaches.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"19 ","pages":"Article 100621"},"PeriodicalIF":4.9000,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827025000040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Traditional methods for multi-class classification (MCC) involve using a monolithic feature extractor and classifier trained on data from all the classes simultaneously. These methods are dependent on the number and types of classes and are therefore rigid against changes to the class structure. For instance, if the number of classes needs to be modified or new training data becomes available, retraining would be required for optimum classification performance. Moreover, these classifiers can become biased toward classes with a large data imbalance. An alternative, more attractive framework is to consider an ensemble of one-class classifiers (EOCC) where each one-class classifier (OCC) is trained with data from a single class only, without using any information from the other classes. Although this framework has not yet systematically matched or surpassed the performance of traditional MCC approaches, it deserves further investigation for several reasons. First, it provides a more flexible framework for handling changes in class structure compared to the traditional MCC approach. Second, it is less biased toward classes with large data imbalances compared to the multi-class classification approach. Finally, each OCC can be separately optimized depending on the characteristics of the class it represents. In this paper, we have performed extensive experiments to evaluate EOCC for MCC using traditional OCCs based on Principal Component Analysis (PCA) and Auto-encoders (AE) as well as newly proposed OCCs based on Generative Adversarial Networks (GANs). Moreover, we have compared the performance of EOCC with traditional multi-class DL classifiers including VGG-19, Resnet and EfficientNet. Two different datasets were used in our experiments: (i) a subset from the Plant Village dataset plant disease dataset with high variance in the number of classes and amount of data in each class, and (ii) an Alzheimer’s disease dataset with low amounts of data and a large imbalance in data between classes. Our results show that the GAN-based EOCC outperform previous EOCC approaches and improve the performance gap with traditional MCC approaches.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

面向多类图像分类的深度单类分类器集成

传统的多类分类方法是使用单一的特征提取器和分类器同时对所有类别的数据进行训练。这些方法依赖于类的数量和类型，因此对类结构的更改是严格的。例如，如果需要修改类的数量或有新的训练数据可用，则需要重新训练以获得最佳分类性能。此外，这些分类器可能会偏向于具有大量数据不平衡的类。另一种更有吸引力的框架是考虑单类分类器（EOCC）的集成，其中每个单类分类器（OCC）仅使用来自单个类的数据进行训练，而不使用来自其他类的任何信息。虽然这个框架还没有系统地匹配或超越传统的MCC方法的性能，但由于几个原因，它值得进一步研究。首先，与传统的MCC方法相比，它提供了一个更灵活的框架来处理类结构的变化。其次，与多类分类方法相比，它较少偏向于具有大数据不平衡的类。最后，每个OCC可以根据它所代表的类的特征分别进行优化。在本文中，我们进行了广泛的实验，使用基于主成分分析（PCA）和自编码器（AE）的传统occ以及基于生成对抗网络（gan）的新提出的occ来评估MCC的EOCC。此外，我们还将EOCC与传统的多类深度学习分类器（VGG-19、Resnet和EfficientNet）的性能进行了比较。在我们的实验中使用了两个不同的数据集：(i)来自Plant Village数据集的植物疾病数据集的子集，其类别数量和每个类别的数据量差异很大；（ii）阿尔茨海默病数据集，其数据量低，类别之间的数据不平衡很大。我们的研究结果表明，基于gan的EOCC方法优于以前的EOCC方法，并改善了与传统MCC方法的性能差距。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications

自引率

0.00%

发文量

审稿时长

98 days