Multiclass Imbalance Problems: Analysis and Potential Solutions.

Shuo Wang, Xin Yao
{"title":"Multiclass Imbalance Problems: Analysis and Potential Solutions.","authors":"Shuo Wang,&nbsp;Xin Yao","doi":"10.1109/TSMCB.2012.2187280","DOIUrl":null,"url":null,"abstract":"<p><p>Class imbalance problems have drawn growing interest recently because of their classification difficulty caused by the imbalanced class distributions. In particular, many ensemble methods have been proposed to deal with such imbalance. However, most efforts so far are only focused on two-class imbalance problems. There are unsolved issues in multiclass imbalance problems, which exist in real-world applications. This paper studies the challenges posed by the multiclass imbalance problems and investigates the generalization ability of some ensemble solutions, including our recently proposed algorithm AdaBoost.NC, with the aim of handling multiclass and imbalance effectively and directly. We first study the impact of multiminority and multimajority on the performance of two basic resampling techniques. They both present strong negative effects. \"Multimajority\" tends to be more harmful to the generalization performance. Motivated by the results, we then apply AdaBoost.NC to several real-world multiclass imbalance tasks and compare it to other popular ensemble methods. AdaBoost.NC is shown to be better at recognizing minority class examples and balancing the performance among classes in terms of G-mean without using any class decomposition. </p>","PeriodicalId":55006,"journal":{"name":"IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics","volume":" ","pages":"1119-30"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TSMCB.2012.2187280","citationCount":"440","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TSMCB.2012.2187280","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2012/3/16 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 440

Abstract

Class imbalance problems have drawn growing interest recently because of their classification difficulty caused by the imbalanced class distributions. In particular, many ensemble methods have been proposed to deal with such imbalance. However, most efforts so far are only focused on two-class imbalance problems. There are unsolved issues in multiclass imbalance problems, which exist in real-world applications. This paper studies the challenges posed by the multiclass imbalance problems and investigates the generalization ability of some ensemble solutions, including our recently proposed algorithm AdaBoost.NC, with the aim of handling multiclass and imbalance effectively and directly. We first study the impact of multiminority and multimajority on the performance of two basic resampling techniques. They both present strong negative effects. "Multimajority" tends to be more harmful to the generalization performance. Motivated by the results, we then apply AdaBoost.NC to several real-world multiclass imbalance tasks and compare it to other popular ensemble methods. AdaBoost.NC is shown to be better at recognizing minority class examples and balancing the performance among classes in terms of G-mean without using any class decomposition.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多类失衡问题:分析与可能的解决方案。
由于班级分布不平衡导致班级分类困难,班级不平衡问题近年来引起了人们越来越多的关注。特别是,许多集成方法被提出来处理这种不平衡。然而,到目前为止,大多数努力只集中在两级失衡问题上。在实际应用中存在的多类不平衡问题中,有一些尚未解决的问题。本文研究了多类不平衡问题带来的挑战,并研究了一些集成解的泛化能力,包括我们最近提出的算法AdaBoost。数控,目的是有效、直接地处理多类和不平衡。我们首先研究了多少数和多多数对两种基本重采样技术性能的影响。它们都呈现出强烈的负面影响。“多多数”倾向于对泛化性能更有害。受到结果的激励,我们然后应用AdaBoost。NC应用于几个现实世界的多类不平衡任务,并将其与其他流行的集成方法进行比较。演算法。NC在识别少数类示例方面表现得更好,并且在不使用任何类分解的情况下,根据g均值平衡类之间的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
审稿时长
6.0 months
期刊最新文献
Alternative Tests for the Selection of Model Variables Operations Research Optimization of neural networks using variable structure systems. Gait recognition across various walking speeds using higher order shape configuration based on a differential composition model. Integrating instance selection, instance weighting, and feature weighting for nearest neighbor classifiers by coevolutionary algorithms.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1