{"title":"A Review on Solution to Class Imbalance Problem: Undersampling Approaches","authors":"D. Devi, S. Biswas, B. Purkayastha","doi":"10.1109/ComPE49325.2020.9200087","DOIUrl":null,"url":null,"abstract":"The classification task carries a significant role in the field of effective data mining and numerous classification models are proposed over the years to carry out the job. However, standard classification models are sensitive to the underlying characteristics of the datasets. When employed to a dataset with skewed class distribution, standard classification models tend to misclassify the rare instances as it gets biased towards the majority patterns. This is where the issue of class imbalance makes it mark and causes to significantly degrade the performance of the standard classifiers. Among the several reported solutions for class imbalance issue, undersampling approaches are quite prevalent which offers to balance the class distribution by discarding insignificant majority instances. In this paper, an insight of class imbalance issue is presented in regard of its impact on classification models, the reported solutions and the effectiveness of the undersampling approaches in solving the issue.","PeriodicalId":6804,"journal":{"name":"2020 International Conference on Computational Performance Evaluation (ComPE)","volume":"45 1","pages":"626-631"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computational Performance Evaluation (ComPE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ComPE49325.2020.9200087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
The classification task carries a significant role in the field of effective data mining and numerous classification models are proposed over the years to carry out the job. However, standard classification models are sensitive to the underlying characteristics of the datasets. When employed to a dataset with skewed class distribution, standard classification models tend to misclassify the rare instances as it gets biased towards the majority patterns. This is where the issue of class imbalance makes it mark and causes to significantly degrade the performance of the standard classifiers. Among the several reported solutions for class imbalance issue, undersampling approaches are quite prevalent which offers to balance the class distribution by discarding insignificant majority instances. In this paper, an insight of class imbalance issue is presented in regard of its impact on classification models, the reported solutions and the effectiveness of the undersampling approaches in solving the issue.