Gede A. Pradipta, Retantyo Wardoyo, Aina Musdholifah, I. Sanjaya, Muhammad Ismail
{"title":"数据不平衡问题处理方法综述","authors":"Gede A. Pradipta, Retantyo Wardoyo, Aina Musdholifah, I. Sanjaya, Muhammad Ismail","doi":"10.1109/ICIC54025.2021.9632912","DOIUrl":null,"url":null,"abstract":"Imbalanced class data distribution occurs when the number of examples representing one class is much lower than others. This conditioning affects the prediction accuracy degraded on minority data. To overcome this problem, Synthetic Minority Oversampling Technique (SMOTE) is a pioneer oversampling method in the research community for imbalanced classification. The basic idea of SMOTE is oversampled by creating a synthetic instance in feature space formed by the instance and its K-nearest neighbors due to the ability to avoid overfitting and assist the classifier in finding decision boundaries between classes. In this paper, we review current issue and problem occurs in classification with imbalanced data, performance evaluation in imbalanced data, a survey on an extension of SMOTE in recent years, and finally identify current challenges and future work in learning with imbalanced data.","PeriodicalId":189541,"journal":{"name":"2021 Sixth International Conference on Informatics and Computing (ICIC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"SMOTE for Handling Imbalanced Data Problem : A Review\",\"authors\":\"Gede A. Pradipta, Retantyo Wardoyo, Aina Musdholifah, I. Sanjaya, Muhammad Ismail\",\"doi\":\"10.1109/ICIC54025.2021.9632912\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Imbalanced class data distribution occurs when the number of examples representing one class is much lower than others. This conditioning affects the prediction accuracy degraded on minority data. To overcome this problem, Synthetic Minority Oversampling Technique (SMOTE) is a pioneer oversampling method in the research community for imbalanced classification. The basic idea of SMOTE is oversampled by creating a synthetic instance in feature space formed by the instance and its K-nearest neighbors due to the ability to avoid overfitting and assist the classifier in finding decision boundaries between classes. In this paper, we review current issue and problem occurs in classification with imbalanced data, performance evaluation in imbalanced data, a survey on an extension of SMOTE in recent years, and finally identify current challenges and future work in learning with imbalanced data.\",\"PeriodicalId\":189541,\"journal\":{\"name\":\"2021 Sixth International Conference on Informatics and Computing (ICIC)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Sixth International Conference on Informatics and Computing (ICIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIC54025.2021.9632912\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Sixth International Conference on Informatics and Computing (ICIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIC54025.2021.9632912","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SMOTE for Handling Imbalanced Data Problem : A Review
Imbalanced class data distribution occurs when the number of examples representing one class is much lower than others. This conditioning affects the prediction accuracy degraded on minority data. To overcome this problem, Synthetic Minority Oversampling Technique (SMOTE) is a pioneer oversampling method in the research community for imbalanced classification. The basic idea of SMOTE is oversampled by creating a synthetic instance in feature space formed by the instance and its K-nearest neighbors due to the ability to avoid overfitting and assist the classifier in finding decision boundaries between classes. In this paper, we review current issue and problem occurs in classification with imbalanced data, performance evaluation in imbalanced data, a survey on an extension of SMOTE in recent years, and finally identify current challenges and future work in learning with imbalanced data.