Yoga Pristyanto, A. F. Nugraha, Rifda Faticha Alfa Aziza, Ibnu Hadi Purwanto, Mulia Sulistiyono, Akhmad Dahlan
{"title":"集成模型作为数据集不平衡类分类解决方案的比较","authors":"Yoga Pristyanto, A. F. Nugraha, Rifda Faticha Alfa Aziza, Ibnu Hadi Purwanto, Mulia Sulistiyono, Akhmad Dahlan","doi":"10.1109/IMCOM56909.2023.10035615","DOIUrl":null,"url":null,"abstract":"A phenomenon known as “class imbalance” occurs when an excessive number of classes are evaluated in relation to other classes. This circumstance is quite common in the challenges that classification modeling is used to in the actual world. Because of the influence of class imbalance on the dataset, the classification model's performance is not at its highest possible level. In addition, the presence of these factors might make the possibility of incorrect categorization greater. Utilizing an ensemble model is one approach that may be used to resolve this issue. The originality of the dataset is preserved, which is one of the many benefits of this method. In this work, three different types of ensemble models-XGBoost, Stacking, and Bagging-were examined and contrasted. All three were put through their paces using five distinct unbalanced multiclass datasets, each with a different value for the imbalanced ratio. The results of the three experiments that used five different assessment indicators reveal that the XGBoost model performs much better than the Bagging and Stacking models when it comes to overall performance. The XGBoost model performs exceptionally well in all of the indicators that were evaluated, including Balanced Accuracy, True Positive Rate, True Negative Rate, Geometric Mean, and Multiclass Area Under Curve. These findings provide more evidence that XGBoost is a viable option for addressing multiclass unbalanced issues in datasets.","PeriodicalId":230213,"journal":{"name":"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"204 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Comparison of Ensemble Models as Solutions for Imbalanced Class Classification of Datasets\",\"authors\":\"Yoga Pristyanto, A. F. Nugraha, Rifda Faticha Alfa Aziza, Ibnu Hadi Purwanto, Mulia Sulistiyono, Akhmad Dahlan\",\"doi\":\"10.1109/IMCOM56909.2023.10035615\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A phenomenon known as “class imbalance” occurs when an excessive number of classes are evaluated in relation to other classes. This circumstance is quite common in the challenges that classification modeling is used to in the actual world. Because of the influence of class imbalance on the dataset, the classification model's performance is not at its highest possible level. In addition, the presence of these factors might make the possibility of incorrect categorization greater. Utilizing an ensemble model is one approach that may be used to resolve this issue. The originality of the dataset is preserved, which is one of the many benefits of this method. In this work, three different types of ensemble models-XGBoost, Stacking, and Bagging-were examined and contrasted. All three were put through their paces using five distinct unbalanced multiclass datasets, each with a different value for the imbalanced ratio. The results of the three experiments that used five different assessment indicators reveal that the XGBoost model performs much better than the Bagging and Stacking models when it comes to overall performance. The XGBoost model performs exceptionally well in all of the indicators that were evaluated, including Balanced Accuracy, True Positive Rate, True Negative Rate, Geometric Mean, and Multiclass Area Under Curve. These findings provide more evidence that XGBoost is a viable option for addressing multiclass unbalanced issues in datasets.\",\"PeriodicalId\":230213,\"journal\":{\"name\":\"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"volume\":\"204 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IMCOM56909.2023.10035615\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 17th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM56909.2023.10035615","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparison of Ensemble Models as Solutions for Imbalanced Class Classification of Datasets
A phenomenon known as “class imbalance” occurs when an excessive number of classes are evaluated in relation to other classes. This circumstance is quite common in the challenges that classification modeling is used to in the actual world. Because of the influence of class imbalance on the dataset, the classification model's performance is not at its highest possible level. In addition, the presence of these factors might make the possibility of incorrect categorization greater. Utilizing an ensemble model is one approach that may be used to resolve this issue. The originality of the dataset is preserved, which is one of the many benefits of this method. In this work, three different types of ensemble models-XGBoost, Stacking, and Bagging-were examined and contrasted. All three were put through their paces using five distinct unbalanced multiclass datasets, each with a different value for the imbalanced ratio. The results of the three experiments that used five different assessment indicators reveal that the XGBoost model performs much better than the Bagging and Stacking models when it comes to overall performance. The XGBoost model performs exceptionally well in all of the indicators that were evaluated, including Balanced Accuracy, True Positive Rate, True Negative Rate, Geometric Mean, and Multiclass Area Under Curve. These findings provide more evidence that XGBoost is a viable option for addressing multiclass unbalanced issues in datasets.