{"title":"Classification Performance Analysis of Decision Tree-Based Algorithms with Noisy Class Variable","authors":"Abdulmajeed Atiah Alharbi","doi":"10.1155/2024/6671395","DOIUrl":null,"url":null,"abstract":"Class noise is a common issue that affects the performance of classification techniques on real-world data sets. Class noise appears when a class variable in data sets has incorrect class labels. In the case of noisy data, the robustness of classification techniques against noise could be more important than the performance results on noise-free data sets. The decision tree method is one of the most popular techniques for classification tasks. The C4.5, CART, and random forest (RF) algorithms are considered to be three of the most used algorithms in decision trees. The aim of this paper is to reach conclusions on which decision tree algorithm is better to use for building decision trees in terms of its performance and robustness against class noise. In order to achieve this aim, we study and compare the performance of the models when applied to class variables with noise. The results obtained indicate that the RF algorithm is more robust to data sets with noisy class variable than other algorithms.","PeriodicalId":55177,"journal":{"name":"Discrete Dynamics in Nature and Society","volume":"60 1","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Discrete Dynamics in Nature and Society","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1155/2024/6671395","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Class noise is a common issue that affects the performance of classification techniques on real-world data sets. Class noise appears when a class variable in data sets has incorrect class labels. In the case of noisy data, the robustness of classification techniques against noise could be more important than the performance results on noise-free data sets. The decision tree method is one of the most popular techniques for classification tasks. The C4.5, CART, and random forest (RF) algorithms are considered to be three of the most used algorithms in decision trees. The aim of this paper is to reach conclusions on which decision tree algorithm is better to use for building decision trees in terms of its performance and robustness against class noise. In order to achieve this aim, we study and compare the performance of the models when applied to class variables with noise. The results obtained indicate that the RF algorithm is more robust to data sets with noisy class variable than other algorithms.
期刊介绍:
The main objective of Discrete Dynamics in Nature and Society is to foster links between basic and applied research relating to discrete dynamics of complex systems encountered in the natural and social sciences. The journal intends to stimulate publications directed to the analyses of computer generated solutions and chaotic in particular, correctness of numerical procedures, chaos synchronization and control, discrete optimization methods among other related topics. The journal provides a channel of communication between scientists and practitioners working in the field of complex systems analysis and will stimulate the development and use of discrete dynamical approach.