{"title":"COMPARING ACCURACY OF LOGISTIC REGRESSION, K-NEAREST NEIGHBOR, SUPPORT VECTOR MACHINE, AND NAÏVE BAYES MODELS USING TRACKING ENSEMBLE MACHINE LEARNING","authors":"Kuntoro Kuntoro","doi":"10.17654/0973514324001","DOIUrl":null,"url":null,"abstract":"Selecting model for classifying target correctly is important. Logistic regression (LR), K-nearest neighbor (KNN), Support vector machine (SVM), and Naïve Bayes (NB) are base models in classifying target. Tracking ensemble is the method for comparing accuracy in machine learning. Datasets are generated by a code of Python as recommended by Brownlee [1]. Five sample sizes of 1,000, 3,000, 5,000, 7,000, and 10,000 are selected. The number of features is 20 having informative and redundant features, respectively, as 15 and 5. The result shows that support vector machine (SVM) has the highest mean of accuracy and the lowest coefficient of variation of accuracy in all sample sizes. Naïve Bayes (NB) has the lowest mean of accuracy and the highest coefficient of variation of accuracy in all sample sizes. It is recommended to select support vector machine (SVM) for classifying target. Received: August 13, 2023Accepted: October 9, 2023","PeriodicalId":40703,"journal":{"name":"JP Journal of Biostatistics","volume":null,"pages":null},"PeriodicalIF":0.1000,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JP Journal of Biostatistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17654/0973514324001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
Selecting model for classifying target correctly is important. Logistic regression (LR), K-nearest neighbor (KNN), Support vector machine (SVM), and Naïve Bayes (NB) are base models in classifying target. Tracking ensemble is the method for comparing accuracy in machine learning. Datasets are generated by a code of Python as recommended by Brownlee [1]. Five sample sizes of 1,000, 3,000, 5,000, 7,000, and 10,000 are selected. The number of features is 20 having informative and redundant features, respectively, as 15 and 5. The result shows that support vector machine (SVM) has the highest mean of accuracy and the lowest coefficient of variation of accuracy in all sample sizes. Naïve Bayes (NB) has the lowest mean of accuracy and the highest coefficient of variation of accuracy in all sample sizes. It is recommended to select support vector machine (SVM) for classifying target. Received: August 13, 2023Accepted: October 9, 2023