{"title":"比较使用跟踪集成机器学习的逻辑回归、k近邻、支持向量机和naÏve贝叶斯模型的准确性","authors":"Kuntoro Kuntoro","doi":"10.17654/0973514324001","DOIUrl":null,"url":null,"abstract":"Selecting model for classifying target correctly is important. Logistic regression (LR), K-nearest neighbor (KNN), Support vector machine (SVM), and Naïve Bayes (NB) are base models in classifying target. Tracking ensemble is the method for comparing accuracy in machine learning. Datasets are generated by a code of Python as recommended by Brownlee [1]. Five sample sizes of 1,000, 3,000, 5,000, 7,000, and 10,000 are selected. The number of features is 20 having informative and redundant features, respectively, as 15 and 5. The result shows that support vector machine (SVM) has the highest mean of accuracy and the lowest coefficient of variation of accuracy in all sample sizes. Naïve Bayes (NB) has the lowest mean of accuracy and the highest coefficient of variation of accuracy in all sample sizes. It is recommended to select support vector machine (SVM) for classifying target. Received: August 13, 2023Accepted: October 9, 2023","PeriodicalId":40703,"journal":{"name":"JP Journal of Biostatistics","volume":"8 1","pages":"0"},"PeriodicalIF":0.1000,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"COMPARING ACCURACY OF LOGISTIC REGRESSION, K-NEAREST NEIGHBOR, SUPPORT VECTOR MACHINE, AND NAÏVE BAYES MODELS USING TRACKING ENSEMBLE MACHINE LEARNING\",\"authors\":\"Kuntoro Kuntoro\",\"doi\":\"10.17654/0973514324001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Selecting model for classifying target correctly is important. Logistic regression (LR), K-nearest neighbor (KNN), Support vector machine (SVM), and Naïve Bayes (NB) are base models in classifying target. Tracking ensemble is the method for comparing accuracy in machine learning. Datasets are generated by a code of Python as recommended by Brownlee [1]. Five sample sizes of 1,000, 3,000, 5,000, 7,000, and 10,000 are selected. The number of features is 20 having informative and redundant features, respectively, as 15 and 5. The result shows that support vector machine (SVM) has the highest mean of accuracy and the lowest coefficient of variation of accuracy in all sample sizes. Naïve Bayes (NB) has the lowest mean of accuracy and the highest coefficient of variation of accuracy in all sample sizes. It is recommended to select support vector machine (SVM) for classifying target. Received: August 13, 2023Accepted: October 9, 2023\",\"PeriodicalId\":40703,\"journal\":{\"name\":\"JP Journal of Biostatistics\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.1000,\"publicationDate\":\"2023-10-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JP Journal of Biostatistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17654/0973514324001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JP Journal of Biostatistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17654/0973514324001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
COMPARING ACCURACY OF LOGISTIC REGRESSION, K-NEAREST NEIGHBOR, SUPPORT VECTOR MACHINE, AND NAÏVE BAYES MODELS USING TRACKING ENSEMBLE MACHINE LEARNING
Selecting model for classifying target correctly is important. Logistic regression (LR), K-nearest neighbor (KNN), Support vector machine (SVM), and Naïve Bayes (NB) are base models in classifying target. Tracking ensemble is the method for comparing accuracy in machine learning. Datasets are generated by a code of Python as recommended by Brownlee [1]. Five sample sizes of 1,000, 3,000, 5,000, 7,000, and 10,000 are selected. The number of features is 20 having informative and redundant features, respectively, as 15 and 5. The result shows that support vector machine (SVM) has the highest mean of accuracy and the lowest coefficient of variation of accuracy in all sample sizes. Naïve Bayes (NB) has the lowest mean of accuracy and the highest coefficient of variation of accuracy in all sample sizes. It is recommended to select support vector machine (SVM) for classifying target. Received: August 13, 2023Accepted: October 9, 2023