{"title":"Algoritma K-Nearest Neighbor untuk Memprediksi Prestasi Mahasiswa Berdasarkan Latar Belakang Pendidikan dan Ekonomi","authors":"Daru Prasetyawan, Rahmadhan Gatra","doi":"10.14421/jiska.2022.7.1.56-67","DOIUrl":null,"url":null,"abstract":"Student academic performance is one measure of success in higher education. Prediction of student academic performance is important because it can help in decision-making. K-Nearest Neighbor (K-NN) algorithm is a method that can be used to predict it. Normalization is needed to scale the attribute value, so the data are in a smaller range than the actual data. Feature selection is used to eliminate irrelevant features. Data cleaning from outliers in the dataset aims to delete data that can affect the classification process. In the classification process, the dataset is divided into a training set by 80% and a validation set by 20% using the cross-validation method. The classification model that is formed is tested using data that is separate from the training data and is evaluated using a confusion matrix. As an evaluation, the K-NN model has 95.85% average accuracy, 95.97% average precision, and 95.84% average recall.","PeriodicalId":34216,"journal":{"name":"JISKA Jurnal Informatika Sunan Kalijaga","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JISKA Jurnal Informatika Sunan Kalijaga","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14421/jiska.2022.7.1.56-67","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Student academic performance is one measure of success in higher education. Prediction of student academic performance is important because it can help in decision-making. K-Nearest Neighbor (K-NN) algorithm is a method that can be used to predict it. Normalization is needed to scale the attribute value, so the data are in a smaller range than the actual data. Feature selection is used to eliminate irrelevant features. Data cleaning from outliers in the dataset aims to delete data that can affect the classification process. In the classification process, the dataset is divided into a training set by 80% and a validation set by 20% using the cross-validation method. The classification model that is formed is tested using data that is separate from the training data and is evaluated using a confusion matrix. As an evaluation, the K-NN model has 95.85% average accuracy, 95.97% average precision, and 95.84% average recall.