Comparative analysis of k-nearest neighbor and modified k-nearest neighbor algorithm for data classification

Okfalisa, Ikbal Gazalba, Mustakim, Nurul Gayatri Indah Reza
{"title":"Comparative analysis of k-nearest neighbor and modified k-nearest neighbor algorithm for data classification","authors":"Okfalisa, Ikbal Gazalba, Mustakim, Nurul Gayatri Indah Reza","doi":"10.1109/ICITISEE.2017.8285514","DOIUrl":null,"url":null,"abstract":"Data mining is the process of handling information from a database which is invisible directly. Data mining is predicted to become a highly revolutionary branch of science over the next decade. One of data mining techniques is classification. The most popular classification technique is K-Nearest Neighbor (KNN). But there is also the Modified K-Nearest Neighbor (MKNN) classification algorithm which is the derived algorithm of KNN. In this paper we will analyze the comparison of KNN and MKNN algorithms to classify the data of Conditional Cash Transfer Implementation Unit (Unit Pelaksana Program Keluarga Harapan) which consist of 7395 records. Comparative analysis is based on the accuracy of both algorithms. Before classification, K-Fold Cross Validation was done to search for the optimal data modeling resulted in data modeling on cross 2 with accuracy of 93.945%. The results of K-Fold Cross Validation modeling will be the model for training data samples and testing data to test KNN and MKNN for classification. Classification result produced accuracy based on the rules of confusion matrix. The test resulted in the highest accuracy of KKN by 94.95% with average accuracy during the test was 93.94% and the highest accuracy of MKNN was 99.51% with the average accuracy during the test was 99.20%, almost all testing from the first test up to the tenth, MKNN algorithm is superior and has better accuracy value than KNN so it can be analyzed that the ability of MKNN algorithm in accuracy is better than KNN. It can be concluded that MKNN algorithm is capable of handling accuracy better for classification than KNN algorithm, by ignoring other aspects such as computerization, time efficiency, and algorithm effectiveness.","PeriodicalId":130873,"journal":{"name":"2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"132","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITISEE.2017.8285514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 132

Abstract

Data mining is the process of handling information from a database which is invisible directly. Data mining is predicted to become a highly revolutionary branch of science over the next decade. One of data mining techniques is classification. The most popular classification technique is K-Nearest Neighbor (KNN). But there is also the Modified K-Nearest Neighbor (MKNN) classification algorithm which is the derived algorithm of KNN. In this paper we will analyze the comparison of KNN and MKNN algorithms to classify the data of Conditional Cash Transfer Implementation Unit (Unit Pelaksana Program Keluarga Harapan) which consist of 7395 records. Comparative analysis is based on the accuracy of both algorithms. Before classification, K-Fold Cross Validation was done to search for the optimal data modeling resulted in data modeling on cross 2 with accuracy of 93.945%. The results of K-Fold Cross Validation modeling will be the model for training data samples and testing data to test KNN and MKNN for classification. Classification result produced accuracy based on the rules of confusion matrix. The test resulted in the highest accuracy of KKN by 94.95% with average accuracy during the test was 93.94% and the highest accuracy of MKNN was 99.51% with the average accuracy during the test was 99.20%, almost all testing from the first test up to the tenth, MKNN algorithm is superior and has better accuracy value than KNN so it can be analyzed that the ability of MKNN algorithm in accuracy is better than KNN. It can be concluded that MKNN algorithm is capable of handling accuracy better for classification than KNN algorithm, by ignoring other aspects such as computerization, time efficiency, and algorithm effectiveness.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
k近邻算法与改进k近邻算法在数据分类中的比较分析
数据挖掘是对数据库中不可见的信息进行处理的过程。据预测,数据挖掘将在未来十年成为一门极具革命性的科学分支。数据挖掘技术之一是分类。最流行的分类技术是k -最近邻(KNN)。但也有改进的k近邻(MKNN)分类算法,它是KNN的衍生算法。本文将比较KNN和MKNN算法对有条件现金转移实施单元(Unit Pelaksana Program Keluarga Harapan) 7395条记录的数据进行分类。对比分析是基于两种算法的准确性。分类前进行K-Fold交叉验证,寻找最优的数据建模,得到交叉2上的数据建模,准确率为93.945%。K-Fold交叉验证建模的结果将作为训练数据样本和测试数据的模型,用于测试KNN和MKNN进行分类。分类结果根据混淆矩阵的规则产生准确率。测试结果表明,KKN的最高准确率为94.95%,测试平均准确率为93.94%;MKNN的最高准确率为99.51%,测试平均准确率为99.20%,从第一次测试到第十次测试,几乎所有测试中,MKNN算法都优于KNN算法,具有更好的准确率值,因此可以分析MKNN算法在准确率方面的能力优于KNN。可以得出结论,在忽略计算机化、时间效率和算法有效性等其他方面的情况下,MKNN算法对分类的处理精度优于KNN算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Deployment of cloud computing for higher education using google apps A triumvirate blended learning method for embedded computational devices used in the Internet of Things: A case study Simple duplicate frame detection of MJPEG codec for video forensic Classification of intrusion detection system (IDS) based on computer network Stabilizing Two-wheeled robot using linear quadratic regulator and states estimation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1