使用 K 近邻法预测贫困社区的数据挖掘分析

Jurnal Informatika dan Teknik Elektro Terapan Pub Date : 2024-04-02 DOI:10.23960/jitet.v12i2.4131

Nurdin Nurdin

{"title":"使用 K 近邻法预测贫困社区的数据挖掘分析","authors":"Nurdin Nurdin","doi":"10.23960/jitet.v12i2.4131","DOIUrl":null,"url":null,"abstract":"Poverty is one of the fundamental issues that is center of attention of the government in a country. One important aspect to support the poverty reduction strategi is the availability of accurate and targeted poverty data. One of the main problems that often hinders the success of these government programs is the availability of appropriate data on the targeting of the poor. This study aims to design an application than can predict the poor using the K-Nearest Neighbor Algorithm with the five main indicators being the type of work, number of dependents, age income and condition of the household head of the family. This prediction provides data on poor families that are suitable for receiving various assistance from the government. The data used for predictions are sample data from Pegasing District. In this study, the K-NN Algorithm was analyzed which was developed based on the web. The working principle of K-Nearest Neighbor is to find the shortest distance between the evaluated data and training data. The results of the evaluation using the confusion matrix obtained the resulting accuracy for 216 training data with 93 testing data with a ratio of 70:30 and five attributes used produced an accuracy of 86,02%, Recall 61,90%, Precision 72,22%, and F1-Score 66,04%.","PeriodicalId":313205,"journal":{"name":"Jurnal Informatika dan Teknik Elektro Terapan","volume":"319 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ANALISA DATA MINING DALAM MEMPREDIKSI MASYARAKAT KURANG MAMPU MENGGUNAKAN METODE K-NEAREST NEIGHBOR\",\"authors\":\"Nurdin Nurdin\",\"doi\":\"10.23960/jitet.v12i2.4131\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Poverty is one of the fundamental issues that is center of attention of the government in a country. One important aspect to support the poverty reduction strategi is the availability of accurate and targeted poverty data. One of the main problems that often hinders the success of these government programs is the availability of appropriate data on the targeting of the poor. This study aims to design an application than can predict the poor using the K-Nearest Neighbor Algorithm with the five main indicators being the type of work, number of dependents, age income and condition of the household head of the family. This prediction provides data on poor families that are suitable for receiving various assistance from the government. The data used for predictions are sample data from Pegasing District. In this study, the K-NN Algorithm was analyzed which was developed based on the web. The working principle of K-Nearest Neighbor is to find the shortest distance between the evaluated data and training data. The results of the evaluation using the confusion matrix obtained the resulting accuracy for 216 training data with 93 testing data with a ratio of 70:30 and five attributes used produced an accuracy of 86,02%, Recall 61,90%, Precision 72,22%, and F1-Score 66,04%.\",\"PeriodicalId\":313205,\"journal\":{\"name\":\"Jurnal Informatika dan Teknik Elektro Terapan\",\"volume\":\"319 \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jurnal Informatika dan Teknik Elektro Terapan\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23960/jitet.v12i2.4131\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Informatika dan Teknik Elektro Terapan","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23960/jitet.v12i2.4131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

贫困是一个国家政府关注的基本问题之一。支持减贫战略的一个重要方面是提供准确和有针对性的贫困数据。而阻碍这些政府计划取得成功的主要问题之一，就是无法获得针对贫困人口的适当数据。本研究旨在设计一种应用软件，利用 K-近邻算法，以工作类型、受抚养人数量、年龄收入和户主状况这五个主要指标来预测贫困人口。这种预测提供了适合接受政府各种援助的贫困家庭的数据。用于预测的数据是佩加辛地区的样本数据。本研究分析了基于网络开发的 K-NN 算法。K-Nearest Neighbor 算法的工作原理是找出评估数据与训练数据之间的最短距离。使用混淆矩阵对 216 个训练数据和 93 个测试数据进行评估，得出的准确率为 70:30，五个属性的准确率为 86,02%，召回率为 61,90%，精确率为 72,22%，F1 分数为 66,04%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ANALISA DATA MINING DALAM MEMPREDIKSI MASYARAKAT KURANG MAMPU MENGGUNAKAN METODE K-NEAREST NEIGHBOR

Poverty is one of the fundamental issues that is center of attention of the government in a country. One important aspect to support the poverty reduction strategi is the availability of accurate and targeted poverty data. One of the main problems that often hinders the success of these government programs is the availability of appropriate data on the targeting of the poor. This study aims to design an application than can predict the poor using the K-Nearest Neighbor Algorithm with the five main indicators being the type of work, number of dependents, age income and condition of the household head of the family. This prediction provides data on poor families that are suitable for receiving various assistance from the government. The data used for predictions are sample data from Pegasing District. In this study, the K-NN Algorithm was analyzed which was developed based on the web. The working principle of K-Nearest Neighbor is to find the shortest distance between the evaluated data and training data. The results of the evaluation using the confusion matrix obtained the resulting accuracy for 216 training data with 93 testing data with a ratio of 70:30 and five attributes used produced an accuracy of 86,02%, Recall 61,90%, Precision 72,22%, and F1-Score 66,04%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Jurnal Informatika dan Teknik Elektro Terapan

自引率

0.00%

发文量