Determining the Optimum Number of Clusters in Hierarchical Clustering Using Pseudo-F

Steven Jansen Sinaga, Neva Satyahadewi., Hendra Perdana
{"title":"Determining the Optimum Number of Clusters in Hierarchical Clustering Using Pseudo-F","authors":"Steven Jansen Sinaga, Neva Satyahadewi., Hendra Perdana","doi":"10.37905/euler.v11i2.23113","DOIUrl":null,"url":null,"abstract":"Poverty refers to the condition where a person cannot meet the basic necessities based on the minimum living standards. Statistics Indonesia proxied an increase in the poverty rate in North Sumatra Province in 2021 from 8.75% to 9.01%. However, this increase is exclusive to North Sumatra Province, which has Indonesia's 3rd largest number of districts/cities. This study discussed mapping the North Sumatra Province region based on 10 poverty factor variables. The 10 variables are life expectancy, health complaints, poverty line, Gross Regional Domestic Product (GRDP), population growth rate, Expected Years of Schooling (EYS), Human Development Index (HDI), labor force participation rate, open unemployment rate, and district/city minimum wage. The Hierarchical Clustering analysis was employed to compare single, complete, and average linkage methods. The best method was determined based on the pseudo-F statistic value. 4 clusters had complete linkage methods, each of which possessed varied characteristics. Cluster 1 contains cities with the lowest poverty rate, including Medan City and  Pematang Siantar City. Cluster 2 consists of cities with low poverty rates, while Cluster 3 consists of cities with high poverty rates. Cities that are included in Cluster 4 have very high poverty rates, including South Nias District and Pakpak Bharat District. The clusters present significant poverty rate gaps among North Sumatra Province regions.","PeriodicalId":504964,"journal":{"name":"Euler : Jurnal Ilmiah Matematika, Sains dan Teknologi","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Euler : Jurnal Ilmiah Matematika, Sains dan Teknologi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37905/euler.v11i2.23113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Poverty refers to the condition where a person cannot meet the basic necessities based on the minimum living standards. Statistics Indonesia proxied an increase in the poverty rate in North Sumatra Province in 2021 from 8.75% to 9.01%. However, this increase is exclusive to North Sumatra Province, which has Indonesia's 3rd largest number of districts/cities. This study discussed mapping the North Sumatra Province region based on 10 poverty factor variables. The 10 variables are life expectancy, health complaints, poverty line, Gross Regional Domestic Product (GRDP), population growth rate, Expected Years of Schooling (EYS), Human Development Index (HDI), labor force participation rate, open unemployment rate, and district/city minimum wage. The Hierarchical Clustering analysis was employed to compare single, complete, and average linkage methods. The best method was determined based on the pseudo-F statistic value. 4 clusters had complete linkage methods, each of which possessed varied characteristics. Cluster 1 contains cities with the lowest poverty rate, including Medan City and  Pematang Siantar City. Cluster 2 consists of cities with low poverty rates, while Cluster 3 consists of cities with high poverty rates. Cities that are included in Cluster 4 have very high poverty rates, including South Nias District and Pakpak Bharat District. The clusters present significant poverty rate gaps among North Sumatra Province regions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用伪 F 确定分层聚类中的最佳聚类数
贫困是指一个人无法满足最低生活标准的基本需求。根据印尼统计局的预测,2021 年北苏门答腊省的贫困率将从 8.75% 上升至 9.01%。然而,这一增长仅限于北苏门答腊省,因为该省拥有印尼第三多的县/市。本研究讨论了根据 10 个贫困因素变量绘制北苏门答腊省地区地图的问题。这 10 个变量分别是预期寿命、健康投诉、贫困线、地区国内生产总值(GRDP)、人口增长率、预期受教育年数(EYS)、人类发展指数(HDI)、劳动力参与率、公开失业率和地区/城市最低工资。采用了层次聚类分析来比较单一联系、完全联系和平均联系方法。根据伪 F 统计量值确定最佳方法。4 个聚类具有完整的联系方法,每个聚类都具有不同的特征。群组 1 包含贫困率最低的城市,包括棉兰市和 Pematang Siantar 市。第 2 组包括贫困率较低的城市,第 3 组包括贫困率较高的城市。第 4 组城市的贫困率非常高,包括南尼亚斯区和巴帕克巴拉特区。北苏门答腊省各地区之间的贫困率差距很大。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Determining the Optimum Number of Clusters in Hierarchical Clustering Using Pseudo-F Modifikasi Garis Singgung Untuk Mempercepat Iterasi Pada Metode Newton Raphson Pemodelan Indeks Pembangunan Manusia Nusa Tenggara Barat Menggunakan Geographically Weighted Regression Implementasi Metode Double Exponential Smoothing Brown Untuk Meramalkan Jumlah Penduduk Miskin Penerapan Model Harga Opsi Black Scholes dalam Penentuan Premi Asuransi Jiwa Dwiguna Unit Link
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1