Steven Jansen Sinaga, Neva Satyahadewi., Hendra Perdana
{"title":"Determining the Optimum Number of Clusters in Hierarchical Clustering Using Pseudo-F","authors":"Steven Jansen Sinaga, Neva Satyahadewi., Hendra Perdana","doi":"10.37905/euler.v11i2.23113","DOIUrl":null,"url":null,"abstract":"Poverty refers to the condition where a person cannot meet the basic necessities based on the minimum living standards. Statistics Indonesia proxied an increase in the poverty rate in North Sumatra Province in 2021 from 8.75% to 9.01%. However, this increase is exclusive to North Sumatra Province, which has Indonesia's 3rd largest number of districts/cities. This study discussed mapping the North Sumatra Province region based on 10 poverty factor variables. The 10 variables are life expectancy, health complaints, poverty line, Gross Regional Domestic Product (GRDP), population growth rate, Expected Years of Schooling (EYS), Human Development Index (HDI), labor force participation rate, open unemployment rate, and district/city minimum wage. The Hierarchical Clustering analysis was employed to compare single, complete, and average linkage methods. The best method was determined based on the pseudo-F statistic value. 4 clusters had complete linkage methods, each of which possessed varied characteristics. Cluster 1 contains cities with the lowest poverty rate, including Medan City and Pematang Siantar City. Cluster 2 consists of cities with low poverty rates, while Cluster 3 consists of cities with high poverty rates. Cities that are included in Cluster 4 have very high poverty rates, including South Nias District and Pakpak Bharat District. The clusters present significant poverty rate gaps among North Sumatra Province regions.","PeriodicalId":504964,"journal":{"name":"Euler : Jurnal Ilmiah Matematika, Sains dan Teknologi","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Euler : Jurnal Ilmiah Matematika, Sains dan Teknologi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37905/euler.v11i2.23113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Poverty refers to the condition where a person cannot meet the basic necessities based on the minimum living standards. Statistics Indonesia proxied an increase in the poverty rate in North Sumatra Province in 2021 from 8.75% to 9.01%. However, this increase is exclusive to North Sumatra Province, which has Indonesia's 3rd largest number of districts/cities. This study discussed mapping the North Sumatra Province region based on 10 poverty factor variables. The 10 variables are life expectancy, health complaints, poverty line, Gross Regional Domestic Product (GRDP), population growth rate, Expected Years of Schooling (EYS), Human Development Index (HDI), labor force participation rate, open unemployment rate, and district/city minimum wage. The Hierarchical Clustering analysis was employed to compare single, complete, and average linkage methods. The best method was determined based on the pseudo-F statistic value. 4 clusters had complete linkage methods, each of which possessed varied characteristics. Cluster 1 contains cities with the lowest poverty rate, including Medan City and Pematang Siantar City. Cluster 2 consists of cities with low poverty rates, while Cluster 3 consists of cities with high poverty rates. Cities that are included in Cluster 4 have very high poverty rates, including South Nias District and Pakpak Bharat District. The clusters present significant poverty rate gaps among North Sumatra Province regions.