{"title":"Fuzzy clustering: Determining the number of clusters","authors":"H. Řezanková, D. Húsek","doi":"10.1109/CASoN.2012.6412415","DOIUrl":null,"url":null,"abstract":"In this study we analyze behavior of two types of coefficients for determining the suitable number of clusters obtained when fuzzy cluster analysis is applied. First one is Dunn's coefficient which contains membership degrees in its computational formula; second one is the average silhouette width, used primarily for evaluating hard clustering. There have already been attempts to compare different coefficients for determining the clustering quality or number of clusters respectively. Unfortunately coefficients for evaluating hard clustering and for fuzzy clustering were studied separately only. We tested coefficients efficiency when clustering both data set consisting of generated objects with the known number of clusters and real data sets with unknown number of clusters. The analysis showed the limitations of these two coefficients especially for the cases when clusters are really fuzzy.","PeriodicalId":431370,"journal":{"name":"2012 Fourth International Conference on Computational Aspects of Social Networks (CASoN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Fourth International Conference on Computational Aspects of Social Networks (CASoN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CASoN.2012.6412415","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
In this study we analyze behavior of two types of coefficients for determining the suitable number of clusters obtained when fuzzy cluster analysis is applied. First one is Dunn's coefficient which contains membership degrees in its computational formula; second one is the average silhouette width, used primarily for evaluating hard clustering. There have already been attempts to compare different coefficients for determining the clustering quality or number of clusters respectively. Unfortunately coefficients for evaluating hard clustering and for fuzzy clustering were studied separately only. We tested coefficients efficiency when clustering both data set consisting of generated objects with the known number of clusters and real data sets with unknown number of clusters. The analysis showed the limitations of these two coefficients especially for the cases when clusters are really fuzzy.