{"title":"Experimental evaluation of cluster quality measures","authors":"O. Kirkland, B. Iglesia","doi":"10.1109/UKCI.2013.6651311","DOIUrl":null,"url":null,"abstract":"Selecting a “good” clustering solution is one of the major difficulties in clustering data as there are many possible clustering solutions for a given problem, including solutions that contain varying numbers of clusters. Our objective is to select measures of clustering quality that can be applied in a multi-objective optimisation context. Such measures may represent potentially conflicting objectives but should give rise to the “best” clustering solutions from which the user can select a compromise solution. There exists a wide range of cluster quality measures for assessing the quality of a given clustering solution. We begin by summarise some of these. We then propose an experimental evaluation to capture the robustness of different measures under changing conditions. Our experimental setup includes the creation of a number of synthetic clustering solutions which are then degraded in a systematic manner. We measure how the degradation of each measure correlates with the degradation of the solutions according to an external quality measure evaluation. We consider as good those measures that show good correlation. In this context, measures based upon the concept of connectivity show good performance in comparison to others.","PeriodicalId":106191,"journal":{"name":"2013 13th UK Workshop on Computational Intelligence (UKCI)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 13th UK Workshop on Computational Intelligence (UKCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UKCI.2013.6651311","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Selecting a “good” clustering solution is one of the major difficulties in clustering data as there are many possible clustering solutions for a given problem, including solutions that contain varying numbers of clusters. Our objective is to select measures of clustering quality that can be applied in a multi-objective optimisation context. Such measures may represent potentially conflicting objectives but should give rise to the “best” clustering solutions from which the user can select a compromise solution. There exists a wide range of cluster quality measures for assessing the quality of a given clustering solution. We begin by summarise some of these. We then propose an experimental evaluation to capture the robustness of different measures under changing conditions. Our experimental setup includes the creation of a number of synthetic clustering solutions which are then degraded in a systematic manner. We measure how the degradation of each measure correlates with the degradation of the solutions according to an external quality measure evaluation. We consider as good those measures that show good correlation. In this context, measures based upon the concept of connectivity show good performance in comparison to others.