{"title":"通过验证迭代进化聚类方法的结果实现自然聚类","authors":"Tansel Özyer, R. Alhajj","doi":"10.1109/IS.2006.348468","DOIUrl":null,"url":null,"abstract":"Clustering is an essential process that leads to the classification of a given set of instances based on user-specified criteria; and different factors may lead to different clustering results. Thus, a large number of clustering algorithms exist to satisfy different purposes. However, scalability and the fact that algorithms in general need the number of clusters be specified a priori, which is mostly hard to estimate even for domain experts, are two challenges that motivate the development of new algorithms. This paper presents a novel approach to handle these two issues. We mainly developed a clustering method that works as an iterative approach to handle the scalability problem; and we utilize multi-objective genetic algorithm combined with validity indexes to decide on the number of clusters. The basic idea is to partition the dataset first; then cluster each partition separately. Finally, each obtained cluster is treated as a single instance (represented by its centroid) and a conquer process is performed to get the final clustering of the complete dataset. Test results on one large real dataset demonstrate the applicability and effectiveness of the proposed approach","PeriodicalId":116809,"journal":{"name":"2006 3rd International IEEE Conference Intelligent Systems","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Achieving Natural Clustering by Validating Results of Iterative Evolutionary Clustering Approach\",\"authors\":\"Tansel Özyer, R. Alhajj\",\"doi\":\"10.1109/IS.2006.348468\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering is an essential process that leads to the classification of a given set of instances based on user-specified criteria; and different factors may lead to different clustering results. Thus, a large number of clustering algorithms exist to satisfy different purposes. However, scalability and the fact that algorithms in general need the number of clusters be specified a priori, which is mostly hard to estimate even for domain experts, are two challenges that motivate the development of new algorithms. This paper presents a novel approach to handle these two issues. We mainly developed a clustering method that works as an iterative approach to handle the scalability problem; and we utilize multi-objective genetic algorithm combined with validity indexes to decide on the number of clusters. The basic idea is to partition the dataset first; then cluster each partition separately. Finally, each obtained cluster is treated as a single instance (represented by its centroid) and a conquer process is performed to get the final clustering of the complete dataset. Test results on one large real dataset demonstrate the applicability and effectiveness of the proposed approach\",\"PeriodicalId\":116809,\"journal\":{\"name\":\"2006 3rd International IEEE Conference Intelligent Systems\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 3rd International IEEE Conference Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IS.2006.348468\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 3rd International IEEE Conference Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IS.2006.348468","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Achieving Natural Clustering by Validating Results of Iterative Evolutionary Clustering Approach
Clustering is an essential process that leads to the classification of a given set of instances based on user-specified criteria; and different factors may lead to different clustering results. Thus, a large number of clustering algorithms exist to satisfy different purposes. However, scalability and the fact that algorithms in general need the number of clusters be specified a priori, which is mostly hard to estimate even for domain experts, are two challenges that motivate the development of new algorithms. This paper presents a novel approach to handle these two issues. We mainly developed a clustering method that works as an iterative approach to handle the scalability problem; and we utilize multi-objective genetic algorithm combined with validity indexes to decide on the number of clusters. The basic idea is to partition the dataset first; then cluster each partition separately. Finally, each obtained cluster is treated as a single instance (represented by its centroid) and a conquer process is performed to get the final clustering of the complete dataset. Test results on one large real dataset demonstrate the applicability and effectiveness of the proposed approach