{"title":"Coercion: A Distributed Clustering Algorithm for Categorical Data","authors":"Bin Wang, Yang Zhou, Xinhong Hei","doi":"10.1109/CIS.2013.149","DOIUrl":null,"url":null,"abstract":"Clustering is an important technology in data mining. Squeezer is one such clustering algorithm for categorical data and it is more efficient than most existing algorithms for categorical data. But Squeezer is time consuming for very large datasets which are distributed in different servers. Thus, we employ the distributed thinking to improve Squeezer and a distributed algorithm for categorical data called Coercion is proposed in this paper. In order to present detailed complexity results for Coercion, we also conduct an experimental study with standard as well as synthetic data sets to demonstrate the effectiveness of the new algorithm.","PeriodicalId":294223,"journal":{"name":"2013 Ninth International Conference on Computational Intelligence and Security","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Ninth International Conference on Computational Intelligence and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS.2013.149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Clustering is an important technology in data mining. Squeezer is one such clustering algorithm for categorical data and it is more efficient than most existing algorithms for categorical data. But Squeezer is time consuming for very large datasets which are distributed in different servers. Thus, we employ the distributed thinking to improve Squeezer and a distributed algorithm for categorical data called Coercion is proposed in this paper. In order to present detailed complexity results for Coercion, we also conduct an experimental study with standard as well as synthetic data sets to demonstrate the effectiveness of the new algorithm.