{"title":"An integrated clustering approach for high dimensional categorical data","authors":"K. Kalaivani, A. Raghavendra","doi":"10.1109/ICGHPC.2013.6533920","DOIUrl":null,"url":null,"abstract":"Clustering is an attractive and important task in data mining which is used in many applications. However earlier work on clustering focused on only categorical data which is based on attribute values for grouping similar kind of data items thus will leads to convergence problem of clustering process. This proposed work is to enhance the existing k-means clustering process based on the categorical and mixed data types in efficient manner. The goal is to use integrated clustering approach based on high dimensional categorical data that works well for data with mixed continuous and categorical features. The experimental results of the proposed method on several data sets are suggest that the link based cluster ensemble algorithm integrate with proposed k-means algorithm to produce accurate clustering results. In this proposed algorithm prove the convergence property of clustering process, thus will improve the accuracy of clustering results. The scope of this proposed work is used to provide the accurate and efficient results, whenever the user wants to access the data from the database.","PeriodicalId":119498,"journal":{"name":"2013 International Conference on Green High Performance Computing (ICGHPC)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Green High Performance Computing (ICGHPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICGHPC.2013.6533920","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Clustering is an attractive and important task in data mining which is used in many applications. However earlier work on clustering focused on only categorical data which is based on attribute values for grouping similar kind of data items thus will leads to convergence problem of clustering process. This proposed work is to enhance the existing k-means clustering process based on the categorical and mixed data types in efficient manner. The goal is to use integrated clustering approach based on high dimensional categorical data that works well for data with mixed continuous and categorical features. The experimental results of the proposed method on several data sets are suggest that the link based cluster ensemble algorithm integrate with proposed k-means algorithm to produce accurate clustering results. In this proposed algorithm prove the convergence property of clustering process, thus will improve the accuracy of clustering results. The scope of this proposed work is used to provide the accurate and efficient results, whenever the user wants to access the data from the database.