{"title":"A detailed study of clustering algorithms","authors":"Kamalpreet Bindra, Dr Anuranjan Misra","doi":"10.1109/CTCEEC.2017.8454973","DOIUrl":null,"url":null,"abstract":"The foremost illustrative task in data mining process is clustering. It plays an exceedingly important role in the entire KDD process also as categorizing data is one of the most rudimentary steps in knowledge discovery. It is an unsupervised learning task used for exploratory data analysis to find some unrevealed patterns which are present in data but cannot be categorized clearly. Sets of data can be designated or grouped together based on some common characteristics and termed clusters, the mechanism involved in cluster analysis are essentially dependent upon the primary task of keeping objects with in a cluster more closer than objects belonging to other groups or clusters. Depending on the data and expected cluster characteristics there are different types of clustering paradigms. In the very recent times many new algorithms have emerged which aim towards bridging the different approaches towards clustering and merging different clustering algorithms given the requirement of handling sequential, extensive data with multiple relationships in many applications across a broad spectrum. Various clustering algorithms have been developed under different paradigms for grouping scattered data points and forming efficient cluster shapes with minimal outliers. This paper attempts to address the problem of creating evenly shaped clusters in detail and aims to study, review and analyze few clustering algorithms falling under different categories of clustering paradigms and presents a detailed comparison of their efficiency, advantages and disadvantages on some common grounds. This study also contributes in correlating some very important characteristics of an efficient clustering algorithm.","PeriodicalId":357118,"journal":{"name":"2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CTCEEC.2017.8454973","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
The foremost illustrative task in data mining process is clustering. It plays an exceedingly important role in the entire KDD process also as categorizing data is one of the most rudimentary steps in knowledge discovery. It is an unsupervised learning task used for exploratory data analysis to find some unrevealed patterns which are present in data but cannot be categorized clearly. Sets of data can be designated or grouped together based on some common characteristics and termed clusters, the mechanism involved in cluster analysis are essentially dependent upon the primary task of keeping objects with in a cluster more closer than objects belonging to other groups or clusters. Depending on the data and expected cluster characteristics there are different types of clustering paradigms. In the very recent times many new algorithms have emerged which aim towards bridging the different approaches towards clustering and merging different clustering algorithms given the requirement of handling sequential, extensive data with multiple relationships in many applications across a broad spectrum. Various clustering algorithms have been developed under different paradigms for grouping scattered data points and forming efficient cluster shapes with minimal outliers. This paper attempts to address the problem of creating evenly shaped clusters in detail and aims to study, review and analyze few clustering algorithms falling under different categories of clustering paradigms and presents a detailed comparison of their efficiency, advantages and disadvantages on some common grounds. This study also contributes in correlating some very important characteristics of an efficient clustering algorithm.