Data density based clustering

2014 14th UK Workshop on Computational Intelligence (UKCI) Pub Date : 2014-10-20 DOI:10.1109/UKCI.2014.6930157

Richard Hyde, P. Angelov

{"title":"Data density based clustering","authors":"Richard Hyde, P. Angelov","doi":"10.1109/UKCI.2014.6930157","DOIUrl":null,"url":null,"abstract":"A new, data density based approach to clustering is presented which automatically determines the number of clusters. By using RDE for each data sample the number of calculations is significantly reduced in offline mode and, further, the method is suitable for online use. The clusters allow a different diameter per feature/dimension creating hyper-ellipsoid clusters which are axis-orthogonal. This results in a greater differentiation between clusters where the clusters are highly asymmetrical. We illustrate this with 3 standard data sets, 1 artificial dataset and a large real dataset to demonstrate comparable results to Subtractive, Hierarchical, K-Means, ELM and DBScan clustering techniques. Unlike subtractive clustering we do not iteratively calculate P however. Unlike hierarchical we do not need O(N2) distances to be calculated and a cut-off threshold to be defined. Unlike k-means we do not need to predefine the number of clusters. Using the RDE equations to calculate the densities the algorithm is efficient, and requires no iteration to approach the optimal result. We compare the proposed algorithm to k-means, subtractive, hierarchical, ELM and DBScan clustering with respect to several criteria. The results demonstrate the validity of the proposed approach.","PeriodicalId":315044,"journal":{"name":"2014 14th UK Workshop on Computational Intelligence (UKCI)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"43","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 14th UK Workshop on Computational Intelligence (UKCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UKCI.2014.6930157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 43

Abstract

A new, data density based approach to clustering is presented which automatically determines the number of clusters. By using RDE for each data sample the number of calculations is significantly reduced in offline mode and, further, the method is suitable for online use. The clusters allow a different diameter per feature/dimension creating hyper-ellipsoid clusters which are axis-orthogonal. This results in a greater differentiation between clusters where the clusters are highly asymmetrical. We illustrate this with 3 standard data sets, 1 artificial dataset and a large real dataset to demonstrate comparable results to Subtractive, Hierarchical, K-Means, ELM and DBScan clustering techniques. Unlike subtractive clustering we do not iteratively calculate P however. Unlike hierarchical we do not need O(N2) distances to be calculated and a cut-off threshold to be defined. Unlike k-means we do not need to predefine the number of clusters. Using the RDE equations to calculate the densities the algorithm is efficient, and requires no iteration to approach the optimal result. We compare the proposed algorithm to k-means, subtractive, hierarchical, ELM and DBScan clustering with respect to several criteria. The results demonstrate the validity of the proposed approach.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于数据密度的聚类

提出了一种新的基于数据密度的聚类方法，该方法可以自动确定聚类的数量。通过对每个数据样本使用RDE，可以大大减少离线模式下的计算次数，而且该方法适合在线使用。集群允许每个特征/维度的不同直径创建轴正交的超椭球集群。这导致集群之间的差异更大，集群是高度不对称的。我们用3个标准数据集、1个人工数据集和一个大型真实数据集来说明这一点，以展示与减法、分层、K-Means、ELM和DBScan聚类技术的可比较结果。不像减法聚类，我们不迭代地计算P。与分层方法不同，我们不需要计算O(N2)距离，也不需要定义截止阈值。与k-means不同，我们不需要预先定义集群的数量。使用RDE方程计算密度，该算法效率高，无需迭代即可接近最优结果。我们将提出的算法与k-means，减法，分层，ELM和DBScan聚类进行了比较。结果表明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 14th UK Workshop on Computational Intelligence (UKCI)

自引率

0.00%

发文量

期刊最新文献

PermGA algorithm for a sequential optimal space filling DoE framework Modeling neural plasticity in echo state networks for time series prediction Hybridisation of decomposition and GRASP for combinatorial multiobjective optimisation Adaptive mutation in dynamic environments Automatic image annotation with long distance spatial-context