A New Approach for Subspace Clustering of High Dimensional Data

Q4 Computer Science International Journal of Computer Science and Applications Pub Date : 2014-05-01 DOI:10.14355/IJCSA.2014.0302.02

M. Suguna, S. Palaniammal

引用次数: 1

Abstract

Clustering high dimensional data is an emerging research area. The similarity criterion used by the traditional clustering algorithms is inadequate in high dimensional space. Also some of the dimensions are likely to be irrelevant thus hiding a possible clustering. Subspace clustering is an extension of traditional clustering that attempts to find clusters in different subspaces within a dataset. This paper proposes an idea by giving weight to every node of a cluster in a subspace. The cluster with greatest weight value will have more number of nodes when compared to all other clusters. This method of assigning weight can be done in two ways such as top down and bottom up. A threshold value is fixed and clusters with value greater than threshold only will be taken into consideration. The discovery of clusters in selected subspaces will be made easily with the process of assigning weight to nodes. This method will surely result in reduction of search space.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

高维数据子空间聚类的一种新方法

高维数据聚类是一个新兴的研究领域。传统聚类算法使用的相似度准则在高维空间中存在不足。此外，一些维度可能是不相关的，因此隐藏了可能的聚类。子空间聚类是传统聚类的扩展，它试图在数据集中的不同子空间中找到聚类。本文提出了一种通过赋予子空间中聚类的每个节点权值的方法。与所有其他集群相比，权重值最大的集群将拥有更多的节点数量。这种分配权重的方法有两种方式，如自上而下和自下而上。阈值是固定的，只有大于阈值的集群才会被考虑。通过为节点分配权重的过程，可以很容易地在选定的子空间中发现簇。这种方法必然会减少搜索空间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Computer Science and Applications Computer Science-Computer Science Applications

自引率

0.00%

发文量

期刊介绍： IJCSA is an international forum for scientists and engineers involved in computer science and its applications to publish high quality and refereed papers. Papers reporting original research and innovative applications from all parts of the world are welcome. Papers for publication in the IJCSA are selected through rigorous peer review to ensure originality, timeliness, relevance, and readability.