A subspace filter supporting the discovery of small clusters in very noisy datasets

Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management Pub Date : 2014-06-30 DOI:10.1145/2618243.2618260

F. Höppner

引用次数: 3

Abstract

Feature selection becomes crucial when exploring high-dimensional datasets via clustering, because it is unlikely that the data groups jointly in all dimensions but clustering algorithms treat all attributes equally. A new subspace filter approach is presented that is capable of coping with the difficult situation of finding small clusters embedded in a very noisy environment (more noise than clustering data), which is not mislead by dense, high-dimensional spots caused by density fluctuations of single attributes. Experimental evaluation on artificial and real datasets demonstrate good performance and high efficiency.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种支持在非常嘈杂的数据集中发现小簇的子空间过滤器

当通过聚类探索高维数据集时，特征选择变得至关重要，因为数据不可能在所有维度上共同分组，但聚类算法平等地对待所有属性。提出了一种新的子空间滤波方法，该方法能够解决在非常嘈杂的环境(比聚类数据更嘈杂)中寻找嵌入的小簇的困难情况，该环境不会被单个属性密度波动引起的密集高维斑点所误导。在人工数据集和真实数据集上的实验评估表明，该方法具有良好的性能和高效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management

自引率

0.00%

发文量

期刊最新文献

Towards Co-Evolution of Data-Centric Ecosystems. Data perturbation for outlier detection ensembles SLACID - sparse linear algebra in a column-oriented in-memory database system SensorBench: benchmarking approaches to processing wireless sensor network data Efficient data management and statistics with zero-copy integration