Independent Component Analysis Based Seeding Method for K-Means Clustering

2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology Pub Date : 2011-08-22 DOI:10.1109/WI-IAT.2011.29

T. Onoda, Miho Sakai, S. Yamada

引用次数: 11

Abstract

The k-means clustering method is a widely used clustering technique for the Web because of its simplicity and speed. However, the clustering result depends heavily on the chosen initial clustering centers, which are chosen uniformly at random from the data points. We propose a seeding method based on the independent component analysis for the k-means clustering method. We evaluate the performance of our proposed method and compare it with other seeding methods by using benchmark datasets. We applied our proposed method to a Web corpus, which is provided by ODP. The experiments show that the normalized mutual information of our proposed method is better than the normalized mutual information of k-means clustering method and k-means++ clustering method. Therefore, the proposed method is useful for Web corpus.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于独立分量分析的k均值聚类播种方法

k-均值聚类方法是一种应用广泛的网络聚类技术，具有简单、快速等优点。然而，聚类结果在很大程度上取决于所选择的初始聚类中心，这些中心是从数据点中均匀随机选择的。针对k-means聚类方法，提出了一种基于独立分量分析的种子方法。我们通过使用基准数据集评估了我们提出的方法的性能，并将其与其他播种方法进行了比较。我们将所提出的方法应用于ODP提供的Web语料库。实验表明，本文方法的互信息归一化优于k-means聚类方法和k-means++聚类方法的互信息归一化。因此，所提出的方法对Web语料库是有用的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology

自引率

0.00%

发文量

期刊最新文献

Slovak Blog Clustering Enhanced by Mining the Web Comments Automatic Face Annotation in News Images by Mining the Web Exploiting Additional Dimensions as Virtual Items on Top-N Recommender Systems Supporting Agent Systems in the Programming Language A Software Agent Framework for Exploiting Demand-Side Consumer Social Networks in Power Systems