约束集成聚类框架中的自组织映射

R. Visakh
{"title":"约束集成聚类框架中的自组织映射","authors":"R. Visakh","doi":"10.1109/ISDA.2012.6416541","DOIUrl":null,"url":null,"abstract":"Clustering is a predominant data mining task which attempts to partition a group of unlabelled data instances into distinct clusters. The clusters so obtained will have maximum intra-cluster similarity and minimum inter-cluster similarity. Several clustering techniques have been proposed in literature, which includes stand-alone as well as ensemble clustering techniques. Most of them lack robustness and suffer from an important drawback that they cannot effectively visualize clustering results to help knowledge discovery and constructive learning. Recently, clustering techniques via visualization of data have been proposed. These rely on building a Self Organizing Map (SOM) originally proposed by Kohonen. Even though Kohonen SOM preserves topology of the input data, it is widely observed that the clustering accuracy achieved by SOM is poor. To perform robust and accurate clustering using SOM, a cluster ensemble framework based on input constraints is proposed in this paper. Cluster ensemble is a set of clustering solutions obtained as a result of individual clustering on subsets of the original high-dimensional data. The final consensus matrix is fed to a neural network which transforms the input data to a lower-dimensional output map. The map clearly depicts the distribution of input data instances into clusters.","PeriodicalId":370150,"journal":{"name":"2012 12th International Conference on Intelligent Systems Design and Applications (ISDA)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Using Self-Organizing Maps in constrained ensemble clustering framework\",\"authors\":\"R. Visakh\",\"doi\":\"10.1109/ISDA.2012.6416541\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering is a predominant data mining task which attempts to partition a group of unlabelled data instances into distinct clusters. The clusters so obtained will have maximum intra-cluster similarity and minimum inter-cluster similarity. Several clustering techniques have been proposed in literature, which includes stand-alone as well as ensemble clustering techniques. Most of them lack robustness and suffer from an important drawback that they cannot effectively visualize clustering results to help knowledge discovery and constructive learning. Recently, clustering techniques via visualization of data have been proposed. These rely on building a Self Organizing Map (SOM) originally proposed by Kohonen. Even though Kohonen SOM preserves topology of the input data, it is widely observed that the clustering accuracy achieved by SOM is poor. To perform robust and accurate clustering using SOM, a cluster ensemble framework based on input constraints is proposed in this paper. Cluster ensemble is a set of clustering solutions obtained as a result of individual clustering on subsets of the original high-dimensional data. The final consensus matrix is fed to a neural network which transforms the input data to a lower-dimensional output map. The map clearly depicts the distribution of input data instances into clusters.\",\"PeriodicalId\":370150,\"journal\":{\"name\":\"2012 12th International Conference on Intelligent Systems Design and Applications (ISDA)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 12th International Conference on Intelligent Systems Design and Applications (ISDA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISDA.2012.6416541\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 12th International Conference on Intelligent Systems Design and Applications (ISDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISDA.2012.6416541","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

聚类是一种主要的数据挖掘任务,它试图将一组未标记的数据实例划分为不同的集群。这样得到的聚类具有最大的簇内相似度和最小的簇间相似度。文献中提出了几种聚类技术,包括独立聚类技术和集成聚类技术。它们大多缺乏鲁棒性,并且存在一个重要的缺点,即它们不能有效地将聚类结果可视化,以帮助知识发现和建设性学习。近年来,人们提出了基于数据可视化的聚类技术。这些依赖于Kohonen最初提出的构建自组织地图(SOM)。尽管Kohonen SOM保留了输入数据的拓扑结构,但广泛观察到SOM的聚类精度较差。为了使用SOM实现鲁棒性和准确性的聚类,本文提出了一种基于输入约束的聚类集成框架。聚类集成是对原始高维数据的子集进行单独聚类而得到的一组聚类解。最终的共识矩阵被送入神经网络,神经网络将输入数据转换为低维输出映射。该映射清楚地描述了输入数据实例在集群中的分布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Using Self-Organizing Maps in constrained ensemble clustering framework
Clustering is a predominant data mining task which attempts to partition a group of unlabelled data instances into distinct clusters. The clusters so obtained will have maximum intra-cluster similarity and minimum inter-cluster similarity. Several clustering techniques have been proposed in literature, which includes stand-alone as well as ensemble clustering techniques. Most of them lack robustness and suffer from an important drawback that they cannot effectively visualize clustering results to help knowledge discovery and constructive learning. Recently, clustering techniques via visualization of data have been proposed. These rely on building a Self Organizing Map (SOM) originally proposed by Kohonen. Even though Kohonen SOM preserves topology of the input data, it is widely observed that the clustering accuracy achieved by SOM is poor. To perform robust and accurate clustering using SOM, a cluster ensemble framework based on input constraints is proposed in this paper. Cluster ensemble is a set of clustering solutions obtained as a result of individual clustering on subsets of the original high-dimensional data. The final consensus matrix is fed to a neural network which transforms the input data to a lower-dimensional output map. The map clearly depicts the distribution of input data instances into clusters.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Prediction of risk score for heart disease using associative classification and hybrid feature subset selection WSDL-TC: Collaborative customization of web services Knowledge representation and reasoning based on generalised fuzzy Petri nets Interval-valued fuzzy graph representation of concept lattice Community optimization: Function optimization by a simulated web community
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1