Information cut and information forces for clustering

R. Jenssen, J. Príncipe, T. Eltoft
{"title":"Information cut and information forces for clustering","authors":"R. Jenssen, J. Príncipe, T. Eltoft","doi":"10.1109/NNSP.2003.1318045","DOIUrl":null,"url":null,"abstract":"We define an information-theoretic divergence measure between probability density functions (pdfs) that has a deep connection to the cut in graph-theory. This connection is revealed when the pdfs are estimated by the Parzen method with a Gaussian kernel. We refer to our divergence measure as the information cut. The information cut provides us with a theoretically sound criterion for cluster evaluation. In this paper we show that it can be used to merge clusters. The initial clusters are obtained based on the related concept of information forces. We create directed trees by selecting the predecessor of a node (pattern) according to the direction of the information force acting on the pattern. Each directed tree corresponds to a cluster, hence enabling us to obtain an initial partitioning of the data set. Subsequently, we utilize the information cut as a cluster evaluation function to merge clusters until the predefined number of clusters is reached. We demonstrate the performance of our novel information-theoretic clustering method when applied to both artificially created data and real data, with encouraging results.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"11 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NNSP.2003.1318045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

We define an information-theoretic divergence measure between probability density functions (pdfs) that has a deep connection to the cut in graph-theory. This connection is revealed when the pdfs are estimated by the Parzen method with a Gaussian kernel. We refer to our divergence measure as the information cut. The information cut provides us with a theoretically sound criterion for cluster evaluation. In this paper we show that it can be used to merge clusters. The initial clusters are obtained based on the related concept of information forces. We create directed trees by selecting the predecessor of a node (pattern) according to the direction of the information force acting on the pattern. Each directed tree corresponds to a cluster, hence enabling us to obtain an initial partitioning of the data set. Subsequently, we utilize the information cut as a cluster evaluation function to merge clusters until the predefined number of clusters is reached. We demonstrate the performance of our novel information-theoretic clustering method when applied to both artificially created data and real data, with encouraging results.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
聚类的信息切割和信息力
我们定义了一个与图论中的切有密切联系的概率密度函数(pdf)之间的信息理论散度度量。当使用带有高斯核的Parzen方法对pdf进行估计时,就会发现这种联系。我们把散度度量称为信息切割。信息切割为聚类评价提供了理论上合理的标准。在本文中,我们证明了它可以用于聚类合并。初始聚类是基于信息力的相关概念得到的。我们通过根据作用在模式上的信息力的方向选择节点(模式)的前身来创建有向树。每个有向树对应一个簇,从而使我们能够获得数据集的初始分区。随后,我们利用信息切割作为聚类评估函数来合并聚类,直到达到预定义的聚类数量。我们展示了我们的新型信息论聚类方法在应用于人工创建的数据和真实数据时的性能,并取得了令人鼓舞的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Computational decomposition of molecular signatures based on blind source separation of non-negative dependent sources with NMF A neural network method to improve prediction of protein-protein interaction sites in heterocomplexes Neuro-variational inversion of ocean color imagery Correlation-based feature detection using pulsed neural networks Computed simultaneous imaging of multiple biomarkers
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1