Clustering method using hypergraph models based on Set Pair Analysis

Guo-ping Lin, Shao-Zi Li
{"title":"Clustering method using hypergraph models based on Set Pair Analysis","authors":"Guo-ping Lin, Shao-Zi Li","doi":"10.1109/ITIME.2009.5236279","DOIUrl":null,"url":null,"abstract":"Text clustering methods can be used to structure large sets of text or hypertext documents. However, a lot of well-known methods for text clustering do not really address the special problems of text clustering: very high dimensionality of the data and understandability of the cluster description. In this paper, we introduce a novel approach which is based on the hypergraph model of text clustering by using Set Pair Analysis (SPA) that is a new methodology to describe and process system uncertainty. In this method, we define a new measure for text similarity by the identical, different, and contrary of Set Pair. After setting up the hypergraph model, a hypergraph partitioning algorithm will be used to find clusters. The new method can eliminate disadvantageous factors and decreases the textual dimension of text and enhances the speed and accuracy of the text clustering. The experiment demonstrates that our approach is applicable and effective in high dimensional textual datasets.","PeriodicalId":398477,"journal":{"name":"2009 IEEE International Symposium on IT in Medicine & Education","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Symposium on IT in Medicine & Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITIME.2009.5236279","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Text clustering methods can be used to structure large sets of text or hypertext documents. However, a lot of well-known methods for text clustering do not really address the special problems of text clustering: very high dimensionality of the data and understandability of the cluster description. In this paper, we introduce a novel approach which is based on the hypergraph model of text clustering by using Set Pair Analysis (SPA) that is a new methodology to describe and process system uncertainty. In this method, we define a new measure for text similarity by the identical, different, and contrary of Set Pair. After setting up the hypergraph model, a hypergraph partitioning algorithm will be used to find clusters. The new method can eliminate disadvantageous factors and decreases the textual dimension of text and enhances the speed and accuracy of the text clustering. The experiment demonstrates that our approach is applicable and effective in high dimensional textual datasets.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于集对分析的超图模型聚类方法
文本聚类方法可用于构建大型文本集或超文本文档。然而,许多众所周知的文本聚类方法并没有真正解决文本聚类的特殊问题:数据的高维度和聚类描述的可理解性。本文提出了一种基于文本聚类的超图模型,利用集对分析(SPA)来描述和处理系统不确定性的新方法。在该方法中,我们通过集合对的相同、不同和相反定义了一种新的文本相似度度量。建立超图模型后,将使用超图划分算法来查找聚类。该方法消除了不利因素,降低了文本的文本维数,提高了文本聚类的速度和准确性。实验结果表明,该方法在高维文本数据集上是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The design and implementation of campus network-based experimental materials management system Construction of engineering training center and the cultivation of talents for petroleum machinery Research and implementation of Course Teaching-Learning Process Management System The detecting technology for the transient feeble optical detection system Survey on demand for accounting talents and evaluation of professional competence
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1