Document clustering using gravitational ensemble clustering

A. Sadeghian, H. Nezamabadi-pour
{"title":"Document clustering using gravitational ensemble clustering","authors":"A. Sadeghian, H. Nezamabadi-pour","doi":"10.1109/AISP.2015.7123481","DOIUrl":null,"url":null,"abstract":"Text Mining is a field that is considered as an extension of data mining. In the context of text mining, document clustering is used to set apart likewise documents of a collection into the identical category, called cluster, and divergent documents to distinctive groups. Since every dataset has its own characteristics, finding an appropriate clustering algorithm that can manage all kinds of clusters, is a big challenge. Clustering algorithms has theirs unique approaches for computing the number of clusters, imposing a structure on the data, and attesting the out coming clusters. The idea of combining different clustering is an effort to overwhelm the faults of single algorithms and further enhance their executions. On the other hand, inspired by the gravitational law, different clustering algorithms have been introduced that each one attempted to cluster complex datasets. Gravitational Ensemble Clustering (GEC) is an ensemble method that employs both the concepts of gravitational clustering and ensemble clustering to reach a better clustering result. This paper represents an application of GEC to the problem of document clustering. The proposed method uses a modification of the original GEC algorithm. This modification tries to produce a more varied clustering ensemble using new parameter setting. The GEC algorithm is assessed using document datasets. Promising results of the presented method were obtained in comparison with competing algorithms.","PeriodicalId":405857,"journal":{"name":"2015 The International Symposium on Artificial Intelligence and Signal Processing (AISP)","volume":"156 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 The International Symposium on Artificial Intelligence and Signal Processing (AISP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AISP.2015.7123481","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Text Mining is a field that is considered as an extension of data mining. In the context of text mining, document clustering is used to set apart likewise documents of a collection into the identical category, called cluster, and divergent documents to distinctive groups. Since every dataset has its own characteristics, finding an appropriate clustering algorithm that can manage all kinds of clusters, is a big challenge. Clustering algorithms has theirs unique approaches for computing the number of clusters, imposing a structure on the data, and attesting the out coming clusters. The idea of combining different clustering is an effort to overwhelm the faults of single algorithms and further enhance their executions. On the other hand, inspired by the gravitational law, different clustering algorithms have been introduced that each one attempted to cluster complex datasets. Gravitational Ensemble Clustering (GEC) is an ensemble method that employs both the concepts of gravitational clustering and ensemble clustering to reach a better clustering result. This paper represents an application of GEC to the problem of document clustering. The proposed method uses a modification of the original GEC algorithm. This modification tries to produce a more varied clustering ensemble using new parameter setting. The GEC algorithm is assessed using document datasets. Promising results of the presented method were obtained in comparison with competing algorithms.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用引力系综聚类的文献聚类
文本挖掘是一个被认为是数据挖掘的扩展领域。在文本挖掘的上下文中,文档聚类用于同样地将集合中的文档划分为相同的类别(称为聚类),并将不同的文档划分为不同的组。由于每个数据集都有自己的特点,找到一个合适的聚类算法,可以管理各种类型的聚类,是一个很大的挑战。聚类算法有其独特的方法来计算聚类的数量,对数据施加结构,并证明即将出现的聚类。将不同的聚类结合起来的想法是为了克服单个算法的缺陷,并进一步提高它们的执行能力。另一方面,受引力定律的启发,引入了不同的聚类算法,每个算法都试图聚类复杂的数据集。引力系综聚类(GEC)是一种将引力聚类和系综聚类的概念结合在一起以达到更好聚类效果的聚类方法。本文介绍了GEC在文档聚类问题中的一个应用。该方法对原有的GEC算法进行了改进。这个修改尝试使用新的参数设置来产生更多样化的聚类集成。GEC算法使用文档数据集进行评估。通过与竞争算法的比较,得到了令人满意的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Small target detection and tracking based on the background elimination and Kalman filter A novel image watermarking scheme using blocks coefficient in DHT domain Latent space model for analysis of conventions A new algorithm for data clustering based on gravitational search algorithm and genetic operators Learning a new distance metric to improve an SVM-clustering based intrusion detection system
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1