基于邻域计算样本点稳定性的聚类集成算法

TongLing Lou
{"title":"基于邻域计算样本点稳定性的聚类集成算法","authors":"TongLing Lou","doi":"10.1109/ISCEIC53685.2021.00015","DOIUrl":null,"url":null,"abstract":"In the process of clustering ensemble, different sample points play different roles in the ensemble results, and the certainty of each sample point in the distribution to each cluster is also different. In order to reduce the impact of this uncertainty on clustering results, some scholars proposed the concept of sample stability. In this paper, we propose to calculate the stability of sample points by calculating the probability of the occurrence of sample points and sample points in their neighborhood in the same cluster of different base clusters, and propose an algorithm framework based on this calculation method. In this paper, the original data are first clustered to calculate the Mahalanobis distance between the sample points. Then, the co-occurrence probability of the target sample point and its nearest K sample points is calculated. According to the cooccurrence probability, the stability of each sample point is calculated. First, the stable sample points are hard clustered, and then the unstable sample points are assigned to the nearest cluster. The effectiveness of the proposed clustering ensemble algorithm is verified on benchmark datasets.","PeriodicalId":342968,"journal":{"name":"2021 2nd International Symposium on Computer Engineering and Intelligent Communications (ISCEIC)","volume":"205 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Clustering Ensemble Algorithm of Computing Stability of Sample Points Based on Neighborhood\",\"authors\":\"TongLing Lou\",\"doi\":\"10.1109/ISCEIC53685.2021.00015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the process of clustering ensemble, different sample points play different roles in the ensemble results, and the certainty of each sample point in the distribution to each cluster is also different. In order to reduce the impact of this uncertainty on clustering results, some scholars proposed the concept of sample stability. In this paper, we propose to calculate the stability of sample points by calculating the probability of the occurrence of sample points and sample points in their neighborhood in the same cluster of different base clusters, and propose an algorithm framework based on this calculation method. In this paper, the original data are first clustered to calculate the Mahalanobis distance between the sample points. Then, the co-occurrence probability of the target sample point and its nearest K sample points is calculated. According to the cooccurrence probability, the stability of each sample point is calculated. First, the stable sample points are hard clustered, and then the unstable sample points are assigned to the nearest cluster. The effectiveness of the proposed clustering ensemble algorithm is verified on benchmark datasets.\",\"PeriodicalId\":342968,\"journal\":{\"name\":\"2021 2nd International Symposium on Computer Engineering and Intelligent Communications (ISCEIC)\",\"volume\":\"205 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 2nd International Symposium on Computer Engineering and Intelligent Communications (ISCEIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCEIC53685.2021.00015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 2nd International Symposium on Computer Engineering and Intelligent Communications (ISCEIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCEIC53685.2021.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在聚类集成过程中,不同的样本点在集成结果中所起的作用是不同的,每个样本点在向每个聚类分布中的确定性也是不同的。为了减少这种不确定性对聚类结果的影响,一些学者提出了样本稳定性的概念。本文提出通过计算样本点及其邻域样本点在不同基聚类的同一聚类中出现的概率来计算样本点的稳定性,并提出了基于该计算方法的算法框架。本文首先对原始数据进行聚类,计算样本点之间的马氏距离。然后,计算目标样本点与其最近的K个样本点的共现概率。根据共现概率,计算每个样本点的稳定性。首先对稳定的样本点进行硬聚类,然后将不稳定的样本点分配到最近的聚类中。在基准数据集上验证了所提聚类集成算法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Clustering Ensemble Algorithm of Computing Stability of Sample Points Based on Neighborhood
In the process of clustering ensemble, different sample points play different roles in the ensemble results, and the certainty of each sample point in the distribution to each cluster is also different. In order to reduce the impact of this uncertainty on clustering results, some scholars proposed the concept of sample stability. In this paper, we propose to calculate the stability of sample points by calculating the probability of the occurrence of sample points and sample points in their neighborhood in the same cluster of different base clusters, and propose an algorithm framework based on this calculation method. In this paper, the original data are first clustered to calculate the Mahalanobis distance between the sample points. Then, the co-occurrence probability of the target sample point and its nearest K sample points is calculated. According to the cooccurrence probability, the stability of each sample point is calculated. First, the stable sample points are hard clustered, and then the unstable sample points are assigned to the nearest cluster. The effectiveness of the proposed clustering ensemble algorithm is verified on benchmark datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Research on the Mechanical Zero Position Capture and Transfer of Steering Gear Based on Machine Vision Adaptive image watermarking algorithm based on visual characteristics Gaussian Image Denoising Method Based on the Dual Channel Deep Neural Network with the Skip Connection Design and Realization of Drum Level Control System for 300MW Unit New energy charging pile planning in residential area based on improved genetic algorithm
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1