基于影响指数的改进K-Means算法

Shaobo Deng, Min Li, Xuegang Li, Lei Wang, Sujie Guan
{"title":"基于影响指数的改进K-Means算法","authors":"Shaobo Deng, Min Li, Xuegang Li, Lei Wang, Sujie Guan","doi":"10.1109/ACAIT56212.2022.10137982","DOIUrl":null,"url":null,"abstract":"The k-means clustering algorithm is a very classical clustering algorithm that is widely used because of its excellent efficiency and performance. The algorithm uses Euclidean distance to calculate the similarity between samples and iteratively updates the membership matrix to obtain clustering results. However, when k-means algorithm clusters datasets containing samples with intra-cluster distances greater than inter-cluster distances, errors often occur when partitioning the boundary samples, which eventually leads to unsatisfactory results. Moreover, although k-means algorithm makes the intra-cluster distance as small as possible, it neglects to maximize the inter-cluster distance, and eventually only finds the local optimal solution. Different from the existing k-means type algorithm, this paper proposes a similarity measure based on the impact factor, which determines the partitioning result by comparing the impact of samples on each cluster. And on the basis of the objective function of k-means algorithm, we combine the inter-cluster distance to solve the defects of local optimality that exist in k-means algorithm. In the paper, we theoretically analyze and prove the proposed method, and compare and analyze the clustering results of the algorithm with the class k-means algorithm on real datasets, and confirm that the proposed algorithm in this paper can effectively avoid the defects of the class k-means algorithm.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Improved K-Means Algorithm Based on Impact Index\",\"authors\":\"Shaobo Deng, Min Li, Xuegang Li, Lei Wang, Sujie Guan\",\"doi\":\"10.1109/ACAIT56212.2022.10137982\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The k-means clustering algorithm is a very classical clustering algorithm that is widely used because of its excellent efficiency and performance. The algorithm uses Euclidean distance to calculate the similarity between samples and iteratively updates the membership matrix to obtain clustering results. However, when k-means algorithm clusters datasets containing samples with intra-cluster distances greater than inter-cluster distances, errors often occur when partitioning the boundary samples, which eventually leads to unsatisfactory results. Moreover, although k-means algorithm makes the intra-cluster distance as small as possible, it neglects to maximize the inter-cluster distance, and eventually only finds the local optimal solution. Different from the existing k-means type algorithm, this paper proposes a similarity measure based on the impact factor, which determines the partitioning result by comparing the impact of samples on each cluster. And on the basis of the objective function of k-means algorithm, we combine the inter-cluster distance to solve the defects of local optimality that exist in k-means algorithm. In the paper, we theoretically analyze and prove the proposed method, and compare and analyze the clustering results of the algorithm with the class k-means algorithm on real datasets, and confirm that the proposed algorithm in this paper can effectively avoid the defects of the class k-means algorithm.\",\"PeriodicalId\":398228,\"journal\":{\"name\":\"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACAIT56212.2022.10137982\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACAIT56212.2022.10137982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

k-means聚类算法是一种非常经典的聚类算法,由于其优异的效率和性能被广泛应用。该算法利用欧氏距离计算样本间的相似度,并迭代更新隶属矩阵,得到聚类结果。然而,当k-means算法对包含簇内距离大于簇间距离的样本的数据集进行聚类时,在划分边界样本时往往会出现错误,最终导致结果不理想。此外,k-means算法虽然使簇内距离尽可能小,但忽略了簇间距离的最大化,最终只能找到局部最优解。与现有的k-means型算法不同,本文提出了一种基于影响因子的相似性度量,通过比较样本对每个聚类的影响来确定划分结果。在k-means算法目标函数的基础上,结合聚类间距离,解决了k-means算法存在的局部最优性缺陷。本文对本文提出的方法进行了理论分析和证明,并将算法与k-means算法在真实数据集上的聚类结果进行了比较分析,证实本文提出的算法能够有效地避免k-means算法的缺陷。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Improved K-Means Algorithm Based on Impact Index
The k-means clustering algorithm is a very classical clustering algorithm that is widely used because of its excellent efficiency and performance. The algorithm uses Euclidean distance to calculate the similarity between samples and iteratively updates the membership matrix to obtain clustering results. However, when k-means algorithm clusters datasets containing samples with intra-cluster distances greater than inter-cluster distances, errors often occur when partitioning the boundary samples, which eventually leads to unsatisfactory results. Moreover, although k-means algorithm makes the intra-cluster distance as small as possible, it neglects to maximize the inter-cluster distance, and eventually only finds the local optimal solution. Different from the existing k-means type algorithm, this paper proposes a similarity measure based on the impact factor, which determines the partitioning result by comparing the impact of samples on each cluster. And on the basis of the objective function of k-means algorithm, we combine the inter-cluster distance to solve the defects of local optimality that exist in k-means algorithm. In the paper, we theoretically analyze and prove the proposed method, and compare and analyze the clustering results of the algorithm with the class k-means algorithm on real datasets, and confirm that the proposed algorithm in this paper can effectively avoid the defects of the class k-means algorithm.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Transformer with Global and Local Interaction for Pedestrian Trajectory Prediction The Use of Explainable Artificial Intelligence in Music—Take Professor Nick Bryan-Kinns’ “XAI+Music” Research as a Perspective Playing Fight the Landlord with Tree Search and Hidden Information Evaluation Evaluation Method of Innovative Economic Benefits of Enterprise Human Capital Based on Deep Learning An Attribute Contribution-Based K-Nearest Neighbor Classifier
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1