A-MKMC: An effective adaptive-based multilevel K-means clustering with optimal centroid selection using hybrid heuristic approach for handling the incomplete data

IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Data & Knowledge Engineering Pub Date : 2023-11-22 DOI:10.1016/j.datak.2023.102243
Hima Vijayan , Subramaniam M , Sathiyasekar K
{"title":"A-MKMC: An effective adaptive-based multilevel K-means clustering with optimal centroid selection using hybrid heuristic approach for handling the incomplete data","authors":"Hima Vijayan ,&nbsp;Subramaniam M ,&nbsp;Sathiyasekar K","doi":"10.1016/j.datak.2023.102243","DOIUrl":null,"url":null,"abstract":"<div><p><span><span>In general, clustering is defined as partitioning similar and dissimilar objects into several groups. It has been widely used in applications like pattern recognition, image processing, and data analysis. When the dataset contains some missing data or value, it is termed incomplete data. In such implications, the incomplete dataset issue is untreatable while validating the data. Due to these flaws, the quality or standard level of the data gets an impact. Hence, the handling of missing values is done by influencing the clustering mechanisms for sorting out the missing data. Yet, the traditional </span>clustering algorithms<span> fail to combat the issues as it is not supposed to maintain large dimensional data. It is also caused by errors of human intervention or inaccurate outcomes. To alleviate the challenging issue of incomplete data, a novel clustering algorithm is proposed. Initially, incomplete or mixed data is garnered from the five different standard data sources. Once the data is to be collected, it is undergone the pre-processing phase, which is accomplished using data normalization. Subsequently, the final step is processed by the new clustering algorithm that is termed Adaptive centroid based Multilevel K-Means Clustering (A-MKMC), in which the cluster centroid is optimized by integrating the two conventional algorithms such as Border Collie Optimization (BCO) and </span></span>Whale Optimization Algorithm<span> (WOA) named as Hybrid Border Collie Whale Optimization (HBCWO). Therefore, the validation of the novel clustering model is estimated using various measures and compared against traditional mechanisms. From the overall result analysis, the accuracy and precision of the designed HBCWO-A-MKMC method attain 93 % and 95 %. Hence, the adaptive clustering process exploits the higher performance that aids in sorting out the missing data issuecompared to the other conventional methods.</span></p></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"150 ","pages":"Article 102243"},"PeriodicalIF":2.7000,"publicationDate":"2023-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X23001039","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In general, clustering is defined as partitioning similar and dissimilar objects into several groups. It has been widely used in applications like pattern recognition, image processing, and data analysis. When the dataset contains some missing data or value, it is termed incomplete data. In such implications, the incomplete dataset issue is untreatable while validating the data. Due to these flaws, the quality or standard level of the data gets an impact. Hence, the handling of missing values is done by influencing the clustering mechanisms for sorting out the missing data. Yet, the traditional clustering algorithms fail to combat the issues as it is not supposed to maintain large dimensional data. It is also caused by errors of human intervention or inaccurate outcomes. To alleviate the challenging issue of incomplete data, a novel clustering algorithm is proposed. Initially, incomplete or mixed data is garnered from the five different standard data sources. Once the data is to be collected, it is undergone the pre-processing phase, which is accomplished using data normalization. Subsequently, the final step is processed by the new clustering algorithm that is termed Adaptive centroid based Multilevel K-Means Clustering (A-MKMC), in which the cluster centroid is optimized by integrating the two conventional algorithms such as Border Collie Optimization (BCO) and Whale Optimization Algorithm (WOA) named as Hybrid Border Collie Whale Optimization (HBCWO). Therefore, the validation of the novel clustering model is estimated using various measures and compared against traditional mechanisms. From the overall result analysis, the accuracy and precision of the designed HBCWO-A-MKMC method attain 93 % and 95 %. Hence, the adaptive clustering process exploits the higher performance that aids in sorting out the missing data issuecompared to the other conventional methods.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A-MKMC:一种有效的基于自适应的多级k -均值聚类方法,采用混合启发式方法进行最优质心选择,用于处理不完整数据
通常,聚类被定义为将相似和不相似的对象划分为几个组。它被广泛应用于模式识别、图像处理和数据分析等领域。当数据集包含一些缺失的数据或值时,它被称为不完整数据。在这种情况下,验证数据时无法处理不完整的数据集问题。由于这些缺陷,数据的质量或标准水平受到影响。因此,对缺失值的处理是通过影响分类缺失数据的聚类机制来完成的。然而,传统的聚类算法无法解决这些问题,因为它不应该维护大维度的数据。它也由人为干预的错误或不准确的结果引起。为了解决数据不完整的难题,提出了一种新的聚类算法。最初,从五个不同的标准数据源收集不完整或混合的数据。一旦要收集数据,它就会经历预处理阶段,这是使用数据规范化来完成的。最后一步采用基于自适应质心的多层次k -均值聚类(A-MKMC)聚类算法,将边界牧羊犬优化算法(BCO)和鲸鱼优化算法(WOA)结合起来进行聚类质心的优化,称为混合边界牧羊犬鲸鱼优化算法(HBCWO)。因此,使用各种度量来估计新聚类模型的有效性,并与传统机制进行比较。从总体结果分析来看,所设计的HBCWO-A-MKMC方法的准确度和精密度分别达到93%和95%。因此,与其他传统方法相比,自适应聚类过程利用了更高的性能,有助于整理丢失的数据问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Data & Knowledge Engineering
Data & Knowledge Engineering 工程技术-计算机:人工智能
CiteScore
5.00
自引率
0.00%
发文量
66
审稿时长
6 months
期刊介绍: Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.
期刊最新文献
Goal modelling in aeronautics: Practical applications for aircraft and manufacturing designs Ethical reasoning methods for ICT: What they are and when to use them SSQTKG: A Subgraph-based Semantic Query Approach for Temporal Knowledge Graph NoSQL document data migration strategy in the context of schema evolution VarClaMM: A reference meta-model to understand DNA variant classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1