CLUSTERIZATION OF DATA ARRAYS BASED ON COMBINED OPTIMIZATION OF DISTRIBUTION DENSITY FUNCTIONS AND THE EVOLUTIONARY METHOD OF CAT SWARM

IF 0.3 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Radio Electronics Computer Science Control Pub Date : 2022-12-05 DOI:10.15588/1607-3274-2022-4-5

Y. Bodyanskiy, I. Pliss, A. Shafronenko

{"title":"CLUSTERIZATION OF DATA ARRAYS BASED ON COMBINED OPTIMIZATION OF DISTRIBUTION DENSITY FUNCTIONS AND THE EVOLUTIONARY METHOD OF CAT SWARM","authors":"Y. Bodyanskiy, I. Pliss, A. Shafronenko","doi":"10.15588/1607-3274-2022-4-5","DOIUrl":null,"url":null,"abstract":"Context. The task of clustering arrays of observations of an arbitrary nature is an integral part of Data Mining, and in the more general case of Data Science, a huge number of approaches have been proposed for its solution, which differ from each other both in a priori assumptions regarding the physical nature of the data and the problem, and in the mathematical apparatus. From a computational point of view, the clustering problem turns into a problem of finding local extrema of a multiextremal function of the vector density argument using gradient procedures that are repeatedly launched from different points of the initial data array. It is possible to speed up the process of searching for these extremes by using the ideas of evolutionary optimization, which includes algorithms inspired by nature, swarm algorithms, population algorithms, etc. \nObjective. The purpose of the work is to introduce a data clustering procedure based on the peaks of the data distribution density and the evolutionary method of cat swarms, that combines the main advantages of methods for working with data in conditions of overlapping classes, is characterized by high-quality clustering, high speed and accuracy of the obtained results. \nMethod. The method for clustering data arrays based on the combined optimization of distribution density functions and the evolutionary method of cat swarms was proposed. The advantage of the proposed approach is to reduce the time for solving optimization problems in conditions where clusters are overlap. \nResults. The results of the experiments confirm the effectiveness of the proposed approach in clustering problems under the condition of classes that overlap and allow us to recommend the proposed method for use in practice to solve problems of automatic clustering big data. \nConclusions. The method for clustering data arrays based on the combined optimization of distribution density functions and the evolutionary method of cat swarm was proposed. The advantage of the proposed approach is to reduce the time for solving optimization problems in conditions where clusters are overlap. The method is quite simple from the numerical implementation and is not critical for choosing an optimization procedure. The experimental results confirm the effectiveness of the proposed approach in clustering problems under conditions of overlapping clusters.","PeriodicalId":43783,"journal":{"name":"Radio Electronics Computer Science Control","volume":"12 1","pages":""},"PeriodicalIF":0.3000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radio Electronics Computer Science Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15588/1607-3274-2022-4-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 2

Abstract

Context. The task of clustering arrays of observations of an arbitrary nature is an integral part of Data Mining, and in the more general case of Data Science, a huge number of approaches have been proposed for its solution, which differ from each other both in a priori assumptions regarding the physical nature of the data and the problem, and in the mathematical apparatus. From a computational point of view, the clustering problem turns into a problem of finding local extrema of a multiextremal function of the vector density argument using gradient procedures that are repeatedly launched from different points of the initial data array. It is possible to speed up the process of searching for these extremes by using the ideas of evolutionary optimization, which includes algorithms inspired by nature, swarm algorithms, population algorithms, etc. Objective. The purpose of the work is to introduce a data clustering procedure based on the peaks of the data distribution density and the evolutionary method of cat swarms, that combines the main advantages of methods for working with data in conditions of overlapping classes, is characterized by high-quality clustering, high speed and accuracy of the obtained results. Method. The method for clustering data arrays based on the combined optimization of distribution density functions and the evolutionary method of cat swarms was proposed. The advantage of the proposed approach is to reduce the time for solving optimization problems in conditions where clusters are overlap. Results. The results of the experiments confirm the effectiveness of the proposed approach in clustering problems under the condition of classes that overlap and allow us to recommend the proposed method for use in practice to solve problems of automatic clustering big data. Conclusions. The method for clustering data arrays based on the combined optimization of distribution density functions and the evolutionary method of cat swarm was proposed. The advantage of the proposed approach is to reduce the time for solving optimization problems in conditions where clusters are overlap. The method is quite simple from the numerical implementation and is not critical for choosing an optimization procedure. The experimental results confirm the effectiveness of the proposed approach in clustering problems under conditions of overlapping clusters.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于分布密度函数联合优化和猫群进化方法的数据阵列聚类

上下文。对任意性质的观察结果进行聚类的任务是数据挖掘的一个组成部分，在数据科学的更一般的情况下，已经提出了大量的方法来解决它，这些方法在关于数据和问题的物理性质的先验假设以及数学装置方面彼此不同。从计算的角度来看，聚类问题变成了一个使用梯度过程寻找向量密度参数的多极值函数的局部极值的问题，梯度过程从初始数据数组的不同点反复启动。利用进化优化的思想可以加速寻找这些极端的过程，进化优化包括受自然启发的算法、群体算法、种群算法等。目标。本文的目的是引入一种基于数据分布密度峰值的数据聚类方法和cat群的进化方法，该方法结合了在重叠类条件下处理数据的方法的主要优点，具有聚类质量高、速度快、结果准确的特点。方法。提出了一种基于分布密度函数优化与猫群进化方法相结合的数据阵列聚类方法。该方法的优点是减少了在集群重叠的情况下求解优化问题的时间。结果。实验结果证实了本文方法在类重叠情况下聚类问题的有效性，并推荐本文方法用于实际解决自动聚类大数据问题。结论。提出了一种基于分布密度函数优化和猫群进化方法的数据阵列聚类方法。该方法的优点是减少了在集群重叠的情况下求解优化问题的时间。从数值实现来看，该方法非常简单，对于选择优化程序并不重要。实验结果证实了该方法在重叠聚类条件下处理聚类问题的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Radio Electronics Computer Science Control COMPUTER SCIENCE, HARDWARE & ARCHITECTURE-

自引率

20.00%

发文量

审稿时长

12 weeks