Artur Starczewski, Magdalena M. Scherer, Wojciech Książek, M. Dębski, Lipo Wang
{"title":"A Novel Grid-Based Clustering Algorithm","authors":"Artur Starczewski, Magdalena M. Scherer, Wojciech Książek, M. Dębski, Lipo Wang","doi":"10.2478/jaiscr-2021-0019","DOIUrl":null,"url":null,"abstract":"Abstract Data clustering is an important method used to discover naturally occurring structures in datasets. One of the most popular approaches is the grid-based concept of clustering algorithms. This kind of method is characterized by a fast processing time and it can also discover clusters of arbitrary shapes in datasets. These properties allow these methods to be used in many different applications. Researchers have created many versions of the clustering method using the grid-based approach. However, the key issue is the right choice of the number of grid cells. This paper proposes a novel grid-based algorithm which uses a method for an automatic determining of the number of grid cells. This method is based on the kdist function which computes the distance between each element of a dataset and its kth nearest neighbor. Experimental results have been obtained for several different datasets and they confirm a very good performance of the newly proposed method.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"11 1","pages":"319 - 330"},"PeriodicalIF":3.3000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence and Soft Computing Research","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.2478/jaiscr-2021-0019","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 6
Abstract
Abstract Data clustering is an important method used to discover naturally occurring structures in datasets. One of the most popular approaches is the grid-based concept of clustering algorithms. This kind of method is characterized by a fast processing time and it can also discover clusters of arbitrary shapes in datasets. These properties allow these methods to be used in many different applications. Researchers have created many versions of the clustering method using the grid-based approach. However, the key issue is the right choice of the number of grid cells. This paper proposes a novel grid-based algorithm which uses a method for an automatic determining of the number of grid cells. This method is based on the kdist function which computes the distance between each element of a dataset and its kth nearest neighbor. Experimental results have been obtained for several different datasets and they confirm a very good performance of the newly proposed method.
期刊介绍:
Journal of Artificial Intelligence and Soft Computing Research (available also at Sciendo (De Gruyter)) is a dynamically developing international journal focused on the latest scientific results and methods constituting traditional artificial intelligence methods and soft computing techniques. Our goal is to bring together scientists representing both approaches and various research communities.