{"title":"Shared Subscribe Hyper Simulation Optimization (SUBHSO) Algorithm for Clustering Big Data – Using Big Databases of Iran Electricity Market","authors":"Mesbaholdin Salami, F. Sobhani, M. Ghazizadeh","doi":"10.2478/acss-2019-0007","DOIUrl":null,"url":null,"abstract":"Abstract Many real world problems have big data, including recorded fields and/or attributes. In such cases, data mining requires dimension reduction techniques because there are serious challenges facing conventional clustering methods in dealing with big data. The subspace selection method is one of the most important dimension reduction techniques. In such methods, a selected set of subspaces is substituted for the general dataset of the problem and clustering is done using this set. This article introduces the Shared Subscribe Hyper Simulation Optimization (SUBHSO) algorithm to introduce the optimized cluster centres to a set of subspaces. SUBHSO uses an optimization loop for modifying and optimizing the coordinates of the cluster centres with the particle swarm optimization (PSO) and the fitness function calculation using the Monte Carlo simulation. The case study on the big data of Iran electricity market (IEM) has shown the improvement of the defined fitness function, which represents the cluster cohesion and separation relative to other dimension reduction algorithms.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"173 1","pages":"49 - 60"},"PeriodicalIF":0.5000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/acss-2019-0007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 1
Abstract
Abstract Many real world problems have big data, including recorded fields and/or attributes. In such cases, data mining requires dimension reduction techniques because there are serious challenges facing conventional clustering methods in dealing with big data. The subspace selection method is one of the most important dimension reduction techniques. In such methods, a selected set of subspaces is substituted for the general dataset of the problem and clustering is done using this set. This article introduces the Shared Subscribe Hyper Simulation Optimization (SUBHSO) algorithm to introduce the optimized cluster centres to a set of subspaces. SUBHSO uses an optimization loop for modifying and optimizing the coordinates of the cluster centres with the particle swarm optimization (PSO) and the fitness function calculation using the Monte Carlo simulation. The case study on the big data of Iran electricity market (IEM) has shown the improvement of the defined fitness function, which represents the cluster cohesion and separation relative to other dimension reduction algorithms.