{"title":"k -均值问题遗传算法中变异算子的比较研究","authors":"Ri-Zhi Li, L. Kazakovtsev","doi":"10.22190/FUMI2004091L","DOIUrl":null,"url":null,"abstract":"The k-means problem and the algorithm of the same name are the most commonly used clustering model and algorithm. Being a local search optimization method, the k-means algorithm falls to a local minimum of the objective function (sum of squared errors) and depends on the initial solution which is given or selected randomly. This disadvantage of the algorithm can be avoided by combining this algorithm with more sophisticated methods such as the Variable Neighborhood Search, agglomerative or dissociative heuristic approaches, the genetic algorithms, etc. Aiming at the shortcomings of the k-means algorithm and combining the advantages of the k-means algorithm and rvolutionary approack, a genetic clustering algorithm with the cross-mutation operator was designed. The efficiency of the genetic algorithms with the tournament selection, one-point crossover and various mutation operators (without any mutation operator, with the uniform mutation, DBM mutation and new cross-mutation) are compared on the data sets up to 2 millions of data vectors. We used data from the UCI repository and special data set collected during the testing of the highly reliable semiconductor components. In this paper, we do not discuss the comparative efficiency of the genetic algorithms for the k-means problem in comparison with the other (non-genetic) algorithms as well as the comparative adequacy of the k-means clustering model. Here, we focus on the influence of various mutation operators on the efficiency of the genetic algorithms only.","PeriodicalId":54148,"journal":{"name":"Facta Universitatis-Series Mathematics and Informatics","volume":"1 1","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2021-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"COMPARATIVE STUDY OF MUTATION OPERATORS IN THE GENETIC ALGORITHMS FOR THE K-MEANS PROBLEM\",\"authors\":\"Ri-Zhi Li, L. Kazakovtsev\",\"doi\":\"10.22190/FUMI2004091L\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The k-means problem and the algorithm of the same name are the most commonly used clustering model and algorithm. Being a local search optimization method, the k-means algorithm falls to a local minimum of the objective function (sum of squared errors) and depends on the initial solution which is given or selected randomly. This disadvantage of the algorithm can be avoided by combining this algorithm with more sophisticated methods such as the Variable Neighborhood Search, agglomerative or dissociative heuristic approaches, the genetic algorithms, etc. Aiming at the shortcomings of the k-means algorithm and combining the advantages of the k-means algorithm and rvolutionary approack, a genetic clustering algorithm with the cross-mutation operator was designed. The efficiency of the genetic algorithms with the tournament selection, one-point crossover and various mutation operators (without any mutation operator, with the uniform mutation, DBM mutation and new cross-mutation) are compared on the data sets up to 2 millions of data vectors. We used data from the UCI repository and special data set collected during the testing of the highly reliable semiconductor components. In this paper, we do not discuss the comparative efficiency of the genetic algorithms for the k-means problem in comparison with the other (non-genetic) algorithms as well as the comparative adequacy of the k-means clustering model. Here, we focus on the influence of various mutation operators on the efficiency of the genetic algorithms only.\",\"PeriodicalId\":54148,\"journal\":{\"name\":\"Facta Universitatis-Series Mathematics and Informatics\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2021-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Facta Universitatis-Series Mathematics and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22190/FUMI2004091L\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Facta Universitatis-Series Mathematics and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22190/FUMI2004091L","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}
COMPARATIVE STUDY OF MUTATION OPERATORS IN THE GENETIC ALGORITHMS FOR THE K-MEANS PROBLEM
The k-means problem and the algorithm of the same name are the most commonly used clustering model and algorithm. Being a local search optimization method, the k-means algorithm falls to a local minimum of the objective function (sum of squared errors) and depends on the initial solution which is given or selected randomly. This disadvantage of the algorithm can be avoided by combining this algorithm with more sophisticated methods such as the Variable Neighborhood Search, agglomerative or dissociative heuristic approaches, the genetic algorithms, etc. Aiming at the shortcomings of the k-means algorithm and combining the advantages of the k-means algorithm and rvolutionary approack, a genetic clustering algorithm with the cross-mutation operator was designed. The efficiency of the genetic algorithms with the tournament selection, one-point crossover and various mutation operators (without any mutation operator, with the uniform mutation, DBM mutation and new cross-mutation) are compared on the data sets up to 2 millions of data vectors. We used data from the UCI repository and special data set collected during the testing of the highly reliable semiconductor components. In this paper, we do not discuss the comparative efficiency of the genetic algorithms for the k-means problem in comparison with the other (non-genetic) algorithms as well as the comparative adequacy of the k-means clustering model. Here, we focus on the influence of various mutation operators on the efficiency of the genetic algorithms only.