{"title":"Study of High-Dimensional Data Analysis based on Clustering Algorithm","authors":"Ping Zong, J. Jiang, Jun Qin","doi":"10.1109/ICCSE49874.2020.9201656","DOIUrl":null,"url":null,"abstract":"With the rapid development of big data, the scale, dimensions, diversity and sparsity of high-dimensional data restrict the effectiveness of traditional clustering algorithms. This paper mainly focuses on high-dimensional data clustering. Starting from the traditional K-means clustering algorithm and subspace clustering algorithm based on self-representation model, an improved algorithm is designed and implemented based on the existing clustering algorithm in this paper. The improved algorithm has better clustering quality by combining the \"distance optimization method\" and the \"density method\" to determine the initial clustering center. The feasibility and effectiveness of improved algorithm are verified through simulation experiments.","PeriodicalId":350703,"journal":{"name":"2020 15th International Conference on Computer Science & Education (ICCSE)","volume":"128 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 15th International Conference on Computer Science & Education (ICCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE49874.2020.9201656","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
With the rapid development of big data, the scale, dimensions, diversity and sparsity of high-dimensional data restrict the effectiveness of traditional clustering algorithms. This paper mainly focuses on high-dimensional data clustering. Starting from the traditional K-means clustering algorithm and subspace clustering algorithm based on self-representation model, an improved algorithm is designed and implemented based on the existing clustering algorithm in this paper. The improved algorithm has better clustering quality by combining the "distance optimization method" and the "density method" to determine the initial clustering center. The feasibility and effectiveness of improved algorithm are verified through simulation experiments.