{"title":"基于聚类算法的高维数据分析研究","authors":"Ping Zong, J. Jiang, Jun Qin","doi":"10.1109/ICCSE49874.2020.9201656","DOIUrl":null,"url":null,"abstract":"With the rapid development of big data, the scale, dimensions, diversity and sparsity of high-dimensional data restrict the effectiveness of traditional clustering algorithms. This paper mainly focuses on high-dimensional data clustering. Starting from the traditional K-means clustering algorithm and subspace clustering algorithm based on self-representation model, an improved algorithm is designed and implemented based on the existing clustering algorithm in this paper. The improved algorithm has better clustering quality by combining the \"distance optimization method\" and the \"density method\" to determine the initial clustering center. The feasibility and effectiveness of improved algorithm are verified through simulation experiments.","PeriodicalId":350703,"journal":{"name":"2020 15th International Conference on Computer Science & Education (ICCSE)","volume":"128 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Study of High-Dimensional Data Analysis based on Clustering Algorithm\",\"authors\":\"Ping Zong, J. Jiang, Jun Qin\",\"doi\":\"10.1109/ICCSE49874.2020.9201656\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of big data, the scale, dimensions, diversity and sparsity of high-dimensional data restrict the effectiveness of traditional clustering algorithms. This paper mainly focuses on high-dimensional data clustering. Starting from the traditional K-means clustering algorithm and subspace clustering algorithm based on self-representation model, an improved algorithm is designed and implemented based on the existing clustering algorithm in this paper. The improved algorithm has better clustering quality by combining the \\\"distance optimization method\\\" and the \\\"density method\\\" to determine the initial clustering center. The feasibility and effectiveness of improved algorithm are verified through simulation experiments.\",\"PeriodicalId\":350703,\"journal\":{\"name\":\"2020 15th International Conference on Computer Science & Education (ICCSE)\",\"volume\":\"128 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 15th International Conference on Computer Science & Education (ICCSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCSE49874.2020.9201656\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 15th International Conference on Computer Science & Education (ICCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE49874.2020.9201656","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Study of High-Dimensional Data Analysis based on Clustering Algorithm
With the rapid development of big data, the scale, dimensions, diversity and sparsity of high-dimensional data restrict the effectiveness of traditional clustering algorithms. This paper mainly focuses on high-dimensional data clustering. Starting from the traditional K-means clustering algorithm and subspace clustering algorithm based on self-representation model, an improved algorithm is designed and implemented based on the existing clustering algorithm in this paper. The improved algorithm has better clustering quality by combining the "distance optimization method" and the "density method" to determine the initial clustering center. The feasibility and effectiveness of improved algorithm are verified through simulation experiments.