{"title":"基于噪声的多层次密度空间聚类应用","authors":"Shimei Wang, Yun Liu, Bo Shen","doi":"10.1145/2925995.2926040","DOIUrl":null,"url":null,"abstract":"With the rapid development of information technology, more and more complex data has been produced. It has practical significance to mine valuable information from the complex data. Clustering is an important research in the field of data mining. As a density-based clustering algorithm, DBSCAN is sensitive to the input parameters and difficult to find out all the meaningful clusters for datasets with varied densities. Aiming at this shortcoming, this paper proposed the MDBSCAN algorithm. The algorithm can generate two different density parameters by statistical method, and then the clustering can be more accurate for datasets with varied densities. At first, the algorithm uses adjacency list to store the graph generated by the datasets with one parameter Eps. Adjacency list which has been established in the first step is conducive to generate the varied densities parameters MinPts0 and MinPts1. Then, based on the parameters and adjacency list, the clustering algorithm can be implemented more accurately. Finally, compared with algorithm DBSCAN, the experimental results show that the proposed algorithm has higher accuracy in clustering the datasets with varied densities while they have similar running time.","PeriodicalId":159180,"journal":{"name":"Proceedings of the The 11th International Knowledge Management in Organizations Conference on The changing face of Knowledge Management Impacting Society","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"MDBSCAN: Multi-level Density Based Spatial Clustering of Applications with Noise\",\"authors\":\"Shimei Wang, Yun Liu, Bo Shen\",\"doi\":\"10.1145/2925995.2926040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of information technology, more and more complex data has been produced. It has practical significance to mine valuable information from the complex data. Clustering is an important research in the field of data mining. As a density-based clustering algorithm, DBSCAN is sensitive to the input parameters and difficult to find out all the meaningful clusters for datasets with varied densities. Aiming at this shortcoming, this paper proposed the MDBSCAN algorithm. The algorithm can generate two different density parameters by statistical method, and then the clustering can be more accurate for datasets with varied densities. At first, the algorithm uses adjacency list to store the graph generated by the datasets with one parameter Eps. Adjacency list which has been established in the first step is conducive to generate the varied densities parameters MinPts0 and MinPts1. Then, based on the parameters and adjacency list, the clustering algorithm can be implemented more accurately. Finally, compared with algorithm DBSCAN, the experimental results show that the proposed algorithm has higher accuracy in clustering the datasets with varied densities while they have similar running time.\",\"PeriodicalId\":159180,\"journal\":{\"name\":\"Proceedings of the The 11th International Knowledge Management in Organizations Conference on The changing face of Knowledge Management Impacting Society\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the The 11th International Knowledge Management in Organizations Conference on The changing face of Knowledge Management Impacting Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2925995.2926040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the The 11th International Knowledge Management in Organizations Conference on The changing face of Knowledge Management Impacting Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2925995.2926040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MDBSCAN: Multi-level Density Based Spatial Clustering of Applications with Noise
With the rapid development of information technology, more and more complex data has been produced. It has practical significance to mine valuable information from the complex data. Clustering is an important research in the field of data mining. As a density-based clustering algorithm, DBSCAN is sensitive to the input parameters and difficult to find out all the meaningful clusters for datasets with varied densities. Aiming at this shortcoming, this paper proposed the MDBSCAN algorithm. The algorithm can generate two different density parameters by statistical method, and then the clustering can be more accurate for datasets with varied densities. At first, the algorithm uses adjacency list to store the graph generated by the datasets with one parameter Eps. Adjacency list which has been established in the first step is conducive to generate the varied densities parameters MinPts0 and MinPts1. Then, based on the parameters and adjacency list, the clustering algorithm can be implemented more accurately. Finally, compared with algorithm DBSCAN, the experimental results show that the proposed algorithm has higher accuracy in clustering the datasets with varied densities while they have similar running time.