{"title":"Spectral clustering using the kNN-MST similarity graph","authors":"Patrick Veenstra, C. Cooper, S. Phelps","doi":"10.1109/CEEC.2016.7835917","DOIUrl":null,"url":null,"abstract":"Spectral clustering is a technique that uses the spectrum of a similarity graph to cluster data. Part of this procedure involves calculating the similarity between data points and creating a similarity graph from the resulting similarity matrix. This is ordinarily achieved by creating a k-nearest neighbour (kNN) graph. In this paper, we show the benefits of using a different similarity graph, namely the union of the kNN graph and the minimum spanning tree of the negated similarity matrix (kNN-MST). We show that this has some distinct advantages on both synthetic and real datasets. Specifically, the clustering accuracy of kNN-MST is less dependent on the choice of k than kNN is.","PeriodicalId":114518,"journal":{"name":"2016 8th Computer Science and Electronic Engineering (CEEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 8th Computer Science and Electronic Engineering (CEEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEEC.2016.7835917","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Spectral clustering is a technique that uses the spectrum of a similarity graph to cluster data. Part of this procedure involves calculating the similarity between data points and creating a similarity graph from the resulting similarity matrix. This is ordinarily achieved by creating a k-nearest neighbour (kNN) graph. In this paper, we show the benefits of using a different similarity graph, namely the union of the kNN graph and the minimum spanning tree of the negated similarity matrix (kNN-MST). We show that this has some distinct advantages on both synthetic and real datasets. Specifically, the clustering accuracy of kNN-MST is less dependent on the choice of k than kNN is.