Yousheng Gao, Raihah Aminuddin, Raseeda Hamzah, Li Ang, Siti Khatijah Nor Abdul Rahim
{"title":"Semi-supervised spectral clustering using shared nearest neighbour for data with different shape and density","authors":"Yousheng Gao, Raihah Aminuddin, Raseeda Hamzah, Li Ang, Siti Khatijah Nor Abdul Rahim","doi":"10.11591/ijai.v13.i2.pp2283-2290","DOIUrl":null,"url":null,"abstract":"In the absence of supervisory information in spectral clustering algorithms, it is difficult to construct suitable similarity graphs for data with complex shapes and varying densities. To address this issue, this paper proposes a Semi-supervised Spectral Clustering algorithm based on shared nearest neighbor. The proposed algorithm combines the idea of semi-supervised clustering, adding Shared Nearest Neighbor information to the calculation of the distance matrix, and using pairwise constraint information to find the relationship between two data points, while providing a portion of supervised information. Comparative experiments were conducted on artificial data sets and University of California Irvine machine learning repository datasets. The experimental results show that the proposed algorithm achieves better clustering results compared to traditional K-means and spectral clustering algorithms.","PeriodicalId":507934,"journal":{"name":"IAES International Journal of Artificial Intelligence (IJ-AI)","volume":"10 18","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IAES International Journal of Artificial Intelligence (IJ-AI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/ijai.v13.i2.pp2283-2290","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the absence of supervisory information in spectral clustering algorithms, it is difficult to construct suitable similarity graphs for data with complex shapes and varying densities. To address this issue, this paper proposes a Semi-supervised Spectral Clustering algorithm based on shared nearest neighbor. The proposed algorithm combines the idea of semi-supervised clustering, adding Shared Nearest Neighbor information to the calculation of the distance matrix, and using pairwise constraint information to find the relationship between two data points, while providing a portion of supervised information. Comparative experiments were conducted on artificial data sets and University of California Irvine machine learning repository datasets. The experimental results show that the proposed algorithm achieves better clustering results compared to traditional K-means and spectral clustering algorithms.