{"title":"Kernel-Based k-Representatives Algorithm for Fuzzy Clustering of Categorical Data","authors":"Toan Nguyen Mau, V. Huynh","doi":"10.1109/FUZZ45933.2021.9494597","DOIUrl":null,"url":null,"abstract":"Fuzzy cluster analysis plays an essential role in addressing unclear boundaries between clusters in data and aims to group objects into fuzzy clusters based on their similarities. In this paper, we propose a new method for fuzzy clustering of data with categorical attributes. Specifically, we first introduce a method for kernel-based representation of cluster centers in which the underlying distribution of categorical values within a cluster center is estimated as a weighted sum of the uniform distribution and their frequency distribution. We then extend the k-centers clustering method by applying this newly proposed method of cluster center presentation for fuzzy clustering of categorical data. The effectiveness and efficiency of the proposed method are demonstrated by conducting experiments on 16 realworld datasets and comparing the results with those of existing methods. In addition, our research can be regarded as the first attempt to apply a fuzzy silhouette scoring method that includes internal coherence and external separation of fuzzy clusters into clustering of categorical data.","PeriodicalId":151289,"journal":{"name":"2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FUZZ45933.2021.9494597","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Fuzzy cluster analysis plays an essential role in addressing unclear boundaries between clusters in data and aims to group objects into fuzzy clusters based on their similarities. In this paper, we propose a new method for fuzzy clustering of data with categorical attributes. Specifically, we first introduce a method for kernel-based representation of cluster centers in which the underlying distribution of categorical values within a cluster center is estimated as a weighted sum of the uniform distribution and their frequency distribution. We then extend the k-centers clustering method by applying this newly proposed method of cluster center presentation for fuzzy clustering of categorical data. The effectiveness and efficiency of the proposed method are demonstrated by conducting experiments on 16 realworld datasets and comparing the results with those of existing methods. In addition, our research can be regarded as the first attempt to apply a fuzzy silhouette scoring method that includes internal coherence and external separation of fuzzy clusters into clustering of categorical data.