{"title":"SSFuzzyART: A Semi-Supervised Fuzzy ART through seeding initialization and a clustered data generation algorithm to deeply study clustering solutions","authors":"Siwar Jendoubi, Aurélien Baelde, Thong Tran","doi":"10.1016/j.array.2023.100319","DOIUrl":null,"url":null,"abstract":"<div><p>Semi-supervised clustering is a machine learning technique that was introduced to boost clustering performance when labeled data is available. Indeed, some labeled data are usually available in real use cases, and can be used to initialize the clustering process to guide it and to make it more efficient. Fuzzy ART is a clustering technique that is proved to be efficient in several real cases, but as an unsupervised algorithm, it cannot use available labeled data. This paper introduces a semi-supervised variant of the FuzzyART clustering algorithm (SSFuzzyART). The proposed solution uses the available labeled data to initialize clusters centers. In another hand, to deeply evaluate the characteristics of the proposed algorithm, a clustered binary data generation algorithm with controlled partitioning is also introduced in this paper. Indeed, the controlled generated clusters allows studying the characteristics of the proposed SSFuzzyART. Furthermore, a set of experiments is carried out on some available benchmarks. SSFuzzyART demonstrated better clustering prediction results than its classic counterpart.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"19 ","pages":"Article 100319"},"PeriodicalIF":2.3000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005623000449","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Semi-supervised clustering is a machine learning technique that was introduced to boost clustering performance when labeled data is available. Indeed, some labeled data are usually available in real use cases, and can be used to initialize the clustering process to guide it and to make it more efficient. Fuzzy ART is a clustering technique that is proved to be efficient in several real cases, but as an unsupervised algorithm, it cannot use available labeled data. This paper introduces a semi-supervised variant of the FuzzyART clustering algorithm (SSFuzzyART). The proposed solution uses the available labeled data to initialize clusters centers. In another hand, to deeply evaluate the characteristics of the proposed algorithm, a clustered binary data generation algorithm with controlled partitioning is also introduced in this paper. Indeed, the controlled generated clusters allows studying the characteristics of the proposed SSFuzzyART. Furthermore, a set of experiments is carried out on some available benchmarks. SSFuzzyART demonstrated better clustering prediction results than its classic counterpart.