Hao Lin, Hongfu Liu, Junjie Wu, Hong Li, Stephan Günnemann
{"title":"Algorithm xxxx: KCC: A MATLAB Package for K-means-based Consensus Clustering","authors":"Hao Lin, Hongfu Liu, Junjie Wu, Hong Li, Stephan Günnemann","doi":"10.1145/3616011","DOIUrl":null,"url":null,"abstract":"Consensus clustering is gaining increasing attention for its high quality and robustness. In particular, K-means-based Consensus Clustering (KCC) converts the usual computationally expensive problem to a classic K-means clustering with generalized utility functions, bringing potentials for large-scale data clustering on different types of data. Despite KCC’s applicability and generalizability, implementing this method such as representing the binary data set in the K-means heuristic is challenging, and has seldom been discussed in prior work. To fill this gap, we present a MATLAB package, KCC, that completely implements the KCC framework, and utilizes a sparse representation technique to achieve a low space complexity. Compared to alternative consensus clustering packages, the KCC package is of high flexibility, efficiency, and effectiveness. Extensive numerical experiments are also included to show its usability on real-world data sets.","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":" ","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Mathematical Software","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3616011","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 1
Abstract
Consensus clustering is gaining increasing attention for its high quality and robustness. In particular, K-means-based Consensus Clustering (KCC) converts the usual computationally expensive problem to a classic K-means clustering with generalized utility functions, bringing potentials for large-scale data clustering on different types of data. Despite KCC’s applicability and generalizability, implementing this method such as representing the binary data set in the K-means heuristic is challenging, and has seldom been discussed in prior work. To fill this gap, we present a MATLAB package, KCC, that completely implements the KCC framework, and utilizes a sparse representation technique to achieve a low space complexity. Compared to alternative consensus clustering packages, the KCC package is of high flexibility, efficiency, and effectiveness. Extensive numerical experiments are also included to show its usability on real-world data sets.
期刊介绍:
As a scientific journal, ACM Transactions on Mathematical Software (TOMS) documents the theoretical underpinnings of numeric, symbolic, algebraic, and geometric computing applications. It focuses on analysis and construction of algorithms and programs, and the interaction of programs and architecture. Algorithms documented in TOMS are available as the Collected Algorithms of the ACM at calgo.acm.org.