{"title":"Federated Momentum Contrastive Clustering","authors":"Runxuan Miao, Erdem Koyuncu","doi":"10.1145/3653981","DOIUrl":null,"url":null,"abstract":"<p>Self-supervised representation learning and deep clustering are mutually beneficial to learn high-quality representations and cluster data simultaneously in centralized settings. However, it is not always feasible to gather large amounts of data at a central entity, considering data privacy requirements and computational resources. Federated Learning (FL) has been developed successfully to aggregate a global model while training on distributed local data, respecting the data privacy of edge devices. However, most FL research effort focuses on supervised learning algorithms. A fully unsupervised federated clustering scheme has not been considered in the existing literature. We present federated momentum contrastive clustering (FedMCC), a generic federated clustering framework that can not only cluster data automatically but also extract discriminative representations training from distributed local data over multiple users. In FedMCC, we demonstrate a two-stage federated learning paradigm where the first stage aims to learn differentiable instance embeddings and the second stage accounts for clustering data automatically. The experimental results show that FedMCC not only achieves superior clustering performance but also outperforms several existing federated self-supervised methods for linear evaluation and semi-supervised learning tasks. Additionally, FedMCC can easily be adapted to ordinary centralized clustering through what we call momentum contrastive clustering (MCC). We show that MCC achieves state-of-the-art clustering accuracy results in certain datasets such as STL-10 and ImageNet-10. We also present a method to reduce the memory footprint of our clustering schemes.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3653981","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Self-supervised representation learning and deep clustering are mutually beneficial to learn high-quality representations and cluster data simultaneously in centralized settings. However, it is not always feasible to gather large amounts of data at a central entity, considering data privacy requirements and computational resources. Federated Learning (FL) has been developed successfully to aggregate a global model while training on distributed local data, respecting the data privacy of edge devices. However, most FL research effort focuses on supervised learning algorithms. A fully unsupervised federated clustering scheme has not been considered in the existing literature. We present federated momentum contrastive clustering (FedMCC), a generic federated clustering framework that can not only cluster data automatically but also extract discriminative representations training from distributed local data over multiple users. In FedMCC, we demonstrate a two-stage federated learning paradigm where the first stage aims to learn differentiable instance embeddings and the second stage accounts for clustering data automatically. The experimental results show that FedMCC not only achieves superior clustering performance but also outperforms several existing federated self-supervised methods for linear evaluation and semi-supervised learning tasks. Additionally, FedMCC can easily be adapted to ordinary centralized clustering through what we call momentum contrastive clustering (MCC). We show that MCC achieves state-of-the-art clustering accuracy results in certain datasets such as STL-10 and ImageNet-10. We also present a method to reduce the memory footprint of our clustering schemes.
期刊介绍:
ACM Transactions on Intelligent Systems and Technology is a scholarly journal that publishes the highest quality papers on intelligent systems, applicable algorithms and technology with a multi-disciplinary perspective. An intelligent system is one that uses artificial intelligence (AI) techniques to offer important services (e.g., as a component of a larger system) to allow integrated systems to perceive, reason, learn, and act intelligently in the real world.
ACM TIST is published quarterly (six issues a year). Each issue has 8-11 regular papers, with around 20 published journal pages or 10,000 words per paper. Additional references, proofs, graphs or detailed experiment results can be submitted as a separate appendix, while excessively lengthy papers will be rejected automatically. Authors can include online-only appendices for additional content of their published papers and are encouraged to share their code and/or data with other readers.