Zihua Zhao;Feiping Nie;Rong Wang;Zheng Wang;Xuelong Li
{"title":"An Balanced, and Scalable Graph-Based Multiview Clustering Method","authors":"Zihua Zhao;Feiping Nie;Rong Wang;Zheng Wang;Xuelong Li","doi":"10.1109/TKDE.2024.3443534","DOIUrl":null,"url":null,"abstract":"In recent years, graph-based multiview clustering methods have become a research hotspot in the clustering field. However, most existing methods lack consideration of cluster balance in their results. In fact, cluster balance is crucial in many real-world scenarios. Additionally, graph-based multiview clustering methods often suffer from high time consumption and cannot handle large-scale datasets. To address these issues, this paper proposes a novel graph-based multiview clustering method. The method is built upon the bipartite graph. Specifically, it employs a label propagation mechanism to update the smaller anchor label matrix rather than the sample label matrix, significantly reducing the computational cost. The introduced balance constraint in the proposed model contributes to achieving balanced clustering results. The entire clustering model combines information from multiple views through graph fusion. The joint graph and view weight parameters in the model are obtained through task-driven self-supervised learning. Moreover, the model can directly obtain clustering results without the need for the two-stage processing typically used in general spectral clustering. Finally, extensive experiments on toy datasets and real-world datasets are conducted to validate the superiority of the proposed method in terms of clustering performance, clustering balance, and time expenditure.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"7643-7656"},"PeriodicalIF":8.9000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10636812/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, graph-based multiview clustering methods have become a research hotspot in the clustering field. However, most existing methods lack consideration of cluster balance in their results. In fact, cluster balance is crucial in many real-world scenarios. Additionally, graph-based multiview clustering methods often suffer from high time consumption and cannot handle large-scale datasets. To address these issues, this paper proposes a novel graph-based multiview clustering method. The method is built upon the bipartite graph. Specifically, it employs a label propagation mechanism to update the smaller anchor label matrix rather than the sample label matrix, significantly reducing the computational cost. The introduced balance constraint in the proposed model contributes to achieving balanced clustering results. The entire clustering model combines information from multiple views through graph fusion. The joint graph and view weight parameters in the model are obtained through task-driven self-supervised learning. Moreover, the model can directly obtain clustering results without the need for the two-stage processing typically used in general spectral clustering. Finally, extensive experiments on toy datasets and real-world datasets are conducted to validate the superiority of the proposed method in terms of clustering performance, clustering balance, and time expenditure.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.