{"title":"Applications of dual regularized Laplacian matrix for community detection","authors":"Huan Qing, Jingli Wang","doi":"10.1007/s11634-023-00565-3","DOIUrl":null,"url":null,"abstract":"<div><p>Spectral clustering is widely used for detecting clusters in networks for community detection, while a small change on the graph Laplacian matrix could bring a dramatic improvement. In this paper, we propose a dual regularized graph Laplacian matrix and then employ it to the classical spectral clustering approach under the degree-corrected stochastic block model. If the number of communities is known as <i>K</i>, we consider more than <i>K</i> leading eigenvectors and weight them by their corresponding eigenvalues in the spectral clustering procedure to improve the performance. The improved spectral clustering method is dual regularized spectral clustering (DRSC). Theoretical analysis of DRSC shows that under mild conditions it yields stable consistent community detection. Meanwhile, we develop a strategy by taking advantage of DRSC and Newman’s modularity to estimate the number of communities <i>K</i>. We compare the performance of DRSC with several spectral methods and investigate the behaviors of our strategy for estimating <i>K</i> by substantial simulated networks and real-world networks. Numerical results show that DRSC enjoys satisfactory performance and our strategy on estimating <i>K</i> performs accurately and consistently, even in cases where there is only one community in a network.</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"18 4","pages":"1001 - 1043"},"PeriodicalIF":1.4000,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Data Analysis and Classification","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s11634-023-00565-3","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
Spectral clustering is widely used for detecting clusters in networks for community detection, while a small change on the graph Laplacian matrix could bring a dramatic improvement. In this paper, we propose a dual regularized graph Laplacian matrix and then employ it to the classical spectral clustering approach under the degree-corrected stochastic block model. If the number of communities is known as K, we consider more than K leading eigenvectors and weight them by their corresponding eigenvalues in the spectral clustering procedure to improve the performance. The improved spectral clustering method is dual regularized spectral clustering (DRSC). Theoretical analysis of DRSC shows that under mild conditions it yields stable consistent community detection. Meanwhile, we develop a strategy by taking advantage of DRSC and Newman’s modularity to estimate the number of communities K. We compare the performance of DRSC with several spectral methods and investigate the behaviors of our strategy for estimating K by substantial simulated networks and real-world networks. Numerical results show that DRSC enjoys satisfactory performance and our strategy on estimating K performs accurately and consistently, even in cases where there is only one community in a network.
光谱聚类被广泛应用于网络中的聚类检测,以实现群落检测,而对图拉普拉斯矩阵的微小改动就能带来巨大的改进。本文提出了一种双重正则化图拉普拉斯矩阵,并将其应用于度校正随机块模型下的经典光谱聚类方法。如果已知群落数量为 K,我们会考虑 K 个以上的前导特征向量,并在谱聚类过程中根据其对应的特征值对它们进行加权,以提高性能。改进后的光谱聚类方法就是双重正则化光谱聚类(DRSC)。DRSC 的理论分析表明,在温和的条件下,它能产生稳定一致的群落检测。同时,我们利用 DRSC 和纽曼模块化的优势开发了一种策略来估计群落数 K。我们比较了 DRSC 和几种光谱方法的性能,并通过大量模拟网络和真实世界网络研究了我们估计 K 的策略的行为。数值结果表明,DRSC 的性能令人满意,即使在网络中只有一个社区的情况下,我们的 K 估算策略也能准确一致地进行估算。
期刊介绍:
The international journal Advances in Data Analysis and Classification (ADAC) is designed as a forum for high standard publications on research and applications concerning the extraction of knowable aspects from many types of data. It publishes articles on such topics as structural, quantitative, or statistical approaches for the analysis of data; advances in classification, clustering, and pattern recognition methods; strategies for modeling complex data and mining large data sets; methods for the extraction of knowledge from data, and applications of advanced methods in specific domains of practice. Articles illustrate how new domain-specific knowledge can be made available from data by skillful use of data analysis methods. The journal also publishes survey papers that outline, and illuminate the basic ideas and techniques of special approaches.