{"title":"An Eigengap Ratio Test for Determining the Number of Communities in Network Data","authors":"Yujia Wu, Jingfei Zhang, Wei Lan, Chih-Ling Tsai","doi":"arxiv-2409.05276","DOIUrl":null,"url":null,"abstract":"To characterize the community structure in network data, researchers have\nintroduced various block-type models, including the stochastic block model,\ndegree-corrected stochastic block model, mixed membership block model,\ndegree-corrected mixed membership block model, and others. A critical step in\napplying these models effectively is determining the number of communities in\nthe network. However, to our knowledge, existing methods for estimating the\nnumber of network communities often require model estimations or are unable to\nsimultaneously account for network sparsity and a divergent number of\ncommunities. In this paper, we propose an eigengap-ratio based test that\naddress these challenges. The test is straightforward to compute, requires no\nparameter tuning, and can be applied to a wide range of block models without\nthe need to estimate network distribution parameters. Furthermore, it is\neffective for both dense and sparse networks with a divergent number of\ncommunities. We show that the proposed test statistic converges to a function\nof the type-I Tracy-Widom distributions under the null hypothesis, and that the\ntest is asymptotically powerful under alternatives. Simulation studies on both\ndense and sparse networks demonstrate the efficacy of the proposed method.\nThree real-world examples are presented to illustrate the usefulness of the\nproposed test.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
To characterize the community structure in network data, researchers have
introduced various block-type models, including the stochastic block model,
degree-corrected stochastic block model, mixed membership block model,
degree-corrected mixed membership block model, and others. A critical step in
applying these models effectively is determining the number of communities in
the network. However, to our knowledge, existing methods for estimating the
number of network communities often require model estimations or are unable to
simultaneously account for network sparsity and a divergent number of
communities. In this paper, we propose an eigengap-ratio based test that
address these challenges. The test is straightforward to compute, requires no
parameter tuning, and can be applied to a wide range of block models without
the need to estimate network distribution parameters. Furthermore, it is
effective for both dense and sparse networks with a divergent number of
communities. We show that the proposed test statistic converges to a function
of the type-I Tracy-Widom distributions under the null hypothesis, and that the
test is asymptotically powerful under alternatives. Simulation studies on both
dense and sparse networks demonstrate the efficacy of the proposed method.
Three real-world examples are presented to illustrate the usefulness of the
proposed test.