An Eigengap Ratio Test for Determining the Number of Communities in Network Data

Yujia Wu, Jingfei Zhang, Wei Lan, Chih-Ling Tsai
{"title":"An Eigengap Ratio Test for Determining the Number of Communities in Network Data","authors":"Yujia Wu, Jingfei Zhang, Wei Lan, Chih-Ling Tsai","doi":"arxiv-2409.05276","DOIUrl":null,"url":null,"abstract":"To characterize the community structure in network data, researchers have\nintroduced various block-type models, including the stochastic block model,\ndegree-corrected stochastic block model, mixed membership block model,\ndegree-corrected mixed membership block model, and others. A critical step in\napplying these models effectively is determining the number of communities in\nthe network. However, to our knowledge, existing methods for estimating the\nnumber of network communities often require model estimations or are unable to\nsimultaneously account for network sparsity and a divergent number of\ncommunities. In this paper, we propose an eigengap-ratio based test that\naddress these challenges. The test is straightforward to compute, requires no\nparameter tuning, and can be applied to a wide range of block models without\nthe need to estimate network distribution parameters. Furthermore, it is\neffective for both dense and sparse networks with a divergent number of\ncommunities. We show that the proposed test statistic converges to a function\nof the type-I Tracy-Widom distributions under the null hypothesis, and that the\ntest is asymptotically powerful under alternatives. Simulation studies on both\ndense and sparse networks demonstrate the efficacy of the proposed method.\nThree real-world examples are presented to illustrate the usefulness of the\nproposed test.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

To characterize the community structure in network data, researchers have introduced various block-type models, including the stochastic block model, degree-corrected stochastic block model, mixed membership block model, degree-corrected mixed membership block model, and others. A critical step in applying these models effectively is determining the number of communities in the network. However, to our knowledge, existing methods for estimating the number of network communities often require model estimations or are unable to simultaneously account for network sparsity and a divergent number of communities. In this paper, we propose an eigengap-ratio based test that address these challenges. The test is straightforward to compute, requires no parameter tuning, and can be applied to a wide range of block models without the need to estimate network distribution parameters. Furthermore, it is effective for both dense and sparse networks with a divergent number of communities. We show that the proposed test statistic converges to a function of the type-I Tracy-Widom distributions under the null hypothesis, and that the test is asymptotically powerful under alternatives. Simulation studies on both dense and sparse networks demonstrate the efficacy of the proposed method. Three real-world examples are presented to illustrate the usefulness of the proposed test.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于确定网络数据中社群数量的 Eigengap 比率测试
为了描述网络数据中的群落结构,研究人员引入了各种块状模型,包括随机块状模型、度校正随机块状模型、混合成员块状模型、度校正混合成员块状模型等。有效应用这些模型的关键步骤是确定网络中的群落数量。然而,据我们所知,现有的估计网络社区数量的方法往往需要对模型进行估计,或者无法同时考虑网络稀疏性和社区数量的差异。在本文中,我们提出了一种基于 eigengap 比率的测试方法来解决这些难题。该检验计算简单,不需要调整参数,可应用于各种区块模型,无需估计网络分布参数。此外,它对具有不同群体数量的密集和稀疏网络都有效。我们证明,在零假设下,所提出的检验统计量收敛于 I 型 Tracy-Widom 分布的函数,并且在替代假设下,该检验在渐近上是强大的。在密集和稀疏网络上进行的仿真研究证明了所提方法的有效性,并列举了三个实际案例来说明所提检验的实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Poisson approximate likelihood compared to the particle filter Optimising the Trade-Off Between Type I and Type II Errors: A Review and Extensions Bias Reduction in Matched Observational Studies with Continuous Treatments: Calipered Non-Bipartite Matching and Bias-Corrected Estimation and Inference Forecasting age distribution of life-table death counts via α-transformation Probability-scale residuals for event-time data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1