Testing Thresholds for High-Dimensional Sparse Random Geometric Graphs

IF 1.2 3区计算机科学 Q3 COMPUTER SCIENCE, THEORY & METHODS SIAM Journal on Computing Pub Date : 2024-02-27 DOI:10.1137/23m1545203

Siqi Liu, Sidhanth Mohanty, Tselil Schramm, Elizabeth Yang

{"title":"Testing Thresholds for High-Dimensional Sparse Random Geometric Graphs","authors":"Siqi Liu, Sidhanth Mohanty, Tselil Schramm, Elizabeth Yang","doi":"10.1137/23m1545203","DOIUrl":null,"url":null,"abstract":"SIAM Journal on Computing, Ahead of Print. <br/> Abstract. The random geometric graph model [math] is a distribution over graphs in which the edges capture a latent geometry. To sample [math], we identify each of our [math] vertices with an independently and uniformly sampled vector from the [math]-dimensional unit sphere [math], and we connect pairs of vertices whose vectors are “sufficiently close,” such that the marginal probability of an edge is [math]. Because of the underlying geometry, this model is natural for applications in data science and beyond. We investigate the problem of testing for this latent geometry, or, in other words, distinguishing an Erdős–Rényi graph [math] from a random geometric graph [math]. It is not too difficult to show that if [math] while [math] is held fixed, the two distributions become indistinguishable; we wish to understand how fast [math] must grow as a function of [math] for indistinguishability to occur. When [math] for constant [math], we prove that if [math], the total variation distance between the two distributions is close to 0; this improves upon the best previous bound of Brennan, Bresler, and Nagaraj (2020), which required [math], and further our result is nearly tight, resolving a conjecture of Bubeck, Ding, Eldan, and Rácz (2016) up to logarithmic factors. We also obtain improved upper bounds on the statistical indistinguishability thresholds in [math] for the full range of [math] satisfying [math], improving upon the previous bounds by polynomial factors. Our analysis uses the belief propagation algorithm to characterize the distributions of (subsets of) the random vectors conditioned on producing a particular graph. In this sense, our analysis is connected to the “cavity method” from statistical physics. To analyze this process, we rely on novel sharp estimates for the area of the intersection of a random sphere cap with an arbitrary subset of [math], which we prove using optimal transport maps and entropy-transport inequalities on the unit sphere. We believe these techniques may be of independent interest.","PeriodicalId":49532,"journal":{"name":"SIAM Journal on Computing","volume":"25 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM Journal on Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1137/23m1545203","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

SIAM Journal on Computing, Ahead of Print.
Abstract. The random geometric graph model [math] is a distribution over graphs in which the edges capture a latent geometry. To sample [math], we identify each of our [math] vertices with an independently and uniformly sampled vector from the [math]-dimensional unit sphere [math], and we connect pairs of vertices whose vectors are “sufficiently close,” such that the marginal probability of an edge is [math]. Because of the underlying geometry, this model is natural for applications in data science and beyond. We investigate the problem of testing for this latent geometry, or, in other words, distinguishing an Erdős–Rényi graph [math] from a random geometric graph [math]. It is not too difficult to show that if [math] while [math] is held fixed, the two distributions become indistinguishable; we wish to understand how fast [math] must grow as a function of [math] for indistinguishability to occur. When [math] for constant [math], we prove that if [math], the total variation distance between the two distributions is close to 0; this improves upon the best previous bound of Brennan, Bresler, and Nagaraj (2020), which required [math], and further our result is nearly tight, resolving a conjecture of Bubeck, Ding, Eldan, and Rácz (2016) up to logarithmic factors. We also obtain improved upper bounds on the statistical indistinguishability thresholds in [math] for the full range of [math] satisfying [math], improving upon the previous bounds by polynomial factors. Our analysis uses the belief propagation algorithm to characterize the distributions of (subsets of) the random vectors conditioned on producing a particular graph. In this sense, our analysis is connected to the “cavity method” from statistical physics. To analyze this process, we rely on novel sharp estimates for the area of the intersection of a random sphere cap with an arbitrary subset of [math], which we prove using optimal transport maps and entropy-transport inequalities on the unit sphere. We believe these techniques may be of independent interest.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

测试高维稀疏随机几何图的阈值

SIAM 计算期刊》，提前印刷。摘要随机几何图模型[math]是一种图的分布，其中的边捕捉了潜在的几何图形。为了对[math]进行采样，我们用一个从[math]维单位球[math]中独立均匀采样的向量来识别每个[math]顶点，然后将向量 "足够接近 "的顶点对连接起来，这样边的边际概率就是[math]。由于其底层几何原理，该模型在数据科学及其他领域的应用非常自然。我们研究的问题是测试这种潜在的几何图形，或者换句话说，区分厄尔多斯-雷尼图 [math] 和随机几何图形 [math]。要证明[数学]固定不变的情况下[数学]的增长速度与[数学]的函数[数学]的增长速度成正比并不难，我们希望了解的是，[数学]的增长速度必须达到多少，才会出现无法区分的情况。当[math]为常数[math]时，我们证明，如果[math]，两个分布之间的总变异距离接近于0；这改进了布伦南、布雷斯勒和纳加拉吉（2020）之前的最佳约束，该约束要求[math]，而且我们的结果近乎严密，解决了布贝克、丁、埃尔丹和拉茨（2016）的一个猜想，达到对数因子。我们还得到了[math]满足[math]的全部范围内[math]统计无差别阈值的改进上限，比之前的上限提高了多项式因子。我们的分析使用信念传播算法来描述以生成特定图形为条件的随机向量（子集）的分布。从这个意义上说，我们的分析与统计物理学中的 "空穴法 "有关。为了分析这一过程，我们依赖于随机球帽与[math]的任意子集的交集面积的新锐估计值，我们利用单位球上的最优传输映射和熵传输不等式证明了这一点。我们相信，这些技术可能具有独立的意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

SIAM Journal on Computing 工程技术-计算机：理论方法

CiteScore

4.60

自引率

0.00%

发文量

审稿时长

6-12 weeks

期刊介绍： The SIAM Journal on Computing aims to provide coverage of the most significant work going on in the mathematical and formal aspects of computer science and nonnumerical computing. Submissions must be clearly written and make a significant technical contribution. Topics include but are not limited to analysis and design of algorithms, algorithmic game theory, data structures, computational complexity, computational algebra, computational aspects of combinatorics and graph theory, computational biology, computational geometry, computational robotics, the mathematical aspects of programming languages, artificial intelligence, computational learning, databases, information retrieval, cryptography, networks, distributed computing, parallel algorithms, and computer architecture.

期刊最新文献

Optimal Resizable Arrays Stronger 3-SUM Lower Bounds for Approximate Distance Oracles via Additive Combinatorics Resolving Matrix Spencer Conjecture up to Poly-Logarithmic Rank Complexity Classification Transfer for CSPs via Algebraic Products Optimal Sublinear Sampling of Spanning Trees and Determinantal Point Processes via Average-Case Entropic Independence