{"title":"Two‐sample testing for random graphs","authors":"Xiaoyi Wen","doi":"10.1002/sam.11703","DOIUrl":null,"url":null,"abstract":"The employment of two‐sample hypothesis testing in examining random graphs has been a prevalent approach in diverse fields such as social sciences, neuroscience, and genetics. We advance a spectral‐based two‐sample hypothesis testing methodology to test the latent position random graphs. We propose two distinct asymptotic normal statistics, each optimally designed for two different models—the elementary Erdős–Rényi model and the more complex latent position random graph model. For the latter, the spectral embedding of the adjacency matrix was utilized to estimate the test statistic. The proposed method exhibited superior efficacy as it accomplished higher power than the conventional method of mean estimation. To validate our hypothesis testing procedure, we applied it to empirical biological data to discern structural variances in gene co‐expression networks between COVID‐19 patients and individuals who remained unaffected by the disease.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"24 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Analysis and Data Mining","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/sam.11703","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The employment of two‐sample hypothesis testing in examining random graphs has been a prevalent approach in diverse fields such as social sciences, neuroscience, and genetics. We advance a spectral‐based two‐sample hypothesis testing methodology to test the latent position random graphs. We propose two distinct asymptotic normal statistics, each optimally designed for two different models—the elementary Erdős–Rényi model and the more complex latent position random graph model. For the latter, the spectral embedding of the adjacency matrix was utilized to estimate the test statistic. The proposed method exhibited superior efficacy as it accomplished higher power than the conventional method of mean estimation. To validate our hypothesis testing procedure, we applied it to empirical biological data to discern structural variances in gene co‐expression networks between COVID‐19 patients and individuals who remained unaffected by the disease.
期刊介绍:
Statistical Analysis and Data Mining addresses the broad area of data analysis, including statistical approaches, machine learning, data mining, and applications. Topics include statistical and computational approaches for analyzing massive and complex datasets, novel statistical and/or machine learning methods and theory, and state-of-the-art applications with high impact. Of special interest are articles that describe innovative analytical techniques, and discuss their application to real problems, in such a way that they are accessible and beneficial to domain experts across science, engineering, and commerce.
The focus of the journal is on papers which satisfy one or more of the following criteria:
Solve data analysis problems associated with massive, complex datasets
Develop innovative statistical approaches, machine learning algorithms, or methods integrating ideas across disciplines, e.g., statistics, computer science, electrical engineering, operation research.
Formulate and solve high-impact real-world problems which challenge existing paradigms via new statistical and/or computational models
Provide survey to prominent research topics.