{"title":"Simultaneous clustering and estimation of networks in multiple graphical models.","authors":"Gen Li, Miaoyan Wang","doi":"10.1093/biostatistics/kxae015","DOIUrl":null,"url":null,"abstract":"<p><p>Gaussian graphical models are widely used to study the dependence structure among variables. When samples are obtained from multiple conditions or populations, joint analysis of multiple graphical models are desired due to their capacity to borrow strength across populations. Nonetheless, existing methods often overlook the varying levels of similarity between populations, leading to unsatisfactory results. Moreover, in many applications, learning the population-level clustering structure itself is of particular interest. In this article, we develop a novel method, called Simultaneous Clustering and Estimation of Networks via Tensor decomposition (SCENT), that simultaneously clusters and estimates graphical models from multiple populations. Precision matrices from different populations are uniquely organized as a three-way tensor array, and a low-rank sparse model is proposed for joint population clustering and network estimation. We develop a penalized likelihood method and an augmented Lagrangian algorithm for model fitting. We also establish the clustering accuracy and norm consistency of the estimated precision matrices. We demonstrate the efficacy of the proposed method with comprehensive simulation studies. The application to the Genotype-Tissue Expression multi-tissue gene expression data provides important insights into tissue clustering and gene coexpression patterns in multiple brain tissues.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11826093/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biostatistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biostatistics/kxae015","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Gaussian graphical models are widely used to study the dependence structure among variables. When samples are obtained from multiple conditions or populations, joint analysis of multiple graphical models are desired due to their capacity to borrow strength across populations. Nonetheless, existing methods often overlook the varying levels of similarity between populations, leading to unsatisfactory results. Moreover, in many applications, learning the population-level clustering structure itself is of particular interest. In this article, we develop a novel method, called Simultaneous Clustering and Estimation of Networks via Tensor decomposition (SCENT), that simultaneously clusters and estimates graphical models from multiple populations. Precision matrices from different populations are uniquely organized as a three-way tensor array, and a low-rank sparse model is proposed for joint population clustering and network estimation. We develop a penalized likelihood method and an augmented Lagrangian algorithm for model fitting. We also establish the clustering accuracy and norm consistency of the estimated precision matrices. We demonstrate the efficacy of the proposed method with comprehensive simulation studies. The application to the Genotype-Tissue Expression multi-tissue gene expression data provides important insights into tissue clustering and gene coexpression patterns in multiple brain tissues.
期刊介绍:
Among the important scientific developments of the 20th century is the explosive growth in statistical reasoning and methods for application to studies of human health. Examples include developments in likelihood methods for inference, epidemiologic statistics, clinical trials, survival analysis, and statistical genetics. Substantive problems in public health and biomedical research have fueled the development of statistical methods, which in turn have improved our ability to draw valid inferences from data. The objective of Biostatistics is to advance statistical science and its application to problems of human health and disease, with the ultimate goal of advancing the public''s health.