{"title":"利用高斯图模型和蒙特卡罗方法从全基因组基因表达数据中学习遗传网络的集成方法。","authors":"Haitao Zhao, Sujay Datta, Zhong-Hui Duan","doi":"10.1177/11779322231152972","DOIUrl":null,"url":null,"abstract":"<p><p>Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding the conditional dependence between genes. Many algorithms based on the GGM have been proposed for learning genetic network structures. Because the number of gene variables is typically far more than the number of samples collected, and a real genetic network is typically sparse, the graphical lasso implementation of GGM becomes a popular tool for inferring the conditional interdependence among genes. However, graphical lasso, although showing good performance in low dimensional data sets, is computationally expensive and inefficient or even unable to work directly on genome-wide gene expression data sets. In this study, the method of Monte Carlo Gaussian graphical model (MCGGM) was proposed to learn global genetic networks of genes. This method uses a Monte Carlo approach to sample subnetworks from genome-wide gene expression data and graphical lasso to learn the structures of the subnetworks. The learned subnetworks are then integrated to approximate a global genetic network. The proposed method was evaluated with a relatively small real data set of RNA-seq expression levels. The results indicate the proposed method shows a strong ability of decoding the interactions with high conditional dependences among genes. The method was then applied to genome-wide data sets of RNA-seq expression levels. The gene interactions with high interdependence from the estimated global networks show that most of the predicted gene-gene interactions have been reported in the literatures playing important roles in different human cancers. Also, the results validate the ability and reliability of the proposed method to identify high conditional dependences among genes in large-scale data sets.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/4e/ca/10.1177_11779322231152972.PMC9972065.pdf","citationCount":"1","resultStr":"{\"title\":\"An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method.\",\"authors\":\"Haitao Zhao, Sujay Datta, Zhong-Hui Duan\",\"doi\":\"10.1177/11779322231152972\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding the conditional dependence between genes. Many algorithms based on the GGM have been proposed for learning genetic network structures. Because the number of gene variables is typically far more than the number of samples collected, and a real genetic network is typically sparse, the graphical lasso implementation of GGM becomes a popular tool for inferring the conditional interdependence among genes. However, graphical lasso, although showing good performance in low dimensional data sets, is computationally expensive and inefficient or even unable to work directly on genome-wide gene expression data sets. In this study, the method of Monte Carlo Gaussian graphical model (MCGGM) was proposed to learn global genetic networks of genes. This method uses a Monte Carlo approach to sample subnetworks from genome-wide gene expression data and graphical lasso to learn the structures of the subnetworks. The learned subnetworks are then integrated to approximate a global genetic network. The proposed method was evaluated with a relatively small real data set of RNA-seq expression levels. The results indicate the proposed method shows a strong ability of decoding the interactions with high conditional dependences among genes. The method was then applied to genome-wide data sets of RNA-seq expression levels. The gene interactions with high interdependence from the estimated global networks show that most of the predicted gene-gene interactions have been reported in the literatures playing important roles in different human cancers. Also, the results validate the ability and reliability of the proposed method to identify high conditional dependences among genes in large-scale data sets.</p>\",\"PeriodicalId\":9065,\"journal\":{\"name\":\"Bioinformatics and Biology Insights\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/4e/ca/10.1177_11779322231152972.PMC9972065.pdf\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics and Biology Insights\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1177/11779322231152972\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics and Biology Insights","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/11779322231152972","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method.
Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding the conditional dependence between genes. Many algorithms based on the GGM have been proposed for learning genetic network structures. Because the number of gene variables is typically far more than the number of samples collected, and a real genetic network is typically sparse, the graphical lasso implementation of GGM becomes a popular tool for inferring the conditional interdependence among genes. However, graphical lasso, although showing good performance in low dimensional data sets, is computationally expensive and inefficient or even unable to work directly on genome-wide gene expression data sets. In this study, the method of Monte Carlo Gaussian graphical model (MCGGM) was proposed to learn global genetic networks of genes. This method uses a Monte Carlo approach to sample subnetworks from genome-wide gene expression data and graphical lasso to learn the structures of the subnetworks. The learned subnetworks are then integrated to approximate a global genetic network. The proposed method was evaluated with a relatively small real data set of RNA-seq expression levels. The results indicate the proposed method shows a strong ability of decoding the interactions with high conditional dependences among genes. The method was then applied to genome-wide data sets of RNA-seq expression levels. The gene interactions with high interdependence from the estimated global networks show that most of the predicted gene-gene interactions have been reported in the literatures playing important roles in different human cancers. Also, the results validate the ability and reliability of the proposed method to identify high conditional dependences among genes in large-scale data sets.
期刊介绍:
Bioinformatics and Biology Insights is an open access, peer-reviewed journal that considers articles on bioinformatics methods and their applications which must pertain to biological insights. All papers should be easily amenable to biologists and as such help bridge the gap between theories and applications.