An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method.

IF 2.3 Q3 BIOCHEMICAL RESEARCH METHODS Bioinformatics and Biology Insights Pub Date : 2023-01-01 DOI:10.1177/11779322231152972
Haitao Zhao, Sujay Datta, Zhong-Hui Duan
{"title":"An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method.","authors":"Haitao Zhao,&nbsp;Sujay Datta,&nbsp;Zhong-Hui Duan","doi":"10.1177/11779322231152972","DOIUrl":null,"url":null,"abstract":"<p><p>Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding the conditional dependence between genes. Many algorithms based on the GGM have been proposed for learning genetic network structures. Because the number of gene variables is typically far more than the number of samples collected, and a real genetic network is typically sparse, the graphical lasso implementation of GGM becomes a popular tool for inferring the conditional interdependence among genes. However, graphical lasso, although showing good performance in low dimensional data sets, is computationally expensive and inefficient or even unable to work directly on genome-wide gene expression data sets. In this study, the method of Monte Carlo Gaussian graphical model (MCGGM) was proposed to learn global genetic networks of genes. This method uses a Monte Carlo approach to sample subnetworks from genome-wide gene expression data and graphical lasso to learn the structures of the subnetworks. The learned subnetworks are then integrated to approximate a global genetic network. The proposed method was evaluated with a relatively small real data set of RNA-seq expression levels. The results indicate the proposed method shows a strong ability of decoding the interactions with high conditional dependences among genes. The method was then applied to genome-wide data sets of RNA-seq expression levels. The gene interactions with high interdependence from the estimated global networks show that most of the predicted gene-gene interactions have been reported in the literatures playing important roles in different human cancers. Also, the results validate the ability and reliability of the proposed method to identify high conditional dependences among genes in large-scale data sets.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/4e/ca/10.1177_11779322231152972.PMC9972065.pdf","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics and Biology Insights","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/11779322231152972","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 1

Abstract

Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding the conditional dependence between genes. Many algorithms based on the GGM have been proposed for learning genetic network structures. Because the number of gene variables is typically far more than the number of samples collected, and a real genetic network is typically sparse, the graphical lasso implementation of GGM becomes a popular tool for inferring the conditional interdependence among genes. However, graphical lasso, although showing good performance in low dimensional data sets, is computationally expensive and inefficient or even unable to work directly on genome-wide gene expression data sets. In this study, the method of Monte Carlo Gaussian graphical model (MCGGM) was proposed to learn global genetic networks of genes. This method uses a Monte Carlo approach to sample subnetworks from genome-wide gene expression data and graphical lasso to learn the structures of the subnetworks. The learned subnetworks are then integrated to approximate a global genetic network. The proposed method was evaluated with a relatively small real data set of RNA-seq expression levels. The results indicate the proposed method shows a strong ability of decoding the interactions with high conditional dependences among genes. The method was then applied to genome-wide data sets of RNA-seq expression levels. The gene interactions with high interdependence from the estimated global networks show that most of the predicted gene-gene interactions have been reported in the literatures playing important roles in different human cancers. Also, the results validate the ability and reliability of the proposed method to identify high conditional dependences among genes in large-scale data sets.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用高斯图模型和蒙特卡罗方法从全基因组基因表达数据中学习遗传网络的集成方法。
全球遗传网络为分析人类疾病提供了更多的信息,超出了传统的以单一基因或地方网络为重点的分析。高斯图模型(Gaussian graphical model, GGM)定义了一个解码基因间条件依赖关系的无向图,被广泛应用于遗传网络的学习。许多基于GGM的遗传网络学习算法已经被提出。由于基因变量的数量通常远远超过所收集的样本数量,并且真正的遗传网络通常是稀疏的,因此GGM的图形套索实现成为推断基因之间条件相互依赖的流行工具。然而,尽管图形套索在低维数据集上表现良好,但计算成本高,效率低,甚至无法直接处理全基因组基因表达数据集。本研究提出了蒙特卡罗高斯图形模型(MCGGM)的方法来学习基因的全局遗传网络。该方法使用蒙特卡罗方法从全基因组基因表达数据和图形套索中对子网络进行采样,以了解子网络的结构。然后将学习到的子网络集成到近似的全局遗传网络中。采用相对较小的RNA-seq表达水平真实数据集对所提出的方法进行了评估。结果表明,该方法具有较强的解码基因间高条件依赖性相互作用的能力。然后将该方法应用于RNA-seq表达水平的全基因组数据集。从估计的全球网络中,高度相互依赖的基因相互作用表明,大多数预测的基因-基因相互作用已经在文献中报道,在不同的人类癌症中发挥重要作用。此外,结果验证了所提出的方法在大规模数据集中识别基因之间高条件依赖性的能力和可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Bioinformatics and Biology Insights
Bioinformatics and Biology Insights BIOCHEMICAL RESEARCH METHODS-
CiteScore
6.80
自引率
1.70%
发文量
36
审稿时长
8 weeks
期刊介绍: Bioinformatics and Biology Insights is an open access, peer-reviewed journal that considers articles on bioinformatics methods and their applications which must pertain to biological insights. All papers should be easily amenable to biologists and as such help bridge the gap between theories and applications.
期刊最新文献
RhizoBindingSites v2.0 Is a Bioinformatic Database of DNA Motifs Potentially Involved in Transcriptional Regulation Deduced From Their Genomic Sites. Marine-Derived Furanones Targeting Quorum-Sensing Receptors in Pseudomonas aeruginosa: Molecular Insights and Potential Mechanisms of Inhibition. Comparing Deep Learning Performance for Chronic Lymphocytic Leukaemia Cell Segmentation in Brightfield Microscopy Images. Identification of Hub of the Hub-Genes From Different Individual Studies for Early Diagnosis, Prognosis, and Therapies of Breast Cancer. Proteomics Exploration of Brucella melitensis to Design an Innovative Multi-Epitope mRNA Vaccine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1