基于集成算法的代谢网络构建

S. Kim, Joohyoung Lee, H. Jang, Xiang Zhang
{"title":"基于集成算法的代谢网络构建","authors":"S. Kim, Joohyoung Lee, H. Jang, Xiang Zhang","doi":"10.4108/EAI.3-12-2015.2262392","DOIUrl":null,"url":null,"abstract":"One of the most important and challenging \"knowledge extraction\" tasks in bioinformatics is the reverse engineering of genes, proteins, and metabolites networks from biological data. Gaussian graphical models (GGMs) have been proven to be a very powerful formalism to infer biological networks. Standard GGM selection techniques can unfortunately not be used in the \"small N, large P\" data setting. Various methods to overcome this issue have been developed based on regularized estimation, partial least squares method, and limited-order partial correlation graphs. Several studies compared the performances among several network construction algorithms, such as PLSR, SCE, and ES, ICR and PCR, Ridge regression, Lasso and adaptive Lasso, to see which method is the best for biological network constructions. Each comparison analysis resulted in that each construction method has its own advantages as well as disadvantages according to different circumstances, such as the network complexity. However, it is almost impossible to recognize the complexity of the network before estimation. Thus, we develop an Ensemble method which is model averaging to construct a metabolic network. Our simulation studies show that the ensemble averaging based network construction has F1 score larger than these of other methods except only for Adaptive Lasso, reflecting its ability to account for uncertainty of network complexity.","PeriodicalId":415083,"journal":{"name":"International Conference on Bio-inspired Information and Communications Technologies","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Metabolic Network Construction Using Ensemble Algorithms\",\"authors\":\"S. Kim, Joohyoung Lee, H. Jang, Xiang Zhang\",\"doi\":\"10.4108/EAI.3-12-2015.2262392\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the most important and challenging \\\"knowledge extraction\\\" tasks in bioinformatics is the reverse engineering of genes, proteins, and metabolites networks from biological data. Gaussian graphical models (GGMs) have been proven to be a very powerful formalism to infer biological networks. Standard GGM selection techniques can unfortunately not be used in the \\\"small N, large P\\\" data setting. Various methods to overcome this issue have been developed based on regularized estimation, partial least squares method, and limited-order partial correlation graphs. Several studies compared the performances among several network construction algorithms, such as PLSR, SCE, and ES, ICR and PCR, Ridge regression, Lasso and adaptive Lasso, to see which method is the best for biological network constructions. Each comparison analysis resulted in that each construction method has its own advantages as well as disadvantages according to different circumstances, such as the network complexity. However, it is almost impossible to recognize the complexity of the network before estimation. Thus, we develop an Ensemble method which is model averaging to construct a metabolic network. Our simulation studies show that the ensemble averaging based network construction has F1 score larger than these of other methods except only for Adaptive Lasso, reflecting its ability to account for uncertainty of network complexity.\",\"PeriodicalId\":415083,\"journal\":{\"name\":\"International Conference on Bio-inspired Information and Communications Technologies\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Bio-inspired Information and Communications Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4108/EAI.3-12-2015.2262392\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Bio-inspired Information and Communications Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/EAI.3-12-2015.2262392","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

生物信息学中最重要和最具挑战性的“知识提取”任务之一是从生物数据中对基因、蛋白质和代谢物网络进行逆向工程。高斯图形模型(GGMs)已被证明是一种非常强大的形式来推断生物网络。不幸的是,标准的GGM选择技术不能用于“小N,大P”的数据集。基于正则化估计、偏最小二乘法和有限阶偏相关图,已经开发了各种方法来克服这个问题。一些研究比较了几种网络构建算法的性能,如PLSR、SCE和ES、ICR和PCR、Ridge回归、Lasso和adaptive Lasso,以了解哪种方法最适合生物网络构建。每一种对比分析的结果是,根据网络复杂程度等不同的情况,每种构建方法都有各自的优点和缺点。然而,在估计之前几乎不可能识别出网络的复杂性。因此,我们开发了一种模型平均的集成方法来构建一个代谢网络。我们的仿真研究表明,除了自适应Lasso之外,基于集合平均的网络构建的F1得分高于其他方法,反映了其考虑网络复杂性不确定性的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Metabolic Network Construction Using Ensemble Algorithms
One of the most important and challenging "knowledge extraction" tasks in bioinformatics is the reverse engineering of genes, proteins, and metabolites networks from biological data. Gaussian graphical models (GGMs) have been proven to be a very powerful formalism to infer biological networks. Standard GGM selection techniques can unfortunately not be used in the "small N, large P" data setting. Various methods to overcome this issue have been developed based on regularized estimation, partial least squares method, and limited-order partial correlation graphs. Several studies compared the performances among several network construction algorithms, such as PLSR, SCE, and ES, ICR and PCR, Ridge regression, Lasso and adaptive Lasso, to see which method is the best for biological network constructions. Each comparison analysis resulted in that each construction method has its own advantages as well as disadvantages according to different circumstances, such as the network complexity. However, it is almost impossible to recognize the complexity of the network before estimation. Thus, we develop an Ensemble method which is model averaging to construct a metabolic network. Our simulation studies show that the ensemble averaging based network construction has F1 score larger than these of other methods except only for Adaptive Lasso, reflecting its ability to account for uncertainty of network complexity.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Taking Cognition Seriously: A generalised physics of cognition Digestive System Dynamics in Molecular Communication Perspectives Sensor Scheme for Target Tracking in Mobile Sensor Networks Leak-Resistant Design of DNA Strand Displacement Systems Design for Detecting Red Blood Cell Deformation at Different Flow Velocities in Blood Vessel
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1