Shortest Hyperpaths in Directed Hypergraphs for Reaction Pathway Inference.

IF 1.4 4区 生物学 Q4 BIOCHEMICAL RESEARCH METHODS Journal of Computational Biology Pub Date : 2023-11-01 Epub Date: 2023-10-31 DOI:10.1089/cmb.2023.0242
Spencer Krieger, John Kececioglu
{"title":"Shortest Hyperpaths in Directed Hypergraphs for Reaction Pathway Inference.","authors":"Spencer Krieger, John Kececioglu","doi":"10.1089/cmb.2023.0242","DOIUrl":null,"url":null,"abstract":"<p><p><b>Signaling and metabolic pathways, which consist of chains of reactions that produce target molecules from source compounds, are cornerstones of cellular biology. Properly modeling the reaction networks that represent such pathways requires directed <i>hypergraphs</i>, where each molecule or compound maps to a vertex, and each reaction maps to a <i>hyperedge</i> directed from its set of input reactants to its set of output products. Inferring the most likely series of reactions that produces a given set of targets from a given set of sources, where for each reaction its reactants are produced by prior reactions in the series, corresponds to finding a <i>shortest hyperpath</i> in a directed hypergraph, which is NP-complete.</b> <b>We give the first <i>exact algorithm</i> for general shortest hyperpaths that can find provably optimal solutions for large, real-world, reaction networks. In particular, we derive a novel <i>graph-theoretic characterization</i> of hyperpaths, which we leverage in a new integer linear programming formulation of shortest hyperpaths that for the first time handles cycles, and develop a <i>cutting-plane algorithm</i> that can solve this integer linear program to optimality in practice. Through comprehensive experiments over all of the thousands of instances from the standard Reactome and NCI-PID reaction databases, we demonstrate that our cutting-plane algorithm quickly finds an optimal hyperpath-inferring the most likely pathway-with a median running time of under 10 seconds, and a maximum time of less than 30 minutes, even on instances with thousands of reactions. We also explore for the first time how well hyperpaths infer true pathways, and show that shortest hyperpaths <i>accurately recover</i> known pathways, typically with very high precision and recall.</b> <b>Source code implementing our cutting-plane algorithm for shortest hyperpaths is available free for research use in a new tool called</b> Mmunin.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1089/cmb.2023.0242","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/10/31 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Signaling and metabolic pathways, which consist of chains of reactions that produce target molecules from source compounds, are cornerstones of cellular biology. Properly modeling the reaction networks that represent such pathways requires directed hypergraphs, where each molecule or compound maps to a vertex, and each reaction maps to a hyperedge directed from its set of input reactants to its set of output products. Inferring the most likely series of reactions that produces a given set of targets from a given set of sources, where for each reaction its reactants are produced by prior reactions in the series, corresponds to finding a shortest hyperpath in a directed hypergraph, which is NP-complete. We give the first exact algorithm for general shortest hyperpaths that can find provably optimal solutions for large, real-world, reaction networks. In particular, we derive a novel graph-theoretic characterization of hyperpaths, which we leverage in a new integer linear programming formulation of shortest hyperpaths that for the first time handles cycles, and develop a cutting-plane algorithm that can solve this integer linear program to optimality in practice. Through comprehensive experiments over all of the thousands of instances from the standard Reactome and NCI-PID reaction databases, we demonstrate that our cutting-plane algorithm quickly finds an optimal hyperpath-inferring the most likely pathway-with a median running time of under 10 seconds, and a maximum time of less than 30 minutes, even on instances with thousands of reactions. We also explore for the first time how well hyperpaths infer true pathways, and show that shortest hyperpaths accurately recover known pathways, typically with very high precision and recall. Source code implementing our cutting-plane algorithm for shortest hyperpaths is available free for research use in a new tool called Mmunin.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于反应路径推断的有向超图中的最短超路径。
信号和代谢途径是细胞生物学的基石,它由从源化合物中产生靶分子的反应链组成。正确地建模代表这种途径的反应网络需要有向超图,其中每个分子或化合物映射到一个顶点,每个反应映射到从其输入反应物集指向其输出产物集的超边。从一组给定的源中推断出产生一组给定目标的最可能的一系列反应,其中对于每个反应,其反应物都是由该系列中的先前反应产生的,这对应于在有向超图中找到最短的超路径,这是NP完全的。我们给出了一般最短超路径的第一个精确算法,该算法可以为大型、真实世界的反应网络找到可证明的最优解。特别地,我们推导了超路径的一种新的图论表征,我们将其用于首次处理循环的最短超路径的新整数线性规划公式,并开发了一种切割平面算法,该算法可以在实践中将该整数线性规划解为最优性。通过对标准Reactome和NCI-PID反应数据库中的数千个实例进行全面实验,我们证明了我们的切割平面算法可以快速找到最优超路径,推断出最有可能的路径,中位运行时间不到10秒,最长时间不到30分钟,即使在有数千个反应的实例中也是如此。我们还首次探索了超通路推断真实通路的能力,并表明最短的超通路准确地恢复了已知通路,通常具有非常高的精确度和召回率。实现我们的最短超路径切割平面算法的源代码可在一种名为Mmunin的新工具中免费供研究使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Computational Biology
Journal of Computational Biology 生物-计算机:跨学科应用
CiteScore
3.60
自引率
5.90%
发文量
113
审稿时长
6-12 weeks
期刊介绍: Journal of Computational Biology is the leading peer-reviewed journal in computational biology and bioinformatics, publishing in-depth statistical, mathematical, and computational analysis of methods, as well as their practical impact. Available only online, this is an essential journal for scientists and students who want to keep abreast of developments in bioinformatics. Journal of Computational Biology coverage includes: -Genomics -Mathematical modeling and simulation -Distributed and parallel biological computing -Designing biological databases -Pattern matching and pattern detection -Linking disparate databases and data -New tools for computational biology -Relational and object-oriented database technology for bioinformatics -Biological expert system design and use -Reasoning by analogy, hypothesis formation, and testing by machine -Management of biological databases
期刊最新文献
A Hybrid GNN Approach for Improved Molecular Property Prediction. Protein-Protein Interaction Prediction Model Based on ProtBert-BiGRU-Attention. BiRNN-DDI: A Drug-Drug Interaction Event Type Prediction Model Based on Bidirectional Recurrent Neural Network and Graph2Seq Representation. SuperTAD-Fast: Accelerating Topologically Associating Domains Detection Through Discretization. CFINet: Cross-Modality MRI Feature Interaction Network for Pseudoprogression Prediction of Glioblastoma.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1