Pritam Kundu , Satyajit Beura , Suman Mondal , Amit Kumar Das , Amit Ghosh
{"title":"Machine learning for the advancement of genome-scale metabolic modeling","authors":"Pritam Kundu , Satyajit Beura , Suman Mondal , Amit Kumar Das , Amit Ghosh","doi":"10.1016/j.biotechadv.2024.108400","DOIUrl":null,"url":null,"abstract":"<div><p>Constraint-based modeling (CBM) has evolved as the core systems biology tool to map the interrelations between genotype, phenotype, and external environment. The recent advancement of high-throughput experimental approaches and multi-omics strategies has generated a plethora of new and precise information from wide-ranging biological domains. On the other hand, the continuously growing field of machine learning (ML) and its specialized branch of deep learning (DL) provide essential computational architectures for decoding complex and heterogeneous biological data. In recent years, both multi-omics and ML have assisted in the escalation of CBM. Condition-specific omics data, such as transcriptomics and proteomics, helped contextualize the model prediction while analyzing a particular phenotypic signature. At the same time, the advanced ML tools have eased the model reconstruction and analysis to increase the accuracy and prediction power. However, the development of these multi-disciplinary methodological frameworks mainly occurs independently, which limits the concatenation of biological knowledge from different domains. Hence, we have reviewed the potential of integrating multi-disciplinary tools and strategies from various fields, such as synthetic biology, CBM, omics, and ML, to explore the biochemical phenomenon beyond the conventional biological dogma. How the integrative knowledge of these intersected domains has improved bioengineering and biomedical applications has also been highlighted. We categorically explained the conventional genome-scale metabolic model (GEM) reconstruction tools and their improvement strategies through ML paradigms. Further, the crucial role of ML and DL in omics data restructuring for GEM development has also been briefly discussed. Finally, the case-study-based assessment of the state-of-the-art method for improving biomedical and metabolic engineering strategies has been elaborated. Therefore, this review demonstrates how integrating experimental and in silico strategies can help map the ever-expanding knowledge of biological systems driven by condition-specific cellular information. This multiview approach will elevate the application of ML-based CBM in the biomedical and bioengineering fields for the betterment of society and the environment.</p></div>","PeriodicalId":8946,"journal":{"name":"Biotechnology advances","volume":"74 ","pages":"Article 108400"},"PeriodicalIF":12.1000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biotechnology advances","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0734975024000946","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Constraint-based modeling (CBM) has evolved as the core systems biology tool to map the interrelations between genotype, phenotype, and external environment. The recent advancement of high-throughput experimental approaches and multi-omics strategies has generated a plethora of new and precise information from wide-ranging biological domains. On the other hand, the continuously growing field of machine learning (ML) and its specialized branch of deep learning (DL) provide essential computational architectures for decoding complex and heterogeneous biological data. In recent years, both multi-omics and ML have assisted in the escalation of CBM. Condition-specific omics data, such as transcriptomics and proteomics, helped contextualize the model prediction while analyzing a particular phenotypic signature. At the same time, the advanced ML tools have eased the model reconstruction and analysis to increase the accuracy and prediction power. However, the development of these multi-disciplinary methodological frameworks mainly occurs independently, which limits the concatenation of biological knowledge from different domains. Hence, we have reviewed the potential of integrating multi-disciplinary tools and strategies from various fields, such as synthetic biology, CBM, omics, and ML, to explore the biochemical phenomenon beyond the conventional biological dogma. How the integrative knowledge of these intersected domains has improved bioengineering and biomedical applications has also been highlighted. We categorically explained the conventional genome-scale metabolic model (GEM) reconstruction tools and their improvement strategies through ML paradigms. Further, the crucial role of ML and DL in omics data restructuring for GEM development has also been briefly discussed. Finally, the case-study-based assessment of the state-of-the-art method for improving biomedical and metabolic engineering strategies has been elaborated. Therefore, this review demonstrates how integrating experimental and in silico strategies can help map the ever-expanding knowledge of biological systems driven by condition-specific cellular information. This multiview approach will elevate the application of ML-based CBM in the biomedical and bioengineering fields for the betterment of society and the environment.
基于约束的建模(CBM)已发展成为绘制基因型、表型和外部环境之间相互关系图谱的核心系统生物学工具。近年来,高通量实验方法和多组学策略的发展从广泛的生物领域产生了大量新的精确信息。另一方面,不断发展的机器学习(ML)及其专业分支深度学习(DL)为解码复杂的异构生物数据提供了重要的计算架构。近年来,多组学和 ML 都为 CBM 的升级提供了帮助。转录组学和蛋白质组学等针对特定条件的组学数据有助于在分析特定表型特征的同时对模型预测进行背景分析。同时,先进的 ML 工具简化了模型重建和分析,提高了准确性和预测能力。然而,这些多学科方法框架的开发主要是独立进行的,这限制了不同领域生物知识的融合。因此,我们回顾了将合成生物学、CBM、omics 和 ML 等不同领域的多学科工具和策略进行整合的潜力,以探索传统生物学教条之外的生化现象。我们还强调了这些交叉领域的综合知识如何改进了生物工程和生物医学应用。我们分类解释了传统的基因组尺度代谢模型(GEM)重建工具及其通过 ML 范式进行改进的策略。此外,我们还简要讨论了 ML 和 DL 在 omics 数据重组 GEM 开发中的关键作用。最后,还详细阐述了基于案例研究的最先进方法评估,以改进生物医学和代谢工程策略。因此,这篇综述展示了如何通过整合实验和硅学策略,帮助绘制由特定条件下的细胞信息驱动的、不断扩展的生物系统知识图谱。这种多视角方法将提升基于 ML 的 CBM 在生物医学和生物工程领域的应用,从而改善社会和环境。
期刊介绍:
Biotechnology Advances is a comprehensive review journal that covers all aspects of the multidisciplinary field of biotechnology. The journal focuses on biotechnology principles and their applications in various industries, agriculture, medicine, environmental concerns, and regulatory issues. It publishes authoritative articles that highlight current developments and future trends in the field of biotechnology. The journal invites submissions of manuscripts that are relevant and appropriate. It targets a wide audience, including scientists, engineers, students, instructors, researchers, practitioners, managers, governments, and other stakeholders in the field. Additionally, special issues are published based on selected presentations from recent relevant conferences in collaboration with the organizations hosting those conferences.