{"title":"AI-based automated construction of high-precision Geobacillus thermoglucosidasius enzyme constraint model.","authors":"Minghao Zhang,Haijiao Shi,Xiaohong Wang,Yanan Zhu,Zilong Li,Linna Tu,Yu Zheng,Menglei Xia,Weishan Wang,Min Wang","doi":"10.1016/j.ymben.2024.10.006","DOIUrl":null,"url":null,"abstract":"Geobacillus thermoglucosidasius NCIMB 11955 possesses advantages, such as high-temperature tolerance, rapid growth rate, and low contamination risk. Additionally, it features efficient gene editing tools, making it one of the most promising next-generation cell factories. However, as a non-model microorganism, a lack of metabolic information significantly hampers the construction of high-precision metabolic flux models. Here, we propose a BioIntelliModel (BIM) strategy based on artificial intelligence technology for the automated construction of enzyme-constrained models. 1) . BIM utilises the Contrastive Learning Enabled Enzyme Annotation (CLEAN) prediction tool to analyse the entire genome sequence of G. thermoglucosidasius NCIMB 11955, uncovering potential functional proteins in non-model strains. 2). The MetaPatchM module of BIM automates the repair of the metabolic network model. 3). The Tianjin University of Science and Technology-kcat (TUST-kcat) module predicts the kcat values of enzymes within the model. 4). The Enzyme-insert procedure constructs an enzyme-constrained model and performs a global scan to address overconstraint issues. Enzymatic data were automatically integrated into the metabolic flux model, creating an enzyme-constrained model, ec_G-ther11955. To validate model accuracy, we used both the p-thermo and ec_G-ther11955 models to predict riboflavin production strategies. The ec_G-ther11955 model demonstrated significantly higher accuracy. To further verify its efficacy, we employed ec_G-ther11955 to guide the rational design of L-valine-producing strains. Using the Optimisation Procedure for Identifying All Genetic Manipulations Leading to Targeted Overproductions (OptForce), Predictive Knockout Targeting (PKT), and Flux Scanning based on Enforced Objective Flux (FSEOF) algorithms, we identified 24 knockout and overexpression targets, achieving an accuracy rate of 87.5%. Ultimately, this led to an increase of 664.04% in L-valine titre. This study provides a novel strategy for rapidly constructing non-model strain models and demonstrates the tremendous potential of artificial intelligence in metabolic engineering.","PeriodicalId":18483,"journal":{"name":"Metabolic engineering","volume":"33 1","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Metabolic engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.ymben.2024.10.006","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Geobacillus thermoglucosidasius NCIMB 11955 possesses advantages, such as high-temperature tolerance, rapid growth rate, and low contamination risk. Additionally, it features efficient gene editing tools, making it one of the most promising next-generation cell factories. However, as a non-model microorganism, a lack of metabolic information significantly hampers the construction of high-precision metabolic flux models. Here, we propose a BioIntelliModel (BIM) strategy based on artificial intelligence technology for the automated construction of enzyme-constrained models. 1) . BIM utilises the Contrastive Learning Enabled Enzyme Annotation (CLEAN) prediction tool to analyse the entire genome sequence of G. thermoglucosidasius NCIMB 11955, uncovering potential functional proteins in non-model strains. 2). The MetaPatchM module of BIM automates the repair of the metabolic network model. 3). The Tianjin University of Science and Technology-kcat (TUST-kcat) module predicts the kcat values of enzymes within the model. 4). The Enzyme-insert procedure constructs an enzyme-constrained model and performs a global scan to address overconstraint issues. Enzymatic data were automatically integrated into the metabolic flux model, creating an enzyme-constrained model, ec_G-ther11955. To validate model accuracy, we used both the p-thermo and ec_G-ther11955 models to predict riboflavin production strategies. The ec_G-ther11955 model demonstrated significantly higher accuracy. To further verify its efficacy, we employed ec_G-ther11955 to guide the rational design of L-valine-producing strains. Using the Optimisation Procedure for Identifying All Genetic Manipulations Leading to Targeted Overproductions (OptForce), Predictive Knockout Targeting (PKT), and Flux Scanning based on Enforced Objective Flux (FSEOF) algorithms, we identified 24 knockout and overexpression targets, achieving an accuracy rate of 87.5%. Ultimately, this led to an increase of 664.04% in L-valine titre. This study provides a novel strategy for rapidly constructing non-model strain models and demonstrates the tremendous potential of artificial intelligence in metabolic engineering.
期刊介绍:
Metabolic Engineering (MBE) is a journal that focuses on publishing original research papers on the directed modulation of metabolic pathways for metabolite overproduction or the enhancement of cellular properties. It welcomes papers that describe the engineering of native pathways and the synthesis of heterologous pathways to convert microorganisms into microbial cell factories. The journal covers experimental, computational, and modeling approaches for understanding metabolic pathways and manipulating them through genetic, media, or environmental means. Effective exploration of metabolic pathways necessitates the use of molecular biology and biochemistry methods, as well as engineering techniques for modeling and data analysis. MBE serves as a platform for interdisciplinary research in fields such as biochemistry, molecular biology, applied microbiology, cellular physiology, cellular nutrition in health and disease, and biochemical engineering. The journal publishes various types of papers, including original research papers and review papers. It is indexed and abstracted in databases such as Scopus, Embase, EMBiology, Current Contents - Life Sciences and Clinical Medicine, Science Citation Index, PubMed/Medline, CAS and Biotechnology Citation Index.