Deep Learning-Driven Insights into Enzyme-Substrate Interaction Discovery.

IF 5.3 2区化学 Q1 CHEMISTRY, MEDICINAL Journal of Chemical Information and Modeling Pub Date : 2024-12-25 DOI:10.1021/acs.jcim.4c01801

Wenjia Qian,Xiaorui Wang,Yuansheng Huang,Yu Kang,Peichen Pan,Chang-Yu Hsieh,Tingjun Hou

{"title":"Deep Learning-Driven Insights into Enzyme-Substrate Interaction Discovery.","authors":"Wenjia Qian,Xiaorui Wang,Yuansheng Huang,Yu Kang,Peichen Pan,Chang-Yu Hsieh,Tingjun Hou","doi":"10.1021/acs.jcim.4c01801","DOIUrl":null,"url":null,"abstract":"Enzymes are ubiquitous catalysts with enormous application potential in biomedicine, green chemistry, and biotechnology. However, accurately predicting whether a molecule serves as a substrate for a specific enzyme, especially for novel entities, remains a significant challenge. Compared with traditional experimental methods, computational approaches are much more resource-efficient and time-saving, but they often compromise on accuracy. To address this, we introduce the molecule-enzyme interaction (MEI) model, a novel machine learning framework designed to predict the probability that a given molecule is a substrate for a specified enzyme with high accuracy. Utilizing a comprehensive data set that encapsulates extensive information on enzymatic reactions and enzyme sequences, the MEI model seamlessly combines atomic environmental data with amino acid sequence features through an advanced attention mechanism within a hierarchical neural network. Empirical evaluations have confirmed that the MEI model outperforms the current state-of-the-art model by at least 6.7% in prediction accuracy and 8.5% in AUROC, underscoring its enhanced predictive capabilities. Additionally, the MEI model demonstrates remarkable generalization across data sets of varying qualities and sizes. This adaptability is further evidenced by its successful application in diverse areas, such as predicting interactions within the CYP450 enzyme family and achieving an outstanding accuracy of 90.5% in predicting the enzymatic breakdown of complex plastics within environmental applications. These examples illustrate the model's ability to effectively transfer knowledge from coarsely annotated enzyme databases to smaller, high-precision data sets, robustly modeling both sparse and high-quality databases. We believe that this versatility firmly establishes the MEI model as a foundational tool in enzyme research with immense potential to extend beyond its original scope.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"41 1","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.4c01801","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}

引用次数: 0

Abstract

Enzymes are ubiquitous catalysts with enormous application potential in biomedicine, green chemistry, and biotechnology. However, accurately predicting whether a molecule serves as a substrate for a specific enzyme, especially for novel entities, remains a significant challenge. Compared with traditional experimental methods, computational approaches are much more resource-efficient and time-saving, but they often compromise on accuracy. To address this, we introduce the molecule-enzyme interaction (MEI) model, a novel machine learning framework designed to predict the probability that a given molecule is a substrate for a specified enzyme with high accuracy. Utilizing a comprehensive data set that encapsulates extensive information on enzymatic reactions and enzyme sequences, the MEI model seamlessly combines atomic environmental data with amino acid sequence features through an advanced attention mechanism within a hierarchical neural network. Empirical evaluations have confirmed that the MEI model outperforms the current state-of-the-art model by at least 6.7% in prediction accuracy and 8.5% in AUROC, underscoring its enhanced predictive capabilities. Additionally, the MEI model demonstrates remarkable generalization across data sets of varying qualities and sizes. This adaptability is further evidenced by its successful application in diverse areas, such as predicting interactions within the CYP450 enzyme family and achieving an outstanding accuracy of 90.5% in predicting the enzymatic breakdown of complex plastics within environmental applications. These examples illustrate the model's ability to effectively transfer knowledge from coarsely annotated enzyme databases to smaller, high-precision data sets, robustly modeling both sparse and high-quality databases. We believe that this versatility firmly establishes the MEI model as a foundational tool in enzyme research with immense potential to extend beyond its original scope.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

酶-底物相互作用发现的深度学习驱动见解。

酶是一种普遍存在的催化剂，在生物医学、绿色化学、生物技术等领域具有巨大的应用潜力。然而，准确预测一个分子是否作为特定酶的底物，特别是对于新的实体，仍然是一个重大的挑战。与传统的实验方法相比，计算方法更节省资源和时间，但往往在准确性上有所妥协。为了解决这个问题，我们引入了分子-酶相互作用（MEI）模型，这是一种新的机器学习框架，旨在高精度地预测给定分子是特定酶的底物的概率。利用包含酶反应和酶序列广泛信息的综合数据集，MEI模型通过分层神经网络中的高级注意机制将原子环境数据与氨基酸序列特征无缝结合。实证评估证实，MEI模型在预测精度上至少优于当前最先进的模型6.7%，在AUROC上优于8.5%，强调了其增强的预测能力。此外，MEI模型在不同质量和大小的数据集上显示出显著的泛化。这种适应性进一步证明了其在不同领域的成功应用，例如预测CYP450酶家族的相互作用，并在预测环境应用中复杂塑料的酶分解方面取得了90.5%的出色准确性。这些示例说明了该模型能够有效地将知识从粗略注释的酶数据库转移到更小、高精度的数据集，并对稀疏和高质量的数据库进行健壮的建模。我们相信，这种多功能性牢固地确立了MEI模型作为酶研究的基础工具，具有超越其原始范围的巨大潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Chemical Information and Modeling 化学-化学综合

CiteScore

9.80

自引率

10.70%

发文量

529

审稿时长

1.4 months

期刊介绍： The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery. Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field. As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.