一个4D张量增强的多维卷积神经网络,用于准确预测蛋白质与配体的结合亲和力。

IF 3.9 2区 化学 Q2 CHEMISTRY, APPLIED Molecular Diversity Pub Date : 2024-12-23 DOI:10.1007/s11030-024-11044-y
Dingfang Huang, Yu Wang, Yiming Sun, Wenhao Ji, Qing Zhang, Yunya Jiang, Haodi Qiu, Haichun Liu, Tao Lu, Xian Wei, Yadong Chen, Yanmin Zhang
{"title":"一个4D张量增强的多维卷积神经网络,用于准确预测蛋白质与配体的结合亲和力。","authors":"Dingfang Huang, Yu Wang, Yiming Sun, Wenhao Ji, Qing Zhang, Yunya Jiang, Haodi Qiu, Haichun Liu, Tao Lu, Xian Wei, Yadong Chen, Yanmin Zhang","doi":"10.1007/s11030-024-11044-y","DOIUrl":null,"url":null,"abstract":"<p><p>Protein-ligand interactions are the molecular basis of many important cellular activities, such as gene regulation, cell metabolism, and signal transduction. Protein-ligand binding affinity is a crucial metric of the strength of the interaction between the two, and accurate prediction of its binding affinity is essential for discovering drugs' new uses. So far, although many predictive models based on machine learning and deep learning have been reported, most of the models mainly focus on one-dimensional sequence and two-dimensional structural characteristics of proteins and ligands, but fail to deeply explore the detailed interaction information between proteins and ligand atoms in the binding pocket region of three-dimensional space. In this study, we introduced a novel 4D tensor feature to capture key interactions within the binding pocket and developed a three-dimensional convolutional neural network (CNN) model based on this feature. Using ten-fold cross-validation, we identified the optimal parameter combination and pocket size. Additionally, we employed feature engineering to extract features across multiple dimensions, including one-dimensional sequences, two-dimensional structures of the ligand and protein, and three-dimensional interaction features between them. We proposed an efficient protein-ligand binding affinity prediction model MCDTA (multi-dimensional convolutional drug-target affinity), built on a multi-dimensional convolutional neural network framework. Feature ablation experiments revealed that the 4D tensor feature had the most significant impact on model performance. MCDTA performed exceptionally well on the PDBbind v.2020 dataset, achieving an RMSE of 1.231 and a PCC of 0.823. In comparative experiments, it outperformed five other mainstream binding affinity prediction models, with an RMSE of 1.349 and a PCC of 0.795. Moreover, MCDTA demonstrated strong generalization ability and practical screening performance across multiple benchmark datasets, highlighting its reliability and accuracy in predicting protein-ligand binding affinity. The code for MCDTA is available at https://github.com/dfhuang-AI/MCDTA .</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A 4D tensor-enhanced multi-dimensional convolutional neural network for accurate prediction of protein-ligand binding affinity.\",\"authors\":\"Dingfang Huang, Yu Wang, Yiming Sun, Wenhao Ji, Qing Zhang, Yunya Jiang, Haodi Qiu, Haichun Liu, Tao Lu, Xian Wei, Yadong Chen, Yanmin Zhang\",\"doi\":\"10.1007/s11030-024-11044-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Protein-ligand interactions are the molecular basis of many important cellular activities, such as gene regulation, cell metabolism, and signal transduction. Protein-ligand binding affinity is a crucial metric of the strength of the interaction between the two, and accurate prediction of its binding affinity is essential for discovering drugs' new uses. So far, although many predictive models based on machine learning and deep learning have been reported, most of the models mainly focus on one-dimensional sequence and two-dimensional structural characteristics of proteins and ligands, but fail to deeply explore the detailed interaction information between proteins and ligand atoms in the binding pocket region of three-dimensional space. In this study, we introduced a novel 4D tensor feature to capture key interactions within the binding pocket and developed a three-dimensional convolutional neural network (CNN) model based on this feature. Using ten-fold cross-validation, we identified the optimal parameter combination and pocket size. Additionally, we employed feature engineering to extract features across multiple dimensions, including one-dimensional sequences, two-dimensional structures of the ligand and protein, and three-dimensional interaction features between them. We proposed an efficient protein-ligand binding affinity prediction model MCDTA (multi-dimensional convolutional drug-target affinity), built on a multi-dimensional convolutional neural network framework. Feature ablation experiments revealed that the 4D tensor feature had the most significant impact on model performance. MCDTA performed exceptionally well on the PDBbind v.2020 dataset, achieving an RMSE of 1.231 and a PCC of 0.823. In comparative experiments, it outperformed five other mainstream binding affinity prediction models, with an RMSE of 1.349 and a PCC of 0.795. Moreover, MCDTA demonstrated strong generalization ability and practical screening performance across multiple benchmark datasets, highlighting its reliability and accuracy in predicting protein-ligand binding affinity. The code for MCDTA is available at https://github.com/dfhuang-AI/MCDTA .</p>\",\"PeriodicalId\":708,\"journal\":{\"name\":\"Molecular Diversity\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-12-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Diversity\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1007/s11030-024-11044-y\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Diversity","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s11030-024-11044-y","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0

摘要

蛋白质与配体的相互作用是许多重要细胞活动的分子基础,如基因调控、细胞代谢和信号转导。蛋白质-配体结合亲和力是衡量两者之间相互作用强度的关键指标,准确预测其结合亲和力对于发现药物的新用途至关重要。到目前为止,虽然已经报道了许多基于机器学习和深度学习的预测模型,但大多数模型主要关注蛋白质和配体的一维序列和二维结构特征,而未能深入探索三维空间结合口袋区域中蛋白质与配体原子之间详细的相互作用信息。在这项研究中,我们引入了一个新的四维张量特征来捕捉绑定口袋内的关键相互作用,并基于该特征开发了一个三维卷积神经网络(CNN)模型。通过十倍交叉验证,我们确定了最佳的参数组合和口袋大小。此外,我们还利用特征工程技术提取了多个维度的特征,包括配体和蛋白质的一维序列、二维结构以及它们之间的三维相互作用特征。基于多维卷积神经网络框架,提出了一种高效的蛋白质-配体结合亲和力预测模型MCDTA (multi-dimensional convolutional drug-target affinity)。特征消融实验表明,4D张量特征对模型性能的影响最为显著。MCDTA在pdbind v.2020数据集上表现得非常好,RMSE为1.231,PCC为0.823。在对比实验中,该模型优于其他5种主流的结合亲和力预测模型,RMSE为1.349,PCC为0.795。此外,MCDTA在多个基准数据集上表现出较强的泛化能力和实用的筛选性能,突出了其预测蛋白质配体结合亲和力的可靠性和准确性。MCDTA的代码可在https://github.com/dfhuang-AI/MCDTA上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A 4D tensor-enhanced multi-dimensional convolutional neural network for accurate prediction of protein-ligand binding affinity.

Protein-ligand interactions are the molecular basis of many important cellular activities, such as gene regulation, cell metabolism, and signal transduction. Protein-ligand binding affinity is a crucial metric of the strength of the interaction between the two, and accurate prediction of its binding affinity is essential for discovering drugs' new uses. So far, although many predictive models based on machine learning and deep learning have been reported, most of the models mainly focus on one-dimensional sequence and two-dimensional structural characteristics of proteins and ligands, but fail to deeply explore the detailed interaction information between proteins and ligand atoms in the binding pocket region of three-dimensional space. In this study, we introduced a novel 4D tensor feature to capture key interactions within the binding pocket and developed a three-dimensional convolutional neural network (CNN) model based on this feature. Using ten-fold cross-validation, we identified the optimal parameter combination and pocket size. Additionally, we employed feature engineering to extract features across multiple dimensions, including one-dimensional sequences, two-dimensional structures of the ligand and protein, and three-dimensional interaction features between them. We proposed an efficient protein-ligand binding affinity prediction model MCDTA (multi-dimensional convolutional drug-target affinity), built on a multi-dimensional convolutional neural network framework. Feature ablation experiments revealed that the 4D tensor feature had the most significant impact on model performance. MCDTA performed exceptionally well on the PDBbind v.2020 dataset, achieving an RMSE of 1.231 and a PCC of 0.823. In comparative experiments, it outperformed five other mainstream binding affinity prediction models, with an RMSE of 1.349 and a PCC of 0.795. Moreover, MCDTA demonstrated strong generalization ability and practical screening performance across multiple benchmark datasets, highlighting its reliability and accuracy in predicting protein-ligand binding affinity. The code for MCDTA is available at https://github.com/dfhuang-AI/MCDTA .

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Molecular Diversity
Molecular Diversity 化学-化学综合
CiteScore
7.30
自引率
7.90%
发文量
219
审稿时长
2.7 months
期刊介绍: Molecular Diversity is a new publication forum for the rapid publication of refereed papers dedicated to describing the development, application and theory of molecular diversity and combinatorial chemistry in basic and applied research and drug discovery. The journal publishes both short and full papers, perspectives, news and reviews dealing with all aspects of the generation of molecular diversity, application of diversity for screening against alternative targets of all types (biological, biophysical, technological), analysis of results obtained and their application in various scientific disciplines/approaches including: combinatorial chemistry and parallel synthesis; small molecule libraries; microwave synthesis; flow synthesis; fluorous synthesis; diversity oriented synthesis (DOS); nanoreactors; click chemistry; multiplex technologies; fragment- and ligand-based design; structure/function/SAR; computational chemistry and molecular design; chemoinformatics; screening techniques and screening interfaces; analytical and purification methods; robotics, automation and miniaturization; targeted libraries; display libraries; peptides and peptoids; proteins; oligonucleotides; carbohydrates; natural diversity; new methods of library formulation and deconvolution; directed evolution, origin of life and recombination; search techniques, landscapes, random chemistry and more;
期刊最新文献
Probing the dark chemical matter against PDE4 for the management of psoriasis using in silico, in vitro and in vivo approach. A practical synthesis of YZD-7082B, a novel orally bioavailable selective estrogen receptor degrader (SERD) for the treatment of ER+ breast cancer. Correction: 1-Styryl-1,3-diketones in the synthesis of spiro[oxindole-3,2'-pyrrolidines] with notable anticancer activity. Integrated virtual screening and compound generation targeting H275Y mutation in the neuraminidase gene of oseltamivir-resistant influenza strains. MedKG: enabling drug discovery through a unified biomedical knowledge graph.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1