Four-body atomic potential for modeling protein-ligand binding affinity: application to enzyme-inhibitor binding energy prediction

Q3 Biochemistry, Genetics and Molecular Biology BMC Structural Biology Pub Date : 2013-11-08 DOI:10.1186/1472-6807-13-S1-S1
Majid Masso
{"title":"Four-body atomic potential for modeling protein-ligand binding affinity: application to enzyme-inhibitor binding energy prediction","authors":"Majid Masso","doi":"10.1186/1472-6807-13-S1-S1","DOIUrl":null,"url":null,"abstract":"<p>Models that are capable of reliably predicting binding affinities for protein-ligand complexes play an important role the field of structure-guided drug design.</p><p>Here, we begin by applying the computational geometry technique of Delaunay tessellation to each set of atomic coordinates for over 1400 diverse macromolecular structures, for the purpose of deriving a four-body statistical potential that serves as a topological scoring function. Next, we identify a second, independent set of three hundred protein-ligand complexes, having both high-resolution structures and known dissociation constants. Two-thirds of these complexes are randomly selected to train a predictive model of binding affinity as follows: two tessellations are generated in each case, one for the entire complex and another strictly for the isolated protein without its bound ligand, and a topological score is computed for each tessellation with the four-body potential. Predicted protein-ligand binding affinity is then based on an empirically derived linear function of the difference between both topological scores, one that appropriately scales the value of this difference.</p><p>A comparison between experimental and calculated binding affinity values over the two hundred complexes reveals a Pearson's correlation coefficient of <i>r</i> = 0.79 with a standard error of <i>SE</i> = 1.98 kcal/mol. To validate the method, we similarly generated two tessellations for each of the remaining protein-ligand complexes, computed their topological scores and the difference between the two scores for each complex, and applied the previously derived linear transformation of this topological score difference to predict binding affinities. For these one hundred complexes, we again observe a correlation of <i>r</i> = 0.79 (<i>SE</i> = 1.93 kcal/mol) between known and calculated binding affinities. Applying our model to an independent test set of high-resolution structures for three hundred diverse enzyme-inhibitor complexes, each with an experimentally known inhibition constant, also yields a correlation of <i>r</i> = 0.79 (<i>SE</i> = 2.39 kcal/mol) between experimental and calculated binding energies.</p><p>Lastly, we generate predictions with our model on a diverse test set of one hundred protein-ligand complexes previously used to benchmark 15 related methods, and our correlation of <i>r</i> = 0.66 between the calculated and experimental binding energies for this dataset exceeds those of the other approaches. Compared with these related prediction methods, our approach stands out based on salient features that include the reliability of our model, combined with the rapidity of the generated predictions, which are less than one second for an average sized complex.</p>","PeriodicalId":51240,"journal":{"name":"BMC Structural Biology","volume":"13 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2013-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1472-6807-13-S1-S1","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Structural Biology","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1186/1472-6807-13-S1-S1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 4

Abstract

Models that are capable of reliably predicting binding affinities for protein-ligand complexes play an important role the field of structure-guided drug design.

Here, we begin by applying the computational geometry technique of Delaunay tessellation to each set of atomic coordinates for over 1400 diverse macromolecular structures, for the purpose of deriving a four-body statistical potential that serves as a topological scoring function. Next, we identify a second, independent set of three hundred protein-ligand complexes, having both high-resolution structures and known dissociation constants. Two-thirds of these complexes are randomly selected to train a predictive model of binding affinity as follows: two tessellations are generated in each case, one for the entire complex and another strictly for the isolated protein without its bound ligand, and a topological score is computed for each tessellation with the four-body potential. Predicted protein-ligand binding affinity is then based on an empirically derived linear function of the difference between both topological scores, one that appropriately scales the value of this difference.

A comparison between experimental and calculated binding affinity values over the two hundred complexes reveals a Pearson's correlation coefficient of r = 0.79 with a standard error of SE = 1.98 kcal/mol. To validate the method, we similarly generated two tessellations for each of the remaining protein-ligand complexes, computed their topological scores and the difference between the two scores for each complex, and applied the previously derived linear transformation of this topological score difference to predict binding affinities. For these one hundred complexes, we again observe a correlation of r = 0.79 (SE = 1.93 kcal/mol) between known and calculated binding affinities. Applying our model to an independent test set of high-resolution structures for three hundred diverse enzyme-inhibitor complexes, each with an experimentally known inhibition constant, also yields a correlation of r = 0.79 (SE = 2.39 kcal/mol) between experimental and calculated binding energies.

Lastly, we generate predictions with our model on a diverse test set of one hundred protein-ligand complexes previously used to benchmark 15 related methods, and our correlation of r = 0.66 between the calculated and experimental binding energies for this dataset exceeds those of the other approaches. Compared with these related prediction methods, our approach stands out based on salient features that include the reliability of our model, combined with the rapidity of the generated predictions, which are less than one second for an average sized complex.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
模拟蛋白质-配体结合亲和力的四体原子势:在酶抑制剂结合能预测中的应用
能够可靠预测蛋白质-配体复合物结合亲和力的模型在结构导向药物设计领域发挥着重要作用。在这里,我们首先将Delaunay镶嵌的计算几何技术应用于1400多种不同大分子结构的每一组原子坐标,目的是推导出作为拓扑评分函数的四体统计势。接下来,我们确定了第二组独立的300个蛋白质配体复合物,具有高分辨率结构和已知的解离常数。随机选择这些复合物的三分之二来训练结合亲和力的预测模型,如下所示:每种情况下产生两个镶嵌,一个用于整个复合物,另一个严格用于不含其结合配体的分离蛋白,并且计算具有四体电位的每个镶嵌的拓扑分数。预测的蛋白质-配体结合亲和力是基于两个拓扑分数之间的差异的经验推导的线性函数,一个适当地衡量这种差异的值。对这200种配合物的实验值和计算值进行比较,得出Pearson相关系数r = 0.79,标准误差SE = 1.98 kcal/mol。为了验证该方法,我们同样为每个剩余的蛋白质配体复合物生成了两个镶嵌图,计算了它们的拓扑分数和每个复合物的两个分数之间的差值,并应用先前导出的拓扑分数差的线性变换来预测结合亲和力。对于这100个配合物,我们再次观察到已知和计算的结合亲和力之间的相关r = 0.79 (SE = 1.93 kcal/mol)。将我们的模型应用于300种不同酶抑制剂复合物的高分辨率结构的独立测试集,每个复合物都具有实验已知的抑制常数,实验和计算的结合能之间的相关性r = 0.79 (SE = 2.39 kcal/mol)。最后,我们用我们的模型对100个蛋白质配体复合物的不同测试集进行了预测,这些测试集之前用于基准测试15种相关方法,我们的计算结合能和实验结合能之间的相关性r = 0.66超过了其他方法。与这些相关的预测方法相比,我们的方法基于显著的特征脱颖而出,包括我们的模型的可靠性,以及生成预测的速度,对于平均大小的复杂来说,预测的速度不到一秒。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
3.60
自引率
0.00%
发文量
0
审稿时长
>12 weeks
期刊介绍: BMC Structural Biology is an open access, peer-reviewed journal that considers articles on investigations into the structure of biological macromolecules, including solving structures, structural and functional analyses, and computational modeling.
期刊最新文献
Characterization of putative proteins encoded by variable ORFs in white spot syndrome virus genome Correction to: Classification of the human THAP protein family identifies an evolutionarily conserved coiled coil region Effect of low complexity regions within the PvMSP3α block II on the tertiary structure of the protein and implications to immune escape mechanisms QRNAS: software tool for refinement of nucleic acid structures Classification of the human THAP protein family identifies an evolutionarily conserved coiled coil region
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1