Benchmarking the accuracy of structure-based binding affinity predictors on Spike-ACE2 deep mutational interaction set.

IF 2.8 4区生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Proteins-Structure Function and Bioinformatics Pub Date : 2024-04-01 Epub Date: 2023-11-22 DOI:10.1002/prot.26645

Burcu Ozden, Eda Şamiloğlu, Atakan Özsan, Mehmet Erguven, Can Yükrük, Mehdi Koşaca, Melis Oktayoğlu, Muratcan Menteş, Nazmiye Arslan, Gökhan Karakülah, Ayşe Berçin Barlas, Büşra Savaş, Ezgi Karaca

{"title":"Benchmarking the accuracy of structure-based binding affinity predictors on Spike-ACE2 deep mutational interaction set.","authors":"Burcu Ozden, Eda Şamiloğlu, Atakan Özsan, Mehmet Erguven, Can Yükrük, Mehdi Koşaca, Melis Oktayoğlu, Muratcan Menteş, Nazmiye Arslan, Gökhan Karakülah, Ayşe Berçin Barlas, Büşra Savaş, Ezgi Karaca","doi":"10.1002/prot.26645","DOIUrl":null,"url":null,"abstract":"<p><p>Since the start of COVID-19 pandemic, a huge effort has been devoted to understanding the Spike (SARS-CoV-2)-ACE2 recognition mechanism. To this end, two deep mutational scanning studies traced the impact of all possible mutations across receptor binding domain (RBD) of Spike and catalytic domain of human ACE2. By concentrating on the interface mutations of these experimental data, we benchmarked six commonly used structure-based binding affinity predictors (FoldX, EvoEF1, MutaBind2, SSIPe, HADDOCK, and UEP). These predictors were selected based on their user-friendliness, accessibility, and speed. As a result of our benchmarking efforts, we observed that none of the methods could generate a meaningful correlation with the experimental binding data. The best correlation is achieved by FoldX (R = -0.51). When we simplified the prediction problem to a binary classification, that is, whether a mutation is enriching or depleting the binding, we showed that the highest accuracy is achieved by FoldX with a 64% success rate. Surprisingly, on this set, simple energetic scoring functions performed significantly better than the ones using extra evolutionary-based terms, as in Mutabind and SSIPe. Furthermore, we demonstrated that recent AI approaches, mmCSM-PPI and TopNetTree, yielded comparable performances to the force field-based techniques. These observations suggest plenty of room to improve the binding affinity predictors in guessing the variant-induced binding profile changes of a host-pathogen system, such as Spike-ACE2. To aid such improvements we provide our benchmarking data at https://github.com/CSB-KaracaLab/RBD-ACE2-MutBench with the option to visualize our mutant models at https://rbd-ace2-mutbench.github.io/.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"529-539"},"PeriodicalIF":2.8000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteins-Structure Function and Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/prot.26645","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/22 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Since the start of COVID-19 pandemic, a huge effort has been devoted to understanding the Spike (SARS-CoV-2)-ACE2 recognition mechanism. To this end, two deep mutational scanning studies traced the impact of all possible mutations across receptor binding domain (RBD) of Spike and catalytic domain of human ACE2. By concentrating on the interface mutations of these experimental data, we benchmarked six commonly used structure-based binding affinity predictors (FoldX, EvoEF1, MutaBind2, SSIPe, HADDOCK, and UEP). These predictors were selected based on their user-friendliness, accessibility, and speed. As a result of our benchmarking efforts, we observed that none of the methods could generate a meaningful correlation with the experimental binding data. The best correlation is achieved by FoldX (R = -0.51). When we simplified the prediction problem to a binary classification, that is, whether a mutation is enriching or depleting the binding, we showed that the highest accuracy is achieved by FoldX with a 64% success rate. Surprisingly, on this set, simple energetic scoring functions performed significantly better than the ones using extra evolutionary-based terms, as in Mutabind and SSIPe. Furthermore, we demonstrated that recent AI approaches, mmCSM-PPI and TopNetTree, yielded comparable performances to the force field-based techniques. These observations suggest plenty of room to improve the binding affinity predictors in guessing the variant-induced binding profile changes of a host-pathogen system, such as Spike-ACE2. To aid such improvements we provide our benchmarking data at https://github.com/CSB-KaracaLab/RBD-ACE2-MutBench with the option to visualize our mutant models at https://rbd-ace2-mutbench.github.io/.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在Spike-ACE2深度突变相互作用集上对基于结构的结合亲和力预测因子的准确性进行基准测试。

自COVID-19大流行开始以来，人们投入了大量精力来了解Spike (SARS-CoV-2)-ACE2识别机制。为此，两项深入的突变扫描研究追踪了所有可能的突变对Spike受体结合域(RBD)和人类ACE2催化域的影响。通过关注这些实验数据的界面突变，我们对六种常用的基于结构的结合亲和预测因子(FoldX、EvoEF1、MutaBind2、SSIPe、HADDOCK和UEP)进行了基准测试。这些预测器是根据它们的用户友好性、可访问性和速度来选择的。由于我们的基准测试工作，我们观察到没有一种方法可以与实验绑定数据产生有意义的相关性。FoldX的相关性最好(R = -0.51)。当我们将预测问题简化为二元分类时，即突变是富集还是耗尽结合，我们发现FoldX的准确率最高，成功率为64%。令人惊讶的是，在这个集合上，简单的能量评分函数比使用额外的基于进化的术语(如Mutabind和SSIPe)的函数表现得好得多。此外，我们证明了最近的人工智能方法，mmCSM-PPI和TopNetTree，产生了与基于力场的技术相当的性能。这些观察结果表明，在猜测变异诱导的宿主-病原体系统(如Spike-ACE2)结合谱变化方面，结合亲和力预测因子还有很大的改进空间。为了帮助实现这些改进，我们在https://github.com/CSB-KaracaLab/RBD-ACE2-MutBench上提供了基准测试数据，并在https://rbd-ace2-mutbench.github.io/上提供了可视化突变模型的选项。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proteins-Structure Function and Bioinformatics 生物-生化与分子生物学

CiteScore

5.90

自引率

3.40%

发文量

172

审稿时长

3 months

期刊介绍： PROTEINS : Structure, Function, and Bioinformatics publishes original reports of significant experimental and analytic research in all areas of protein research: structure, function, computation, genetics, and design. The journal encourages reports that present new experimental or computational approaches for interpreting and understanding data from biophysical chemistry, structural studies of proteins and macromolecular assemblies, alterations of protein structure and function engineered through techniques of molecular biology and genetics, functional analyses under physiologic conditions, as well as the interactions of proteins with receptors, nucleic acids, or other specific ligands or substrates. Research in protein and peptide biochemistry directed toward synthesizing or characterizing molecules that simulate aspects of the activity of proteins, or that act as inhibitors of protein function, is also within the scope of PROTEINS. In addition to full-length reports, short communications (usually not more than 4 printed pages) and prediction reports are welcome. Reviews are typically by invitation; authors are encouraged to submit proposed topics for consideration.