构建、制备和评估蛋白质配体结合亲和力基准的最佳实践[文章v0.1]。

Living journal of computational molecular science Pub Date : 2022-01-01 Epub Date: 2022-08-30 DOI:10.33011/livecoms.4.1.1497

David F Hahn, Christopher I Bayly, Hannah E Bruce Macdonald, John D Chodera, Antonia S J S Mey, David L Mobley, Laura Perez Benito, Christina E M Schindler, Gary Tresadern, Gregory L Warren

{"title":"构建、制备和评估蛋白质配体结合亲和力基准的最佳实践[文章v0.1]。","authors":"David F Hahn, Christopher I Bayly, Hannah E Bruce Macdonald, John D Chodera, Antonia S J S Mey, David L Mobley, Laura Perez Benito, Christina E M Schindler, Gary Tresadern, Gregory L Warren","doi":"10.33011/livecoms.4.1.1497","DOIUrl":null,"url":null,"abstract":"Free energy calculations are rapidly becoming indispensable in structure-enabled drug discovery programs. As new methods, force fields, and implementations are developed, assessing their expected accuracy on real-world systems (benchmarking) becomes critical to provide users with an assessment of the accuracy expected when these methods are applied within their domain of applicability, and developers with a way to assess the expected impact of new methodologies. These assessments require construction of a benchmark-a set of well-prepared, high quality systems with corresponding experimental measurements designed to ensure the resulting calculations provide a realistic assessment of expected performance when these methods are deployed within their domains of applicability. To date, the community has not yet adopted a common standardized benchmark, and existing benchmark reports suffer from a myriad of issues, including poor data quality, limited statistical power, and statistically deficient analyses, all of which can conspire to produce benchmarks that are poorly predictive of real-world performance. Here, we address these issues by presenting guidelines for (1) curating experimental data to develop meaningful benchmark sets, (2) preparing benchmark inputs according to best practices to facilitate widespread adoption, and (3) analysis of the resulting predictions to enable statistically meaningful comparisons among methods and force fields. We highlight challenges and open questions that remain to be solved in these areas, as well as recommendations for the collection of new datasets that might optimally serve to measure progress as methods become systematically more reliable. Finally, we provide a curated, versioned, open, standardized benchmark set adherent to these standards (PLBenchmarks) and an open source toolkit for implementing standardized best practices assessments (arsenic) for the community to use as a standardized assessment tool. While our main focus is free energy methods based on molecular simulations, these guidelines should prove useful for assessment of the rapidly growing field of machine learning methods for affinity prediction as well.","PeriodicalId":74084,"journal":{"name":"Living journal of computational molecular science","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9662604/pdf/nihms-1700409.pdf","citationCount":"0","resultStr":"{\"title\":\"Best practices for constructing, preparing, and evaluating protein-ligand binding affinity benchmarks [Article v0.1].\",\"authors\":\"David F Hahn, Christopher I Bayly, Hannah E Bruce Macdonald, John D Chodera, Antonia S J S Mey, David L Mobley, Laura Perez Benito, Christina E M Schindler, Gary Tresadern, Gregory L Warren\",\"doi\":\"10.33011/livecoms.4.1.1497\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Free energy calculations are rapidly becoming indispensable in structure-enabled drug discovery programs. As new methods, force fields, and implementations are developed, assessing their expected accuracy on real-world systems (benchmarking) becomes critical to provide users with an assessment of the accuracy expected when these methods are applied within their domain of applicability, and developers with a way to assess the expected impact of new methodologies. These assessments require construction of a benchmark-a set of well-prepared, high quality systems with corresponding experimental measurements designed to ensure the resulting calculations provide a realistic assessment of expected performance when these methods are deployed within their domains of applicability. To date, the community has not yet adopted a common standardized benchmark, and existing benchmark reports suffer from a myriad of issues, including poor data quality, limited statistical power, and statistically deficient analyses, all of which can conspire to produce benchmarks that are poorly predictive of real-world performance. Here, we address these issues by presenting guidelines for (1) curating experimental data to develop meaningful benchmark sets, (2) preparing benchmark inputs according to best practices to facilitate widespread adoption, and (3) analysis of the resulting predictions to enable statistically meaningful comparisons among methods and force fields. We highlight challenges and open questions that remain to be solved in these areas, as well as recommendations for the collection of new datasets that might optimally serve to measure progress as methods become systematically more reliable. Finally, we provide a curated, versioned, open, standardized benchmark set adherent to these standards (PLBenchmarks) and an open source toolkit for implementing standardized best practices assessments (arsenic) for the community to use as a standardized assessment tool. While our main focus is free energy methods based on molecular simulations, these guidelines should prove useful for assessment of the rapidly growing field of machine learning methods for affinity prediction as well.\",\"PeriodicalId\":74084,\"journal\":{\"name\":\"Living journal of computational molecular science\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9662604/pdf/nihms-1700409.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Living journal of computational molecular science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33011/livecoms.4.1.1497\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2022/8/30 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Living journal of computational molecular science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33011/livecoms.4.1.1497","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/8/30 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

自由能计算正迅速成为结构药物发现计划中不可或缺的一部分。随着新方法、力场和实现的开发，评估其在现实世界系统中的预期准确性（基准测试）变得至关重要，这对于用户在其适用范围内应用这些方法时对预期准确性的评估以及开发人员评估新方法预期影响的方法来说至关重要。这些评估需要构建一个基准点——一套准备充分的高质量系统，以及相应的实验测量，以确保当这些方法在其适用范围内部署时，所得计算能够对预期性能进行现实的评估。到目前为止，社区还没有采用一个通用的标准化基准，现有的基准报告面临着无数问题，包括数据质量差、统计能力有限和统计不足的分析，所有这些都可能共同产生对现实世界表现预测不佳的基准。在这里，我们通过提出以下指导方针来解决这些问题：（1）管理实验数据，以开发有意义的基准集；（2）根据最佳实践准备基准输入，以促进广泛采用；（3）分析结果预测，以实现方法和力场之间的统计意义比较。我们强调了这些领域仍有待解决的挑战和悬而未决的问题，以及收集新数据集的建议，这些数据集可能有助于随着方法变得更加系统可靠而最佳地衡量进展。最后，我们提供了一个符合这些标准的策划、版本化、开放、标准化的基准集（PLBenchmarks），以及一个用于实施标准化最佳实践评估（砷）的开源工具包，供社区用作标准化评估工具。虽然我们的主要关注点是基于分子模拟的自由能方法，但这些指南也应被证明对评估快速增长的机器学习方法领域的亲和力预测有用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Best practices for constructing, preparing, and evaluating protein-ligand binding affinity benchmarks [Article v0.1].

Free energy calculations are rapidly becoming indispensable in structure-enabled drug discovery programs. As new methods, force fields, and implementations are developed, assessing their expected accuracy on real-world systems (benchmarking) becomes critical to provide users with an assessment of the accuracy expected when these methods are applied within their domain of applicability, and developers with a way to assess the expected impact of new methodologies. These assessments require construction of a benchmark-a set of well-prepared, high quality systems with corresponding experimental measurements designed to ensure the resulting calculations provide a realistic assessment of expected performance when these methods are deployed within their domains of applicability. To date, the community has not yet adopted a common standardized benchmark, and existing benchmark reports suffer from a myriad of issues, including poor data quality, limited statistical power, and statistically deficient analyses, all of which can conspire to produce benchmarks that are poorly predictive of real-world performance. Here, we address these issues by presenting guidelines for (1) curating experimental data to develop meaningful benchmark sets, (2) preparing benchmark inputs according to best practices to facilitate widespread adoption, and (3) analysis of the resulting predictions to enable statistically meaningful comparisons among methods and force fields. We highlight challenges and open questions that remain to be solved in these areas, as well as recommendations for the collection of new datasets that might optimally serve to measure progress as methods become systematically more reliable. Finally, we provide a curated, versioned, open, standardized benchmark set adherent to these standards (PLBenchmarks) and an open source toolkit for implementing standardized best practices assessments (arsenic) for the community to use as a standardized assessment tool. While our main focus is free energy methods based on molecular simulations, these guidelines should prove useful for assessment of the rapidly growing field of machine learning methods for affinity prediction as well.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Living journal of computational molecular science

自引率

0.00%

发文量