Exploration on learning molecular docking with deep learning models.

IF 1.4 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Quantitative Biology Pub Date : 2023-10-17 eCollection Date: 2023-09-01 DOI:10.15302/J-QB-022-0321

Qin Xie, Wei Ma, Jianhang Zhang, Shiliang Li, Xiaobing Deng, Youjun Xu, Weilin Zhang

{"title":"Exploration on learning molecular docking with deep learning models.","authors":"Qin Xie, Wei Ma, Jianhang Zhang, Shiliang Li, Xiaobing Deng, Youjun Xu, Weilin Zhang","doi":"10.15302/J-QB-022-0321","DOIUrl":null,"url":null,"abstract":"A deep learning-powered VS approach combined with two free docking programs are proposed and evaluated for screening an ultra-large compound library to obtain diverse potential active compounds rapidly and efficiently. We found that it is a practical and transferable strategy to significantly reduce computational cost.Background: Molecular docking-based virtual screening (VS) aims to choose ligands with potential pharmacological activities from millions or even billions of molecules. This process could significantly cut down the number of compounds that need to be experimentally tested. However, during the docking calculation, many molecules have low affinity for a particular protein target, which waste a lot of computational resources.Methods: We implemented a fast and practical molecular screening approach called DL-DockVS (deep learning dock virtual screening) by using deep learning models (regression and classification models) to learn the outcomes of pipelined docking programs step-by-step.Results: In this study, we showed that this approach could successfully weed out compounds with poor docking scores while keeping compounds with potentially high docking scores against 10 DUD-E protein targets. A self-built dataset of about 1.9 million molecules was used to further verify DL-DockVS, yielding good results in terms of recall rate, active compounds enrichment factor and runtime speed.Conclusions: We comprehensively evaluate the practicality and effectiveness of DL-DockVS against 10 protein targets. Due to the improvements of runtime and maintained success rate, it would be a useful and promising approach to screen ultra-large compound libraries in the age of big data. It is also very convenient for researchers to make a well-trained model of one specific target for predicting other chemical libraries and high docking-score molecules without docking computation again.","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":"1 1","pages":"320-331"},"PeriodicalIF":1.4000,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12807227/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.15302/J-QB-022-0321","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/9/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

A deep learning-powered VS approach combined with two free docking programs are proposed and evaluated for screening an ultra-large compound library to obtain diverse potential active compounds rapidly and efficiently. We found that it is a practical and transferable strategy to significantly reduce computational cost.

Background: Molecular docking-based virtual screening (VS) aims to choose ligands with potential pharmacological activities from millions or even billions of molecules. This process could significantly cut down the number of compounds that need to be experimentally tested. However, during the docking calculation, many molecules have low affinity for a particular protein target, which waste a lot of computational resources.

Methods: We implemented a fast and practical molecular screening approach called DL-DockVS (deep learning dock virtual screening) by using deep learning models (regression and classification models) to learn the outcomes of pipelined docking programs step-by-step.

Results: In this study, we showed that this approach could successfully weed out compounds with poor docking scores while keeping compounds with potentially high docking scores against 10 DUD-E protein targets. A self-built dataset of about 1.9 million molecules was used to further verify DL-DockVS, yielding good results in terms of recall rate, active compounds enrichment factor and runtime speed.

Conclusions: We comprehensively evaluate the practicality and effectiveness of DL-DockVS against 10 protein targets. Due to the improvements of runtime and maintained success rate, it would be a useful and promising approach to screen ultra-large compound libraries in the age of big data. It is also very convenient for researchers to make a well-trained model of one specific target for predicting other chemical libraries and high docking-score molecules without docking computation again.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

学习分子与深度学习模型对接的探索。

提出了一种基于深度学习的VS方法，并结合两个免费对接程序对超大化合物库进行筛选，以快速高效地获得多种潜在活性化合物。我们发现这是一种实用且可转移的策略，可以显著降低计算成本。背景：基于分子对接的虚拟筛选（VS）旨在从数百万甚至数十亿个分子中选择具有潜在药理活性的配体。这个过程可以显著减少需要实验测试的化合物的数量。然而，在对接计算过程中，许多分子对特定蛋白靶点的亲和力较低，浪费了大量的计算资源。方法：利用深度学习模型（回归模型和分类模型）逐步学习流水线对接程序的结果，实现了一种快速实用的分子筛选方法DL-DockVS（深度学习对接虚拟筛选）。结果：在这项研究中，我们发现这种方法可以成功地剔除对接得分低的化合物，同时保留对10个ddu - e蛋白靶点具有潜在高对接得分的化合物。利用自建的约190万分子数据集进一步验证DL-DockVS，在查全率、活性化合物富集因子和运行速度方面均取得了较好的结果。结论：我们综合评价了DL-DockVS对10个蛋白靶点的实用性和有效性。由于运行时和维护成功率的提高，它将是大数据时代筛选超大型复合库的一种有用且有前途的方法。研究人员也非常方便地建立一个训练有素的特定靶点模型，用于预测其他化学文库和高对接分数的分子，而无需再次进行对接计算。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Quantitative Biology MATHEMATICAL & COMPUTATIONAL BIOLOGY-

CiteScore

5.00

自引率

3.20%

发文量

264

期刊介绍： Quantitative Biology is an interdisciplinary journal that focuses on original research that uses quantitative approaches and technologies to analyze and integrate biological systems, construct and model engineered life systems, and gain a deeper understanding of the life sciences. It aims to provide a platform for not only the analysis but also the integration and construction of biological systems. It is a quarterly journal seeking to provide an inter- and multi-disciplinary forum for a broad blend of peer-reviewed academic papers in order to promote rapid communication and exchange between scientists in the East and the West. The content of Quantitative Biology will mainly focus on the two broad and related areas: ·bioinformatics and computational biology, which focuses on dealing with information technologies and computational methodologies that can efficiently and accurately manipulate –omics data and transform molecular information into biological knowledge. ·systems and synthetic biology, which focuses on complex interactions in biological systems and the emergent functional properties, and on the design and construction of new biological functions and systems. Its goal is to reflect the significant advances made in quantitatively investigating and modeling both natural and engineered life systems at the molecular and higher levels. The journal particularly encourages original papers that link novel theory with cutting-edge experiments, especially in the newly emerging and multi-disciplinary areas of research. The journal also welcomes high-quality reviews and perspective articles.