{"title":"ClassyPose:应用于药物发现虚拟筛选的配体姿态选择机器学习分类模型","authors":"V. Tran-Nguyen, A. Camproux, Olivier Taboureau","doi":"10.1002/aisy.202400238","DOIUrl":null,"url":null,"abstract":"Determining the target‐bound conformation of a drug‐like molecule is a crucial step in drug design, as it affects the outcome of virtual screening (VS), and paves the way for hit‐to‐lead and lead optimization. While most docking programs usually manage to produce at least a near‐native pose for a bioactive molecule inside its binding pocket, their integrated classical scoring functions (SFs) generally fail to prioritize this pose. Many studies have been carried out to tackle this SF problem, offering multiple pose refinement and/or classification methods, albeit with limitations. This study presents a new support vector machine model for pose classification, called “ClassyPose”, which predicts the probability that a receptor‐bound ligand conformation could be near‐native, without any additional pose optimization step. Trained on protein‐ligand extended connectivity features extracted from over 21 600 crystal and docking poses of diverse ligands, this model outperformed other machine‐learning algorithms and three existing SFs in terms of docking power, identifying the native ligand pose as top‐ranked solution for more than 90% of entries in two test sets. It also achieved high specificity (above 0.96), and improved VS performance when used for pose selection. This efficient, user‐friendly tool and all related data are available at https://github.com/vktrannguyen/Classy_Pose.","PeriodicalId":7187,"journal":{"name":"Advanced Intelligent Systems","volume":"110 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ClassyPose: A Machine‐Learning Classification Model for Ligand Pose Selection Applied to Virtual Screening in Drug Discovery\",\"authors\":\"V. Tran-Nguyen, A. Camproux, Olivier Taboureau\",\"doi\":\"10.1002/aisy.202400238\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Determining the target‐bound conformation of a drug‐like molecule is a crucial step in drug design, as it affects the outcome of virtual screening (VS), and paves the way for hit‐to‐lead and lead optimization. While most docking programs usually manage to produce at least a near‐native pose for a bioactive molecule inside its binding pocket, their integrated classical scoring functions (SFs) generally fail to prioritize this pose. Many studies have been carried out to tackle this SF problem, offering multiple pose refinement and/or classification methods, albeit with limitations. This study presents a new support vector machine model for pose classification, called “ClassyPose”, which predicts the probability that a receptor‐bound ligand conformation could be near‐native, without any additional pose optimization step. Trained on protein‐ligand extended connectivity features extracted from over 21 600 crystal and docking poses of diverse ligands, this model outperformed other machine‐learning algorithms and three existing SFs in terms of docking power, identifying the native ligand pose as top‐ranked solution for more than 90% of entries in two test sets. It also achieved high specificity (above 0.96), and improved VS performance when used for pose selection. This efficient, user‐friendly tool and all related data are available at https://github.com/vktrannguyen/Classy_Pose.\",\"PeriodicalId\":7187,\"journal\":{\"name\":\"Advanced Intelligent Systems\",\"volume\":\"110 8\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advanced Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/aisy.202400238\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/aisy.202400238","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
确定类药物分子的靶标结合构象是药物设计的关键一步,因为它影响着虚拟筛选(VS)的结果,并为 "命中先导"(hit-to-lead)和 "先导优化"(lead optimization)铺平了道路。虽然大多数对接程序通常都能为生物活性分子在其结合口袋内生成至少一个接近原生的姿势,但其集成的经典评分函数(SF)通常无法优先考虑这一姿势。为解决 SF 问题,已有许多研究提供了多种姿势改进和/或分类方法,但这些方法都有局限性。本研究提出了一种新的姿势分类支持向量机模型,称为 "ClassyPose",它可以预测受体结合配体构象接近原生的概率,而无需任何额外的姿势优化步骤。该模型以从超过 21 600 个不同配体的晶体和对接姿势中提取的蛋白质-配体扩展连接特征为基础进行训练,在对接能力方面优于其他机器学习算法和现有的三种 SF,在两个测试集中,90% 以上的条目都能将原生配体姿势识别为排名靠前的解决方案。它还实现了较高的特异性(高于 0.96),并在用于姿势选择时提高了 VS 性能。这一高效、用户友好的工具和所有相关数据可在 https://github.com/vktrannguyen/Classy_Pose 网站上查阅。
ClassyPose: A Machine‐Learning Classification Model for Ligand Pose Selection Applied to Virtual Screening in Drug Discovery
Determining the target‐bound conformation of a drug‐like molecule is a crucial step in drug design, as it affects the outcome of virtual screening (VS), and paves the way for hit‐to‐lead and lead optimization. While most docking programs usually manage to produce at least a near‐native pose for a bioactive molecule inside its binding pocket, their integrated classical scoring functions (SFs) generally fail to prioritize this pose. Many studies have been carried out to tackle this SF problem, offering multiple pose refinement and/or classification methods, albeit with limitations. This study presents a new support vector machine model for pose classification, called “ClassyPose”, which predicts the probability that a receptor‐bound ligand conformation could be near‐native, without any additional pose optimization step. Trained on protein‐ligand extended connectivity features extracted from over 21 600 crystal and docking poses of diverse ligands, this model outperformed other machine‐learning algorithms and three existing SFs in terms of docking power, identifying the native ligand pose as top‐ranked solution for more than 90% of entries in two test sets. It also achieved high specificity (above 0.96), and improved VS performance when used for pose selection. This efficient, user‐friendly tool and all related data are available at https://github.com/vktrannguyen/Classy_Pose.