从UK-2A到氟啶虫酰胺：主动学习识别大环天然产物的模拟物

IF 3 3区生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Journal of Computer-Aided Molecular Design Pub Date : 2024-04-17 DOI:10.1007/s10822-024-00555-3

Ann E. Cleves, Ajay N. Jain, David A. Demeter, Zachary A. Buchan, Jeremy Wilmot, Erin N. Hancock

{"title":"从UK-2A到氟啶虫酰胺：主动学习识别大环天然产物的模拟物","authors":"Ann E. Cleves, Ajay N. Jain, David A. Demeter, Zachary A. Buchan, Jeremy Wilmot, Erin N. Hancock","doi":"10.1007/s10822-024-00555-3","DOIUrl":null,"url":null,"abstract":"<div><p>Scaffold replacement as part of an optimization process that requires maintenance of potency, desirable biodistribution, metabolic stability, and considerations of synthesis at very large scale is a complex challenge. Here, we consider a set of over 1000 time-stamped compounds, beginning with a macrocyclic natural-product lead and ending with a broad-spectrum crop anti-fungal. We demonstrate the application of the QuanSA 3D-QSAR method employing an active learning procedure that combines two types of molecular selection. The first identifies compounds predicted to be most active of those most likely to be well-covered by the model. The second identifies compounds predicted to be most <i>informative</i> based on exhibiting low predicted activity but showing high 3D similarity to a highly active nearest-neighbor training molecule. Beginning with just 100 compounds, using a deterministic and automatic procedure, five rounds of 20-compound selection and model refinement identifies the binding metabolic form of florylpicoxamid. We show how iterative refinement broadens the domain of applicability of the successive models while also enhancing predictive accuracy. We also demonstrate how a simple method requiring very sparse data can be used to generate relevant ideas for synthetic candidates.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"38 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10822-024-00555-3.pdf","citationCount":"0","resultStr":"{\"title\":\"From UK-2A to florylpicoxamid: Active learning to identify a mimic of a macrocyclic natural product\",\"authors\":\"Ann E. Cleves, Ajay N. Jain, David A. Demeter, Zachary A. Buchan, Jeremy Wilmot, Erin N. Hancock\",\"doi\":\"10.1007/s10822-024-00555-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Scaffold replacement as part of an optimization process that requires maintenance of potency, desirable biodistribution, metabolic stability, and considerations of synthesis at very large scale is a complex challenge. Here, we consider a set of over 1000 time-stamped compounds, beginning with a macrocyclic natural-product lead and ending with a broad-spectrum crop anti-fungal. We demonstrate the application of the QuanSA 3D-QSAR method employing an active learning procedure that combines two types of molecular selection. The first identifies compounds predicted to be most active of those most likely to be well-covered by the model. The second identifies compounds predicted to be most <i>informative</i> based on exhibiting low predicted activity but showing high 3D similarity to a highly active nearest-neighbor training molecule. Beginning with just 100 compounds, using a deterministic and automatic procedure, five rounds of 20-compound selection and model refinement identifies the binding metabolic form of florylpicoxamid. We show how iterative refinement broadens the domain of applicability of the successive models while also enhancing predictive accuracy. We also demonstrate how a simple method requiring very sparse data can be used to generate relevant ideas for synthetic candidates.</p></div>\",\"PeriodicalId\":621,\"journal\":{\"name\":\"Journal of Computer-Aided Molecular Design\",\"volume\":\"38 1\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2024-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10822-024-00555-3.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer-Aided Molecular Design\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10822-024-00555-3\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer-Aided Molecular Design","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10822-024-00555-3","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

作为优化过程的一部分，支架置换要求维持药效、理想的生物分布、代谢稳定性，并考虑大规模合成，这是一项复杂的挑战。在这里，我们考虑了一组超过 1000 个有时间戳的化合物，从一个大环天然产物先导化合物开始，到一个广谱作物抗真菌药物。我们展示了 QuanSA 3D-QSAR 方法的应用，该方法采用了一种结合两种分子选择类型的主动学习程序。第一种是在最有可能被模型很好覆盖的化合物中识别出最有活性的化合物。第二种方法是根据预测活性较低，但与高活性近邻训练分子的三维相似性较高的情况，确定预测信息量最大的化合物。从仅有的 100 个化合物开始，使用确定性的自动程序，经过五轮 20 个化合物的筛选和模型完善，确定了氟啶虫酰胺的结合代谢形式。我们展示了迭代改进如何拓宽连续模型的适用范围，同时提高预测准确性。我们还展示了如何利用一种需要非常稀少数据的简单方法来产生合成候选化合物的相关想法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

From UK-2A to florylpicoxamid: Active learning to identify a mimic of a macrocyclic natural product

Scaffold replacement as part of an optimization process that requires maintenance of potency, desirable biodistribution, metabolic stability, and considerations of synthesis at very large scale is a complex challenge. Here, we consider a set of over 1000 time-stamped compounds, beginning with a macrocyclic natural-product lead and ending with a broad-spectrum crop anti-fungal. We demonstrate the application of the QuanSA 3D-QSAR method employing an active learning procedure that combines two types of molecular selection. The first identifies compounds predicted to be most active of those most likely to be well-covered by the model. The second identifies compounds predicted to be most informative based on exhibiting low predicted activity but showing high 3D similarity to a highly active nearest-neighbor training molecule. Beginning with just 100 compounds, using a deterministic and automatic procedure, five rounds of 20-compound selection and model refinement identifies the binding metabolic form of florylpicoxamid. We show how iterative refinement broadens the domain of applicability of the successive models while also enhancing predictive accuracy. We also demonstrate how a simple method requiring very sparse data can be used to generate relevant ideas for synthetic candidates.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Computer-Aided Molecular Design 生物-计算机：跨学科应用

CiteScore

8.00

自引率

8.60%

发文量

审稿时长

3 months

期刊介绍： The Journal of Computer-Aided Molecular Design provides a form for disseminating information on both the theory and the application of computer-based methods in the analysis and design of molecules. The scope of the journal encompasses papers which report new and original research and applications in the following areas: - theoretical chemistry; - computational chemistry; - computer and molecular graphics; - molecular modeling; - protein engineering; - drug design; - expert systems; - general structure-property relationships; - molecular dynamics; - chemical database development and usage.