从UK-2A到氟啶虫酰胺:主动学习识别大环天然产物的模拟物

IF 3 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Journal of Computer-Aided Molecular Design Pub Date : 2024-04-17 DOI:10.1007/s10822-024-00555-3
Ann E. Cleves, Ajay N. Jain, David A. Demeter, Zachary A. Buchan, Jeremy Wilmot, Erin N. Hancock
{"title":"从UK-2A到氟啶虫酰胺:主动学习识别大环天然产物的模拟物","authors":"Ann E. Cleves,&nbsp;Ajay N. Jain,&nbsp;David A. Demeter,&nbsp;Zachary A. Buchan,&nbsp;Jeremy Wilmot,&nbsp;Erin N. Hancock","doi":"10.1007/s10822-024-00555-3","DOIUrl":null,"url":null,"abstract":"<div><p>Scaffold replacement as part of an optimization process that requires maintenance of potency, desirable biodistribution, metabolic stability, and considerations of synthesis at very large scale is a complex challenge. Here, we consider a set of over 1000 time-stamped compounds, beginning with a macrocyclic natural-product lead and ending with a broad-spectrum crop anti-fungal. We demonstrate the application of the QuanSA 3D-QSAR method employing an active learning procedure that combines two types of molecular selection. The first identifies compounds predicted to be most active of those most likely to be well-covered by the model. The second identifies compounds predicted to be most <i>informative</i> based on exhibiting low predicted activity but showing high 3D similarity to a highly active nearest-neighbor training molecule. Beginning with just 100 compounds, using a deterministic and automatic procedure, five rounds of 20-compound selection and model refinement identifies the binding metabolic form of florylpicoxamid. We show how iterative refinement broadens the domain of applicability of the successive models while also enhancing predictive accuracy. We also demonstrate how a simple method requiring very sparse data can be used to generate relevant ideas for synthetic candidates.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"38 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10822-024-00555-3.pdf","citationCount":"0","resultStr":"{\"title\":\"From UK-2A to florylpicoxamid: Active learning to identify a mimic of a macrocyclic natural product\",\"authors\":\"Ann E. Cleves,&nbsp;Ajay N. Jain,&nbsp;David A. Demeter,&nbsp;Zachary A. Buchan,&nbsp;Jeremy Wilmot,&nbsp;Erin N. Hancock\",\"doi\":\"10.1007/s10822-024-00555-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Scaffold replacement as part of an optimization process that requires maintenance of potency, desirable biodistribution, metabolic stability, and considerations of synthesis at very large scale is a complex challenge. Here, we consider a set of over 1000 time-stamped compounds, beginning with a macrocyclic natural-product lead and ending with a broad-spectrum crop anti-fungal. We demonstrate the application of the QuanSA 3D-QSAR method employing an active learning procedure that combines two types of molecular selection. The first identifies compounds predicted to be most active of those most likely to be well-covered by the model. The second identifies compounds predicted to be most <i>informative</i> based on exhibiting low predicted activity but showing high 3D similarity to a highly active nearest-neighbor training molecule. Beginning with just 100 compounds, using a deterministic and automatic procedure, five rounds of 20-compound selection and model refinement identifies the binding metabolic form of florylpicoxamid. We show how iterative refinement broadens the domain of applicability of the successive models while also enhancing predictive accuracy. We also demonstrate how a simple method requiring very sparse data can be used to generate relevant ideas for synthetic candidates.</p></div>\",\"PeriodicalId\":621,\"journal\":{\"name\":\"Journal of Computer-Aided Molecular Design\",\"volume\":\"38 1\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2024-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10822-024-00555-3.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer-Aided Molecular Design\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10822-024-00555-3\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer-Aided Molecular Design","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10822-024-00555-3","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

作为优化过程的一部分,支架置换要求维持药效、理想的生物分布、代谢稳定性,并考虑大规模合成,这是一项复杂的挑战。在这里,我们考虑了一组超过 1000 个有时间戳的化合物,从一个大环天然产物先导化合物开始,到一个广谱作物抗真菌药物。我们展示了 QuanSA 3D-QSAR 方法的应用,该方法采用了一种结合两种分子选择类型的主动学习程序。第一种是在最有可能被模型很好覆盖的化合物中识别出最有活性的化合物。第二种方法是根据预测活性较低,但与高活性近邻训练分子的三维相似性较高的情况,确定预测信息量最大的化合物。从仅有的 100 个化合物开始,使用确定性的自动程序,经过五轮 20 个化合物的筛选和模型完善,确定了氟啶虫酰胺的结合代谢形式。我们展示了迭代改进如何拓宽连续模型的适用范围,同时提高预测准确性。我们还展示了如何利用一种需要非常稀少数据的简单方法来产生合成候选化合物的相关想法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
From UK-2A to florylpicoxamid: Active learning to identify a mimic of a macrocyclic natural product

Scaffold replacement as part of an optimization process that requires maintenance of potency, desirable biodistribution, metabolic stability, and considerations of synthesis at very large scale is a complex challenge. Here, we consider a set of over 1000 time-stamped compounds, beginning with a macrocyclic natural-product lead and ending with a broad-spectrum crop anti-fungal. We demonstrate the application of the QuanSA 3D-QSAR method employing an active learning procedure that combines two types of molecular selection. The first identifies compounds predicted to be most active of those most likely to be well-covered by the model. The second identifies compounds predicted to be most informative based on exhibiting low predicted activity but showing high 3D similarity to a highly active nearest-neighbor training molecule. Beginning with just 100 compounds, using a deterministic and automatic procedure, five rounds of 20-compound selection and model refinement identifies the binding metabolic form of florylpicoxamid. We show how iterative refinement broadens the domain of applicability of the successive models while also enhancing predictive accuracy. We also demonstrate how a simple method requiring very sparse data can be used to generate relevant ideas for synthetic candidates.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Computer-Aided Molecular Design
Journal of Computer-Aided Molecular Design 生物-计算机:跨学科应用
CiteScore
8.00
自引率
8.60%
发文量
56
审稿时长
3 months
期刊介绍: The Journal of Computer-Aided Molecular Design provides a form for disseminating information on both the theory and the application of computer-based methods in the analysis and design of molecules. The scope of the journal encompasses papers which report new and original research and applications in the following areas: - theoretical chemistry; - computational chemistry; - computer and molecular graphics; - molecular modeling; - protein engineering; - drug design; - expert systems; - general structure-property relationships; - molecular dynamics; - chemical database development and usage.
期刊最新文献
Discovering promising drug candidates for Parkinson’s disease: integrating miRNA and DEG analysis with molecular dynamics and MMPBSA In silico exploration of natural xanthone derivatives as potential inhibitors of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) replication and cellular entry Elucidating allosteric signal disruption in PBP2a: impact of N146K/E150K mutations on ceftaroline resistance in methicillin-resistant Staphylococcus aureus In silico design of dehydrophenylalanine containing peptide activators of glucokinase using pharmacophore modelling, molecular dynamics and machine learning: implications in type 2 diabetes ConoDL: a deep learning framework for rapid generation and prediction of conotoxins
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1