机器学习辅助药物发现中高通量筛选的命中优先级排序

IF 12.7 1区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY ACS Central Science Pub Date : 2024-03-15 DOI:10.1021/acscentsci.3c01517
Davide Boldini, Lukas Friedrich, Daniel Kuhn and Stephan A. Sieber*, 
{"title":"机器学习辅助药物发现中高通量筛选的命中优先级排序","authors":"Davide Boldini,&nbsp;Lukas Friedrich,&nbsp;Daniel Kuhn and Stephan A. Sieber*,&nbsp;","doi":"10.1021/acscentsci.3c01517","DOIUrl":null,"url":null,"abstract":"<p >Efficient prioritization of bioactive compounds from high throughput screening campaigns is a fundamental challenge for accelerating drug development efforts. In this study, we present the first data-driven approach to simultaneously detect assay interferents and prioritize true bioactive compounds. By analyzing the learning dynamics during training of a gradient boosting model on noisy high throughput screening data using a novel formulation of sample influence, we are able to distinguish between compounds exhibiting the desired biological response and those producing assay artifacts. Therefore, our method enables false positive and true positive detection without relying on prior screens or assay interference mechanisms, making it applicable to any high throughput screening campaign. We demonstrate that our approach consistently excludes assay interferents with different mechanisms and prioritizes biologically relevant compounds more efficiently than all tested baselines, including a retrospective case study simulating its use in a real drug discovery campaign. Finally, our tool is extremely computationally efficient, requiring less than 30 s per assay on low-resource hardware. As such, our findings show that our method is an ideal addition to existing false positive detection tools and can be used to guide further pharmacological optimization after high throughput screening campaigns.</p><p >Minimum variance sampling analysis (MVS-A) is a fast machine-learning approach enabling the identification of both true bioactive compounds and false positives in high throughput screening data.</p>","PeriodicalId":10,"journal":{"name":"ACS Central Science","volume":null,"pages":null},"PeriodicalIF":12.7000,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acscentsci.3c01517","citationCount":"0","resultStr":"{\"title\":\"Machine Learning Assisted Hit Prioritization for High Throughput Screening in Drug Discovery\",\"authors\":\"Davide Boldini,&nbsp;Lukas Friedrich,&nbsp;Daniel Kuhn and Stephan A. Sieber*,&nbsp;\",\"doi\":\"10.1021/acscentsci.3c01517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Efficient prioritization of bioactive compounds from high throughput screening campaigns is a fundamental challenge for accelerating drug development efforts. In this study, we present the first data-driven approach to simultaneously detect assay interferents and prioritize true bioactive compounds. By analyzing the learning dynamics during training of a gradient boosting model on noisy high throughput screening data using a novel formulation of sample influence, we are able to distinguish between compounds exhibiting the desired biological response and those producing assay artifacts. Therefore, our method enables false positive and true positive detection without relying on prior screens or assay interference mechanisms, making it applicable to any high throughput screening campaign. We demonstrate that our approach consistently excludes assay interferents with different mechanisms and prioritizes biologically relevant compounds more efficiently than all tested baselines, including a retrospective case study simulating its use in a real drug discovery campaign. Finally, our tool is extremely computationally efficient, requiring less than 30 s per assay on low-resource hardware. As such, our findings show that our method is an ideal addition to existing false positive detection tools and can be used to guide further pharmacological optimization after high throughput screening campaigns.</p><p >Minimum variance sampling analysis (MVS-A) is a fast machine-learning approach enabling the identification of both true bioactive compounds and false positives in high throughput screening data.</p>\",\"PeriodicalId\":10,\"journal\":{\"name\":\"ACS Central Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":12.7000,\"publicationDate\":\"2024-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.acs.org/doi/epdf/10.1021/acscentsci.3c01517\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Central Science\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acscentsci.3c01517\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Central Science","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acscentsci.3c01517","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

从高通量筛选活动中有效地确定生物活性化合物的优先次序是加速药物开发工作的一项基本挑战。在本研究中,我们提出了第一种数据驱动方法,可同时检测检测干扰物和优先筛选真正的生物活性化合物。通过分析梯度提升模型在嘈杂的高通量筛选数据上训练过程中的学习动态,并使用一种新颖的样本影响公式,我们能够区分出表现出预期生物反应的化合物和产生检测伪影的化合物。因此,我们的方法可以实现假阳性和真阳性检测,而无需依赖先前的筛选或检测干扰机制,因此适用于任何高通量筛选活动。我们证明,与所有测试基线相比,我们的方法能一致地排除不同机制的检测干扰,并更有效地确定生物相关化合物的优先级,包括一项模拟在真实药物发现活动中使用该方法的回顾性案例研究。最后,我们的工具具有极高的计算效率,在低资源硬件上每次检测只需不到 30 秒。因此,我们的研究结果表明,我们的方法是现有假阳性检测工具的理想补充,可用于指导高通量筛选活动后的进一步药理优化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Machine Learning Assisted Hit Prioritization for High Throughput Screening in Drug Discovery

Efficient prioritization of bioactive compounds from high throughput screening campaigns is a fundamental challenge for accelerating drug development efforts. In this study, we present the first data-driven approach to simultaneously detect assay interferents and prioritize true bioactive compounds. By analyzing the learning dynamics during training of a gradient boosting model on noisy high throughput screening data using a novel formulation of sample influence, we are able to distinguish between compounds exhibiting the desired biological response and those producing assay artifacts. Therefore, our method enables false positive and true positive detection without relying on prior screens or assay interference mechanisms, making it applicable to any high throughput screening campaign. We demonstrate that our approach consistently excludes assay interferents with different mechanisms and prioritizes biologically relevant compounds more efficiently than all tested baselines, including a retrospective case study simulating its use in a real drug discovery campaign. Finally, our tool is extremely computationally efficient, requiring less than 30 s per assay on low-resource hardware. As such, our findings show that our method is an ideal addition to existing false positive detection tools and can be used to guide further pharmacological optimization after high throughput screening campaigns.

Minimum variance sampling analysis (MVS-A) is a fast machine-learning approach enabling the identification of both true bioactive compounds and false positives in high throughput screening data.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACS Central Science
ACS Central Science Chemical Engineering-General Chemical Engineering
CiteScore
25.50
自引率
0.50%
发文量
194
审稿时长
10 weeks
期刊介绍: ACS Central Science publishes significant primary reports on research in chemistry and allied fields where chemical approaches are pivotal. As the first fully open-access journal by the American Chemical Society, it covers compelling and important contributions to the broad chemistry and scientific community. "Central science," a term popularized nearly 40 years ago, emphasizes chemistry's central role in connecting physical and life sciences, and fundamental sciences with applied disciplines like medicine and engineering. The journal focuses on exceptional quality articles, addressing advances in fundamental chemistry and interdisciplinary research.
期刊最新文献
Issue Editorial Masthead Issue Publication Information Fantastic Frustrated Materials–and Where to Find Them The Chemist Who Stayed in Gaza Bioinspired, Carbohydrate-Containing Polymers Efficiently and Reversibly Sequester Heavy Metals
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1