Machine learning proteochemometric models for Cereblon glue activity predictions

Francis J. Prael III , Jiayi Cox , Noé Sturm , Peter Kutchukian , William C. Forrester , Gregory Michaud , Jutta Blank , Lingling Shen , Raquel Rodríguez-Pérez
{"title":"Machine learning proteochemometric models for Cereblon glue activity predictions","authors":"Francis J. Prael III ,&nbsp;Jiayi Cox ,&nbsp;Noé Sturm ,&nbsp;Peter Kutchukian ,&nbsp;William C. Forrester ,&nbsp;Gregory Michaud ,&nbsp;Jutta Blank ,&nbsp;Lingling Shen ,&nbsp;Raquel Rodríguez-Pérez","doi":"10.1016/j.ailsci.2024.100100","DOIUrl":null,"url":null,"abstract":"<div><p>Targeted protein degradation (TPD) is a rapidly developing drug discovery technique with unique efficacy and target scope stemming from its degradation-based activity. Molecular glue degraders are a promising arm of TPD, as evidenced by the FDA-approved therapeutics within this class, the increasing number of degraders in clinical development, and their predisposition to drug-likeness. Cereblon (CRBN) glue degraders mediate target degradation by generating a neomorphic interface between CRBN and a protein of interest. While promising, the complicated nature of this CRBN-glue-target ternary complex makes the rational design of molecular glue degraders challenging. For other drug modalities, predictive modeling has been established to leverage existing activity data and generate quantitative structure-activity relationships (QSAR). However, the applicability of QSAR strategies for glues remains under-investigated. Herein, machine learning methodologies were developed to predict glue-mediated recruitment of CRBN to target proteins and achieved promising performance. Generated models leveraged more than a hundred internal screening campaigns across thousands of CRBN glues to predict glue-mediated recruitment of targets to CRBN. Our results show that recruitment activity of CRBN glue degraders can be modeled by machine learning, with 89 % of models producing an area under the receiver operating characteristic curve (ROC AUC) &gt; 0.8 and 70 % of models producing a Matthew's correlation coefficient (MCC) &gt; 0.2 for these primary screening data. Importantly, our findings also indicate that the combination of compound and protein descriptors in the so-called proteochemometric models improves performance, with &gt;80 % of the models exhibiting higher ROC AUC and MCC values than per-target models only based on compound information. Hence, our investigations suggest that proteochemometric modeling is a successful approach for molecular glue degraders. The proposed machine learning strategies can aid compound prioritization based on recruitment efficacy and target selectivity, thus have the potential to facilitate the design and discovery of therapeutic CRBN molecular glues.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667318524000072/pdfft?md5=74a4c064cfb576ff403180c61ffdc97f&pid=1-s2.0-S2667318524000072-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence in the life sciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667318524000072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Targeted protein degradation (TPD) is a rapidly developing drug discovery technique with unique efficacy and target scope stemming from its degradation-based activity. Molecular glue degraders are a promising arm of TPD, as evidenced by the FDA-approved therapeutics within this class, the increasing number of degraders in clinical development, and their predisposition to drug-likeness. Cereblon (CRBN) glue degraders mediate target degradation by generating a neomorphic interface between CRBN and a protein of interest. While promising, the complicated nature of this CRBN-glue-target ternary complex makes the rational design of molecular glue degraders challenging. For other drug modalities, predictive modeling has been established to leverage existing activity data and generate quantitative structure-activity relationships (QSAR). However, the applicability of QSAR strategies for glues remains under-investigated. Herein, machine learning methodologies were developed to predict glue-mediated recruitment of CRBN to target proteins and achieved promising performance. Generated models leveraged more than a hundred internal screening campaigns across thousands of CRBN glues to predict glue-mediated recruitment of targets to CRBN. Our results show that recruitment activity of CRBN glue degraders can be modeled by machine learning, with 89 % of models producing an area under the receiver operating characteristic curve (ROC AUC) > 0.8 and 70 % of models producing a Matthew's correlation coefficient (MCC) > 0.2 for these primary screening data. Importantly, our findings also indicate that the combination of compound and protein descriptors in the so-called proteochemometric models improves performance, with >80 % of the models exhibiting higher ROC AUC and MCC values than per-target models only based on compound information. Hence, our investigations suggest that proteochemometric modeling is a successful approach for molecular glue degraders. The proposed machine learning strategies can aid compound prioritization based on recruitment efficacy and target selectivity, thus have the potential to facilitate the design and discovery of therapeutic CRBN molecular glues.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于预测脑龙胶活性的机器学习蛋白质化学计量模型
靶向蛋白质降解(TPD)是一种快速发展的药物发现技术,其独特的功效和靶向范围源于其基于降解的活性。分子胶降解剂是一种前景广阔的靶向降解技术,美国食品及药物管理局(FDA)批准的该类治疗药物、越来越多的降解剂进入临床开发阶段以及它们的药物相似性都证明了这一点。Cereblon(CRBN)胶水降解剂通过在 CRBN 和感兴趣的蛋白质之间生成一个新形界面来介导目标降解。这种 CRBN-胶水-靶标三元复合物性质复杂,虽然前景广阔,但合理设计分子胶水降解剂仍具有挑战性。对于其他药物模式,已经建立了预测模型来利用现有的活性数据并生成定量结构-活性关系(QSAR)。然而,QSAR 策略对胶水的适用性仍未得到充分研究。在此,我们开发了机器学习方法来预测胶水介导的 CRBN 对靶蛋白的招募,并取得了良好的效果。生成的模型利用了数以千计的 CRBN 胶的百余次内部筛选活动来预测胶介导的 CRBN 对靶蛋白的招募。我们的研究结果表明,CRBN胶水降解剂的招募活性可以通过机器学习来建模,对于这些初筛数据,89%的模型产生的接收者操作特征曲线下面积(ROC AUC)为0.8,70%的模型产生的马修相关系数(MCC)为0.2。重要的是,我们的研究结果还表明,在所谓的蛋白质化学计量学模型中结合化合物和蛋白质描述因子可提高性能,80%的模型比仅基于化合物信息的每目标模型显示出更高的ROC AUC和MCC值。因此,我们的研究表明,蛋白化学计量模型是一种成功的分子胶降解方法。所提出的机器学习策略可以根据招募效果和靶点选择性帮助确定化合物的优先级,从而有可能促进治疗性 CRBN 分子胶的设计和发现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Artificial intelligence in the life sciences
Artificial intelligence in the life sciences Pharmacology, Biochemistry, Genetics and Molecular Biology (General), Computer Science Applications, Health Informatics, Drug Discovery, Veterinary Science and Veterinary Medicine (General)
CiteScore
5.00
自引率
0.00%
发文量
0
审稿时长
15 days
期刊最新文献
Modeling PROTAC degradation activity with machine learning Machine learning proteochemometric models for Cereblon glue activity predictions Editorial Board Statistical approaches enabling technology-specific assay interference prediction from large screening data sets Federated learning for predicting compound mechanism of action based on image-data from cell painting
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1