moPPIt: De Novo Generation of Motif-Specific Binders with Protein Language Models

Tong Chen, Yinuo Zhang, Pranam Chatterjee
{"title":"moPPIt: De Novo Generation of Motif-Specific Binders with Protein Language Models","authors":"Tong Chen, Yinuo Zhang, Pranam Chatterjee","doi":"10.1101/2024.07.31.606098","DOIUrl":null,"url":null,"abstract":"The ability to precisely target specific motifs on disease-related proteins, whether conserved epitopes on viral proteins, intrinsically disordered regions within transcription factors, or breakpoint junctions in fusion oncoproteins, is essential for modulating their function while minimizing off-target effects. Current methods struggle to achieve this specificity without reliable structural information. In this work, we introduce a <strong>mo</strong>tif-specific <strong>PPI</strong> <strong>t</strong>argeting algorithm, <strong>moPPIt</strong>, for <em>de novo</em> generation of motif-specific peptide binders from the target protein sequence alone. At the core of moPPIt is BindEvaluator, a transformer-based model that interpolates protein language model embeddings of two proteins via a series of multi-headed self-attention blocks, with a key focus on local motif features. Trained on over 510,000 annotated PPIs, BindEvaluator accurately predicts target binding sites given protein-protein sequence pairs with a test AUC &gt; 0.94, improving to AUC &gt; 0.96 when fine-tuned on peptide-protein pairs. By combining BindEvaluator with our PepMLM peptide generator and genetic algorithm-based optimization, moPPIt generates peptides that bind specifically to user-defined residues on target proteins. We demonstrate moPPIt's efficacy in computationally designing binders to specific motifs, first on targets with known binding peptides and then extending to structured and disordered targets with no known binders. In total, moPPIt serves as a powerful tool for developing highly specific peptide therapeutics without relying on target structure or structure-dependent latent spaces.","PeriodicalId":501408,"journal":{"name":"bioRxiv - Synthetic Biology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Synthetic Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.31.606098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The ability to precisely target specific motifs on disease-related proteins, whether conserved epitopes on viral proteins, intrinsically disordered regions within transcription factors, or breakpoint junctions in fusion oncoproteins, is essential for modulating their function while minimizing off-target effects. Current methods struggle to achieve this specificity without reliable structural information. In this work, we introduce a motif-specific PPI targeting algorithm, moPPIt, for de novo generation of motif-specific peptide binders from the target protein sequence alone. At the core of moPPIt is BindEvaluator, a transformer-based model that interpolates protein language model embeddings of two proteins via a series of multi-headed self-attention blocks, with a key focus on local motif features. Trained on over 510,000 annotated PPIs, BindEvaluator accurately predicts target binding sites given protein-protein sequence pairs with a test AUC > 0.94, improving to AUC > 0.96 when fine-tuned on peptide-protein pairs. By combining BindEvaluator with our PepMLM peptide generator and genetic algorithm-based optimization, moPPIt generates peptides that bind specifically to user-defined residues on target proteins. We demonstrate moPPIt's efficacy in computationally designing binders to specific motifs, first on targets with known binding peptides and then extending to structured and disordered targets with no known binders. In total, moPPIt serves as a powerful tool for developing highly specific peptide therapeutics without relying on target structure or structure-dependent latent spaces.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
moPPIt:利用蛋白质语言模型从新生成特定于动机的粘合剂
无论是病毒蛋白上的保守表位、转录因子中的内在无序区,还是融合肿瘤蛋白中的断点连接,精确靶向疾病相关蛋白上特定基团的能力对于调节其功能同时最大限度地减少脱靶效应至关重要。目前的方法很难在没有可靠结构信息的情况下实现这种特异性。在这项工作中,我们介绍了一种图案特异性 PPI 靶向算法 moPPIt,它可以仅从目标蛋白质序列中从头生成图案特异性多肽结合体。moPPIt 的核心是 BindEvaluator,它是一种基于变换器的模型,通过一系列多头自注意块插值两个蛋白质的蛋白质语言模型嵌入,重点关注局部主题特征。BindEvaluator 在超过 510,000 个已注释的 PPIs 上进行了训练,能准确预测蛋白质-蛋白质序列对的目标结合位点,测试 AUC > 0.94,在对肽-蛋白质对进行微调后,AUC > 0.96 有所提高。通过将 BindEvaluator 与我们的 PepMLM 肽生成器和基于遗传算法的优化相结合,moPPIt 能生成与用户定义的目标蛋白质残基特异性结合的肽。我们展示了 moPPIt 在计算设计特定主题的结合体方面的功效,首先是在已知结合肽的靶标上,然后扩展到没有已知结合体的结构化和无序靶标上。总之,moPPIt 是开发高度特异性多肽疗法的强大工具,而无需依赖靶标结构或结构相关的潜在空间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DNA-templated spatially controlled proteolysis targeting chimeras for CyclinD1-CDK4/6 complex protein degradation Cas9AEY (Cas9-facilitated Homologous Recombination Assembly of non-specific Escherichia coli yeast vector) method of constructing large-sized DNA. Metabolite-responsive Control of Transcription by Phase Separation-based Synthetic Organelles A modular system for programming multistep activation of endogenous genes in stem cells Mutual dependence between membrane phase separation and bacterial division protein dynamics in synthetic cell models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1