Guiding a language-model based protein design method towards MHC Class-I immune-visibility targets in vaccines and therapeutics

Hans-Christof Gasser , Diego A. Oyarzún , Ajitha Rajan , Javier Antonio Alfaro
{"title":"Guiding a language-model based protein design method towards MHC Class-I immune-visibility targets in vaccines and therapeutics","authors":"Hans-Christof Gasser ,&nbsp;Diego A. Oyarzún ,&nbsp;Ajitha Rajan ,&nbsp;Javier Antonio Alfaro","doi":"10.1016/j.immuno.2024.100035","DOIUrl":null,"url":null,"abstract":"<div><p>Proteins have an arsenal of medical applications that include disrupting protein interactions, acting as potent vaccines, and replacing genetically deficient proteins. While therapeutics must avoid triggering unwanted immune-responses, vaccines should support a robust immune-reaction targeting a broad range of pathogen variants. Therefore, computational methods modifying proteins’ immunogenicity without disrupting function are needed. While many components of the immune-system can be involved in a reaction, we focus on Cytotoxic T-lymphocytes (CTLs). These target short peptides presented via the MHC Class I (MHC-I) pathway. To explore the limits of modifying the visibility of those peptides to CTLs within the distribution of naturally occurring sequences, we developed a novel machine learning technique, <span>CAPE-XVAE</span>. It combines a language model with reinforcement learning to modify a protein’s immune-visibility. Our results show that <span>CAPE-XVAE</span> effectively modifies the visibility of the HIV Nef protein to CTLs. We contrast <span>CAPE-XVAE</span> to <span>CAPE-Packer</span>, a physics-based method we also developed. Compared to <span>CAPE-Packer</span>, the machine learning approach suggests sequences that draw upon local sequence similarities in the training set. This is beneficial for vaccine development, where the sequence should be representative of the real viral population. Additionally, the language model approach holds promise for preserving both known and unknown functional constraints, which is essential for the immune-modulation of therapeutic proteins. In contrast, <span>CAPE-Packer</span>, emphasizes preserving the protein’s overall fold and can reach greater extremes of immune-visibility, but falls short of capturing the sequence diversity of viral variants available to learn from. Source code: <span>https://github.com/hcgasser/CAPE</span><svg><path></path></svg> (Tag: <span>v1.1</span>)</p></div>","PeriodicalId":73343,"journal":{"name":"Immunoinformatics (Amsterdam, Netherlands)","volume":"14 ","pages":"Article 100035"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667119024000053/pdfft?md5=add2e81105c2c0a169282f80ff064817&pid=1-s2.0-S2667119024000053-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Immunoinformatics (Amsterdam, Netherlands)","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667119024000053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Proteins have an arsenal of medical applications that include disrupting protein interactions, acting as potent vaccines, and replacing genetically deficient proteins. While therapeutics must avoid triggering unwanted immune-responses, vaccines should support a robust immune-reaction targeting a broad range of pathogen variants. Therefore, computational methods modifying proteins’ immunogenicity without disrupting function are needed. While many components of the immune-system can be involved in a reaction, we focus on Cytotoxic T-lymphocytes (CTLs). These target short peptides presented via the MHC Class I (MHC-I) pathway. To explore the limits of modifying the visibility of those peptides to CTLs within the distribution of naturally occurring sequences, we developed a novel machine learning technique, CAPE-XVAE. It combines a language model with reinforcement learning to modify a protein’s immune-visibility. Our results show that CAPE-XVAE effectively modifies the visibility of the HIV Nef protein to CTLs. We contrast CAPE-XVAE to CAPE-Packer, a physics-based method we also developed. Compared to CAPE-Packer, the machine learning approach suggests sequences that draw upon local sequence similarities in the training set. This is beneficial for vaccine development, where the sequence should be representative of the real viral population. Additionally, the language model approach holds promise for preserving both known and unknown functional constraints, which is essential for the immune-modulation of therapeutic proteins. In contrast, CAPE-Packer, emphasizes preserving the protein’s overall fold and can reach greater extremes of immune-visibility, but falls short of capturing the sequence diversity of viral variants available to learn from. Source code: https://github.com/hcgasser/CAPE (Tag: v1.1)

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
引导基于语言模型的蛋白质设计方法,实现疫苗和治疗中的 MHC I 类免疫可见性目标
蛋白质在医学上有广泛的应用,包括破坏蛋白质相互作用、作为强效疫苗和替代基因缺陷蛋白质。治疗药物必须避免引发不必要的免疫反应,而疫苗则应支持针对各种病原体变体的强效免疫反应。因此,需要用计算方法在不破坏功能的情况下改变蛋白质的免疫原性。虽然免疫系统的许多成分都可能参与反应,但我们将重点放在细胞毒性 T 淋巴细胞(CTLs)上。它们的靶标是通过 MHC I 类(MHC-I)途径呈现的短肽。为了探索在天然序列分布范围内修改这些肽对 CTL 的可见性的极限,我们开发了一种新型机器学习技术 CAPE-XVAE。它将语言模型与强化学习相结合,以改变蛋白质的免疫可见性。我们的研究结果表明,CAPE-XVAE 能有效改变 HIV Nef 蛋白在 CTLs 中的可见性。我们将 CAPE-XVAE 与 CAPE-Packer 进行了对比,后者也是我们开发的一种基于物理的方法。与 CAPE-Packer 相比,机器学习方法能利用训练集中的局部序列相似性提出序列建议。这有利于疫苗研发,因为疫苗序列应能代表真实的病毒群。此外,语言模型方法有望保留已知和未知的功能约束,这对于治疗蛋白的免疫调节至关重要。相比之下,CAPE-Packer 则强调保留蛋白质的整体折叠,并能达到更高的免疫可见度,但却无法捕捉可供学习的病毒变体序列多样性。源代码:https://github.com/hcgasser/CAPE(标签:v1.1)
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Immunoinformatics (Amsterdam, Netherlands)
Immunoinformatics (Amsterdam, Netherlands) Immunology, Computer Science Applications
自引率
0.00%
发文量
0
审稿时长
60 days
期刊最新文献
Scifer: An R/Bioconductor package for large-scale integration of Sanger sequencing and flow cytometry data of index-sorted single cells Lessons learned from the IMMREP23 TCR-epitope prediction challenge Multicohort analysis identifies conserved transcriptional interactions between humans and Plasmodium falciparum In silico modelling of CD8 T cell immune response links genetic regulation to population dynamics Data mining antibody sequences for database searching in bottom-up proteomics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1