Hans-Christof Gasser , Diego A. Oyarzún , Ajitha Rajan , Javier Antonio Alfaro
{"title":"Guiding a language-model based protein design method towards MHC Class-I immune-visibility targets in vaccines and therapeutics","authors":"Hans-Christof Gasser , Diego A. Oyarzún , Ajitha Rajan , Javier Antonio Alfaro","doi":"10.1016/j.immuno.2024.100035","DOIUrl":null,"url":null,"abstract":"<div><p>Proteins have an arsenal of medical applications that include disrupting protein interactions, acting as potent vaccines, and replacing genetically deficient proteins. While therapeutics must avoid triggering unwanted immune-responses, vaccines should support a robust immune-reaction targeting a broad range of pathogen variants. Therefore, computational methods modifying proteins’ immunogenicity without disrupting function are needed. While many components of the immune-system can be involved in a reaction, we focus on Cytotoxic T-lymphocytes (CTLs). These target short peptides presented via the MHC Class I (MHC-I) pathway. To explore the limits of modifying the visibility of those peptides to CTLs within the distribution of naturally occurring sequences, we developed a novel machine learning technique, <span>CAPE-XVAE</span>. It combines a language model with reinforcement learning to modify a protein’s immune-visibility. Our results show that <span>CAPE-XVAE</span> effectively modifies the visibility of the HIV Nef protein to CTLs. We contrast <span>CAPE-XVAE</span> to <span>CAPE-Packer</span>, a physics-based method we also developed. Compared to <span>CAPE-Packer</span>, the machine learning approach suggests sequences that draw upon local sequence similarities in the training set. This is beneficial for vaccine development, where the sequence should be representative of the real viral population. Additionally, the language model approach holds promise for preserving both known and unknown functional constraints, which is essential for the immune-modulation of therapeutic proteins. In contrast, <span>CAPE-Packer</span>, emphasizes preserving the protein’s overall fold and can reach greater extremes of immune-visibility, but falls short of capturing the sequence diversity of viral variants available to learn from. Source code: <span>https://github.com/hcgasser/CAPE</span><svg><path></path></svg> (Tag: <span>v1.1</span>)</p></div>","PeriodicalId":73343,"journal":{"name":"Immunoinformatics (Amsterdam, Netherlands)","volume":"14 ","pages":"Article 100035"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667119024000053/pdfft?md5=add2e81105c2c0a169282f80ff064817&pid=1-s2.0-S2667119024000053-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Immunoinformatics (Amsterdam, Netherlands)","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667119024000053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Proteins have an arsenal of medical applications that include disrupting protein interactions, acting as potent vaccines, and replacing genetically deficient proteins. While therapeutics must avoid triggering unwanted immune-responses, vaccines should support a robust immune-reaction targeting a broad range of pathogen variants. Therefore, computational methods modifying proteins’ immunogenicity without disrupting function are needed. While many components of the immune-system can be involved in a reaction, we focus on Cytotoxic T-lymphocytes (CTLs). These target short peptides presented via the MHC Class I (MHC-I) pathway. To explore the limits of modifying the visibility of those peptides to CTLs within the distribution of naturally occurring sequences, we developed a novel machine learning technique, CAPE-XVAE. It combines a language model with reinforcement learning to modify a protein’s immune-visibility. Our results show that CAPE-XVAE effectively modifies the visibility of the HIV Nef protein to CTLs. We contrast CAPE-XVAE to CAPE-Packer, a physics-based method we also developed. Compared to CAPE-Packer, the machine learning approach suggests sequences that draw upon local sequence similarities in the training set. This is beneficial for vaccine development, where the sequence should be representative of the real viral population. Additionally, the language model approach holds promise for preserving both known and unknown functional constraints, which is essential for the immune-modulation of therapeutic proteins. In contrast, CAPE-Packer, emphasizes preserving the protein’s overall fold and can reach greater extremes of immune-visibility, but falls short of capturing the sequence diversity of viral variants available to learn from. Source code: https://github.com/hcgasser/CAPE (Tag: v1.1)