Jaemin Jeon, Suwan Yu, Sangam Lee, Sang Cheol Kim, Hye-Yeong Jo, Inuk Jung, Kwangsoo Kim
{"title":"EpicPred: Predicting phenotypes driven by epitope binding TCRs using attention-based multiple instance learning.","authors":"Jaemin Jeon, Suwan Yu, Sangam Lee, Sang Cheol Kim, Hye-Yeong Jo, Inuk Jung, Kwangsoo Kim","doi":"10.1093/bioinformatics/btaf080","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Correctly identifying epitope binding TCRs is important to both understand their underlying biological mechanism in association to some phenotype and accordingly develop T-cell mediated immunotherapy treatments. Although the importance of the CDR3 region in TCRs for epitope recognition is well recognized, methods for profiling their interactions in association to a certain disease or phenotype remains less studied. We developed EpicPred to identify phenotype specific TCR-epitope interactions. EpicPred first predicts and removes unlikely TCR-epitope interactions to reduce false positives using the Open-set Recognition. Subsequently, multiple instance learning was used to identify TCR-epitope interactions specific to a cancer type or severity levels of COVID-19 patients.</p><p><strong>Results: </strong>From six public TCR databases, 244,552 TCR sequences and 105 unique epitopes were used to predict epitope binding TCRs and to filter out non-epitope binding TCRs using the open-set recognition method. The predicted interactions were used to further predict the phenotype groups in two cancer and four COVID-19 TCR-seq datasets of both bulk and single-cell resolution. EpicPred outperformed the competing methods in predicting the phenotypes, achieving an average AUROC of 0.80 ± 0.07.</p><p><strong>Availability and implementation: </strong>The EpicPred Software is available at https://github.com/jaeminjj/EpicPred.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: Correctly identifying epitope binding TCRs is important to both understand their underlying biological mechanism in association to some phenotype and accordingly develop T-cell mediated immunotherapy treatments. Although the importance of the CDR3 region in TCRs for epitope recognition is well recognized, methods for profiling their interactions in association to a certain disease or phenotype remains less studied. We developed EpicPred to identify phenotype specific TCR-epitope interactions. EpicPred first predicts and removes unlikely TCR-epitope interactions to reduce false positives using the Open-set Recognition. Subsequently, multiple instance learning was used to identify TCR-epitope interactions specific to a cancer type or severity levels of COVID-19 patients.
Results: From six public TCR databases, 244,552 TCR sequences and 105 unique epitopes were used to predict epitope binding TCRs and to filter out non-epitope binding TCRs using the open-set recognition method. The predicted interactions were used to further predict the phenotype groups in two cancer and four COVID-19 TCR-seq datasets of both bulk and single-cell resolution. EpicPred outperformed the competing methods in predicting the phenotypes, achieving an average AUROC of 0.80 ± 0.07.
Availability and implementation: The EpicPred Software is available at https://github.com/jaeminjj/EpicPred.
Supplementary information: Supplementary data are available at Bioinformatics online.