EpicPred: predicting phenotypes driven by epitope-binding TCRs using attention-based multiple instance learning.

IF 5.4 Bioinformatics (Oxford, England) Pub Date : 2025-03-04 DOI:10.1093/bioinformatics/btaf080

Jaemin Jeon, Suwan Yu, Sangam Lee, Sang Cheol Kim, Hye-Yeong Jo, Inuk Jung, Kwangsoo Kim

{"title":"EpicPred: predicting phenotypes driven by epitope-binding TCRs using attention-based multiple instance learning.","authors":"Jaemin Jeon, Suwan Yu, Sangam Lee, Sang Cheol Kim, Hye-Yeong Jo, Inuk Jung, Kwangsoo Kim","doi":"10.1093/bioinformatics/btaf080","DOIUrl":null,"url":null,"abstract":"Motivation: Correctly identifying epitope-binding T-cell receptors (TCRs) is important to both understand their underlying biological mechanism in association to some phenotype and accordingly develop T-cell mediated immunotherapy treatments. Although the importance of the CDR3 region in TCRs for epitope recognition is well recognized, methods for profiling their interactions in association to a certain disease or phenotype remains less studied. We developed EpicPred to identify phenotype-specific TCR-epitope interactions. EpicPred first predicts and removes unlikely TCR-epitope interactions to reduce false positives using the Open-set Recognition (OSR). Subsequently, multiple instance learning was used to identify TCR-epitope interactions specific to a cancer type or severity levels of COVID-19 infected patients.Results: From six public TCR databases, 244 552 TCR sequences and 105 unique epitopes were used to predict epitope-binding TCRs and to filter out non-epitope-binding TCRs using the OSR method. The predicted interactions were used to further predict the phenotype groups in two cancer and four COVID-19 TCR-seq datasets of both bulk and single-cell resolution. EpicPred outperformed the competing methods in predicting the phenotypes, achieving an average AUROC of 0.80 ± 0.07.Availability and implementation: The EpicPred Software is available at https://github.com/jaeminjj/EpicPred.","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11879650/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Motivation: Correctly identifying epitope-binding T-cell receptors (TCRs) is important to both understand their underlying biological mechanism in association to some phenotype and accordingly develop T-cell mediated immunotherapy treatments. Although the importance of the CDR3 region in TCRs for epitope recognition is well recognized, methods for profiling their interactions in association to a certain disease or phenotype remains less studied. We developed EpicPred to identify phenotype-specific TCR-epitope interactions. EpicPred first predicts and removes unlikely TCR-epitope interactions to reduce false positives using the Open-set Recognition (OSR). Subsequently, multiple instance learning was used to identify TCR-epitope interactions specific to a cancer type or severity levels of COVID-19 infected patients.

Results: From six public TCR databases, 244 552 TCR sequences and 105 unique epitopes were used to predict epitope-binding TCRs and to filter out non-epitope-binding TCRs using the OSR method. The predicted interactions were used to further predict the phenotype groups in two cancer and four COVID-19 TCR-seq datasets of both bulk and single-cell resolution. EpicPred outperformed the competing methods in predicting the phenotypes, achieving an average AUROC of 0.80 ± 0.07.

Availability and implementation: The EpicPred Software is available at https://github.com/jaeminjj/EpicPred.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

EpicPred：使用基于注意的多实例学习预测表位结合tcr驱动的表型。

动机：正确识别表位结合TCRs对于了解其与某些表型相关的潜在生物学机制以及相应开发t细胞介导的免疫治疗非常重要。虽然tcr中CDR3区域对表位识别的重要性已得到充分认识，但分析它们与某种疾病或表型相关的相互作用的方法仍然较少研究。我们开发了EpicPred来鉴定表型特异性tcr -表位相互作用。EpicPred首先使用开放集识别预测并去除不可能的tcr -表位相互作用，以减少误报。随后，使用多实例学习来识别特定于癌症类型或COVID-19患者严重程度的tcr -表位相互作用。结果：从6个公共TCR数据库中，利用244,552个TCR序列和105个独特的表位，采用开放集识别方法预测表位结合的TCR，过滤掉非表位结合的TCR。预测的相互作用被用于进一步预测两种癌症和四种COVID-19 TCR-seq数据集的表型组，包括大量和单细胞分辨率。EpicPred在预测表型方面优于竞争对手的方法，平均AUROC为0.80±0.07。可用性和实施：EpicPred软件可在https://github.com/jaeminjj/EpicPred.Supplementary上获得信息；补充数据可在Bioinformatics在线获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Bioinformatics (Oxford, England)

自引率

0.00%

发文量