A comprehensive benchmarking for evaluating TCR embeddings in modeling TCR-epitope interactions.

IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Briefings in bioinformatics Pub Date : 2024-11-22 DOI:10.1093/bib/bbaf030
Xikang Feng, Miaozhe Huo, He Li, Yongze Yang, Yuepeng Jiang, Liang He, Shuai Cheng Li
{"title":"A comprehensive benchmarking for evaluating TCR embeddings in modeling TCR-epitope interactions.","authors":"Xikang Feng, Miaozhe Huo, He Li, Yongze Yang, Yuepeng Jiang, Liang He, Shuai Cheng Li","doi":"10.1093/bib/bbaf030","DOIUrl":null,"url":null,"abstract":"<p><p>The complexity of T cell receptor (TCR) sequences, particularly within the complementarity-determining region 3 (CDR3), requires efficient embedding methods for applying machine learning to immunology. While various TCR CDR3 embedding strategies have been proposed, the absence of their systematic evaluations created perplexity in the community. Here, we extracted CDR3 embedding models from 19 existing methods and benchmarked these models with four curated datasets by accessing their impact on the performance of TCR downstream tasks, including TCR-epitope binding affinity prediction, epitope-specific TCR identification, TCR clustering, and visualization analysis. We assessed these models utilizing eight downstream classifiers and five downstream clustering methods, with the performance measured by a diverse range of metrics for precision, robustness, and usability. Overall, handcrafted embeddings outperformed data-driven ones in modeling TCR-epitope interactions. To further refine our comparative findings, we developed an all-in-one TCR CDR3 embedding package comprising all evaluated embedding models. This package will assist users in easily selecting suitable embedding models for their data.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":7.7000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11781202/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf030","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The complexity of T cell receptor (TCR) sequences, particularly within the complementarity-determining region 3 (CDR3), requires efficient embedding methods for applying machine learning to immunology. While various TCR CDR3 embedding strategies have been proposed, the absence of their systematic evaluations created perplexity in the community. Here, we extracted CDR3 embedding models from 19 existing methods and benchmarked these models with four curated datasets by accessing their impact on the performance of TCR downstream tasks, including TCR-epitope binding affinity prediction, epitope-specific TCR identification, TCR clustering, and visualization analysis. We assessed these models utilizing eight downstream classifiers and five downstream clustering methods, with the performance measured by a diverse range of metrics for precision, robustness, and usability. Overall, handcrafted embeddings outperformed data-driven ones in modeling TCR-epitope interactions. To further refine our comparative findings, we developed an all-in-one TCR CDR3 embedding package comprising all evaluated embedding models. This package will assist users in easily selecting suitable embedding models for their data.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在模拟TCR-表位相互作用中评估TCR嵌入的综合基准。
T细胞受体(TCR)序列的复杂性,特别是在互补决定区3 (CDR3)内,需要有效的嵌入方法将机器学习应用于免疫学。虽然已经提出了各种TCR CDR3嵌入策略,但缺乏对其进行系统评估,这在社区中造成了困惑。在这里,我们从19种现有方法中提取了CDR3嵌入模型,并通过四个精选数据集对这些模型进行基准测试,通过访问它们对TCR下游任务性能的影响,包括TCR-表位结合亲和力预测、表位特异性TCR鉴定、TCR聚类和可视化分析。我们利用8种下游分类器和5种下游聚类方法对这些模型进行了评估,并通过各种精度、鲁棒性和可用性指标来衡量其性能。总的来说,手工制作的嵌入在模拟tcr -表位相互作用方面优于数据驱动的嵌入。为了进一步完善我们的比较发现,我们开发了一个一体化的TCR CDR3嵌入包,包括所有评估的嵌入模型。这个包将帮助用户方便地为他们的数据选择合适的嵌入模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Briefings in bioinformatics
Briefings in bioinformatics 生物-生化研究方法
CiteScore
13.20
自引率
13.70%
发文量
549
审稿时长
6 months
期刊介绍: Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.
期刊最新文献
EpGAT: integrating epigenetics and 3D genome structure to predict alternative splicing and polyadenylation. Could statistical potential models achieve comparable or better performance than deep learning models? Integrating feature selection with unsupervised deep embedding for clustering single-cell RNA-seq data. Master of Metals2: a graph neural network based architecture for the prediction of zinc binding sites in protein structures. ORANGE: a machine learning approach for modeling tissue-specific aging from transcriptomic data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1