NovoBoard: A Comprehensive Framework for Evaluating the False Discovery Rate and Accuracy of De Novo Peptide Sequencing.

IF 6.1 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Molecular & Cellular Proteomics Pub Date : 2024-09-24 DOI:10.1016/j.mcpro.2024.100849
Ngoc Hieu Tran, Rui Qiao, Zeping Mao, Shengying Pan, Qing Zhang, Wenting Li, Lei Xin, Ming Li, Baozhen Shan
{"title":"NovoBoard: A Comprehensive Framework for Evaluating the False Discovery Rate and Accuracy of De Novo Peptide Sequencing.","authors":"Ngoc Hieu Tran, Rui Qiao, Zeping Mao, Shengying Pan, Qing Zhang, Wenting Li, Lei Xin, Ming Li, Baozhen Shan","doi":"10.1016/j.mcpro.2024.100849","DOIUrl":null,"url":null,"abstract":"<p><p>De novo peptide sequencing is one of the most fundamental research areas in mass spectrometry-based proteomics. Many methods have often been evaluated using a couple of simple metrics that do not fully reflect their overall performance. Moreover, there has not been an established method to estimate the false discovery rate (FDR) of de novo peptide-spectrum matches. Here we propose NovoBoard, a comprehensive framework to evaluate the performance of de novo peptide-sequencing methods. The framework consists of diverse benchmark datasets (including tryptic, nontryptic, immunopeptidomics, and different species) and a standard set of accuracy metrics to evaluate the fragment ions, amino acids, and peptides of the de novo results. More importantly, a new approach is designed to evaluate de novo peptide-sequencing methods on target-decoy spectra and to estimate and validate their FDRs. Our FDR estimation provides valuable information to assess the reliability of new peptides identified by de novo sequencing tools, especially when no ground-truth information is available to evaluate their accuracy. The FDR estimation can also be used to evaluate the capability of de novo peptide sequencing tools to distinguish between de novo peptide-spectrum matches and random matches. Our results thoroughly reveal the strengths and weaknesses of different de novo peptide-sequencing methods and how their performances depend on specific applications and the types of data.</p>","PeriodicalId":18712,"journal":{"name":"Molecular & Cellular Proteomics","volume":" ","pages":"100849"},"PeriodicalIF":6.1000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11532909/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular & Cellular Proteomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.mcpro.2024.100849","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

De novo peptide sequencing is one of the most fundamental research areas in mass spectrometry-based proteomics. Many methods have often been evaluated using a couple of simple metrics that do not fully reflect their overall performance. Moreover, there has not been an established method to estimate the false discovery rate (FDR) of de novo peptide-spectrum matches. Here we propose NovoBoard, a comprehensive framework to evaluate the performance of de novo peptide-sequencing methods. The framework consists of diverse benchmark datasets (including tryptic, nontryptic, immunopeptidomics, and different species) and a standard set of accuracy metrics to evaluate the fragment ions, amino acids, and peptides of the de novo results. More importantly, a new approach is designed to evaluate de novo peptide-sequencing methods on target-decoy spectra and to estimate and validate their FDRs. Our FDR estimation provides valuable information to assess the reliability of new peptides identified by de novo sequencing tools, especially when no ground-truth information is available to evaluate their accuracy. The FDR estimation can also be used to evaluate the capability of de novo peptide sequencing tools to distinguish between de novo peptide-spectrum matches and random matches. Our results thoroughly reveal the strengths and weaknesses of different de novo peptide-sequencing methods and how their performances depend on specific applications and the types of data.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
NovoBoard:评估全新多肽测序的错误发现率和准确性的综合框架。
全新肽测序是基于质谱(MS)的蛋白质组学中最基础的研究领域之一。许多方法通常使用几个简单的指标进行评估,但这些指标并不能完全反映其整体性能。此外,还没有一种成熟的方法来估算从头肽谱匹配(PSMs)的错误发现率(FDR)。在此,我们提出了一个评估从头肽测序方法性能的综合框架 NovoBoard。该框架由多种基准数据集(包括胰蛋白酶、非胰蛋白酶、免疫肽组学和不同物种)和一套标准的准确度指标组成,用于评估从头测序结果的片段离子、氨基酸和肽段。更重要的是,我们设计了一种新方法来评估目标诱饵光谱上的从头肽测序方法,并估算和验证其FDR。我们的 FDR 估计为评估从头测序工具鉴定的新肽的可靠性提供了宝贵的信息,尤其是在没有地面实况信息来评估其准确性的情况下。FDR估计值还可用于评估从头肽测序工具区分从头PSM和随机匹配的能力。我们的研究结果彻底揭示了不同从头多肽测序方法的优缺点,以及它们的性能如何取决于特定的应用和数据类型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Molecular & Cellular Proteomics
Molecular & Cellular Proteomics 生物-生化研究方法
CiteScore
11.50
自引率
4.30%
发文量
131
审稿时长
84 days
期刊介绍: The mission of MCP is to foster the development and applications of proteomics in both basic and translational research. MCP will publish manuscripts that report significant new biological or clinical discoveries underpinned by proteomic observations across all kingdoms of life. Manuscripts must define the biological roles played by the proteins investigated or their mechanisms of action. The journal also emphasizes articles that describe innovative new computational methods and technological advancements that will enable future discoveries. Manuscripts describing such approaches do not have to include a solution to a biological problem, but must demonstrate that the technology works as described, is reproducible and is appropriate to uncover yet unknown protein/proteome function or properties using relevant model systems or publicly available data. Scope: -Fundamental studies in biology, including integrative "omics" studies, that provide mechanistic insights -Novel experimental and computational technologies -Proteogenomic data integration and analysis that enable greater understanding of physiology and disease processes -Pathway and network analyses of signaling that focus on the roles of post-translational modifications -Studies of proteome dynamics and quality controls, and their roles in disease -Studies of evolutionary processes effecting proteome dynamics, quality and regulation -Chemical proteomics, including mechanisms of drug action -Proteomics of the immune system and antigen presentation/recognition -Microbiome proteomics, host-microbe and host-pathogen interactions, and their roles in health and disease -Clinical and translational studies of human diseases -Metabolomics to understand functional connections between genes, proteins and phenotypes
期刊最新文献
Integrative Multi-PTM Proteomics Reveals Dynamic Global, Redox, Phosphorylation, and Acetylation Regulation in Cytokine-treated Pancreatic Beta Cells. Gradient-Elution Nanoflow Liquid Chromatography without a Binary Pump: Smoothed Step Gradients Enable Reproducible, Sensitive, and Low-Cost Separations for Single-Cell Proteomics. In-depth analysis of miRNA binding sites reveals the complex response of uterine epithelium to miR-26a-5p and miR-125b-5p during early pregnancy. Bridging the Gap from Proteomics Technology to Clinical Application: Highlights from the 68th Benzon Foundation Symposium. Knockdown proteomics reveals USP7 as a regulator of cell-cell adhesion in colorectal cancer via AJUBA.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1