{"title":"metaExpertPro:用于元蛋白质组学谱库构建和独立于数据采集的质谱数据分析的计算工作流程。","authors":"Yingying Sun, Ziyuan Xing, Shuang Liang, Zelei Miao, Lai-Bao Zhuo, Wenhao Jiang, Hui Zhao, Huanhuan Gao, Yuting Xie, Yan Zhou, Liang Yue, Xue Cai, Yu-Ming Chen, Ju-Sheng Zheng, Tiannan Guo","doi":"10.1016/j.mcpro.2024.100840","DOIUrl":null,"url":null,"abstract":"<p><p>Analysis of large-scale data-independent acquisition mass spectrometry metaproteomics data remains a computational challenge. Here, we present a computational pipeline called metaExpertPro for metaproteomics data analysis. This pipeline encompasses spectral library generation using data-dependent acquisition MS, protein identification and quantification using data-independent acquisition mass spectrometry, functional and taxonomic annotation, as well as quantitative matrix generation for both microbiota and hosts. By integrating FragPipe and DIA-NN, metaExpertPro offers compatibility with both Orbitrap and timsTOF MS instruments. To evaluate the depth and accuracy of identification and quantification, we conducted extensive assessments using human fecal samples and benchmark tests. Performance tests conducted on human fecal samples indicated that metaExpertPro quantified an average of 45,000 peptides in a 60-min diaPASEF injection. Notably, metaExpertPro outperformed three existing software tools by characterizing a higher number of peptides and proteins. Importantly, metaExpertPro maintained a low factual false discovery rate of approximately 5% for protein groups across four benchmark tests. Applying a filter of five peptides per genus, metaExpertPro achieved relatively high accuracy (F-score = 0.67-0.90) in genus diversity and showed a high correlation (r<sub>Spearman</sub> = 0.73-0.82) between the measured and true genus relative abundance in benchmark tests. Additionally, the quantitative results at the protein, taxonomy, and function levels exhibited high reproducibility and consistency across the commonly adopted public human gut microbial protein databases IGC and UHGP. In a metaproteomic analysis of dyslipidemia patients, metaExpertPro revealed characteristic alterations in microbial functions and potential interactions between the microbiota and the host.</p>","PeriodicalId":18712,"journal":{"name":"Molecular & Cellular Proteomics","volume":" ","pages":"100840"},"PeriodicalIF":6.1000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"metaExpertPro: A Computational Workflow for Metaproteomics Spectral Library Construction and Data-Independent Acquisition Mass Spectrometry Data Analysis.\",\"authors\":\"Yingying Sun, Ziyuan Xing, Shuang Liang, Zelei Miao, Lai-Bao Zhuo, Wenhao Jiang, Hui Zhao, Huanhuan Gao, Yuting Xie, Yan Zhou, Liang Yue, Xue Cai, Yu-Ming Chen, Ju-Sheng Zheng, Tiannan Guo\",\"doi\":\"10.1016/j.mcpro.2024.100840\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Analysis of large-scale data-independent acquisition mass spectrometry metaproteomics data remains a computational challenge. Here, we present a computational pipeline called metaExpertPro for metaproteomics data analysis. This pipeline encompasses spectral library generation using data-dependent acquisition MS, protein identification and quantification using data-independent acquisition mass spectrometry, functional and taxonomic annotation, as well as quantitative matrix generation for both microbiota and hosts. By integrating FragPipe and DIA-NN, metaExpertPro offers compatibility with both Orbitrap and timsTOF MS instruments. To evaluate the depth and accuracy of identification and quantification, we conducted extensive assessments using human fecal samples and benchmark tests. Performance tests conducted on human fecal samples indicated that metaExpertPro quantified an average of 45,000 peptides in a 60-min diaPASEF injection. Notably, metaExpertPro outperformed three existing software tools by characterizing a higher number of peptides and proteins. Importantly, metaExpertPro maintained a low factual false discovery rate of approximately 5% for protein groups across four benchmark tests. Applying a filter of five peptides per genus, metaExpertPro achieved relatively high accuracy (F-score = 0.67-0.90) in genus diversity and showed a high correlation (r<sub>Spearman</sub> = 0.73-0.82) between the measured and true genus relative abundance in benchmark tests. Additionally, the quantitative results at the protein, taxonomy, and function levels exhibited high reproducibility and consistency across the commonly adopted public human gut microbial protein databases IGC and UHGP. In a metaproteomic analysis of dyslipidemia patients, metaExpertPro revealed characteristic alterations in microbial functions and potential interactions between the microbiota and the host.</p>\",\"PeriodicalId\":18712,\"journal\":{\"name\":\"Molecular & Cellular Proteomics\",\"volume\":\" \",\"pages\":\"100840\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular & Cellular Proteomics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.mcpro.2024.100840\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/13 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular & Cellular Proteomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.mcpro.2024.100840","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
metaExpertPro: A Computational Workflow for Metaproteomics Spectral Library Construction and Data-Independent Acquisition Mass Spectrometry Data Analysis.
Analysis of large-scale data-independent acquisition mass spectrometry metaproteomics data remains a computational challenge. Here, we present a computational pipeline called metaExpertPro for metaproteomics data analysis. This pipeline encompasses spectral library generation using data-dependent acquisition MS, protein identification and quantification using data-independent acquisition mass spectrometry, functional and taxonomic annotation, as well as quantitative matrix generation for both microbiota and hosts. By integrating FragPipe and DIA-NN, metaExpertPro offers compatibility with both Orbitrap and timsTOF MS instruments. To evaluate the depth and accuracy of identification and quantification, we conducted extensive assessments using human fecal samples and benchmark tests. Performance tests conducted on human fecal samples indicated that metaExpertPro quantified an average of 45,000 peptides in a 60-min diaPASEF injection. Notably, metaExpertPro outperformed three existing software tools by characterizing a higher number of peptides and proteins. Importantly, metaExpertPro maintained a low factual false discovery rate of approximately 5% for protein groups across four benchmark tests. Applying a filter of five peptides per genus, metaExpertPro achieved relatively high accuracy (F-score = 0.67-0.90) in genus diversity and showed a high correlation (rSpearman = 0.73-0.82) between the measured and true genus relative abundance in benchmark tests. Additionally, the quantitative results at the protein, taxonomy, and function levels exhibited high reproducibility and consistency across the commonly adopted public human gut microbial protein databases IGC and UHGP. In a metaproteomic analysis of dyslipidemia patients, metaExpertPro revealed characteristic alterations in microbial functions and potential interactions between the microbiota and the host.
期刊介绍:
The mission of MCP is to foster the development and applications of proteomics in both basic and translational research. MCP will publish manuscripts that report significant new biological or clinical discoveries underpinned by proteomic observations across all kingdoms of life. Manuscripts must define the biological roles played by the proteins investigated or their mechanisms of action.
The journal also emphasizes articles that describe innovative new computational methods and technological advancements that will enable future discoveries. Manuscripts describing such approaches do not have to include a solution to a biological problem, but must demonstrate that the technology works as described, is reproducible and is appropriate to uncover yet unknown protein/proteome function or properties using relevant model systems or publicly available data.
Scope:
-Fundamental studies in biology, including integrative "omics" studies, that provide mechanistic insights
-Novel experimental and computational technologies
-Proteogenomic data integration and analysis that enable greater understanding of physiology and disease processes
-Pathway and network analyses of signaling that focus on the roles of post-translational modifications
-Studies of proteome dynamics and quality controls, and their roles in disease
-Studies of evolutionary processes effecting proteome dynamics, quality and regulation
-Chemical proteomics, including mechanisms of drug action
-Proteomics of the immune system and antigen presentation/recognition
-Microbiome proteomics, host-microbe and host-pathogen interactions, and their roles in health and disease
-Clinical and translational studies of human diseases
-Metabolomics to understand functional connections between genes, proteins and phenotypes