Aurora Callahan, Xien Yu Chua, Alijah A Griffith, Tobias Hildebrandt, Guoping Fu, Mengzhou Hu, Renren Wen, Arthur R Salomon
Sequencing the tyrosine phosphoproteome using MS-based proteomics is challenging due to the low abundance of tyrosine phosphorylation in cells, a challenge compounded in scarce samples like primary cells or clinical samples. The broad-spectrum optimisation of selective triggering (BOOST) method was recently developed to increase phosphotyrosine sequencing in low protein input samples by leveraging tandem mass tags (TMT), phosphotyrosine enrichment, and a phosphotyrosine-loaded carrier channel. Here, we demonstrate the viability of BOOST in T cell receptor (TCR)-stimulated primary murine T cells by benchmarking the accuracy and precision of the BOOST method and discerning significant alterations in the phosphoproteome associated with receptor stimulation. Using 1 mg of protein input (about 20 million cells) and BOOST, we identify and precisely quantify more than 2000 unique pY sites compared to about 300 unique pY sites in non-BOOST control samples. We show that although replicate variation increases when using the BOOST method, BOOST does not jeopardise quantitative precision or the ability to determine statistical significance for peptides measured in triplicate. Many pY previously uncharacterised sites on important T cell signalling proteins are quantified using BOOST, and we identify new TCR responsive pY sites observable only with BOOST. Finally, we determine that the phase-spectrum deconvolution method on Orbitrap instruments can impair pY quantitation in BOOST experiments.
{"title":"Deep phosphotyrosine characterisation of primary murine T cells using broad spectrum optimisation of selective triggering.","authors":"Aurora Callahan, Xien Yu Chua, Alijah A Griffith, Tobias Hildebrandt, Guoping Fu, Mengzhou Hu, Renren Wen, Arthur R Salomon","doi":"10.1002/pmic.202400106","DOIUrl":"https://doi.org/10.1002/pmic.202400106","url":null,"abstract":"<p><p>Sequencing the tyrosine phosphoproteome using MS-based proteomics is challenging due to the low abundance of tyrosine phosphorylation in cells, a challenge compounded in scarce samples like primary cells or clinical samples. The broad-spectrum optimisation of selective triggering (BOOST) method was recently developed to increase phosphotyrosine sequencing in low protein input samples by leveraging tandem mass tags (TMT), phosphotyrosine enrichment, and a phosphotyrosine-loaded carrier channel. Here, we demonstrate the viability of BOOST in T cell receptor (TCR)-stimulated primary murine T cells by benchmarking the accuracy and precision of the BOOST method and discerning significant alterations in the phosphoproteome associated with receptor stimulation. Using 1 mg of protein input (about 20 million cells) and BOOST, we identify and precisely quantify more than 2000 unique pY sites compared to about 300 unique pY sites in non-BOOST control samples. We show that although replicate variation increases when using the BOOST method, BOOST does not jeopardise quantitative precision or the ability to determine statistical significance for peptides measured in triplicate. Many pY previously uncharacterised sites on important T cell signalling proteins are quantified using BOOST, and we identify new TCR responsive pY sites observable only with BOOST. Finally, we determine that the phase-spectrum deconvolution method on Orbitrap instruments can impair pY quantitation in BOOST experiments.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141873751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Single-cell proteomics (SCP) aims to characterize the proteome of individual cells, providing insights into complex biological systems. It reveals subtle differences in distinct cellular populations that bulk proteome analysis may overlook, which is essential for understanding disease mechanisms and developing targeted therapies. Mass spectrometry (MS) methods in SCP allow the identification and quantification of thousands of proteins from individual cells. Two major challenges in SCP are the limited material in single-cell samples necessitating highly sensitive analytical techniques and the efficient processing of samples, as each biological sample requires thousands of single cell measurements. This review discusses MS advancements to mitigate these challenges using data-dependent acquisition (DDA) and data-independent acquisition (DIA). Additionally, we examine the use of short liquid chromatography gradients and sample multiplexing methods that increase the sample throughput and scalability of SCP experiments. We believe these methods will pave the way for improving our understanding of cellular heterogeneity and its implications for systems biology.
单细胞蛋白质组学(Single-cell proteomics,SCP)旨在表征单个细胞的蛋白质组,从而深入了解复杂的生物系统。它揭示了大量蛋白质组分析可能忽略的不同细胞群的细微差别,这对于了解疾病机制和开发靶向疗法至关重要。SCP 中的质谱(MS)方法可对单个细胞中的数千种蛋白质进行鉴定和定量。SCP 面临两大挑战:一是单细胞样本中的物质有限,需要高灵敏度的分析技术;二是样本的高效处理,因为每个生物样本需要进行数千次单细胞测量。本综述讨论了利用数据依赖性采集(DDA)和数据无关性采集(DIA)来减轻这些挑战的 MS 先进技术。此外,我们还探讨了使用短液相色谱梯度和样品复用方法来提高样品吞吐量和 SCP 实验的可扩展性。我们相信,这些方法将为我们更好地理解细胞异质性及其对系统生物学的影响铺平道路。
{"title":"Data acquisition approaches for single cell proteomics.","authors":"Gautam Ghosh, Ariana E Shannon, Brian C Searle","doi":"10.1002/pmic.202400022","DOIUrl":"https://doi.org/10.1002/pmic.202400022","url":null,"abstract":"<p><p>Single-cell proteomics (SCP) aims to characterize the proteome of individual cells, providing insights into complex biological systems. It reveals subtle differences in distinct cellular populations that bulk proteome analysis may overlook, which is essential for understanding disease mechanisms and developing targeted therapies. Mass spectrometry (MS) methods in SCP allow the identification and quantification of thousands of proteins from individual cells. Two major challenges in SCP are the limited material in single-cell samples necessitating highly sensitive analytical techniques and the efficient processing of samples, as each biological sample requires thousands of single cell measurements. This review discusses MS advancements to mitigate these challenges using data-dependent acquisition (DDA) and data-independent acquisition (DIA). Additionally, we examine the use of short liquid chromatography gradients and sample multiplexing methods that increase the sample throughput and scalability of SCP experiments. We believe these methods will pave the way for improving our understanding of cellular heterogeneity and its implications for systems biology.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141873750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongkai Xu, Jiangguo Zhang, Fang Wang, Yiyang Chen, Hao Chen, Yang Feng, Guixue Hou, Jin Zi, Meiping Zhang, Jinfeng Zhou, Le Deng, Liang Lin, Xiaoyin Zhang, Siqi Liu
Intestinal lavage fluid (IVF) containing the mucosa-associated microbiota instead of fecal samples was used to study the gut microbiota using different omics approaches. Focusing on the 63 IVF samples collected from healthy and hepatitis B virus-liver disease (HBV-LD), a question is prompted whether omics features could be extracted to distinguish these samples. The IVF-related microbiota derived from the omics data was classified into two enterotype sets, whereas the genomics-based enterotypes were poorly overlapped with the proteomics-based one in either distribution of microbiota or of IVFs. There is lack of molecular features in these enterotypes to specifically recognize healthy or HBV-LD. Running machine learning against the omics data sought the appropriate models to discriminate the healthy and HBV-LD IVFs based on selected genes or proteins. Although a single omics dataset is basically workable in such discrimination, integration of the two datasets enhances discrimination efficiency. The protein features with higher frequencies in the models are further compared between healthy and HBV-LD based on their abundance, bringing about three potential protein biomarkers. This study highlights that integration of metaomics data is beneficial for a molecular discriminator of healthy and HBV-LD, and reveals the IVF samples are valuable for microbiome in a small cohort.
{"title":"Integration of metagenomics and metaproteomics in the intestinal lavage fluids benefits construction of discriminative model and discovery of biomarkers for HBV liver diseases","authors":"Hongkai Xu, Jiangguo Zhang, Fang Wang, Yiyang Chen, Hao Chen, Yang Feng, Guixue Hou, Jin Zi, Meiping Zhang, Jinfeng Zhou, Le Deng, Liang Lin, Xiaoyin Zhang, Siqi Liu","doi":"10.1002/pmic.202400002","DOIUrl":"10.1002/pmic.202400002","url":null,"abstract":"<p>Intestinal lavage fluid (IVF) containing the mucosa-associated microbiota instead of fecal samples was used to study the gut microbiota using different omics approaches. Focusing on the 63 IVF samples collected from healthy and hepatitis B virus-liver disease (HBV-LD), a question is prompted whether omics features could be extracted to distinguish these samples. The IVF-related microbiota derived from the omics data was classified into two enterotype sets, whereas the genomics-based enterotypes were poorly overlapped with the proteomics-based one in either distribution of microbiota or of IVFs. There is lack of molecular features in these enterotypes to specifically recognize healthy or HBV-LD. Running machine learning against the omics data sought the appropriate models to discriminate the healthy and HBV-LD IVFs based on selected genes or proteins. Although a single omics dataset is basically workable in such discrimination, integration of the two datasets enhances discrimination efficiency. The protein features with higher frequencies in the models are further compared between healthy and HBV-LD based on their abundance, bringing about three potential protein biomarkers. This study highlights that integration of metaomics data is beneficial for a molecular discriminator of healthy and HBV-LD, and reveals the IVF samples are valuable for microbiome in a small cohort.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141750689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiří Pospíšil, Alice Sax, Martin Hubálek, Libor Krásný, Jiří Vohradský
In this study, we present a high-resolution dataset and bioinformatic analysis of the proteome of Bacillus subtilis 168 trp+ (BSB1) during germination and spore outgrowth. Samples were collected at 14 different time points (ranging from 0 to 130 min) in three biological replicates after spore inoculation into germination medium. A total of 2191 proteins were identified and categorized based on their expression kinetics. We observed four distinct clusters that were analyzed for functional categories and KEGG pathways annotations. The examination of newly synthesized proteins between successive time points revealed significant changes, particularly within the first 50 min. The dataset provides an information base that can be used for modeling purposes and inspire the design of new experiments.
{"title":"Whole proteome analysis of germinating and outgrowing Bacillus subtilis 168","authors":"Jiří Pospíšil, Alice Sax, Martin Hubálek, Libor Krásný, Jiří Vohradský","doi":"10.1002/pmic.202400031","DOIUrl":"10.1002/pmic.202400031","url":null,"abstract":"<p>In this study, we present a high-resolution dataset and bioinformatic analysis of the proteome of <i>Bacillus subtilis</i> 168 trp+ (BSB1) during germination and spore outgrowth. Samples were collected at 14 different time points (ranging from 0 to 130 min) in three biological replicates after spore inoculation into germination medium. A total of 2191 proteins were identified and categorized based on their expression kinetics. We observed four distinct clusters that were analyzed for functional categories and KEGG pathways annotations. The examination of newly synthesized proteins between successive time points revealed significant changes, particularly within the first 50 min. The dataset provides an information base that can be used for modeling purposes and inspire the design of new experiments.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202400031","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141750690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fei Fang, Guangyao Gao, Qianyi Wang, Qianjie Wang, Liangliang Sun
Mass spectrometry (MS)-based top-down proteomics (TDP) analysis of histone proteoforms provides critical information about combinatorial post-translational modifications (PTMs), which is vital for pursuing a better understanding of epigenetic regulation of gene expression. It requires high-resolution separations of histone proteoforms before MS and tandem MS (MS/MS) analysis. In this work, for the first time, we combined SDS-PAGE-based protein fractionation (passively eluting proteins from polyacrylamide gels as intact species for mass spectrometry, PEPPI-MS) with capillary zone electrophoresis (CZE)-MS/MS for high-resolution characterization of histone proteoforms. We systematically studied the histone proteoform extraction from SDS-PAGE gel and follow-up cleanup as well as CZE-MS/MS, to determine an optimal procedure. The optimal procedure showed reproducible and high-resolution separation and characterization of histone proteoforms. SDS-PAGE separated histone proteins (H1, H2, H3, and H4) based on their molecular weight and CZE provided additional separations of proteoforms of each histone protein based on their electrophoretic mobility, which was affected by PTMs, for example, acetylation and phosphorylation. Using the technique, we identified over 200 histone proteoforms from a commercial calf thymus histone sample with good reproducibility. The orthogonal and high-resolution separations of SDS-PAGE and CZE made our technique attractive for the delineation of histone proteoforms extracted from complex biological systems.
{"title":"Combining SDS-PAGE to capillary zone electrophoresis-tandem mass spectrometry for high-resolution top-down proteomics analysis of intact histone proteoforms","authors":"Fei Fang, Guangyao Gao, Qianyi Wang, Qianjie Wang, Liangliang Sun","doi":"10.1002/pmic.202300650","DOIUrl":"10.1002/pmic.202300650","url":null,"abstract":"<p>Mass spectrometry (MS)-based top-down proteomics (TDP) analysis of histone proteoforms provides critical information about combinatorial post-translational modifications (PTMs), which is vital for pursuing a better understanding of epigenetic regulation of gene expression. It requires high-resolution separations of histone proteoforms before MS and tandem MS (MS/MS) analysis. In this work, for the first time, we combined SDS-PAGE-based protein fractionation (passively eluting proteins from polyacrylamide gels as intact species for mass spectrometry, PEPPI-MS) with capillary zone electrophoresis (CZE)-MS/MS for high-resolution characterization of histone proteoforms. We systematically studied the histone proteoform extraction from SDS-PAGE gel and follow-up cleanup as well as CZE-MS/MS, to determine an optimal procedure. The optimal procedure showed reproducible and high-resolution separation and characterization of histone proteoforms. SDS-PAGE separated histone proteins (H1, H2, H3, and H4) based on their molecular weight and CZE provided additional separations of proteoforms of each histone protein based on their electrophoretic mobility, which was affected by PTMs, for example, acetylation and phosphorylation. Using the technique, we identified over 200 histone proteoforms from a commercial calf thymus histone sample with good reproducibility. The orthogonal and high-resolution separations of SDS-PAGE and CZE made our technique attractive for the delineation of histone proteoforms extracted from complex biological systems.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202300650","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141632126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aaron O Bailey, Kenneth R Durbin, Matthew T Robey, Lee K Palmer, William K Russell
Liquid chromatography-mass spectrometry (LC-MS) intact mass analysis and LC-MS/MS peptide mapping are decisional assays for developing biological drugs and other commercial protein products. Certain PTM types, such as truncation and oxidation, increase the difficulty of precise proteoform characterization owing to inherent limitations in peptide and intact protein analyses. Top-down MS (TDMS) can resolve this ambiguity via fragmentation of specific proteoforms. We leveraged the strengths of flow-programmed (fp) denaturing online buffer exchange (dOBE) chromatography, including robust automation, relatively high ESI sensitivity, and long MS/MS window time, to support a TDMS platform for industrial protein characterization. We tested data-dependent (DDA) and targeted strategies using 14 different MS/MS scan types featuring combinations of collisional- and electron-based fragmentation as well as proton transfer charge reduction. This large, focused dataset was processed using a new software platform, named TDAcquireX, that improves proteoform characterization through TDMS data aggregation. A DDA-based workflow provided objective identification of αLac truncation proteoforms with a two-termini clipping search. A targeted TDMS workflow facilitated the characterization of αLac oxidation positional isomers. This strategy relied on using sliding window-based fragment ion deconvolution to generate composite proteoform spectral match (cPrSM) results amenable to fragment noise filtering, which is a fundamental enhancement relevant to TDMS applications generally.
{"title":"Filling the gaps in peptide maps with a platform assay for top-down characterization of purified protein samples.","authors":"Aaron O Bailey, Kenneth R Durbin, Matthew T Robey, Lee K Palmer, William K Russell","doi":"10.1002/pmic.202400036","DOIUrl":"10.1002/pmic.202400036","url":null,"abstract":"<p><p>Liquid chromatography-mass spectrometry (LC-MS) intact mass analysis and LC-MS/MS peptide mapping are decisional assays for developing biological drugs and other commercial protein products. Certain PTM types, such as truncation and oxidation, increase the difficulty of precise proteoform characterization owing to inherent limitations in peptide and intact protein analyses. Top-down MS (TDMS) can resolve this ambiguity via fragmentation of specific proteoforms. We leveraged the strengths of flow-programmed (fp) denaturing online buffer exchange (dOBE) chromatography, including robust automation, relatively high ESI sensitivity, and long MS/MS window time, to support a TDMS platform for industrial protein characterization. We tested data-dependent (DDA) and targeted strategies using 14 different MS/MS scan types featuring combinations of collisional- and electron-based fragmentation as well as proton transfer charge reduction. This large, focused dataset was processed using a new software platform, named TDAcquireX, that improves proteoform characterization through TDMS data aggregation. A DDA-based workflow provided objective identification of αLac truncation proteoforms with a two-termini clipping search. A targeted TDMS workflow facilitated the characterization of αLac oxidation positional isomers. This strategy relied on using sliding window-based fragment ion deconvolution to generate composite proteoform spectral match (cPrSM) results amenable to fragment noise filtering, which is a fundamental enhancement relevant to TDMS applications generally.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141615392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Predicting protein function from protein sequence, structure, interaction, and other relevant information is important for generating hypotheses for biological experiments and studying biological systems, and therefore has been a major challenge in protein bioinformatics. Numerous computational methods had been developed to advance protein function prediction gradually in the last two decades. Particularly, in the recent years, leveraging the revolutionary advances in artificial intelligence (AI), more and more deep learning methods have been developed to improve protein function prediction at a faster pace. Here, we provide an in-depth review of the recent developments of deep learning methods for protein function prediction. We summarize the significant advances in the field, identify several remaining major challenges to be tackled, and suggest some potential directions to explore. The data sources and evaluation metrics widely used in protein function prediction are also discussed to assist the machine learning, AI, and bioinformatics communities to develop more cutting-edge methods to advance protein function prediction.
{"title":"Deep learning methods for protein function prediction.","authors":"Frimpong Boadu, Ahhyun Lee, Jianlin Cheng","doi":"10.1002/pmic.202300471","DOIUrl":"https://doi.org/10.1002/pmic.202300471","url":null,"abstract":"<p><p>Predicting protein function from protein sequence, structure, interaction, and other relevant information is important for generating hypotheses for biological experiments and studying biological systems, and therefore has been a major challenge in protein bioinformatics. Numerous computational methods had been developed to advance protein function prediction gradually in the last two decades. Particularly, in the recent years, leveraging the revolutionary advances in artificial intelligence (AI), more and more deep learning methods have been developed to improve protein function prediction at a faster pace. Here, we provide an in-depth review of the recent developments of deep learning methods for protein function prediction. We summarize the significant advances in the field, identify several remaining major challenges to be tackled, and suggest some potential directions to explore. The data sources and evaluation metrics widely used in protein function prediction are also discussed to assist the machine learning, AI, and bioinformatics communities to develop more cutting-edge methods to advance protein function prediction.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141597982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Colin W. Combe, Lars Kolbowski, Lutz Fischer, Ville Koskinen, Joshua Klein, Alexander Leitner, Andrew R. Jones, Juan Antonio Vizcaíno, Juri Rappsilber
The mzIdentML data format, originally developed by the Proteomics Standards Initiative in 2011, is the open XML data standard for peptide and protein identification results coming from mass spectrometry. We present mzIdentML version 1.3.0, which introduces new functionality and support for additional use cases. First of all, a new mechanism for encoding identifications based on multiple spectra has been introduced. Furthermore, the main mzIdentML specification document can now be supplemented by extension documents which provide further guidance for encoding specific use cases for different proteomics subfields. One extension document has been added, covering additional use cases for the encoding of crosslinked peptide identifications. The ability to add extension documents facilitates keeping the mzIdentML standard up to date with advances in the proteomics field, without having to change the main specification document. The crosslinking extension document provides further explanation of the crosslinking use cases already supported in mzIdentML version 1.2.0, and provides support for encoding additional scenarios that are critical to reflect developments in the crosslinking field and facilitate its integration in structural biology. These are: (i) support for cleavable crosslinkers, (ii) support for internally linked peptides, (iii) support for noncovalently associated peptides, and (iv) improved support for encoding scores and the corresponding thresholds.
{"title":"mzIdentML 1.3.0 – Essential progress on the support of crosslinking and other identifications based on multiple spectra","authors":"Colin W. Combe, Lars Kolbowski, Lutz Fischer, Ville Koskinen, Joshua Klein, Alexander Leitner, Andrew R. Jones, Juan Antonio Vizcaíno, Juri Rappsilber","doi":"10.1002/pmic.202300385","DOIUrl":"10.1002/pmic.202300385","url":null,"abstract":"<p>The mzIdentML data format, originally developed by the Proteomics Standards Initiative in 2011, is the open XML data standard for peptide and protein identification results coming from mass spectrometry. We present mzIdentML version 1.3.0, which introduces new functionality and support for additional use cases. First of all, a new mechanism for encoding identifications based on multiple spectra has been introduced. Furthermore, the main mzIdentML specification document can now be supplemented by extension documents which provide further guidance for encoding specific use cases for different proteomics subfields. One extension document has been added, covering additional use cases for the encoding of crosslinked peptide identifications. The ability to add extension documents facilitates keeping the mzIdentML standard up to date with advances in the proteomics field, without having to change the main specification document. The crosslinking extension document provides further explanation of the crosslinking use cases already supported in mzIdentML version 1.2.0, and provides support for encoding additional scenarios that are critical to reflect developments in the crosslinking field and facilitate its integration in structural biology. These are: (i) support for cleavable crosslinkers, (ii) support for internally linked peptides, (iii) support for noncovalently associated peptides, and (iv) improved support for encoding scores and the corresponding thresholds.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202300385","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141597983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}