{"title":"Leveraging Pretrained Deep Protein Language Model to Predict Peptide Collision Cross Section","authors":"Ayano Nakai-Kasai, Kosuke Ogata, Yasushi Ishihama, Toshiyuki Tanaka","doi":"10.1101/2024.09.11.612388","DOIUrl":null,"url":null,"abstract":"Collision cross section (CCS) of peptide ions provides an important separation dimension in liquid chromatography/tandem mass spectrometry-based proteomics that incorporates ion mobility spectrometry (IMS), and its accurate prediction is the basis for advanced proteomics workflows. This paper describes novel experimental data and a novel prediction model for challenging CCS prediction tasks including longer peptides that tend to have higher charge states. The proposed model is based on a pretrained deep protein language model. While the conventional prediction model requires training from scratch, the proposed model enables training with less amount of time owing to the use of the pretrained model as a feature extractor. Results of experiments with the novel experimental data show that the proposed model succeeds in drastically reducing the training time while maintaining the same or even better prediction performance compared with the conventional method. Our approach presents the possibility of prediction in a greener manner of various peptide properties in proteomic liquid chromatography/tandem mass spectrometry experiments.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"7 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.11.612388","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Collision cross section (CCS) of peptide ions provides an important separation dimension in liquid chromatography/tandem mass spectrometry-based proteomics that incorporates ion mobility spectrometry (IMS), and its accurate prediction is the basis for advanced proteomics workflows. This paper describes novel experimental data and a novel prediction model for challenging CCS prediction tasks including longer peptides that tend to have higher charge states. The proposed model is based on a pretrained deep protein language model. While the conventional prediction model requires training from scratch, the proposed model enables training with less amount of time owing to the use of the pretrained model as a feature extractor. Results of experiments with the novel experimental data show that the proposed model succeeds in drastically reducing the training time while maintaining the same or even better prediction performance compared with the conventional method. Our approach presents the possibility of prediction in a greener manner of various peptide properties in proteomic liquid chromatography/tandem mass spectrometry experiments.