Ionmob: a Python package for prediction of peptide collisional cross-section values.

IF 4.4 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Bioinformatics Pub Date : 2023-09-02 DOI:10.1093/bioinformatics/btad486
David Teschner, David Gomez-Zepeda, Arthur Declercq, Mateusz K Łącki, Seymen Avci, Konstantin Bob, Ute Distler, Thomas Michna, Lennart Martens, Stefan Tenzer, Andreas Hildebrandt
{"title":"Ionmob: a Python package for prediction of peptide collisional cross-section values.","authors":"David Teschner,&nbsp;David Gomez-Zepeda,&nbsp;Arthur Declercq,&nbsp;Mateusz K Łącki,&nbsp;Seymen Avci,&nbsp;Konstantin Bob,&nbsp;Ute Distler,&nbsp;Thomas Michna,&nbsp;Lennart Martens,&nbsp;Stefan Tenzer,&nbsp;Andreas Hildebrandt","doi":"10.1093/bioinformatics/btad486","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Including ion mobility separation (IMS) into mass spectrometry proteomics experiments is useful to improve coverage and throughput. Many IMS devices enable linking experimentally derived mobility of an ion to its collisional cross-section (CCS), a highly reproducible physicochemical property dependent on the ion's mass, charge and conformation in the gas phase. Thus, known peptide ion mobilities can be used to tailor acquisition methods or to refine database search results. The large space of potential peptide sequences, driven also by posttranslational modifications of amino acids, motivates an in silico predictor for peptide CCS. Recent studies explored the general performance of varying machine-learning techniques, however, the workflow engineering part was of secondary importance. For the sake of applicability, such a tool should be generic, data driven, and offer the possibility to be easily adapted to individual workflows for experimental design and data processing.</p><p><strong>Results: </strong>We created ionmob, a Python-based framework for data preparation, training, and prediction of collisional cross-section values of peptides. It is easily customizable and includes a set of pretrained, ready-to-use models and preprocessing routines for training and inference. Using a set of ≈21 000 unique phosphorylated peptides and ≈17 000 MHC ligand sequences and charge state pairs, we expand upon the space of peptides that can be integrated into CCS prediction. Lastly, we investigate the applicability of in silico predicted CCS to increase confidence in identified peptides by applying methods of re-scoring and demonstrate that predicted CCS values complement existing predictors for that task.</p><p><strong>Availability and implementation: </strong>The Python package is available at github: https://github.com/theGreatHerrLebert/ionmob.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":" ","pages":""},"PeriodicalIF":4.4000,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10521631/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btad486","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Including ion mobility separation (IMS) into mass spectrometry proteomics experiments is useful to improve coverage and throughput. Many IMS devices enable linking experimentally derived mobility of an ion to its collisional cross-section (CCS), a highly reproducible physicochemical property dependent on the ion's mass, charge and conformation in the gas phase. Thus, known peptide ion mobilities can be used to tailor acquisition methods or to refine database search results. The large space of potential peptide sequences, driven also by posttranslational modifications of amino acids, motivates an in silico predictor for peptide CCS. Recent studies explored the general performance of varying machine-learning techniques, however, the workflow engineering part was of secondary importance. For the sake of applicability, such a tool should be generic, data driven, and offer the possibility to be easily adapted to individual workflows for experimental design and data processing.

Results: We created ionmob, a Python-based framework for data preparation, training, and prediction of collisional cross-section values of peptides. It is easily customizable and includes a set of pretrained, ready-to-use models and preprocessing routines for training and inference. Using a set of ≈21 000 unique phosphorylated peptides and ≈17 000 MHC ligand sequences and charge state pairs, we expand upon the space of peptides that can be integrated into CCS prediction. Lastly, we investigate the applicability of in silico predicted CCS to increase confidence in identified peptides by applying methods of re-scoring and demonstrate that predicted CCS values complement existing predictors for that task.

Availability and implementation: The Python package is available at github: https://github.com/theGreatHerrLebert/ionmob.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Ionmob:用于预测肽碰撞横截面值的Python包。
动机:将离子迁移率分离(IMS)纳入质谱蛋白质组学实验有助于提高覆盖率和产量。许多IMS设备能够将实验得出的离子迁移率与其碰撞截面(CCS)联系起来,这是一种高度可重复的物理化学性质,取决于离子在气相中的质量、电荷和构象。因此,已知的肽离子迁移率可用于定制获取方法或细化数据库搜索结果。潜在肽序列的大空间,也由氨基酸的翻译后修饰驱动,激发了肽CCS的计算机预测。最近的研究探索了各种机器学习技术的一般性能,然而,工作流工程部分是次要的。为了适用性,这种工具应该是通用的、数据驱动的,并提供易于适应实验设计和数据处理的单个工作流程的可能性。结果:我们创建了ionmob,这是一个基于Python的框架,用于肽碰撞截面值的数据准备、训练和预测。它易于定制,包括一组经过预训练的现成模型和用于训练和推理的预处理例程。使用一组≈21 000个独特的磷酸化肽和≈17 000MHC配体序列和电荷态对,我们扩展了可以整合到CCS预测中的肽的空间。最后,我们研究了计算机预测CCS的适用性,通过应用重新评分的方法来增加对已鉴定肽的信心,并证明预测的CCS值补充了该任务的现有预测因子。可用性和实现:Python包可在github上获得:https://github.com/theGreatHerrLebert/ionmob.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Bioinformatics
Bioinformatics 生物-生化研究方法
CiteScore
11.20
自引率
5.20%
发文量
753
审稿时长
2.1 months
期刊介绍: The leading journal in its field, Bioinformatics publishes the highest quality scientific papers and review articles of interest to academic and industrial researchers. Its main focus is on new developments in genome bioinformatics and computational biology. Two distinct sections within the journal - Discovery Notes and Application Notes- focus on shorter papers; the former reporting biologically interesting discoveries using computational methods, the latter exploring the applications used for experiments.
期刊最新文献
MEHunter: Transformer-based mobile element variant detection from long reads Metabolic syndrome may be more frequent in treatment-naive sarcoidosis patients. Coracle—A Machine Learning Framework to Identify Bacteria Associated with Continuous Variables CoSIA: an R Bioconductor package for CrOss Species Investigation and Analysis LncLocFormer: a Transformer-based deep learning model for multi-label lncRNA subcellular localization prediction by using localization-specific attention mechanism
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1