Hamid Hachemi, Jean Armengaud, Lucia Grenga* and Olivier Pible,
{"title":"LineageFilter: Improved Proteotyping of Complex Samples Using Metaproteomics and Machine Learning","authors":"Hamid Hachemi, Jean Armengaud, Lucia Grenga* and Olivier Pible, ","doi":"10.1021/acs.jproteome.4c0018410.1021/acs.jproteome.4c00184","DOIUrl":null,"url":null,"abstract":"<p >Metaproteomics is a powerful tool to characterize how microbiota function by analyzing their proteic content by tandem mass spectrometry. Given the complexity of these samples, accurately assessing their taxonomical composition without prior information based solely on peptide sequences remains a challenge. Here, we present LineageFilter, a new python-based AI software for refined proteotyping of complex samples using metaproteomics interpreted data and machine learning. Given a tentative list of taxa, their abundances, and the scores associated with their identified peptides, LineageFilter computes a comprehensive set of features for each identified taxon at all taxonomical ranks. Its machine-learning model then assesses the likelihood of each taxon’s presence based on these features, enabling improved proteotyping and sample-specific database construction.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"99","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jproteome.4c00184","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Metaproteomics is a powerful tool to characterize how microbiota function by analyzing their proteic content by tandem mass spectrometry. Given the complexity of these samples, accurately assessing their taxonomical composition without prior information based solely on peptide sequences remains a challenge. Here, we present LineageFilter, a new python-based AI software for refined proteotyping of complex samples using metaproteomics interpreted data and machine learning. Given a tentative list of taxa, their abundances, and the scores associated with their identified peptides, LineageFilter computes a comprehensive set of features for each identified taxon at all taxonomical ranks. Its machine-learning model then assesses the likelihood of each taxon’s presence based on these features, enabling improved proteotyping and sample-specific database construction.