Peter Lasch, Wolfgang Beyer, Alejandra Bosch, Rainer Borriss, Michal Drevinek, Susann Dupke, Monika Ehling-Schulz, Xuewen Gao, Roland Grunow, Daniela Jacob, Silke R Klee, Armand Paauw, Jörg Rau, Andy Schneider, Holger C Scholz, Maren Stämmler, Le Thi Thanh Tam, Herbert Tomaso, Guido Werner, Joerg Doellinger
{"title":"A MALDI-ToF mass spectrometry database for identification and classification of highly pathogenic bacteria.","authors":"Peter Lasch, Wolfgang Beyer, Alejandra Bosch, Rainer Borriss, Michal Drevinek, Susann Dupke, Monika Ehling-Schulz, Xuewen Gao, Roland Grunow, Daniela Jacob, Silke R Klee, Armand Paauw, Jörg Rau, Andy Schneider, Holger C Scholz, Maren Stämmler, Le Thi Thanh Tam, Herbert Tomaso, Guido Werner, Joerg Doellinger","doi":"10.1038/s41597-025-04504-z","DOIUrl":null,"url":null,"abstract":"<p><p>Today, MALDI-ToF MS is an established technique to characterize and identify pathogenic bacteria. The technique is increasingly applied by clinical microbiological laboratories that use commercially available complete solutions, including spectra databases covering clinically relevant bacteria. Such databases are validated for clinical, or research applications, but are often less comprehensive concerning highly pathogenic bacteria (HPB). To improve MALDI-ToF MS diagnostics of HPB we initiated a program to develop protocols for reliable and MALDI-compatible microbial inactivation and to acquire mass spectra thereof many years ago. As a result of this project, databases covering HPB, closely related bacteria, and bacteria of clinical relevance have been made publicly available on platforms such as ZENODO. This publication in detail describes the most recent version of this database. The dataset contains a total of 11,055 spectra from altogether 1,601 microbial strains and 264 species and is primarily intended to improve the diagnosis of HPB. We hope that our MALDI-ToF MS data may also be a valuable resource for developing machine learning-based bacterial identification and classification methods.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"187"},"PeriodicalIF":5.8000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11785946/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-04504-z","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Today, MALDI-ToF MS is an established technique to characterize and identify pathogenic bacteria. The technique is increasingly applied by clinical microbiological laboratories that use commercially available complete solutions, including spectra databases covering clinically relevant bacteria. Such databases are validated for clinical, or research applications, but are often less comprehensive concerning highly pathogenic bacteria (HPB). To improve MALDI-ToF MS diagnostics of HPB we initiated a program to develop protocols for reliable and MALDI-compatible microbial inactivation and to acquire mass spectra thereof many years ago. As a result of this project, databases covering HPB, closely related bacteria, and bacteria of clinical relevance have been made publicly available on platforms such as ZENODO. This publication in detail describes the most recent version of this database. The dataset contains a total of 11,055 spectra from altogether 1,601 microbial strains and 264 species and is primarily intended to improve the diagnosis of HPB. We hope that our MALDI-ToF MS data may also be a valuable resource for developing machine learning-based bacterial identification and classification methods.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.