Toward Automated Preprocessing of Untargeted LC-MS-Based Metabolomics Feature Lists from Human Biofluids

IF 6.7 1区化学 Q1 CHEMISTRY, ANALYTICAL Analytical Chemistry Pub Date : 2025-01-06 DOI:10.1021/acs.analchem.4c03124

Amy Hughes, Pablo Vangeenderhuysen, Marilyn De Graeve, Beata Pomian, Tim S. Nawrot, Jeroen Raes, Simon J. S. Cameron, Lynn Vanhaecke

{"title":"Toward Automated Preprocessing of Untargeted LC-MS-Based Metabolomics Feature Lists from Human Biofluids","authors":"Amy Hughes, Pablo Vangeenderhuysen, Marilyn De Graeve, Beata Pomian, Tim S. Nawrot, Jeroen Raes, Simon J. S. Cameron, Lynn Vanhaecke","doi":"10.1021/acs.analchem.4c03124","DOIUrl":null,"url":null,"abstract":"Maximizing the extraction of true, high-quality, nonredundant features from biofluids analyzed via LC-MS systems is challenging. Here, the R packages IPO and AutoTuner were used to optimize XCMS parameter settings for the retrieval of metabolite or lipid features in both ionization modes from either faecal or urine samples from two cohorts (<i>n</i> = 621). The feature lists obtained were compared with those where the parameter values were selected manually. Three categories were used to compare feature lists: 1) feature quality through removing false positives, 2) tentative metabolite identification using the Human Metabolome Database (HMDB) and 3) feature utility such as analyzing the proportion of features within intensity threshold bins. Furthermore, a PCA-based approach to feature filtering using QC samples and variable loadings was also explored under this category. Overall, more features were observed after automated selection of parameter values for all data sets (1.3- to 3.7-fold), which propagated through comparative exercises. For example, a greater number of features (on average 51 vs 45%) had a coefficient of variation (CV) < 30%. Additionally, there was a significant increase (7.6–10.4%) in the number of faecal metabolites that could be tentatively annotated, and more features were present in higher intensity threshold bins. Considering the overlap across all three categories, a greater number of features were also retained. Automated approaches that guide selection of optimal parameter values for preprocessing are important to decrease the time invested for this step, while taking advantage of the wealth of data that LC-MS systems provide.","PeriodicalId":27,"journal":{"name":"Analytical Chemistry","volume":"37 1","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.analchem.4c03124","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Maximizing the extraction of true, high-quality, nonredundant features from biofluids analyzed via LC-MS systems is challenging. Here, the R packages IPO and AutoTuner were used to optimize XCMS parameter settings for the retrieval of metabolite or lipid features in both ionization modes from either faecal or urine samples from two cohorts (n = 621). The feature lists obtained were compared with those where the parameter values were selected manually. Three categories were used to compare feature lists: 1) feature quality through removing false positives, 2) tentative metabolite identification using the Human Metabolome Database (HMDB) and 3) feature utility such as analyzing the proportion of features within intensity threshold bins. Furthermore, a PCA-based approach to feature filtering using QC samples and variable loadings was also explored under this category. Overall, more features were observed after automated selection of parameter values for all data sets (1.3- to 3.7-fold), which propagated through comparative exercises. For example, a greater number of features (on average 51 vs 45%) had a coefficient of variation (CV) < 30%. Additionally, there was a significant increase (7.6–10.4%) in the number of faecal metabolites that could be tentatively annotated, and more features were present in higher intensity threshold bins. Considering the overlap across all three categories, a greater number of features were also retained. Automated approaches that guide selection of optimal parameter values for preprocessing are important to decrease the time invested for this step, while taking advantage of the wealth of data that LC-MS systems provide.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Analytical Chemistry 化学-分析化学

CiteScore

12.10

自引率

12.20%

发文量

1949

审稿时长

1.4 months

期刊介绍： Analytical Chemistry, a peer-reviewed research journal, focuses on disseminating new and original knowledge across all branches of analytical chemistry. Fundamental articles may explore general principles of chemical measurement science and need not directly address existing or potential analytical methodology. They can be entirely theoretical or report experimental results. Contributions may cover various phases of analytical operations, including sampling, bioanalysis, electrochemistry, mass spectrometry, microscale and nanoscale systems, environmental analysis, separations, spectroscopy, chemical reactions and selectivity, instrumentation, imaging, surface analysis, and data processing. Papers discussing known analytical methods should present a significant, original application of the method, a notable improvement, or results on an important analyte.