Cailum M. K. Stienstra, Emir Nazdrajić, W. Scott Hopkins
{"title":"From Reverse Phase Chromatography to HILIC: Graph Transformers Power Method-Independent Machine Learning of Retention Times","authors":"Cailum M. K. Stienstra, Emir Nazdrajić, W. Scott Hopkins","doi":"10.1021/acs.analchem.4c05859","DOIUrl":null,"url":null,"abstract":"Liquid chromatography (LC) is a cornerstone of analytical separations, but comparing the retention times (RTs) across different LC methods is challenging because of variations in experimental parameters such as column type and solvent gradient. Nevertheless, RTs are powerful metrics in tandem mass spectrometry (MS<sup>2</sup>) that can reduce false positive rates for metabolite annotation, differentiate isobaric species, and improve peptide identification. Here, we present Graphormer-RT, a novel graph transformer that performs the first single-model method-independent prediction of RTs. We use the RepoRT data set, which contains 142,688 reverse phase (RP) RTs (from 191 methods) and 4,373 HILIC RTs (from 49 methods). Our best RP model (trained and tested on 191 methods) achieved a test set mean average error (MAE) of 29.3 ± 0.6 s, comparable performance to the state-of-the-art model which was only trained on a single LC method. Our best-performing HILIC model achieved a test MAE = 42.4 ± 2.9 s. We expect that Graphormer-RT can be used as an LC “foundation model”, where transfer learning can reduce the amount of training data needed for highly accurate “specialist” models applied to method-specific RP and HILIC tasks. These frameworks could enable the machine optimization of automated LC workflows, improved filtration of candidate structures using predicted RTs, and the <i>in silico</i> annotation of unknown analytes in LC-MS<sup>2</sup> measurements.","PeriodicalId":27,"journal":{"name":"Analytical Chemistry","volume":"15 1","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.analchem.4c05859","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
From Reverse Phase Chromatography to HILIC: Graph Transformers Power Method-Independent Machine Learning of Retention Times
Liquid chromatography (LC) is a cornerstone of analytical separations, but comparing the retention times (RTs) across different LC methods is challenging because of variations in experimental parameters such as column type and solvent gradient. Nevertheless, RTs are powerful metrics in tandem mass spectrometry (MS2) that can reduce false positive rates for metabolite annotation, differentiate isobaric species, and improve peptide identification. Here, we present Graphormer-RT, a novel graph transformer that performs the first single-model method-independent prediction of RTs. We use the RepoRT data set, which contains 142,688 reverse phase (RP) RTs (from 191 methods) and 4,373 HILIC RTs (from 49 methods). Our best RP model (trained and tested on 191 methods) achieved a test set mean average error (MAE) of 29.3 ± 0.6 s, comparable performance to the state-of-the-art model which was only trained on a single LC method. Our best-performing HILIC model achieved a test MAE = 42.4 ± 2.9 s. We expect that Graphormer-RT can be used as an LC “foundation model”, where transfer learning can reduce the amount of training data needed for highly accurate “specialist” models applied to method-specific RP and HILIC tasks. These frameworks could enable the machine optimization of automated LC workflows, improved filtration of candidate structures using predicted RTs, and the in silico annotation of unknown analytes in LC-MS2 measurements.
期刊介绍:
Analytical Chemistry, a peer-reviewed research journal, focuses on disseminating new and original knowledge across all branches of analytical chemistry. Fundamental articles may explore general principles of chemical measurement science and need not directly address existing or potential analytical methodology. They can be entirely theoretical or report experimental results. Contributions may cover various phases of analytical operations, including sampling, bioanalysis, electrochemistry, mass spectrometry, microscale and nanoscale systems, environmental analysis, separations, spectroscopy, chemical reactions and selectivity, instrumentation, imaging, surface analysis, and data processing. Papers discussing known analytical methods should present a significant, original application of the method, a notable improvement, or results on an important analyte.