Alberto Marchetto, Monica Tirapelle, Luca Mazzei, Eva Sorensen, Maximilian O. Besenhard
{"title":"In Silico High-Performance Liquid Chromatography Method Development via Machine Learning","authors":"Alberto Marchetto, Monica Tirapelle, Luca Mazzei, Eva Sorensen, Maximilian O. Besenhard","doi":"10.1021/acs.analchem.4c03466","DOIUrl":null,"url":null,"abstract":"High-performance liquid chromatography (HPLC) remains the gold standard for analyzing and purifying molecular components in solutions. However, developing HPLC methods is material- and time-consuming, so computer-aided shortcuts are highly desirable. In line with the digitalization of process development and the growth of HPLC databases, we propose a data-driven methodology to predict molecule retention factors as a function of mobile phase composition without the need for any new experiments, solely relying on molecular descriptors (MDs) obtained via simplified molecular input line entry system (SMILES) string representations of molecules. This new approach combines: (a) quantitative structure–property relationships (QSPR) using MDs to predict solute-dependent parameters in (b) linear solvation energy relationships (LSER) and (c) linear solvent strength (LSS) theory. We demonstrate the potential of this computational methodology using experimental data for retention factors of small molecules made available by the research community for which the MDs were obtained via SMILES string representations determined by the structural formulas of the molecules. This method can be adopted directly to predict elution times of molecular components; however, in combination with first-principle-based mechanistic transport models, the method can also be employed to optimize HPLC methods in-silico. Both options can reduce the experimental load and accelerate HPLC method development significantly, lowering the time and cost of the drug manufacturing cycle and reducing the time to market. Given the growing number and quality of HPLC databases, the predictive power of this methodology will only increase in the coming years.","PeriodicalId":27,"journal":{"name":"Analytical Chemistry","volume":"33 1","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.analchem.4c03466","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0
Abstract
High-performance liquid chromatography (HPLC) remains the gold standard for analyzing and purifying molecular components in solutions. However, developing HPLC methods is material- and time-consuming, so computer-aided shortcuts are highly desirable. In line with the digitalization of process development and the growth of HPLC databases, we propose a data-driven methodology to predict molecule retention factors as a function of mobile phase composition without the need for any new experiments, solely relying on molecular descriptors (MDs) obtained via simplified molecular input line entry system (SMILES) string representations of molecules. This new approach combines: (a) quantitative structure–property relationships (QSPR) using MDs to predict solute-dependent parameters in (b) linear solvation energy relationships (LSER) and (c) linear solvent strength (LSS) theory. We demonstrate the potential of this computational methodology using experimental data for retention factors of small molecules made available by the research community for which the MDs were obtained via SMILES string representations determined by the structural formulas of the molecules. This method can be adopted directly to predict elution times of molecular components; however, in combination with first-principle-based mechanistic transport models, the method can also be employed to optimize HPLC methods in-silico. Both options can reduce the experimental load and accelerate HPLC method development significantly, lowering the time and cost of the drug manufacturing cycle and reducing the time to market. Given the growing number and quality of HPLC databases, the predictive power of this methodology will only increase in the coming years.
期刊介绍:
Analytical Chemistry, a peer-reviewed research journal, focuses on disseminating new and original knowledge across all branches of analytical chemistry. Fundamental articles may explore general principles of chemical measurement science and need not directly address existing or potential analytical methodology. They can be entirely theoretical or report experimental results. Contributions may cover various phases of analytical operations, including sampling, bioanalysis, electrochemistry, mass spectrometry, microscale and nanoscale systems, environmental analysis, separations, spectroscopy, chemical reactions and selectivity, instrumentation, imaging, surface analysis, and data processing. Papers discussing known analytical methods should present a significant, original application of the method, a notable improvement, or results on an important analyte.