Ready-to-use Models Built Using a Diverse Set of 266 Aroma Compounds for the Estimation of Gas Chromatographic Retention Indices for the 50%-Cyanopropylphenyl-50%-Dimethylpolysiloxane Stationary Phase
{"title":"Ready-to-use Models Built Using a Diverse Set of 266 Aroma Compounds for the Estimation of Gas Chromatographic Retention Indices for the 50%-Cyanopropylphenyl-50%-Dimethylpolysiloxane Stationary Phase","authors":"Anastasia Yu. Sholokhova, Dmitriy D. Matyushin","doi":"10.1002/jssc.70016","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Retention index prediction based on the molecule structure is not often used in practice due to low accuracy, the need to use paid software to calculate molecular descriptors (MD), and the narrow applicability domain of many models. In recent years, relatively accurate and versatile deep learning (DL)-based models have emerged. These models are now used in practice as an additional criterion in gas chromatography-mass spectrometry identification. The DB-225ms stationary phase (usually described as 50%-cyanopropylphenyl-50%-dimethylpolysiloxane in available sources) is widely used, but ready-to-use retention index estimation models are not available for it. This study presents such models. The models are linear and use simple constitutional MD and retention indices predicted by DL for the DB-WAX and DB-624 stationary phases as MD (we show that it is their use that allows us to achieve satisfactory accuracy). The accuracy obtained for a completely unseen hold-out test set: root mean square error 73.2; mean absolute error 45.7; median absolute error 22.0. The models were trained using a retention data set of 266 volatile compounds. All calculations can be performed using the convenient open-source software CHERESHNYA. The final equations are implemented as a spreadsheet and a code snippet and are available online: https://doi.org/10.6084/m9.figshare.26800789.</p>\n </div>","PeriodicalId":17098,"journal":{"name":"Journal of separation science","volume":"47 21","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of separation science","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jssc.70016","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Retention index prediction based on the molecule structure is not often used in practice due to low accuracy, the need to use paid software to calculate molecular descriptors (MD), and the narrow applicability domain of many models. In recent years, relatively accurate and versatile deep learning (DL)-based models have emerged. These models are now used in practice as an additional criterion in gas chromatography-mass spectrometry identification. The DB-225ms stationary phase (usually described as 50%-cyanopropylphenyl-50%-dimethylpolysiloxane in available sources) is widely used, but ready-to-use retention index estimation models are not available for it. This study presents such models. The models are linear and use simple constitutional MD and retention indices predicted by DL for the DB-WAX and DB-624 stationary phases as MD (we show that it is their use that allows us to achieve satisfactory accuracy). The accuracy obtained for a completely unseen hold-out test set: root mean square error 73.2; mean absolute error 45.7; median absolute error 22.0. The models were trained using a retention data set of 266 volatile compounds. All calculations can be performed using the convenient open-source software CHERESHNYA. The final equations are implemented as a spreadsheet and a code snippet and are available online: https://doi.org/10.6084/m9.figshare.26800789.
期刊介绍:
The Journal of Separation Science (JSS) is the most comprehensive source in separation science, since it covers all areas of chromatographic and electrophoretic separation methods in theory and practice, both in the analytical and in the preparative mode, solid phase extraction, sample preparation, and related techniques. Manuscripts on methodological or instrumental developments, including detection aspects, in particular mass spectrometry, as well as on innovative applications will also be published. Manuscripts on hyphenation, automation, and miniaturization are particularly welcome. Pre- and post-separation facets of a total analysis may be covered as well as the underlying logic of the development or application of a method.