{"title":"Prediction of Activation Energies of Organic Molecules With at Most Seven Non-Hydrogen Atoms Using Quantum-Chemically Assisted ML","authors":"K. G. Kalamatianos, Olga N. Flenga","doi":"10.1002/jcc.70083","DOIUrl":null,"url":null,"abstract":"In this study, a hybrid machine learning (ML) approach is presented for accurately predicting activation energies (<i>E</i><sub>a</sub>) of gas-phase elementary reactions involving organic compounds with up to seven non-hydrogen atoms. Given the importance of activation energies in reaction studies and modeling, ML composite models were created that effectively integrate molecular descriptors with semi-empirical and single energy density functional theory (DFT) calculations. The dataset, containing 300 randomly selected elementary gas-phase reactions, was assembled using accurate DFT (ωB97X-D3/def2-TZVP) values for activation energies <i>E</i><sub>a</sub> from a database alongside semi-empirical computations. For accurate predictions, this approach required the inclusion of both physical organic and geometric/empirical descriptors in the training procedure. The best two ML models demonstrated efficient <i>E</i><sub>a</sub> prediction capability, achieving a mean absolute error (MAE) of 1.314 kcal mol<sup>−1</sup> and <i>R</i><sup>2</sup> of 0.992 (Model 3) and (MAE) of 1.949 kcal mol<sup>−1</sup> and <i>R</i><sup>2</sup> of 0.979 (Model 2) in validation tests. Notably, this performance approaches the threshold of “chemical accuracy” of 1 kcal mol<sup>−1</sup>. Model's 3 robustness was tested across the reaction types present in the dataset, demonstrating its ability in properly predicting activation energies, which is critical for the study and optimization of chemical processes.","PeriodicalId":188,"journal":{"name":"Journal of Computational Chemistry","volume":"20 1","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1002/jcc.70083","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
In this study, a hybrid machine learning (ML) approach is presented for accurately predicting activation energies (Ea) of gas-phase elementary reactions involving organic compounds with up to seven non-hydrogen atoms. Given the importance of activation energies in reaction studies and modeling, ML composite models were created that effectively integrate molecular descriptors with semi-empirical and single energy density functional theory (DFT) calculations. The dataset, containing 300 randomly selected elementary gas-phase reactions, was assembled using accurate DFT (ωB97X-D3/def2-TZVP) values for activation energies Ea from a database alongside semi-empirical computations. For accurate predictions, this approach required the inclusion of both physical organic and geometric/empirical descriptors in the training procedure. The best two ML models demonstrated efficient Ea prediction capability, achieving a mean absolute error (MAE) of 1.314 kcal mol−1 and R2 of 0.992 (Model 3) and (MAE) of 1.949 kcal mol−1 and R2 of 0.979 (Model 2) in validation tests. Notably, this performance approaches the threshold of “chemical accuracy” of 1 kcal mol−1. Model's 3 robustness was tested across the reaction types present in the dataset, demonstrating its ability in properly predicting activation energies, which is critical for the study and optimization of chemical processes.
期刊介绍:
This distinguished journal publishes articles concerned with all aspects of computational chemistry: analytical, biological, inorganic, organic, physical, and materials. The Journal of Computational Chemistry presents original research, contemporary developments in theory and methodology, and state-of-the-art applications. Computational areas that are featured in the journal include ab initio and semiempirical quantum mechanics, density functional theory, molecular mechanics, molecular dynamics, statistical mechanics, cheminformatics, biomolecular structure prediction, molecular design, and bioinformatics.