Wei Liu , Junwei Zha , Mengxuan Ling , Dan Li , Kaidong Shen , Longjiu Cheng
{"title":"A boiling point prediction method based on machine learning for potential insulating gases","authors":"Wei Liu , Junwei Zha , Mengxuan Ling , Dan Li , Kaidong Shen , Longjiu Cheng","doi":"10.1016/j.chemphys.2024.112447","DOIUrl":null,"url":null,"abstract":"<div><p>The boiling point is a crucial indicator for assessing the suitability of insulating gases. Its theoretical prediction has consistently garnered significant attention from the scientific community. In this study, a boiling point database composed of <em>hexa</em>-element (C, H, O, N, F, S) for potential insulating gases was constructed. The model of Gradient Boosting Regression with RDKit descriptors (RDKit-GBR) achieved superior predictive ability on the test set with a coefficient of determination of 0.97, a mean absolute error of 17.74 °C, and a root-mean-squared error of 27.83 °C. The SHapley Additive exPlanations analysis showed that the “Ipc” feature in RDKit, which represents the spatial relationship and interaction between pairs of atoms within molecules, plays a central role in predicting the boiling points for insulation gases. Furthermore, the applicability of RDKit-GBR method was further validated across several elemental combinations. Eventually, compared with the previously reported models, the <em>hexa</em>-element model achieves excellent accuracy.</p></div>","PeriodicalId":272,"journal":{"name":"Chemical Physics","volume":"588 ","pages":"Article 112447"},"PeriodicalIF":2.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemical Physics","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0301010424002763","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
The boiling point is a crucial indicator for assessing the suitability of insulating gases. Its theoretical prediction has consistently garnered significant attention from the scientific community. In this study, a boiling point database composed of hexa-element (C, H, O, N, F, S) for potential insulating gases was constructed. The model of Gradient Boosting Regression with RDKit descriptors (RDKit-GBR) achieved superior predictive ability on the test set with a coefficient of determination of 0.97, a mean absolute error of 17.74 °C, and a root-mean-squared error of 27.83 °C. The SHapley Additive exPlanations analysis showed that the “Ipc” feature in RDKit, which represents the spatial relationship and interaction between pairs of atoms within molecules, plays a central role in predicting the boiling points for insulation gases. Furthermore, the applicability of RDKit-GBR method was further validated across several elemental combinations. Eventually, compared with the previously reported models, the hexa-element model achieves excellent accuracy.
期刊介绍:
Chemical Physics publishes experimental and theoretical papers on all aspects of chemical physics. In this journal, experiments are related to theory, and in turn theoretical papers are related to present or future experiments. Subjects covered include: spectroscopy and molecular structure, interacting systems, relaxation phenomena, biological systems, materials, fundamental problems in molecular reactivity, molecular quantum theory and statistical mechanics. Computational chemistry studies of routine character are not appropriate for this journal.