Improving Vapor Pressure Prediction Through Integration of Multiple Molecular Representations: A Super Learner Approach

IF 2.1 4区 化学 Q1 SOCIAL WORK Journal of Chemometrics Pub Date : 2025-02-10 DOI:10.1002/cem.70003
Ji Hyun Nam, Seul Lee, Seongil Jo, Jaeoh Kim, Jooyeon Lee, Jahyun Koo, Byounghwak Lee, Keunhong Jeong, Donghyeon Yu
{"title":"Improving Vapor Pressure Prediction Through Integration of Multiple Molecular Representations: A Super Learner Approach","authors":"Ji Hyun Nam,&nbsp;Seul Lee,&nbsp;Seongil Jo,&nbsp;Jaeoh Kim,&nbsp;Jooyeon Lee,&nbsp;Jahyun Koo,&nbsp;Byounghwak Lee,&nbsp;Keunhong Jeong,&nbsp;Donghyeon Yu","doi":"10.1002/cem.70003","DOIUrl":null,"url":null,"abstract":"<p>Accurate prediction of vapor pressure is essential in chemical engineering, environmental science, and pharmaceutical development, impacting the volatility and stability of compounds. Traditional methods often fall short for complex and new molecular structures. This study introduces an advanced machine learning approach, integrating graph neural networks (GNNs), and CHEM-BERT models to improve prediction accuracy. Utilizing the largest dataset to date, we derived comprehensive chemical descriptors and fingerprints. We evaluated 19 predictive models, including ridge regression, random forest, support vector regression, and feed-forward neural networks, trained on diverse features like PaDEL and Morgan fingerprints, chemical descriptors, and Chem-BERT embeddings. Central to our methodology is the super learner architecture, which combines 19 multiple models to enhance accuracy. The super learner achieved a root mean squared error (RMSE) of 0.8200, outperforming individual models and previous reports. These successful results highlight the effectiveness of integrating GNNs and Chem-BERT for capturing detailed molecular information, setting a new benchmark for vapor pressure prediction. This study underscores the value of advanced machine learning techniques and comprehensive datasets, offering a robust tool for researchers and paving the way for future advancements in chemical property prediction.</p>","PeriodicalId":15274,"journal":{"name":"Journal of Chemometrics","volume":"39 2","pages":"1-19"},"PeriodicalIF":2.1000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cem.70003","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemometrics","FirstCategoryId":"92","ListUrlMain":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/cem.70003","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL WORK","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate prediction of vapor pressure is essential in chemical engineering, environmental science, and pharmaceutical development, impacting the volatility and stability of compounds. Traditional methods often fall short for complex and new molecular structures. This study introduces an advanced machine learning approach, integrating graph neural networks (GNNs), and CHEM-BERT models to improve prediction accuracy. Utilizing the largest dataset to date, we derived comprehensive chemical descriptors and fingerprints. We evaluated 19 predictive models, including ridge regression, random forest, support vector regression, and feed-forward neural networks, trained on diverse features like PaDEL and Morgan fingerprints, chemical descriptors, and Chem-BERT embeddings. Central to our methodology is the super learner architecture, which combines 19 multiple models to enhance accuracy. The super learner achieved a root mean squared error (RMSE) of 0.8200, outperforming individual models and previous reports. These successful results highlight the effectiveness of integrating GNNs and Chem-BERT for capturing detailed molecular information, setting a new benchmark for vapor pressure prediction. This study underscores the value of advanced machine learning techniques and comprehensive datasets, offering a robust tool for researchers and paving the way for future advancements in chemical property prediction.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过整合多个分子表征改进蒸汽压预测:一种超级学习器方法
蒸汽压的准确预测在化学工程、环境科学和药物开发中至关重要,它影响着化合物的挥发性和稳定性。传统的方法往往不能满足复杂和新的分子结构。本研究引入了一种先进的机器学习方法,将图神经网络(GNNs)和CHEM-BERT模型相结合,以提高预测精度。利用迄今为止最大的数据集,我们得到了全面的化学描述符和指纹。我们评估了19种预测模型,包括脊回归、随机森林、支持向量回归和前馈神经网络,并对不同的特征(如PaDEL和Morgan指纹、化学描述符和Chem-BERT嵌入)进行了训练。我们方法论的核心是超级学习者架构,它结合了19个多个模型来提高准确性。超级学习器的均方根误差(RMSE)为0.8200,优于单个模型和以前的报告。这些成功的结果突出了集成gnn和Chem-BERT捕获详细分子信息的有效性,为蒸汽压预测设定了新的基准。这项研究强调了先进的机器学习技术和综合数据集的价值,为研究人员提供了一个强大的工具,并为化学性质预测的未来发展铺平了道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Chemometrics
Journal of Chemometrics 化学-分析化学
CiteScore
5.20
自引率
8.30%
发文量
78
审稿时长
2 months
期刊介绍: The Journal of Chemometrics is devoted to the rapid publication of original scientific papers, reviews and short communications on fundamental and applied aspects of chemometrics. It also provides a forum for the exchange of information on meetings and other news relevant to the growing community of scientists who are interested in chemometrics and its applications. Short, critical review papers are a particularly important feature of the journal, in view of the multidisciplinary readership at which it is aimed.
期刊最新文献
A Perspective on Using Immersive Analytics With Virtual Reality for One-Class Classification Decisions Lean Chemometrics in Spectroscopic Process Analytical Technology Volatile Gas Detection Based on Electronic Nose Combined With a Feature Complementary Calculation Network to Identify the Adulterated Peanuts In-Situ Detection of Microplastic Particles on Food Using Hyperspectral Imaging With One-Dimensional Convolutional Neural Network and Artificial Neural Network XR and Hybrid Data Visualization Spaces for Enhanced Data Analytics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1