An Improved Machine Learning Model for Pure Component Property Estimation

IF 10.1 1区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY Engineering Pub Date : 2024-08-01 DOI:10.1016/j.eng.2023.08.024
Xinyu Cao , Ming Gong , Anjan Tula , Xi Chen , Rafiqul Gani , Venkat Venkatasubramanian
{"title":"An Improved Machine Learning Model for Pure Component Property Estimation","authors":"Xinyu Cao ,&nbsp;Ming Gong ,&nbsp;Anjan Tula ,&nbsp;Xi Chen ,&nbsp;Rafiqul Gani ,&nbsp;Venkat Venkatasubramanian","doi":"10.1016/j.eng.2023.08.024","DOIUrl":null,"url":null,"abstract":"<div><p>Information on the physicochemical properties of chemical species is an important prerequisite when performing tasks such as process design and product design. However, the lack of extensive data and high experimental costs hinder the development of prediction techniques for these properties. Moreover, accuracy and predictive capabilities still limit the scope and applicability of most property estimation methods. This paper proposes a new Gaussian process-based modeling framework that aims to manage a discrete and high-dimensional input space related to molecular structure representation with the group-contribution approach. A warping function is used to map discrete input into a continuous domain in order to adjust the correlation between different compounds. Prior selection techniques, including prior elicitation and prior predictive checking, are also applied during the building procedure to provide the model with more information from previous research findings. The framework is assessed using datasets of varying sizes for 20 pure component properties. For 18 out of the 20 pure component properties, the new models are found to give improved accuracy and predictive power in comparison with other published models, with and without machine learning.</p></div>","PeriodicalId":11783,"journal":{"name":"Engineering","volume":null,"pages":null},"PeriodicalIF":10.1000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2095809924001590/pdfft?md5=1467de2f6cb3888be2501c5f8217cd9b&pid=1-s2.0-S2095809924001590-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2095809924001590","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Information on the physicochemical properties of chemical species is an important prerequisite when performing tasks such as process design and product design. However, the lack of extensive data and high experimental costs hinder the development of prediction techniques for these properties. Moreover, accuracy and predictive capabilities still limit the scope and applicability of most property estimation methods. This paper proposes a new Gaussian process-based modeling framework that aims to manage a discrete and high-dimensional input space related to molecular structure representation with the group-contribution approach. A warping function is used to map discrete input into a continuous domain in order to adjust the correlation between different compounds. Prior selection techniques, including prior elicitation and prior predictive checking, are also applied during the building procedure to provide the model with more information from previous research findings. The framework is assessed using datasets of varying sizes for 20 pure component properties. For 18 out of the 20 pure component properties, the new models are found to give improved accuracy and predictive power in comparison with other published models, with and without machine learning.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于纯组件特性估计的改进型机器学习模型
在进行工艺设计和产品设计等工作时,有关化学物质理化性质的信息是一个重要的先决条件。然而,大量数据的缺乏和高昂的实验成本阻碍了这些性质预测技术的发展。此外,准确性和预测能力仍然限制了大多数性质估计方法的范围和适用性。本文提出了一种新的基于高斯过程的建模框架,旨在利用组贡献方法管理与分子结构表征相关的离散高维输入空间。使用扭曲函数将离散输入映射到连续域,以调整不同化合物之间的相关性。在构建过程中,还应用了先验选择技术,包括先验激发和先验预测检查,以便从先前的研究成果中为模型提供更多信息。该框架使用不同规模的数据集对 20 种纯成分特性进行了评估。在 20 个纯组件属性中的 18 个属性中,与其他已发布的模型相比,无论是否使用机器学习,新模型的准确性和预测能力都有所提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Engineering
Engineering Environmental Science-Environmental Engineering
自引率
1.60%
发文量
335
审稿时长
35 days
期刊介绍: Engineering, an international open-access journal initiated by the Chinese Academy of Engineering (CAE) in 2015, serves as a distinguished platform for disseminating cutting-edge advancements in engineering R&D, sharing major research outputs, and highlighting key achievements worldwide. The journal's objectives encompass reporting progress in engineering science, fostering discussions on hot topics, addressing areas of interest, challenges, and prospects in engineering development, while considering human and environmental well-being and ethics in engineering. It aims to inspire breakthroughs and innovations with profound economic and social significance, propelling them to advanced international standards and transforming them into a new productive force. Ultimately, this endeavor seeks to bring about positive changes globally, benefit humanity, and shape a new future.
期刊最新文献
Digital Twins for Engineering Asset Management: Synthesis, Analytical Framework, and Future Directions Understanding the Resilience of Urban Rail Transit: Concepts, Reviews, and Trends Direct Ethylene Purification from Cracking Gas via a Metal–Organic Framework Through Pore Geometry Fitting Utilization of Bubbles and Oil for Microplastic Capture from Water Robust, Flexible, and Superhydrophobic Fabrics for High-Efficiency and Ultrawide-Band Microwave Absorption
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1