Combined deep-learning optimization predictive models for determining carbon dioxide solubility in ionic liquids

IF 10.4 1区计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Journal of Industrial Information Integration Pub Date : 2024-09-01 Epub Date: 2024-07-10 DOI:10.1016/j.jii.2024.100662

Shadfar Davoodi , Hung Vo Thanh , David A. Wood , Mohammad Mehrad , Mohammad Reza Hajsaeedi , Valeriy S. Rukavishnikov

{"title":"Combined deep-learning optimization predictive models for determining carbon dioxide solubility in ionic liquids","authors":"Shadfar Davoodi , Hung Vo Thanh , David A. Wood , Mohammad Mehrad , Mohammad Reza Hajsaeedi , Valeriy S. Rukavishnikov","doi":"10.1016/j.jii.2024.100662","DOIUrl":null,"url":null,"abstract":"<div>This study explores the development of predictive models for carbon dioxide (CO2) solubility in ionic liquids based on a compiled dataset of 10,116 experimentally measured data points involving four input variables: pressure (P), temperature (T), cation type, and anion type. The deep-learning (DL) predictive models evaluated are standalone and hybrid versions of convolutional neural network (CNN) and long short-term memory (LSTM) algorithms with cuckoo optimization algorithm (COA) and gradient-based optimization (GBO). The laboratory-measured data was separated into training and test categories, and each category was normalized separately to improve the performance of the deep learning algorithms. The Mahalanobis distance-based quantile method was utilized to identify any outliers in the training data. Once identified, the outlier data points were eliminated from the training dataset. The control parameters of the deep learning algorithms were optimized using COA to enhance their efficiency, and the algorithms were hybridized with optimization algorithms to further improve their performance. The resulting models were analyzed to assess their accuracy, degree of overfitting, and the importance of input features. The study found that using 80% of the data for training and 20% for testing results in more accurate and generalizable models. Using the outlier detection method on the training data led to 307 data points being eliminated as outliers. Developing CO2-solubility predictive model showed that, the CNN<img>COA model had the lowest RMSE and highest R2 among the developed models, indicating high generalizability for data unseen by the trained model. The analysis revealed that using optimization algorithms increased the CO2-solubility prediction performance of DL algorithms and reduced overfitting. T and cation type were the most and least important input features, respectively. Simultaneous changes in cation and anion type on CO2-solubility predictions displayed no systematic pattern. For increases in T, CO2 solubility typically decreased, whereas for increases in P CO2 solubility always increased but at variable rates. The results of this study can be used to develop accurate and generalizable CO2-solubility predictive models for various applications.</div>","PeriodicalId":55975,"journal":{"name":"Journal of Industrial Information Integration","volume":"41 ","pages":"Article 100662"},"PeriodicalIF":10.4000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Industrial Information Integration","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452414X24001067","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/10 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

This study explores the development of predictive models for carbon dioxide (CO₂) solubility in ionic liquids based on a compiled dataset of 10,116 experimentally measured data points involving four input variables: pressure (P), temperature (T), cation type, and anion type. The deep-learning (DL) predictive models evaluated are standalone and hybrid versions of convolutional neural network (CNN) and long short-term memory (LSTM) algorithms with cuckoo optimization algorithm (COA) and gradient-based optimization (GBO). The laboratory-measured data was separated into training and test categories, and each category was normalized separately to improve the performance of the deep learning algorithms. The Mahalanobis distance-based quantile method was utilized to identify any outliers in the training data. Once identified, the outlier data points were eliminated from the training dataset. The control parameters of the deep learning algorithms were optimized using COA to enhance their efficiency, and the algorithms were hybridized with optimization algorithms to further improve their performance. The resulting models were analyzed to assess their accuracy, degree of overfitting, and the importance of input features. The study found that using 80% of the data for training and 20% for testing results in more accurate and generalizable models. Using the outlier detection method on the training data led to 307 data points being eliminated as outliers. Developing CO₂-solubility predictive model showed that, the CNNCOA model had the lowest RMSE and highest R² among the developed models, indicating high generalizability for data unseen by the trained model. The analysis revealed that using optimization algorithms increased the CO₂-solubility prediction performance of DL algorithms and reduced overfitting. T and cation type were the most and least important input features, respectively. Simultaneous changes in cation and anion type on CO₂-solubility predictions displayed no systematic pattern. For increases in T, CO₂ solubility typically decreased, whereas for increases in P CO₂ solubility always increased but at variable rates. The results of this study can be used to develop accurate and generalizable CO₂-solubility predictive models for various applications.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

确定二氧化碳在离子液体中溶解度的深度学习优化预测组合模型

本研究探讨了二氧化碳 (CO2) 在离子液体中溶解度预测模型的开发，该模型基于一个由 10,116 个实验测量数据点组成的数据集，涉及四个输入变量：压力 (P)、温度 (T)、阳离子类型和阴离子类型。所评估的深度学习（DL）预测模型是卷积神经网络（CNN）和长短期记忆（LSTM）算法的独立版本和混合版本，以及布谷鸟优化算法（COA）和基于梯度的优化算法（GBO）。实验室测量的数据被分为训练和测试两类，并分别对每类数据进行归一化处理，以提高深度学习算法的性能。利用基于马哈拉诺比斯距离的量化方法来识别训练数据中的异常值。一旦识别出，离群数据点就会从训练数据集中剔除。使用 COA 对深度学习算法的控制参数进行了优化，以提高其效率，并将算法与优化算法进行了混合，以进一步提高其性能。研究人员对生成的模型进行了分析，以评估其准确性、过拟合程度以及输入特征的重要性。研究发现，使用 80% 的数据进行训练，使用 20% 的数据进行测试，可以得到更准确、更通用的模型。在训练数据中使用离群点检测方法，有 307 个数据点被视为离群点而被剔除。二氧化碳溶解度预测模型的开发结果表明，CNNCOA 模型的 RMSE 最低，R2 最高，表明该模型对训练模型未见过的数据具有很高的泛化能力。分析表明，使用优化算法提高了 DL 算法的二氧化碳溶解度预测性能，减少了过拟合。T和阳离子类型分别是最重要和最不重要的输入特征。阳离子和阴离子类型的同时变化对二氧化碳溶解度预测没有系统性的影响。当 T 值增加时，二氧化碳溶解度通常会降低，而当 P 值增加时，二氧化碳溶解度总是会增加，但增加的速度各不相同。这项研究的结果可用于为各种应用开发准确、可推广的二氧化碳溶解度预测模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Industrial Information Integration Decision Sciences-Information Systems and Management

CiteScore

22.30

自引率

13.40%

发文量

100

期刊介绍： The Journal of Industrial Information Integration focuses on the industry's transition towards industrial integration and informatization, covering not only hardware and software but also information integration. It serves as a platform for promoting advances in industrial information integration, addressing challenges, issues, and solutions in an interdisciplinary forum for researchers, practitioners, and policy makers. The Journal of Industrial Information Integration welcomes papers on foundational, technical, and practical aspects of industrial information integration, emphasizing the complex and cross-disciplinary topics that arise in industrial integration. Techniques from mathematical science, computer science, computer engineering, electrical and electronic engineering, manufacturing engineering, and engineering management are crucial in this context.