基于机器学习的模型，通过调整重要参数估算药物在超临界流体中的溶解度

IF 3.7 2区化学 Q2 AUTOMATION & CONTROL SYSTEMS Chemometrics and Intelligent Laboratory Systems Pub Date : 2024-10-03 DOI:10.1016/j.chemolab.2024.105241

Yaoyang Liu , Morug Salih Mahdi , Usama Kadem Radi , Ali Jihad , Ali Hamid AbdulHussein , Irshad Ahmad , Nasrin Mansuri , Mostafa Adnan Abdalrahman , Ahmed Alkhayyat , Ahmed Faisal

{"title":"基于机器学习的模型，通过调整重要参数估算药物在超临界流体中的溶解度","authors":"Yaoyang Liu , Morug Salih Mahdi , Usama Kadem Radi , Ali Jihad , Ali Hamid AbdulHussein , Irshad Ahmad , Nasrin Mansuri , Mostafa Adnan Abdalrahman , Ahmed Alkhayyat , Ahmed Faisal","doi":"10.1016/j.chemolab.2024.105241","DOIUrl":null,"url":null,"abstract":"<div><div>Here, we employed machine learning models to predict how well Capecitabine drug would dissolve in supercritical carbon dioxide as the green solvent. The vision is to investigate the drug suitability for processing of nanodrugs with enhanced bioavailability in the body. In the employed data set, P (pressure) and T (temperature) serve as inputs, and Y, the solubility, is the only output for building the models. This study uses DT (Decision Tree) and MLP (Multilayer perceptron) as the core models. However, the raw and individual form of conventional algorithms may not provide accurate and general results. Ensemble methods like boosting improve the model performance. Also, single and ensemble models mounted on these models have hyper-parameters whose optimization affects the final models. Meta-heuristic algorithms are popular for tuning hyper-parameters. In this research, we used a new hybrid framework by coupling the basic models with the Adaboost algorithm (as an ensemble method) and PO and CS algorithms (as optimizers) to obtain four different models. The MLP model boosted with Adaboost and tuned with PO algorithm showed the best fitting accuracy among all models. This model reduces the RMSE error rate to 1.71, MSE to 2.92, and MAE to 1.42.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105241"},"PeriodicalIF":3.7000,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning based modeling for estimation of drug solubility in supercritical fluid by adjusting important parameters\",\"authors\":\"Yaoyang Liu , Morug Salih Mahdi , Usama Kadem Radi , Ali Jihad , Ali Hamid AbdulHussein , Irshad Ahmad , Nasrin Mansuri , Mostafa Adnan Abdalrahman , Ahmed Alkhayyat , Ahmed Faisal\",\"doi\":\"10.1016/j.chemolab.2024.105241\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Here, we employed machine learning models to predict how well Capecitabine drug would dissolve in supercritical carbon dioxide as the green solvent. The vision is to investigate the drug suitability for processing of nanodrugs with enhanced bioavailability in the body. In the employed data set, P (pressure) and T (temperature) serve as inputs, and Y, the solubility, is the only output for building the models. This study uses DT (Decision Tree) and MLP (Multilayer perceptron) as the core models. However, the raw and individual form of conventional algorithms may not provide accurate and general results. Ensemble methods like boosting improve the model performance. Also, single and ensemble models mounted on these models have hyper-parameters whose optimization affects the final models. Meta-heuristic algorithms are popular for tuning hyper-parameters. In this research, we used a new hybrid framework by coupling the basic models with the Adaboost algorithm (as an ensemble method) and PO and CS algorithms (as optimizers) to obtain four different models. The MLP model boosted with Adaboost and tuned with PO algorithm showed the best fitting accuracy among all models. This model reduces the RMSE error rate to 1.71, MSE to 2.92, and MAE to 1.42.</div></div>\",\"PeriodicalId\":9774,\"journal\":{\"name\":\"Chemometrics and Intelligent Laboratory Systems\",\"volume\":\"254 \",\"pages\":\"Article 105241\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chemometrics and Intelligent Laboratory Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169743924001813\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743924001813","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

在这里，我们采用机器学习模型来预测卡培他滨药物在作为绿色溶剂的超临界二氧化碳中的溶解度。我们的愿景是研究药物在体内生物利用度提高的纳米药物加工中的适用性。在采用的数据集中，P（压力）和 T（温度）是输入，Y（溶解度）是建立模型的唯一输出。本研究使用 DT（决策树）和 MLP（多层感知器）作为核心模型。然而，传统算法的原始和单独形式可能无法提供准确和通用的结果。增强等集合方法可以提高模型性能。此外，安装在这些模型上的单一模型和集合模型都有超参数，其优化会影响最终模型。元启发式算法是调整超参数的常用方法。在这项研究中，我们使用了一种新的混合框架，将基本模型与 Adaboost 算法（作为一种集合方法）以及 PO 和 CS 算法（作为优化器）结合起来，得到了四种不同的模型。在所有模型中，用 Adaboost 算法提升并用 PO 算法调整的 MLP 模型的拟合精度最高。该模型将 RMSE 误差率降至 1.71，MSE 降至 2.92，MAE 降至 1.42。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Machine learning based modeling for estimation of drug solubility in supercritical fluid by adjusting important parameters

Here, we employed machine learning models to predict how well Capecitabine drug would dissolve in supercritical carbon dioxide as the green solvent. The vision is to investigate the drug suitability for processing of nanodrugs with enhanced bioavailability in the body. In the employed data set, P (pressure) and T (temperature) serve as inputs, and Y, the solubility, is the only output for building the models. This study uses DT (Decision Tree) and MLP (Multilayer perceptron) as the core models. However, the raw and individual form of conventional algorithms may not provide accurate and general results. Ensemble methods like boosting improve the model performance. Also, single and ensemble models mounted on these models have hyper-parameters whose optimization affects the final models. Meta-heuristic algorithms are popular for tuning hyper-parameters. In this research, we used a new hybrid framework by coupling the basic models with the Adaboost algorithm (as an ensemble method) and PO and CS algorithms (as optimizers) to obtain four different models. The MLP model boosted with Adaboost and tuned with PO algorithm showed the best fitting accuracy among all models. This model reduces the RMSE error rate to 1.71, MSE to 2.92, and MAE to 1.42.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Chemometrics and Intelligent Laboratory Systems 工程技术-分析化学

CiteScore

7.50

自引率

7.70%

发文量

169

审稿时长

3.4 months

期刊介绍： Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines. Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data. The journal deals with the following topics: 1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.) 2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered. 3) Development of new software that provides novel tools or truly advances the use of chemometrical methods. 4) Well characterized data sets to test performance for the new methods and software. The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.

期刊最新文献

Editorial Board Field-deployable real-time AI System for chemical warfare agent detection using YOLOv8 and colorimetric sensors Just-in-time process soft sensor with spatiotemporal graph decoupled learning Editorial Board An iterative conditional variable selection method for constraint-based time series causal discovery