Computational intelligence investigations on evaluation of salicylic acid solubility in various solvents at different temperatures.

IF 3.9 2区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Scientific Reports Pub Date : 2025-02-28 DOI:10.1038/s41598-025-90704-x
Adel Alhowyan, Wael A Mahdi, Ahmad J Obaidullah
{"title":"Computational intelligence investigations on evaluation of salicylic acid solubility in various solvents at different temperatures.","authors":"Adel Alhowyan, Wael A Mahdi, Ahmad J Obaidullah","doi":"10.1038/s41598-025-90704-x","DOIUrl":null,"url":null,"abstract":"<p><p>This research shows the utilization of various tree-based machine learning algorithms with a specific focus on predicting Salicylic acid solubility values in 13 solvents. We employed four distinct models: cubist regression, gradient boosting (GB), extreme gradient boosting (XGB), and extra trees (ET) for correlation of drug solubility to pressure, temperature, and solvent composition. The dataset was preprocessed using the Standard Scaler to standardize it, ensuring each feature has a mean of zero and a standard deviation of one, followed by outlier detection with Cook's distance. Hyperparameter optimization made using the Differential Evolution (DE) method improved the performance of models. Monte Carlo Cross-Valuation was used in evaluation of the models. Measures including the R<sup>2</sup> score, Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE) helped to measure their performance. With an R<sup>2</sup> value of 0.996, the Extra Trees model displayed remarkable accuracy and consistency, so showing better performance than other models. This study emphasizes the resilience of ensemble methods in capturing intricate data patterns and their effectiveness in regression tasks for application of pharmaceutical manufacturing.</p>","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"7142"},"PeriodicalIF":3.9000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11871127/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-90704-x","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

This research shows the utilization of various tree-based machine learning algorithms with a specific focus on predicting Salicylic acid solubility values in 13 solvents. We employed four distinct models: cubist regression, gradient boosting (GB), extreme gradient boosting (XGB), and extra trees (ET) for correlation of drug solubility to pressure, temperature, and solvent composition. The dataset was preprocessed using the Standard Scaler to standardize it, ensuring each feature has a mean of zero and a standard deviation of one, followed by outlier detection with Cook's distance. Hyperparameter optimization made using the Differential Evolution (DE) method improved the performance of models. Monte Carlo Cross-Valuation was used in evaluation of the models. Measures including the R2 score, Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE) helped to measure their performance. With an R2 value of 0.996, the Extra Trees model displayed remarkable accuracy and consistency, so showing better performance than other models. This study emphasizes the resilience of ensemble methods in capturing intricate data patterns and their effectiveness in regression tasks for application of pharmaceutical manufacturing.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
关于不同温度下水杨酸在各种溶剂中溶解度评估的计算智能研究。
本研究展示了各种基于树的机器学习算法的利用,特别关注于预测水杨酸在13种溶剂中的溶解度值。我们采用了四种不同的模型:立体回归、梯度增强(GB)、极端梯度增强(XGB)和额外树(ET)来研究药物溶解度与压力、温度和溶剂成分的相关性。使用Standard Scaler对数据集进行预处理,使其标准化,确保每个特征的平均值为零,标准差为1,然后使用库克距离进行异常值检测。采用差分进化方法进行超参数优化,提高了模型的性能。采用蒙特卡罗交叉评价法对模型进行评价。包括R2评分、均方根误差(RMSE)和平均绝对误差(MAE)在内的措施有助于衡量他们的表现。Extra Trees模型的R2值为0.996,具有较好的准确性和一致性,性能优于其他模型。本研究强调了集成方法在捕获复杂数据模式方面的弹性,以及它们在药物制造应用的回归任务中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Scientific Reports
Scientific Reports Natural Science Disciplines-
CiteScore
7.50
自引率
4.30%
发文量
19567
审稿时长
3.9 months
期刊介绍: We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections. Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021). •Engineering Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live. •Physical sciences Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics. •Earth and environmental sciences Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems. •Biological sciences Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants. •Health sciences The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.
期刊最新文献
β-Nicotinamide mononucleotide preserves muscle strength in septic male mice. Rhizospheric glycosyltransferase repertoires as a resource for enabling sustainable bioprocessing and green biocatalyst discovery. Evaluation of rainwater harvesting system in university buildings for non-potable water demand. Subjective happiness moderates the relationship between implicit and explicit attitudes and excessive digital media use among adolescents. Shared patterns of dysregulated gene expression across squamous cell carcinomas unveil predictors for prognosis and drug sensitivity.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1