Development and experimental validation of a machine learning model for the prediction of new antimalarials

IF 4.3 2区 化学 Q2 CHEMISTRY, MULTIDISCIPLINARY BMC Chemistry Pub Date : 2025-01-30 DOI:10.1186/s13065-025-01395-4
Mukul Kore, Dimple Acharya, Lakshya Sharma, Shruthi Sridhar Vembar, Sandeep Sundriyal
{"title":"Development and experimental validation of a machine learning model for the prediction of new antimalarials","authors":"Mukul Kore,&nbsp;Dimple Acharya,&nbsp;Lakshya Sharma,&nbsp;Shruthi Sridhar Vembar,&nbsp;Sandeep Sundriyal","doi":"10.1186/s13065-025-01395-4","DOIUrl":null,"url":null,"abstract":"<div><p>A large set of antimalarial molecules (<i>N</i> ~ 15k) was employed from ChEMBL to build a robust random forest (RF) model for the prediction of antiplasmodial activity. Rather than depending on high throughput screening (HTS) data, molecules tested at multiple doses against blood stages of <i>Plasmodium falciparum</i> were used for model development. The open-access and code-free KNIME platform was used to develop a workflow to train the model on 80% of data (<i>N</i> ~ 12k). The hyperparameter values were optimized to achieve the highest predictive accuracy with nine different molecular fingerprints (MFPs), among which Avalon MFPs (referred to as RF-1) provided the best results. RF-1 displayed 91.7% accuracy, 93.5% precision, 88.4% sensitivity and 97.3% area under the Receiver operating characteristic (AUROC) for the remaining 20% test set. The predictive performance of RF-1 was comparable to that of the malaria inhibitor prediction platform (MAIP), a recently reported consensus model based on a large proprietary dataset. However, hits obtained from RF-1 and MAIP from a commercial library did not overlap, suggesting that these two models are complementary. Finally, RF-1 was used to screen small molecules under clinical investigations for repurposing. Six molecules were purchased, out of which two human kinase inhibitors were identified to have single-digit micromolar antiplasmodial activity. One of the hits (compound <b>1</b>) was a potent inhibitor of <i>β</i>-hematin, suggesting the involvement of parasite hemozoin (Hz) synthesis in the parasiticidal effect. The training and test sets are provided as supplementary information, allowing others to reproduce this work.</p></div>","PeriodicalId":496,"journal":{"name":"BMC Chemistry","volume":"19 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783816/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Chemistry","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13065-025-01395-4","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

A large set of antimalarial molecules (N ~ 15k) was employed from ChEMBL to build a robust random forest (RF) model for the prediction of antiplasmodial activity. Rather than depending on high throughput screening (HTS) data, molecules tested at multiple doses against blood stages of Plasmodium falciparum were used for model development. The open-access and code-free KNIME platform was used to develop a workflow to train the model on 80% of data (N ~ 12k). The hyperparameter values were optimized to achieve the highest predictive accuracy with nine different molecular fingerprints (MFPs), among which Avalon MFPs (referred to as RF-1) provided the best results. RF-1 displayed 91.7% accuracy, 93.5% precision, 88.4% sensitivity and 97.3% area under the Receiver operating characteristic (AUROC) for the remaining 20% test set. The predictive performance of RF-1 was comparable to that of the malaria inhibitor prediction platform (MAIP), a recently reported consensus model based on a large proprietary dataset. However, hits obtained from RF-1 and MAIP from a commercial library did not overlap, suggesting that these two models are complementary. Finally, RF-1 was used to screen small molecules under clinical investigations for repurposing. Six molecules were purchased, out of which two human kinase inhibitors were identified to have single-digit micromolar antiplasmodial activity. One of the hits (compound 1) was a potent inhibitor of β-hematin, suggesting the involvement of parasite hemozoin (Hz) synthesis in the parasiticidal effect. The training and test sets are provided as supplementary information, allowing others to reproduce this work.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
开发和实验验证用于预测新型抗疟药的机器学习模型。
利用ChEMBL中大量的抗疟分子(N ~ 15k)构建了鲁棒随机森林(RF)模型,用于预测抗疟原虫活性。与依赖高通量筛选(HTS)数据不同的是,模型开发使用了针对恶性疟原虫血液阶段的多剂量分子测试。采用开放存取、无代码的KNIME平台,开发了在80%的数据(N ~ 12k)上训练模型的工作流程。对9种不同分子指纹图谱(mfp)的超参数值进行优化,获得最高的预测精度,其中Avalon mfp (RF-1)的预测效果最好。RF-1的准确度为91.7%,精密度为93.5%,灵敏度为88.4%,其余20%的测试集在Receiver operating characteristic (AUROC)下的面积为97.3%。RF-1的预测性能与疟疾抑制剂预测平台(MAIP)相当,后者是最近报道的基于大型专有数据集的共识模型。然而,从RF-1和来自商业文库的maep中获得的点击率没有重叠,这表明这两个模型是互补的。最后,RF-1被用于筛选临床研究中的小分子以重新利用。我们购买了六种分子,其中两种人类激酶抑制剂被鉴定为具有个位数微摩尔抗疟原虫活性。其中一个hit(化合物1)是β-血红素的有效抑制剂,表明寄生虫血色素(Hz)的合成参与了杀寄生作用。训练集和测试集作为补充信息提供,允许其他人复制此工作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
BMC Chemistry
BMC Chemistry Chemistry-General Chemistry
CiteScore
5.30
自引率
2.20%
发文量
92
审稿时长
27 weeks
期刊介绍: BMC Chemistry, formerly known as Chemistry Central Journal, is now part of the BMC series journals family. Chemistry Central Journal has served the chemistry community as a trusted open access resource for more than 10 years – and we are delighted to announce the next step on its journey. In January 2019 the journal has been renamed BMC Chemistry and now strengthens the BMC series footprint in the physical sciences by publishing quality articles and by pushing the boundaries of open chemistry.
期刊最新文献
Efficient adsorption of fast green dye by chitosan modified with cyanoguanidine: statistical modelling, kinetic, and isotherm studies. Halloysite nanotube-grafted reduced graphene oxide for sensitive electrochemical sensing of niclosamide in food, environmental and biological matrices. Comprehensive experimental and theoretical investigations of novel triazine Schiff base metal complexes: spectroscopic, electrochemical, DNA interaction, in vitro cytotoxicity, antimicrobial, and in silico studies. Wastewater modeling of Cr (VI) adsorption by sulfonated lignin of pine cones: equilibrium, kinetics, and isotherm study. Sustainable chromatographic assays of a novel antifungal combination for keratomycosis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1