结合传统的QSAR和基于读取的回归模型预测潜在的抗利什曼唑化合物。

IF 3.9 2区 化学 Q2 CHEMISTRY, APPLIED Molecular Diversity Pub Date : 2024-12-10 DOI:10.1007/s11030-024-11070-w
Rajat Nandi, Anupama Sharma, Ananya Priya, Diwakar Kumar
{"title":"结合传统的QSAR和基于读取的回归模型预测潜在的抗利什曼唑化合物。","authors":"Rajat Nandi, Anupama Sharma, Ananya Priya, Diwakar Kumar","doi":"10.1007/s11030-024-11070-w","DOIUrl":null,"url":null,"abstract":"<p><p>Leishmaniasis, a neglected tropical disease caused by various Leishmania species, poses a significant global health challenge, especially in resource-limited regions. Visceral Leishmaniasis (VL) stands out among its severe manifestations, and current drug therapies have limitations, necessitating the exploration of new, cost-effective treatments. This study utilized a comprehensive computational workflow, integrating traditional 2D-QSAR, q-RASAR, and molecular docking to identify novel anti-leishmanial compounds, with a focus on Glycyl-tRNA Synthetase (LdGlyRS) as a promising drug target. A feature selection process combining Genetic Function Approximation (GFA)-Lasso with Multiple Linear Regression (MLR) was used to characterize 99 azole compounds across ten structural classes. The baseline MLR model (MOD1), containing seven simple and interpretable 2D features, exhibited robust predictive capabilities, achieving an R<sup>2</sup><sub>train</sub> value of 0.82 and an R<sup>2</sup><sub>test</sub> value of 0.87. To further enhance prediction accuracy, three qualified single models (two MLR and one q-RASAR) were used to construct three consensus models (CMs), with CM2 (MAE<sub>test</sub> = 0.127) demonstrating significantly higher prediction accuracy for test compounds than the MOD1. Subsequently, Support Vector Regression (SVR) and Boosting yielded 0.88 (R<sup>2</sup><sub>train</sub>), 0.86 (R<sup>2</sup><sub>test</sub>), 0.92 (R<sup>2</sup><sub>train</sub>), and 0.82 (R<sup>2</sup><sub>test</sub>), respectively. Molecular docking highlighted interactions of potent azoles within the QSAR dataset with critical residues in the LdGlyRS active site (Arg226 and Glu350), emphasizing their inhibitory potential. Furthermore, the pIC50 values of an accurate external set of 2000 azole compounds from the ZINC20 database were simultaneously predicted by CM2 + SVR + Boosting models and docked against the LdGlyRS, which identified Bazedoxifene, Talmetacin, Pyrvinium, Enzastaurin as leading FDA candidates, whereas three novel compounds with the database code ZINC000001153734, ZINC000011934652, and ZINC000009942262 displayed stable docked interactions and favourable ADMET assessments. Subsequently, Molecular Dynamics (MD) simulations for 100 ns were conducted to validate the findings further, offering enhanced insights into the stability and dynamic behaviour of the ligand-protein complexes. The integrated approach of this study underscores the efficacy of 2D-QSAR modelling. It identifies LdGlyRS as a promising leishmaniasis target, offering a robust strategy for discovering and optimizing anti-leishmanial compounds to address the critical need for improved treatments.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating traditional QSAR and read-across-based regression models for predicting potential anti-leishmanial azole compounds.\",\"authors\":\"Rajat Nandi, Anupama Sharma, Ananya Priya, Diwakar Kumar\",\"doi\":\"10.1007/s11030-024-11070-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Leishmaniasis, a neglected tropical disease caused by various Leishmania species, poses a significant global health challenge, especially in resource-limited regions. Visceral Leishmaniasis (VL) stands out among its severe manifestations, and current drug therapies have limitations, necessitating the exploration of new, cost-effective treatments. This study utilized a comprehensive computational workflow, integrating traditional 2D-QSAR, q-RASAR, and molecular docking to identify novel anti-leishmanial compounds, with a focus on Glycyl-tRNA Synthetase (LdGlyRS) as a promising drug target. A feature selection process combining Genetic Function Approximation (GFA)-Lasso with Multiple Linear Regression (MLR) was used to characterize 99 azole compounds across ten structural classes. The baseline MLR model (MOD1), containing seven simple and interpretable 2D features, exhibited robust predictive capabilities, achieving an R<sup>2</sup><sub>train</sub> value of 0.82 and an R<sup>2</sup><sub>test</sub> value of 0.87. To further enhance prediction accuracy, three qualified single models (two MLR and one q-RASAR) were used to construct three consensus models (CMs), with CM2 (MAE<sub>test</sub> = 0.127) demonstrating significantly higher prediction accuracy for test compounds than the MOD1. Subsequently, Support Vector Regression (SVR) and Boosting yielded 0.88 (R<sup>2</sup><sub>train</sub>), 0.86 (R<sup>2</sup><sub>test</sub>), 0.92 (R<sup>2</sup><sub>train</sub>), and 0.82 (R<sup>2</sup><sub>test</sub>), respectively. Molecular docking highlighted interactions of potent azoles within the QSAR dataset with critical residues in the LdGlyRS active site (Arg226 and Glu350), emphasizing their inhibitory potential. Furthermore, the pIC50 values of an accurate external set of 2000 azole compounds from the ZINC20 database were simultaneously predicted by CM2 + SVR + Boosting models and docked against the LdGlyRS, which identified Bazedoxifene, Talmetacin, Pyrvinium, Enzastaurin as leading FDA candidates, whereas three novel compounds with the database code ZINC000001153734, ZINC000011934652, and ZINC000009942262 displayed stable docked interactions and favourable ADMET assessments. Subsequently, Molecular Dynamics (MD) simulations for 100 ns were conducted to validate the findings further, offering enhanced insights into the stability and dynamic behaviour of the ligand-protein complexes. The integrated approach of this study underscores the efficacy of 2D-QSAR modelling. It identifies LdGlyRS as a promising leishmaniasis target, offering a robust strategy for discovering and optimizing anti-leishmanial compounds to address the critical need for improved treatments.</p>\",\"PeriodicalId\":708,\"journal\":{\"name\":\"Molecular Diversity\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-12-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Diversity\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1007/s11030-024-11070-w\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Diversity","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s11030-024-11070-w","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0

摘要

利什曼病是由各种利什曼原虫引起的一种被忽视的热带病,对全球卫生构成重大挑战,特别是在资源有限的地区。内脏利什曼病(VL)在其严重表现中表现突出,目前的药物治疗有局限性,需要探索新的、具有成本效益的治疗方法。本研究利用综合计算工作流程,结合传统的2D-QSAR、q-RASAR和分子对接,鉴定新的抗利什曼原虫化合物,重点研究glyyl - trna合成酶(LdGlyRS)作为一个有前景的药物靶点。采用遗传函数近似(GFA)-Lasso和多元线性回归(MLR)相结合的特征选择方法,对10个结构类的99个唑类化合物进行了特征选择。基线MLR模型(MOD1)包含7个简单且可解释的2D特征,表现出强大的预测能力,r2训练值为0.82,r2测试值为0.87。为了进一步提高预测精度,我们利用3个合格的单一模型(2个MLR和1个q-RASAR)构建了3个共识模型(CMs),其中CM2 (MAEtest = 0.127)对被试化合物的预测精度显著高于MOD1。随后,支持向量回归(SVR)和Boosting分别得到0.88 (R2train)、0.86 (R2test)、0.92 (R2train)和0.82 (R2test)。分子对接强调了QSAR数据集中的强效唑与LdGlyRS活性位点的关键残基(Arg226和Glu350)的相互作用,强调了它们的抑制潜力。此外,通过CM2 + SVR + Boosting模型同时预测了来自ZINC20数据库的2000种准确的外部化合物的pIC50值,并与LdGlyRS对接,该模型确定了Bazedoxifene, Talmetacin, Pyrvinium, Enzastaurin是FDA的主要候选药物,而数据库代码为ZINC000001153734, ZINC000011934652和ZINC000009942262的三种新化合物显示稳定的对接相互作用和良好的ADMET评估。随后,进行了100 ns的分子动力学(MD)模拟来进一步验证这些发现,为配体-蛋白质复合物的稳定性和动态行为提供了更好的见解。本研究的综合方法强调了2D-QSAR建模的有效性。它确定了LdGlyRS作为一个有希望的利什曼病靶点,为发现和优化抗利什曼化合物提供了一个强有力的策略,以满足改进治疗的迫切需求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Integrating traditional QSAR and read-across-based regression models for predicting potential anti-leishmanial azole compounds.

Leishmaniasis, a neglected tropical disease caused by various Leishmania species, poses a significant global health challenge, especially in resource-limited regions. Visceral Leishmaniasis (VL) stands out among its severe manifestations, and current drug therapies have limitations, necessitating the exploration of new, cost-effective treatments. This study utilized a comprehensive computational workflow, integrating traditional 2D-QSAR, q-RASAR, and molecular docking to identify novel anti-leishmanial compounds, with a focus on Glycyl-tRNA Synthetase (LdGlyRS) as a promising drug target. A feature selection process combining Genetic Function Approximation (GFA)-Lasso with Multiple Linear Regression (MLR) was used to characterize 99 azole compounds across ten structural classes. The baseline MLR model (MOD1), containing seven simple and interpretable 2D features, exhibited robust predictive capabilities, achieving an R2train value of 0.82 and an R2test value of 0.87. To further enhance prediction accuracy, three qualified single models (two MLR and one q-RASAR) were used to construct three consensus models (CMs), with CM2 (MAEtest = 0.127) demonstrating significantly higher prediction accuracy for test compounds than the MOD1. Subsequently, Support Vector Regression (SVR) and Boosting yielded 0.88 (R2train), 0.86 (R2test), 0.92 (R2train), and 0.82 (R2test), respectively. Molecular docking highlighted interactions of potent azoles within the QSAR dataset with critical residues in the LdGlyRS active site (Arg226 and Glu350), emphasizing their inhibitory potential. Furthermore, the pIC50 values of an accurate external set of 2000 azole compounds from the ZINC20 database were simultaneously predicted by CM2 + SVR + Boosting models and docked against the LdGlyRS, which identified Bazedoxifene, Talmetacin, Pyrvinium, Enzastaurin as leading FDA candidates, whereas three novel compounds with the database code ZINC000001153734, ZINC000011934652, and ZINC000009942262 displayed stable docked interactions and favourable ADMET assessments. Subsequently, Molecular Dynamics (MD) simulations for 100 ns were conducted to validate the findings further, offering enhanced insights into the stability and dynamic behaviour of the ligand-protein complexes. The integrated approach of this study underscores the efficacy of 2D-QSAR modelling. It identifies LdGlyRS as a promising leishmaniasis target, offering a robust strategy for discovering and optimizing anti-leishmanial compounds to address the critical need for improved treatments.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Molecular Diversity
Molecular Diversity 化学-化学综合
CiteScore
7.30
自引率
7.90%
发文量
219
审稿时长
2.7 months
期刊介绍: Molecular Diversity is a new publication forum for the rapid publication of refereed papers dedicated to describing the development, application and theory of molecular diversity and combinatorial chemistry in basic and applied research and drug discovery. The journal publishes both short and full papers, perspectives, news and reviews dealing with all aspects of the generation of molecular diversity, application of diversity for screening against alternative targets of all types (biological, biophysical, technological), analysis of results obtained and their application in various scientific disciplines/approaches including: combinatorial chemistry and parallel synthesis; small molecule libraries; microwave synthesis; flow synthesis; fluorous synthesis; diversity oriented synthesis (DOS); nanoreactors; click chemistry; multiplex technologies; fragment- and ligand-based design; structure/function/SAR; computational chemistry and molecular design; chemoinformatics; screening techniques and screening interfaces; analytical and purification methods; robotics, automation and miniaturization; targeted libraries; display libraries; peptides and peptoids; proteins; oligonucleotides; carbohydrates; natural diversity; new methods of library formulation and deconvolution; directed evolution, origin of life and recombination; search techniques, landscapes, random chemistry and more;
期刊最新文献
Evaluation of selected indigenous spices- and herbs-derived small molecules as potential inhibitors of SREBP and its implications for breast cancer using MD simulations and MMPBSA calculations. Machine learning-based screening and molecular simulations for discovering novel PARP-1 inhibitors targeting DNA repair mechanisms for breast cancer therapy. Thiazolidinedione derivatives: emerging role in cancer therapy. Discovery of novel A2AR antagonist via 3D-QSAR pharmacophore modeling: neuroprotective effects in 6-OHDA-induced SH-SY5Y cells and haloperidol-induced Parkinsonism in C57 bl/6 mice. Structure-based inhibition of acetylcholinesterase and butyrylcholinesterase with 2-Aryl-6-carboxamide benzoxazole derivatives: synthesis, enzymatic assay, and in silico studies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1