建立基于 ML 的 QSAR 模型,预测咪唑类治疗活性药物的生物活性

Komal Singh , Irina Ghosh , Venkatesan Jayaprakash , Sudeepan Jayapalan
{"title":"建立基于 ML 的 QSAR 模型,预测咪唑类治疗活性药物的生物活性","authors":"Komal Singh ,&nbsp;Irina Ghosh ,&nbsp;Venkatesan Jayaprakash ,&nbsp;Sudeepan Jayapalan","doi":"10.1016/j.ejmcr.2024.100148","DOIUrl":null,"url":null,"abstract":"<div><p>Human immunodeficiency virus, a retrovirus, causes AIDS, a chronic immune system disease. HIV interferes with the ability of our body to combat disease and infection by weakening our immune system. An essential enzyme necessary for HIV replication is reverse transcriptase (RT). RT inhibitors (RTIs) are a class of antiretroviral drugs that target HIV's RT enzyme, blocking its ability to convert viral RNA into DNA. The RT-1 enzyme has been found to be inhibited by imidazole. It attaches to the RT-1 enzyme's active site and prevents it from performing its usual activity. As a result, viral replication is inhibited, which can eventually aid in slowing the course of HIV and other retroviral diseases. A computational tool allows researchers to simulate and analyze the drug's behaviour in a virtual environment, providing valuable insights into its pharmacological properties, efficacy, and safety. QSAR modelling uses machine learning methods to create predictive models from datasets of chemical substances and the accompanying biological activity. Here, a comparative analysis of the model performances by four different algorithms for the Imidazole scaffold are reported. The algorithms of Support Vector Regression (SVR), Random Forest Regression (RFR), Decision Tree Regression (DTR) and Hist Gradient Boosting Regression (HGBR) have given promising results with the R<sup>2</sup> value of 0.905, 0.993, 0.688 and 0.921 respectively for the train sets and for the test set 0.843, 0.977, 0.567 and 0.880. The best performed RFR model have been validated using developed RFR codes for randomly selected compounds and it shows the error percentage of about 0.151% only. From the R<sup>2</sup> values, it is observed that the RFR and HGBR models show a better fit with the variables compared to the other models thereby making them the potential models for predicting the activity of novel anti-viral compounds.</p></div>","PeriodicalId":12015,"journal":{"name":"European Journal of Medicinal Chemistry Reports","volume":"11 ","pages":"Article 100148"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772417424000207/pdfft?md5=f8c0587cac96b9677a261126b3c259c5&pid=1-s2.0-S2772417424000207-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold\",\"authors\":\"Komal Singh ,&nbsp;Irina Ghosh ,&nbsp;Venkatesan Jayaprakash ,&nbsp;Sudeepan Jayapalan\",\"doi\":\"10.1016/j.ejmcr.2024.100148\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Human immunodeficiency virus, a retrovirus, causes AIDS, a chronic immune system disease. HIV interferes with the ability of our body to combat disease and infection by weakening our immune system. An essential enzyme necessary for HIV replication is reverse transcriptase (RT). RT inhibitors (RTIs) are a class of antiretroviral drugs that target HIV's RT enzyme, blocking its ability to convert viral RNA into DNA. The RT-1 enzyme has been found to be inhibited by imidazole. It attaches to the RT-1 enzyme's active site and prevents it from performing its usual activity. As a result, viral replication is inhibited, which can eventually aid in slowing the course of HIV and other retroviral diseases. A computational tool allows researchers to simulate and analyze the drug's behaviour in a virtual environment, providing valuable insights into its pharmacological properties, efficacy, and safety. QSAR modelling uses machine learning methods to create predictive models from datasets of chemical substances and the accompanying biological activity. Here, a comparative analysis of the model performances by four different algorithms for the Imidazole scaffold are reported. The algorithms of Support Vector Regression (SVR), Random Forest Regression (RFR), Decision Tree Regression (DTR) and Hist Gradient Boosting Regression (HGBR) have given promising results with the R<sup>2</sup> value of 0.905, 0.993, 0.688 and 0.921 respectively for the train sets and for the test set 0.843, 0.977, 0.567 and 0.880. The best performed RFR model have been validated using developed RFR codes for randomly selected compounds and it shows the error percentage of about 0.151% only. From the R<sup>2</sup> values, it is observed that the RFR and HGBR models show a better fit with the variables compared to the other models thereby making them the potential models for predicting the activity of novel anti-viral compounds.</p></div>\",\"PeriodicalId\":12015,\"journal\":{\"name\":\"European Journal of Medicinal Chemistry Reports\",\"volume\":\"11 \",\"pages\":\"Article 100148\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772417424000207/pdfft?md5=f8c0587cac96b9677a261126b3c259c5&pid=1-s2.0-S2772417424000207-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Medicinal Chemistry Reports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772417424000207\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Medicinal Chemistry Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772417424000207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

人类免疫缺陷病毒是一种逆转录病毒,可导致艾滋病这种慢性免疫系统疾病。艾滋病毒会削弱人体的免疫系统,从而干扰人体抵抗疾病和感染的能力。逆转录酶(RT)是艾滋病毒复制所必需的一种重要酶。RT 抑制剂(RTIs)是一类抗逆转录病毒药物,以 HIV 的 RT 酶为靶点,阻断其将病毒 RNA 转化为 DNA 的能力。已发现咪唑可抑制 RT-1 酶。它附着在 RT-1 酶的活性位点上,使其无法进行通常的活动。因此,病毒复制受到抑制,最终有助于减缓艾滋病毒和其他逆转录病毒疾病的进程。通过计算工具,研究人员可以在虚拟环境中模拟和分析药物的行为,为药物的药理特性、疗效和安全性提供有价值的见解。QSAR 建模使用机器学习方法,从化学物质数据集和伴随的生物活性中创建预测模型。本文报告了四种不同算法对咪唑支架模型性能的比较分析。支持向量回归 (SVR)、随机森林回归 (RFR)、决策树回归 (DTR) 和直方梯度提升回归 (HGBR) 等算法取得了很好的结果,训练集的 R2 值分别为 0.905、0.993、0.688 和 0.921,测试集的 R2 值分别为 0.843、0.977、0.567 和 0.880。使用开发的 RFR 代码对随机选择的化合物进行了验证,结果表明最佳 RFR 模型的误差率仅为 0.151%。从 R2 值可以看出,与其他模型相比,RFR 和 HGBR 模型与变量的拟合度更高,因此成为预测新型抗病毒化合物活性的潜在模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Building a ML-based QSAR model for predicting the bioactivity of therapeutically active drug class with imidazole scaffold

Human immunodeficiency virus, a retrovirus, causes AIDS, a chronic immune system disease. HIV interferes with the ability of our body to combat disease and infection by weakening our immune system. An essential enzyme necessary for HIV replication is reverse transcriptase (RT). RT inhibitors (RTIs) are a class of antiretroviral drugs that target HIV's RT enzyme, blocking its ability to convert viral RNA into DNA. The RT-1 enzyme has been found to be inhibited by imidazole. It attaches to the RT-1 enzyme's active site and prevents it from performing its usual activity. As a result, viral replication is inhibited, which can eventually aid in slowing the course of HIV and other retroviral diseases. A computational tool allows researchers to simulate and analyze the drug's behaviour in a virtual environment, providing valuable insights into its pharmacological properties, efficacy, and safety. QSAR modelling uses machine learning methods to create predictive models from datasets of chemical substances and the accompanying biological activity. Here, a comparative analysis of the model performances by four different algorithms for the Imidazole scaffold are reported. The algorithms of Support Vector Regression (SVR), Random Forest Regression (RFR), Decision Tree Regression (DTR) and Hist Gradient Boosting Regression (HGBR) have given promising results with the R2 value of 0.905, 0.993, 0.688 and 0.921 respectively for the train sets and for the test set 0.843, 0.977, 0.567 and 0.880. The best performed RFR model have been validated using developed RFR codes for randomly selected compounds and it shows the error percentage of about 0.151% only. From the R2 values, it is observed that the RFR and HGBR models show a better fit with the variables compared to the other models thereby making them the potential models for predicting the activity of novel anti-viral compounds.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.50
自引率
0.00%
发文量
0
期刊最新文献
Novel synthesized seleno-glycoconjugates as cosmeceutical ingredients: Antioxidant activity and in vitro skin permeation Use of radiopharmaceuticals in the diagnosis of neurodegenerative diseases Gold nanobiosensors and Machine Learning: Pioneering breakthroughs in precision breast cancer detection A reagent-free, sequence-dependent in situ peptide self-cyclization strategy under physiological condition Novel small molecule-based acetylcholinesterase (AChE) inhibitors: From biological perspective to recent developments
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1