基于机器学习的癌症候选药物ERα活性定量构效关系及ADMET预测模型

Zong-Ren Xu
{"title":"基于机器学习的癌症候选药物ERα活性定量构效关系及ADMET预测模型","authors":"Zong-Ren Xu","doi":"10.1051/wujns/2023283257","DOIUrl":null,"url":null,"abstract":"Breast cancer is presently one of the most common malignancies worldwide, with a higher fatality rate. In this study, a quantitative structure-activity relationship (QSAR) model of compound biological activity and ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) properties prediction model were performed using estrogen receptor alpha (ERα) antagonist information collected from compound samples. We first utilized grey relation analysis (GRA) in conjunction with the random forest (RF) algorithm to identify the top 20 molecular descriptor variables that have the greatest influence on biological activity, and then we used Spearman correlation analysis to identify 16 independent variables. Second, a QSAR model of the compound were developed based on BP neural network (BPNN), genetic algorithm optimized BP neural network (GA-BPNN), and support vector regression (SVR). The BPNN, the SVR, and the logistic regression (LR) models were then used to identify and predict the ADMET properties of substances, with the prediction impacts of each model compared and assessed. The results reveal that a SVR model was used in QSAR quantitative prediction, and in the classification prediction of ADMET properties: the SVR model predicts the Caco-2 and hERG(human Ether-a-go-go Related Gene) properties, the LR model predicts the cytochrome P450 enzyme 3A4 subtype (CYP3A4) and Micronucleus (MN) properties, and the BPNN model predicts the Human Oral Bioavailability (HOB) properties. Finally, information entropy theory is used to validate the rationality of variable screening, and sensitivity analysis of the model demonstrates that the constructed model has high accuracy and stability, which can be used as a reference for screening probable active compounds and drug discovery.","PeriodicalId":23976,"journal":{"name":"Wuhan University Journal of Natural Sciences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning-Based Quantitative Structure-Activity Relationship and ADMET Prediction Models for ERα Activity of Anti-Breast Cancer Drug Candidates\",\"authors\":\"Zong-Ren Xu\",\"doi\":\"10.1051/wujns/2023283257\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer is presently one of the most common malignancies worldwide, with a higher fatality rate. In this study, a quantitative structure-activity relationship (QSAR) model of compound biological activity and ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) properties prediction model were performed using estrogen receptor alpha (ERα) antagonist information collected from compound samples. We first utilized grey relation analysis (GRA) in conjunction with the random forest (RF) algorithm to identify the top 20 molecular descriptor variables that have the greatest influence on biological activity, and then we used Spearman correlation analysis to identify 16 independent variables. Second, a QSAR model of the compound were developed based on BP neural network (BPNN), genetic algorithm optimized BP neural network (GA-BPNN), and support vector regression (SVR). The BPNN, the SVR, and the logistic regression (LR) models were then used to identify and predict the ADMET properties of substances, with the prediction impacts of each model compared and assessed. The results reveal that a SVR model was used in QSAR quantitative prediction, and in the classification prediction of ADMET properties: the SVR model predicts the Caco-2 and hERG(human Ether-a-go-go Related Gene) properties, the LR model predicts the cytochrome P450 enzyme 3A4 subtype (CYP3A4) and Micronucleus (MN) properties, and the BPNN model predicts the Human Oral Bioavailability (HOB) properties. Finally, information entropy theory is used to validate the rationality of variable screening, and sensitivity analysis of the model demonstrates that the constructed model has high accuracy and stability, which can be used as a reference for screening probable active compounds and drug discovery.\",\"PeriodicalId\":23976,\"journal\":{\"name\":\"Wuhan University Journal of Natural Sciences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Wuhan University Journal of Natural Sciences\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://doi.org/10.1051/wujns/2023283257\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Multidisciplinary\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wuhan University Journal of Natural Sciences","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.1051/wujns/2023283257","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Multidisciplinary","Score":null,"Total":0}
引用次数: 0

摘要

癌症是目前世界上最常见的恶性肿瘤之一,死亡率较高。在本研究中,利用从化合物样品中收集的雌激素受体α(ERα)拮抗剂信息,建立了化合物生物活性的定量构效关系(QSAR)模型和ADMET(吸收、分布、代谢、排泄、毒性)特性预测模型。我们首先利用灰色关联分析(GRA)和随机森林(RF)算法来识别对生物活性影响最大的前20个分子描述符变量,然后我们使用Spearman相关分析来识别16个自变量。其次,基于BP神经网络、遗传算法优化BP神经网络和支持向量回归建立了该化合物的QSAR模型。然后使用BPNN、SVR和逻辑回归(LR)模型来识别和预测物质的ADMET特性,并对每个模型的预测影响进行比较和评估。结果表明,SVR模型用于QSAR定量预测和ADMET特性的分类预测:SVR模型预测Caco-2和hERG(人Ether-a-go-go相关基因)特性,LR模型预测细胞色素P450酶3A4亚型(CYP3A4)和微核(MN)特性,并且BPNN模型预测人类口腔生物利用度(HOB)特性。最后,利用信息熵理论验证了变量筛选的合理性,对模型的敏感性分析表明,所构建的模型具有较高的准确性和稳定性,可为筛选可能的活性化合物和药物发现提供参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Machine Learning-Based Quantitative Structure-Activity Relationship and ADMET Prediction Models for ERα Activity of Anti-Breast Cancer Drug Candidates
Breast cancer is presently one of the most common malignancies worldwide, with a higher fatality rate. In this study, a quantitative structure-activity relationship (QSAR) model of compound biological activity and ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) properties prediction model were performed using estrogen receptor alpha (ERα) antagonist information collected from compound samples. We first utilized grey relation analysis (GRA) in conjunction with the random forest (RF) algorithm to identify the top 20 molecular descriptor variables that have the greatest influence on biological activity, and then we used Spearman correlation analysis to identify 16 independent variables. Second, a QSAR model of the compound were developed based on BP neural network (BPNN), genetic algorithm optimized BP neural network (GA-BPNN), and support vector regression (SVR). The BPNN, the SVR, and the logistic regression (LR) models were then used to identify and predict the ADMET properties of substances, with the prediction impacts of each model compared and assessed. The results reveal that a SVR model was used in QSAR quantitative prediction, and in the classification prediction of ADMET properties: the SVR model predicts the Caco-2 and hERG(human Ether-a-go-go Related Gene) properties, the LR model predicts the cytochrome P450 enzyme 3A4 subtype (CYP3A4) and Micronucleus (MN) properties, and the BPNN model predicts the Human Oral Bioavailability (HOB) properties. Finally, information entropy theory is used to validate the rationality of variable screening, and sensitivity analysis of the model demonstrates that the constructed model has high accuracy and stability, which can be used as a reference for screening probable active compounds and drug discovery.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Wuhan University Journal of Natural Sciences
Wuhan University Journal of Natural Sciences Multidisciplinary-Multidisciplinary
CiteScore
0.40
自引率
0.00%
发文量
2485
期刊介绍: Wuhan University Journal of Natural Sciences aims to promote rapid communication and exchange between the World and Wuhan University, as well as other Chinese universities and academic institutions. It mainly reflects the latest advances being made in many disciplines of scientific research in Chinese universities and academic institutions. The journal also publishes papers presented at conferences in China and abroad. The multi-disciplinary nature of Wuhan University Journal of Natural Sciences is apparent in the wide range of articles from leading Chinese scholars. This journal also aims to introduce Chinese academic achievements to the world community, by demonstrating the significance of Chinese scientific investigations.
期刊最新文献
Comprehensive Analysis of the Role of Forkhead Box J3 (FOXJ3) in Human Cancers Three New Classes of Subsystem Codes A Note of the Interpolating Sequence in Qp∩H∞ Learning Label Correlations for Multi-Label Online Passive Aggressive Classification Algorithm Uniform Asymptotics for Finite-Time Ruin Probabilities of Risk Models with Non-Stationary Arrivals and Strongly Subexponential Claim Sizes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1