建立新冠肺炎药物再利用的SARS-CoV-2主要蛋白酶结合预测随机森林模型。

IF 2.8 4区 医学 Q2 MEDICINE, RESEARCH & EXPERIMENTAL Experimental Biology and Medicine Pub Date : 2023-11-01 Epub Date: 2023-11-24 DOI:10.1177/15353702231209413
Jie Liu, Liang Xu, Wenjing Guo, Zoe Li, Md Kamrul Hasan Khan, Weigong Ge, Tucker A Patterson, Huixiao Hong
{"title":"建立新冠肺炎药物再利用的SARS-CoV-2主要蛋白酶结合预测随机森林模型。","authors":"Jie Liu, Liang Xu, Wenjing Guo, Zoe Li, Md Kamrul Hasan Khan, Weigong Ge, Tucker A Patterson, Huixiao Hong","doi":"10.1177/15353702231209413","DOIUrl":null,"url":null,"abstract":"<p><p>The coronavirus disease 2019 (COVID-19) global pandemic resulted in millions of people becoming infected with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and close to seven million deaths worldwide. It is essential to further explore and design effective COVID-19 treatment drugs that target the main protease of SARS-CoV-2, a major target for COVID-19 drugs. In this study, machine learning was applied for predicting the SARS-CoV-2 main protease binding of Food and Drug Administration (FDA)-approved drugs to assist in the identification of potential repurposing candidates for COVID-19 treatment. Ligands bound to the SARS-CoV-2 main protease in the Protein Data Bank and compounds experimentally tested in SARS-CoV-2 main protease binding assays in the literature were curated. These chemicals were divided into training (516 chemicals) and testing (360 chemicals) data sets. To identify SARS-CoV-2 main protease binders as potential candidates for repurposing to treat COVID-19, 1188 FDA-approved drugs from the Liver Toxicity Knowledge Base were obtained. A random forest algorithm was used for constructing predictive models based on molecular descriptors calculated using Mold2 software. Model performance was evaluated using 100 iterations of fivefold cross-validations which resulted in 78.8% balanced accuracy. The random forest model that was constructed from the whole training dataset was used to predict SARS-CoV-2 main protease binding on the testing set and the FDA-approved drugs. Model applicability domain and prediction confidence on drugs predicted as the main protease binders discovered 10 FDA-approved drugs as potential candidates for repurposing to treat COVID-19. Our results demonstrate that machine learning is an efficient method for drug repurposing and, thus, may accelerate drug development targeting SARS-CoV-2.</p>","PeriodicalId":12163,"journal":{"name":"Experimental Biology and Medicine","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10798185/pdf/","citationCount":"0","resultStr":"{\"title\":\"Developing a SARS-CoV-2 main protease binding prediction random forest model for drug repurposing for COVID-19 treatment.\",\"authors\":\"Jie Liu, Liang Xu, Wenjing Guo, Zoe Li, Md Kamrul Hasan Khan, Weigong Ge, Tucker A Patterson, Huixiao Hong\",\"doi\":\"10.1177/15353702231209413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The coronavirus disease 2019 (COVID-19) global pandemic resulted in millions of people becoming infected with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and close to seven million deaths worldwide. It is essential to further explore and design effective COVID-19 treatment drugs that target the main protease of SARS-CoV-2, a major target for COVID-19 drugs. In this study, machine learning was applied for predicting the SARS-CoV-2 main protease binding of Food and Drug Administration (FDA)-approved drugs to assist in the identification of potential repurposing candidates for COVID-19 treatment. Ligands bound to the SARS-CoV-2 main protease in the Protein Data Bank and compounds experimentally tested in SARS-CoV-2 main protease binding assays in the literature were curated. These chemicals were divided into training (516 chemicals) and testing (360 chemicals) data sets. To identify SARS-CoV-2 main protease binders as potential candidates for repurposing to treat COVID-19, 1188 FDA-approved drugs from the Liver Toxicity Knowledge Base were obtained. A random forest algorithm was used for constructing predictive models based on molecular descriptors calculated using Mold2 software. Model performance was evaluated using 100 iterations of fivefold cross-validations which resulted in 78.8% balanced accuracy. The random forest model that was constructed from the whole training dataset was used to predict SARS-CoV-2 main protease binding on the testing set and the FDA-approved drugs. Model applicability domain and prediction confidence on drugs predicted as the main protease binders discovered 10 FDA-approved drugs as potential candidates for repurposing to treat COVID-19. Our results demonstrate that machine learning is an efficient method for drug repurposing and, thus, may accelerate drug development targeting SARS-CoV-2.</p>\",\"PeriodicalId\":12163,\"journal\":{\"name\":\"Experimental Biology and Medicine\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10798185/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Experimental Biology and Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/15353702231209413\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/11/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Experimental Biology and Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/15353702231209413","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/24 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

摘要

2019年冠状病毒病(COVID-19)全球大流行导致数百万人感染了严重急性呼吸综合征冠状病毒2 (SARS-CoV-2)病毒,全球近700万人死亡。因此,有必要进一步探索和设计针对SARS-CoV-2主要蛋白酶的有效COVID-19治疗药物,这是COVID-19药物的主要靶点。在这项研究中,机器学习被应用于预测美国食品和药物管理局(FDA)批准的药物与SARS-CoV-2主要蛋白酶的结合,以帮助确定潜在的可用于治疗COVID-19的候选药物。筛选蛋白质数据库中与SARS-CoV-2主要蛋白酶结合的配体,以及文献中SARS-CoV-2主要蛋白酶结合试验中实验检测的化合物。这些化学品被分为训练(516种化学品)和测试(360种化学品)数据集。为了确定SARS-CoV-2主要蛋白酶结合物作为重新用于治疗COVID-19的潜在候选物,从肝毒性知识库中获得了1188种fda批准的药物。基于Mold2软件计算的分子描述符,采用随机森林算法构建预测模型。使用100次五重交叉验证来评估模型性能,结果达到78.8%的平衡精度。利用整个训练数据集构建的随机森林模型预测了SARS-CoV-2主要蛋白酶在测试集和fda批准的药物上的结合。预测作为主要蛋白酶结合物的药物的模型适用性域和预测置信度发现了10种fda批准的药物作为治疗COVID-19的潜在候选药物。我们的研究结果表明,机器学习是一种有效的药物再利用方法,因此可能会加速针对SARS-CoV-2的药物开发。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Developing a SARS-CoV-2 main protease binding prediction random forest model for drug repurposing for COVID-19 treatment.

The coronavirus disease 2019 (COVID-19) global pandemic resulted in millions of people becoming infected with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and close to seven million deaths worldwide. It is essential to further explore and design effective COVID-19 treatment drugs that target the main protease of SARS-CoV-2, a major target for COVID-19 drugs. In this study, machine learning was applied for predicting the SARS-CoV-2 main protease binding of Food and Drug Administration (FDA)-approved drugs to assist in the identification of potential repurposing candidates for COVID-19 treatment. Ligands bound to the SARS-CoV-2 main protease in the Protein Data Bank and compounds experimentally tested in SARS-CoV-2 main protease binding assays in the literature were curated. These chemicals were divided into training (516 chemicals) and testing (360 chemicals) data sets. To identify SARS-CoV-2 main protease binders as potential candidates for repurposing to treat COVID-19, 1188 FDA-approved drugs from the Liver Toxicity Knowledge Base were obtained. A random forest algorithm was used for constructing predictive models based on molecular descriptors calculated using Mold2 software. Model performance was evaluated using 100 iterations of fivefold cross-validations which resulted in 78.8% balanced accuracy. The random forest model that was constructed from the whole training dataset was used to predict SARS-CoV-2 main protease binding on the testing set and the FDA-approved drugs. Model applicability domain and prediction confidence on drugs predicted as the main protease binders discovered 10 FDA-approved drugs as potential candidates for repurposing to treat COVID-19. Our results demonstrate that machine learning is an efficient method for drug repurposing and, thus, may accelerate drug development targeting SARS-CoV-2.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Experimental Biology and Medicine
Experimental Biology and Medicine 医学-医学:研究与实验
CiteScore
6.00
自引率
0.00%
发文量
157
审稿时长
1 months
期刊介绍: Experimental Biology and Medicine (EBM) is a global, peer-reviewed journal dedicated to the publication of multidisciplinary and interdisciplinary research in the biomedical sciences. EBM provides both research and review articles as well as meeting symposia and brief communications. Articles in EBM represent cutting edge research at the overlapping junctions of the biological, physical and engineering sciences that impact upon the health and welfare of the world''s population. Topics covered in EBM include: Anatomy/Pathology; Biochemistry and Molecular Biology; Bioimaging; Biomedical Engineering; Bionanoscience; Cell and Developmental Biology; Endocrinology and Nutrition; Environmental Health/Biomarkers/Precision Medicine; Genomics, Proteomics, and Bioinformatics; Immunology/Microbiology/Virology; Mechanisms of Aging; Neuroscience; Pharmacology and Toxicology; Physiology; Stem Cell Biology; Structural Biology; Systems Biology and Microphysiological Systems; and Translational Research.
期刊最新文献
Experimental Biology and Medicine: a global journal with rigorous publication standards. Collagen II enrichment through scAAV6-RNAi-mediated inhibition of matrix-metalloproteinases 3 and 13 in degenerative nucleus-pulposus cells degenerative disc disease and biological treatment strategies. Ultrasound-assisted laser therapy for selective removal of melanoma cells. Modulation of arterial intima stiffness by disturbed blood flow. LM11A-31, a modulator of p75 neurotrophin receptor, suppresses HIV-1 replication and inflammatory response in macrophages
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1