A Systematic Implementation of Machine Learning Algorithms for Multifaceted Antimicrobial Screening of Lead Compounds

ECA 2022 Pub Date : 2022-06-16 DOI:10.3390/eca2022-12751
Justin Shen, Davesh Valagolam
{"title":"A Systematic Implementation of Machine Learning Algorithms for Multifaceted Antimicrobial Screening of Lead Compounds","authors":"Justin Shen, Davesh Valagolam","doi":"10.3390/eca2022-12751","DOIUrl":null,"url":null,"abstract":": This study employed machine learning algorithms to identify lead compounds that inhibit 11 the antibiotic targets, DNA gyrase and Dihydrofolate reductase in Escherichia coli , and identified 12 new, multifaceted antimicrobial compounds. This study used three separate datasets: 1) 326 Esche-13 richia coli DNA gyrase inhibitors and 132 non-inhibitors, 2) 346 Escherichia coli Dihydrofolate re-14 ductase inhibitors and 176 non-inhibitors, and 3) 18387 non-specific drug-like chemicals. All da-15 tasets were then processed using ECFP-4 fingerprints and split into train, test, and validation da-16 tasets according to a 70-15-15 train-test-validation split. We explored the potential of six different 17 classification algorithms, all optimized with Bayesian optimization. Our results indicate that the 18 Gradient Boosting Classifier (GBC) performed the best at identifying a compound's efficacy towards 19 DNA gyrase with an accuracy, precision, recall, F1-score, and AUC of 0.91, 0.92, 0.86, 0.88, and 0.933, 20 respectively. The Random Forest Classifier (RFC) performed optimally for identifying a com-21 pound’s effectiveness towards Dihydrofolate reductase with an accuracy, precision, recall, F1 -score, 22 and AUC of 0.86, 0.83, 0.85, 0.84, and 0.944, respectively. As a result, the GBC and RFC were used 23 to search for compounds that inhibited both DNA gyrase and Dihydrofolate reductase. Out of 18387 24 compounds, we identified 5 novel compounds that have a predicted probability greater than 95% 25 to inhibit both DNA gyrase and Dihydrofolate reductase, suggesting a high antimicrobial potential. 26 The models evaluated in this study, particularly the GBC and RFC models, hold tremendous prom-27 ise in computationally screening large libraries of compounds for antimicrobial potential.","PeriodicalId":431431,"journal":{"name":"ECA 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ECA 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/eca2022-12751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

: This study employed machine learning algorithms to identify lead compounds that inhibit 11 the antibiotic targets, DNA gyrase and Dihydrofolate reductase in Escherichia coli , and identified 12 new, multifaceted antimicrobial compounds. This study used three separate datasets: 1) 326 Esche-13 richia coli DNA gyrase inhibitors and 132 non-inhibitors, 2) 346 Escherichia coli Dihydrofolate re-14 ductase inhibitors and 176 non-inhibitors, and 3) 18387 non-specific drug-like chemicals. All da-15 tasets were then processed using ECFP-4 fingerprints and split into train, test, and validation da-16 tasets according to a 70-15-15 train-test-validation split. We explored the potential of six different 17 classification algorithms, all optimized with Bayesian optimization. Our results indicate that the 18 Gradient Boosting Classifier (GBC) performed the best at identifying a compound's efficacy towards 19 DNA gyrase with an accuracy, precision, recall, F1-score, and AUC of 0.91, 0.92, 0.86, 0.88, and 0.933, 20 respectively. The Random Forest Classifier (RFC) performed optimally for identifying a com-21 pound’s effectiveness towards Dihydrofolate reductase with an accuracy, precision, recall, F1 -score, 22 and AUC of 0.86, 0.83, 0.85, 0.84, and 0.944, respectively. As a result, the GBC and RFC were used 23 to search for compounds that inhibited both DNA gyrase and Dihydrofolate reductase. Out of 18387 24 compounds, we identified 5 novel compounds that have a predicted probability greater than 95% 25 to inhibit both DNA gyrase and Dihydrofolate reductase, suggesting a high antimicrobial potential. 26 The models evaluated in this study, particularly the GBC and RFC models, hold tremendous prom-27 ise in computationally screening large libraries of compounds for antimicrobial potential.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习算法在先导化合物多方面抗菌筛选中的系统实现
本研究利用机器学习算法鉴定了大肠杆菌中抑制11种抗生素靶点、DNA旋切酶和二氢叶酸还原酶的先导化合物,并鉴定出12种新的、多方面的抗菌化合物。本研究使用了三个独立的数据集:1)326个大肠杆菌Esche-13 DNA螺旋酶抑制剂和132个非抑制剂,2)346个大肠杆菌双氢叶酸re-14 ductase抑制剂和176个非抑制剂,3)18387个非特异性药物样化学物质。采用ECFP-4指纹图谱对所有da-15数据集进行处理,并按照70-15-15训练-测试-验证分割法分为训练、测试和验证数据集。我们探索了6种不同的17种分类算法的潜力,这些算法都是通过贝叶斯优化进行优化的。结果表明,18梯度增强分类器(GBC)在识别化合物对19种DNA旋切酶的有效性方面表现最好,其准确度、精密度、召回率、f1得分和AUC分别为0.91、0.92、0.86、0.88和0.933、20。随机森林分类器(RFC)在鉴定com-21磅对二氢叶酸还原酶的有效性方面表现最佳,其准确度、精密度、召回率、F1 -score、22和AUC分别为0.86、0.83、0.85、0.84和0.944。因此,GBC和RFC被用来寻找同时抑制DNA回转酶和二氢叶酸还原酶的化合物。在18387 24个化合物中,我们鉴定出5个新化合物,其预测概率大于95% 25,同时抑制DNA旋切酶和二氢叶酸还原酶,表明其具有很高的抗菌潜力。本研究中评估的模型,特别是GBC和RFC模型,在计算筛选具有抗菌潜力的大型化合物文库方面具有巨大的前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Antimicrobial Activities of Compounds Produced by Newly Isolated Streptomyces Strains from Mountain Caves A Systematic Implementation of Machine Learning Algorithms for Multifaceted Antimicrobial Screening of Lead Compounds The Antibiofilm Potential of Vapor Fractions of Selected Essential Oils against Pseudomonas aeruginosa  Photodynamic Inactivation of Phage Phi6 as SARS-CoV-2 Model in Wastewater Disinfection: Effectivity and Safety Experience of Real-Life Use of Dalbavancin as an Off-Label Treatment of Complicated Infectious Diseases in a Tertiary Care Hospital Experience
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1