机器学习预测小噻唑类化学物质对雌激素受体抑制活性的相关性。

IF 1.5 4区医学 Q4 CHEMISTRY, MEDICINAL Current computer-aided drug design Pub Date : 2023-01-01 DOI:10.2174/1573409919666221121141646

Jayaprakash Venkatesan, Thangavelu Saravanan, Karuppaiyan Ravindran, Thangavelu Prabha, Selvaraj Jubie, Jayapalan Sudeepan, M V N L Chaitanya, Thangavel Sivakumar

{"title":"机器学习预测小噻唑类化学物质对雌激素受体抑制活性的相关性。","authors":"Jayaprakash Venkatesan, Thangavelu Saravanan, Karuppaiyan Ravindran, Thangavelu Prabha, Selvaraj Jubie, Jayapalan Sudeepan, M V N L Chaitanya, Thangavel Sivakumar","doi":"10.2174/1573409919666221121141646","DOIUrl":null,"url":null,"abstract":"Background: Drug discovery requires the use of hybrid technologies for the discovery of new chemical substances. One of those interesting strategies is QSAR via applying an artificial intelligence system that effectively predicts how chemical alterations can impact biological activity via in-silico.Aim: Our present study aimed to work on a trending machine learning approach with a new opensource data analysis python script for the discovery of anticancer lead via building the QSAR model by using 53 compounds of thiazole derivatives.Methods: A python script has been executed with 53 small thiazole chemicals using Google collaboratory interface. A total of 82 CDK molecular descriptors were downloaded from \"chemdes\" web server and used for our study. After training the model, we checked the model performance via cross-validation of the external test set.Results: The generated QSAR model afforded the ordinary least squares (OLS) regression as R2 = 0.542, F=8.773, and adjusted R2 (Q2) =0.481, std. error = 0.061, reg.coef_ developed were of, - 0.00064 (PC1), -0.07753 (PC2), -0.09078 (PC3), -0.08986 (PC4), 0.05044 (PC5), and reg.intercept_ of 4.79279 developed through stats models, formula module. The performance of test set prediction was done by multiple linear regression, support vector machine, and partial least square regression classifiers of sklearn module, which generated the model score of 0.5424, 0.6422 and 0.6422 respectively.Conclusion: Hence, we conclude that the R2values (i.e. the model score) obtained using this script via three diverse algorithms were correlated well and there is not much difference between them and may be useful in the design of a similar group of thiazole derivatives as anticancer agents.","PeriodicalId":10886,"journal":{"name":"Current computer-aided drug design","volume":"19 1","pages":"37-50"},"PeriodicalIF":1.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Relevance of Machine Learning to Predict the Inhibitory Activity of Small Thiazole Chemicals on Estrogen Receptor.\",\"authors\":\"Jayaprakash Venkatesan, Thangavelu Saravanan, Karuppaiyan Ravindran, Thangavelu Prabha, Selvaraj Jubie, Jayapalan Sudeepan, M V N L Chaitanya, Thangavel Sivakumar\",\"doi\":\"10.2174/1573409919666221121141646\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Drug discovery requires the use of hybrid technologies for the discovery of new chemical substances. One of those interesting strategies is QSAR via applying an artificial intelligence system that effectively predicts how chemical alterations can impact biological activity via in-silico.Aim: Our present study aimed to work on a trending machine learning approach with a new opensource data analysis python script for the discovery of anticancer lead via building the QSAR model by using 53 compounds of thiazole derivatives.Methods: A python script has been executed with 53 small thiazole chemicals using Google collaboratory interface. A total of 82 CDK molecular descriptors were downloaded from \\\"chemdes\\\" web server and used for our study. After training the model, we checked the model performance via cross-validation of the external test set.Results: The generated QSAR model afforded the ordinary least squares (OLS) regression as R2 = 0.542, F=8.773, and adjusted R2 (Q2) =0.481, std. error = 0.061, reg.coef_ developed were of, - 0.00064 (PC1), -0.07753 (PC2), -0.09078 (PC3), -0.08986 (PC4), 0.05044 (PC5), and reg.intercept_ of 4.79279 developed through stats models, formula module. The performance of test set prediction was done by multiple linear regression, support vector machine, and partial least square regression classifiers of sklearn module, which generated the model score of 0.5424, 0.6422 and 0.6422 respectively.Conclusion: Hence, we conclude that the R2values (i.e. the model score) obtained using this script via three diverse algorithms were correlated well and there is not much difference between them and may be useful in the design of a similar group of thiazole derivatives as anticancer agents.\",\"PeriodicalId\":10886,\"journal\":{\"name\":\"Current computer-aided drug design\",\"volume\":\"19 1\",\"pages\":\"37-50\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current computer-aided drug design\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2174/1573409919666221121141646\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"CHEMISTRY, MEDICINAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current computer-aided drug design","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2174/1573409919666221121141646","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}

引用次数: 0

摘要

背景:药物发现需要使用混合技术来发现新的化学物质。其中一个有趣的策略是QSAR，它通过应用人工智能系统，通过计算机有效地预测化学变化如何影响生物活性。目的:我们目前的研究旨在利用一种新的开源数据分析python脚本，通过构建53种噻唑衍生物化合物的QSAR模型，研究一种趋势机器学习方法，用于发现抗癌铅。方法:使用Google协作界面，编写53种小噻唑类化合物的python脚本。从chemdes web服务器上下载了82个CDK分子描述符用于我们的研究。在训练模型之后，我们通过外部测试集的交叉验证来检查模型的性能。结果:所建立的QSAR模型具有普通最小二乘(OLS)回归，R2 = 0.542, F=8.773，调整后的R2 (Q2) =0.481，标准差= 0.061,reg。通过统计模型、公式模块求得的coef_分别为- 0.00064 (PC1)、-0.07753 (PC2)、-0.09078 (PC3)、-0.08986 (PC4)、0.05044 (PC5)， reg_intercept_为4.79279。通过多元线性回归、支持向量机和sklearn模块的偏最小二乘回归分类器对测试集进行预测，模型得分分别为0.5424、0.6422和0.6422。结论:通过三种不同的算法得到的r2值(即模型得分)具有良好的相关性，它们之间没有太大的差异，可以用于设计一类类似的噻唑类衍生物抗癌药物。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Relevance of Machine Learning to Predict the Inhibitory Activity of Small Thiazole Chemicals on Estrogen Receptor.

Background: Drug discovery requires the use of hybrid technologies for the discovery of new chemical substances. One of those interesting strategies is QSAR via applying an artificial intelligence system that effectively predicts how chemical alterations can impact biological activity via in-silico.

Aim: Our present study aimed to work on a trending machine learning approach with a new opensource data analysis python script for the discovery of anticancer lead via building the QSAR model by using 53 compounds of thiazole derivatives.

Methods: A python script has been executed with 53 small thiazole chemicals using Google collaboratory interface. A total of 82 CDK molecular descriptors were downloaded from "chemdes" web server and used for our study. After training the model, we checked the model performance via cross-validation of the external test set.

Results: The generated QSAR model afforded the ordinary least squares (OLS) regression as R² = 0.542, F=8.773, and adjusted R² (Q2) =0.481, std. error = 0.061, reg.coef_ developed were of, - 0.00064 (PC1), -0.07753 (PC2), -0.09078 (PC3), -0.08986 (PC4), 0.05044 (PC5), and reg.intercept_ of 4.79279 developed through stats models, formula module. The performance of test set prediction was done by multiple linear regression, support vector machine, and partial least square regression classifiers of sklearn module, which generated the model score of 0.5424, 0.6422 and 0.6422 respectively.

Conclusion: Hence, we conclude that the R2values (i.e. the model score) obtained using this script via three diverse algorithms were correlated well and there is not much difference between them and may be useful in the design of a similar group of thiazole derivatives as anticancer agents.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Current computer-aided drug design 医学-计算机：跨学科应用

CiteScore

3.70

自引率

5.90%

发文量

审稿时长

>12 weeks

期刊介绍： Aims & Scope Current Computer-Aided Drug Design aims to publish all the latest developments in drug design based on computational techniques. The field of computer-aided drug design has had extensive impact in the area of drug design. Current Computer-Aided Drug Design is an essential journal for all medicinal chemists who wish to be kept informed and up-to-date with all the latest and important developments in computer-aided methodologies and their applications in drug discovery. Each issue contains a series of timely, in-depth reviews, original research articles and letter articles written by leaders in the field, covering a range of computational techniques for drug design, screening, ADME studies, theoretical chemistry; computational chemistry; computer and molecular graphics; molecular modeling; protein engineering; drug design; expert systems; general structure-property relationships; molecular dynamics; chemical database development and usage etc., providing excellent rationales for drug development.