{"title":"Unsupervised machine learning, QSAR modelling and web tool development for streamlining the lead identification process of antimalarial flavonoids.","authors":"J H Zothantluanga, D Chetia, S Rajkhowa, A K Umar","doi":"10.1080/1062936X.2023.2169347","DOIUrl":null,"url":null,"abstract":"<p><p>Identification of lead compounds with the traditional laboratory approach is expensive and time-consuming. Nowadays, in silico techniques have emerged as a promising approach for lead identification. In this study, we aim to develop robust and predictive 2D-QSAR models to identify lead flavonoids by predicting the IC<sub>50</sub> against <i>Plasmodium falciparum</i>. We applied machine learning algorithms (Principal component analysis followed by K-means clustering) and Pearson correlation analysis to select 9 molecular descriptors (MDs) for model building. We selected and validated the three best QSAR models after execution of multiple linear regression (MLR) 100 times with different combinations of MDs. The developed models have fulfilled the five principles for QSAR models as specified by the Organization for Economic Co-operation and Development. The outcome of the study is a reliable and sustainable in silico method of IC<sub>50</sub> (Mean ± SD) prediction that will positively impact the antimalarial drug development process by reducing the money and time required to identify potential antimalarial lead compounds from the class of flavonoids. We also developed a web tool (JazQSAR, https://etflin.com/news/4) to offer an easily accessible platform for the developed QSAR models.</p>","PeriodicalId":21446,"journal":{"name":"SAR and QSAR in Environmental Research","volume":"34 2","pages":"117-146"},"PeriodicalIF":2.3000,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SAR and QSAR in Environmental Research","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1080/1062936X.2023.2169347","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 2
Abstract
Identification of lead compounds with the traditional laboratory approach is expensive and time-consuming. Nowadays, in silico techniques have emerged as a promising approach for lead identification. In this study, we aim to develop robust and predictive 2D-QSAR models to identify lead flavonoids by predicting the IC50 against Plasmodium falciparum. We applied machine learning algorithms (Principal component analysis followed by K-means clustering) and Pearson correlation analysis to select 9 molecular descriptors (MDs) for model building. We selected and validated the three best QSAR models after execution of multiple linear regression (MLR) 100 times with different combinations of MDs. The developed models have fulfilled the five principles for QSAR models as specified by the Organization for Economic Co-operation and Development. The outcome of the study is a reliable and sustainable in silico method of IC50 (Mean ± SD) prediction that will positively impact the antimalarial drug development process by reducing the money and time required to identify potential antimalarial lead compounds from the class of flavonoids. We also developed a web tool (JazQSAR, https://etflin.com/news/4) to offer an easily accessible platform for the developed QSAR models.
期刊介绍:
SAR and QSAR in Environmental Research is an international journal welcoming papers on the fundamental and practical aspects of the structure-activity and structure-property relationships in the fields of environmental science, agrochemistry, toxicology, pharmacology and applied chemistry. A unique aspect of the journal is the focus on emerging techniques for the building of SAR and QSAR models in these widely varying fields. The scope of the journal includes, but is not limited to, the topics of topological and physicochemical descriptors, mathematical, statistical and graphical methods for data analysis, computer methods and programs, original applications and comparative studies. In addition to primary scientific papers, the journal contains reviews of books and software and news of conferences. Special issues on topics of current and widespread interest to the SAR and QSAR community will be published from time to time.