Konda Mani Saravanan , Jiang-Fan Wan , Liujiang Dai , Jiajun Zhang , John Z.H. Zhang , Haiping Zhang
{"title":"A deep learning based multi-model approach for predicting drug-like chemical compound’s toxicity","authors":"Konda Mani Saravanan , Jiang-Fan Wan , Liujiang Dai , Jiajun Zhang , John Z.H. Zhang , Haiping Zhang","doi":"10.1016/j.ymeth.2024.04.020","DOIUrl":null,"url":null,"abstract":"<div><p>Ensuring the safety and efficacy of chemical compounds is crucial in small-molecule drug development. In the later stages of drug development, toxic compounds pose a significant challenge, losing valuable resources and time. Early and accurate prediction of compound toxicity using deep learning models offers a promising solution to mitigate these risks during drug discovery. In this study, we present the development of several deep-learning models aimed at evaluating different types of compound toxicity, including acute toxicity, carcinogenicity, hERG_cardiotoxicity (the human ether-a-go-go related gene caused cardiotoxicity), hepatotoxicity, and mutagenicity. To address the inherent variations in data size, label type, and distribution across different types of toxicity, we employed diverse training strategies. Our first approach involved utilizing a graph convolutional network (GCN) regression model to predict acute toxicity, which achieved notable performance with Pearson R 0.76, 0.74, and 0.65 for intraperitoneal, intravenous, and oral administration routes, respectively. Furthermore, we trained multiple GCN binary classification models, each tailored to a specific type of toxicity. These models exhibited high area under the curve (AUC) scores, with an impressive AUC of 0.69, 0.77, 0.88, and 0.79 for predicting carcinogenicity, hERG_cardiotoxicity, mutagenicity, and hepatotoxicity, respectively. Additionally, we have used the approved drug dataset to determine the appropriate threshold value for the prediction score in model usage. We integrated these models into a virtual screening pipeline to assess their effectiveness in identifying potential low-toxicity drug candidates. Our findings indicate that this deep learning approach has the potential to significantly reduce the cost and risk associated with drug development by expediting the selection of compounds with low toxicity profiles. Therefore, the models developed in this study hold promise as critical tools for early drug candidate screening and selection.</p></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"226 ","pages":"Pages 164-175"},"PeriodicalIF":4.2000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1046202324001105","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Ensuring the safety and efficacy of chemical compounds is crucial in small-molecule drug development. In the later stages of drug development, toxic compounds pose a significant challenge, losing valuable resources and time. Early and accurate prediction of compound toxicity using deep learning models offers a promising solution to mitigate these risks during drug discovery. In this study, we present the development of several deep-learning models aimed at evaluating different types of compound toxicity, including acute toxicity, carcinogenicity, hERG_cardiotoxicity (the human ether-a-go-go related gene caused cardiotoxicity), hepatotoxicity, and mutagenicity. To address the inherent variations in data size, label type, and distribution across different types of toxicity, we employed diverse training strategies. Our first approach involved utilizing a graph convolutional network (GCN) regression model to predict acute toxicity, which achieved notable performance with Pearson R 0.76, 0.74, and 0.65 for intraperitoneal, intravenous, and oral administration routes, respectively. Furthermore, we trained multiple GCN binary classification models, each tailored to a specific type of toxicity. These models exhibited high area under the curve (AUC) scores, with an impressive AUC of 0.69, 0.77, 0.88, and 0.79 for predicting carcinogenicity, hERG_cardiotoxicity, mutagenicity, and hepatotoxicity, respectively. Additionally, we have used the approved drug dataset to determine the appropriate threshold value for the prediction score in model usage. We integrated these models into a virtual screening pipeline to assess their effectiveness in identifying potential low-toxicity drug candidates. Our findings indicate that this deep learning approach has the potential to significantly reduce the cost and risk associated with drug development by expediting the selection of compounds with low toxicity profiles. Therefore, the models developed in this study hold promise as critical tools for early drug candidate screening and selection.
期刊介绍:
Methods focuses on rapidly developing techniques in the experimental biological and medical sciences.
Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.