Lucas N. Alberca , Denis N. Prada Gori , Maximiliano J. Fallico , Alexandre V. Fassio , Alan Talevi , Carolina L. Bellera
{"title":"LIDEB's Useful Decoys (LUDe): A freely available decoy-generation tool. Benchmarking and scope","authors":"Lucas N. Alberca , Denis N. Prada Gori , Maximiliano J. Fallico , Alexandre V. Fassio , Alan Talevi , Carolina L. Bellera","doi":"10.1016/j.ailsci.2025.100129","DOIUrl":null,"url":null,"abstract":"<div><div>In the field of chemoinformatics, and in particular, when developing models to be applied in virtual screening campaigns, it is essential to run retrospective virtual screening experiments that evaluate the performance of such models in a scenario similar to the real one. That is, the ability to recover a small number of active compounds dispersed among a much larger number of compounds without the desired activity. However, such a retrospective experiment is often limited by the relative scarcity of known inactive compounds against the pharmacological target of interest. In these cases, automatic decoy (putative inactive compound) generation tools are often of great importance. Their basic goal is to generate decoys that are similar enough to the known active compounds to challenge the models, but different enough so that the probability that the decoys modulate the molecular target of interest is small.</div><div>In this article, we report the latest version of our open-source decoy generation tool LUDe, inspired by the well-known DUD-E but designed to reduce the probability of generating decoys topologically similar to known active compounds. We have carried out a benchmarking exercise against DUD-E through 102 pharmacological targets, using the DOE score and the Doppelganger score as comparison criteria. LUDe decoys obtained better DOE scores across most of the targets, indicating a lower risk of artificial enrichment. The mean Doppelganger score, in contrast, was similar for LUDe and DUD-E decoys, exhibiting a slight improvement for LUDe decoys for most of the targets. Simulation experiments were performed to verify whether the generated decoys are unsuitable to validate ligand-based models. Our results suggest that LUDe decoys are apt to be used to validate and compare machine learning ligand-based screening approaches. Importantly, LUDe may be used locally, independently from external server availability, and is thus suitable to obtain decoys from large datasets. It is available as a Web App (at <span><span>https://lideb.biol.unlp.edu.ar/?page_id=1076</span><svg><path></path></svg></span>) and as Python code at (<span><span>https://github.com/LIDeB/LUDe.v1.0</span><svg><path></path></svg></span>)</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100129"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence in the life sciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667318525000054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the field of chemoinformatics, and in particular, when developing models to be applied in virtual screening campaigns, it is essential to run retrospective virtual screening experiments that evaluate the performance of such models in a scenario similar to the real one. That is, the ability to recover a small number of active compounds dispersed among a much larger number of compounds without the desired activity. However, such a retrospective experiment is often limited by the relative scarcity of known inactive compounds against the pharmacological target of interest. In these cases, automatic decoy (putative inactive compound) generation tools are often of great importance. Their basic goal is to generate decoys that are similar enough to the known active compounds to challenge the models, but different enough so that the probability that the decoys modulate the molecular target of interest is small.
In this article, we report the latest version of our open-source decoy generation tool LUDe, inspired by the well-known DUD-E but designed to reduce the probability of generating decoys topologically similar to known active compounds. We have carried out a benchmarking exercise against DUD-E through 102 pharmacological targets, using the DOE score and the Doppelganger score as comparison criteria. LUDe decoys obtained better DOE scores across most of the targets, indicating a lower risk of artificial enrichment. The mean Doppelganger score, in contrast, was similar for LUDe and DUD-E decoys, exhibiting a slight improvement for LUDe decoys for most of the targets. Simulation experiments were performed to verify whether the generated decoys are unsuitable to validate ligand-based models. Our results suggest that LUDe decoys are apt to be used to validate and compare machine learning ligand-based screening approaches. Importantly, LUDe may be used locally, independently from external server availability, and is thus suitable to obtain decoys from large datasets. It is available as a Web App (at https://lideb.biol.unlp.edu.ar/?page_id=1076) and as Python code at (https://github.com/LIDeB/LUDe.v1.0)
Artificial intelligence in the life sciencesPharmacology, Biochemistry, Genetics and Molecular Biology (General), Computer Science Applications, Health Informatics, Drug Discovery, Veterinary Science and Veterinary Medicine (General)