Marie Oestreich, Erinc Merdivan, Michael Lee, Joachim L. Schultze, Marie Piraud, Matthias Becker
{"title":"DrugDiff: small molecule diffusion model with flexible guidance towards molecular properties","authors":"Marie Oestreich, Erinc Merdivan, Michael Lee, Joachim L. Schultze, Marie Piraud, Matthias Becker","doi":"10.1186/s13321-025-00965-x","DOIUrl":null,"url":null,"abstract":"<p>With the cost/yield-ratio of drug development becoming increasingly unfavourable, recent work has explored machine learning to accelerate early stages of the development process. Given the current success of deep generative models across domains, we here investigated their application to the property-based proposal of new small molecules for drug development. Specifically, we trained a latent diffusion model—<i>DrugDiff</i>—paired with predictor guidance to generate novel compounds with a variety of desired molecular properties. The architecture was designed to be highly flexible and easily adaptable to future scenarios. Our experiments showed successful generation of unique, diverse and novel small molecules with targeted properties. The code is available at https://github.com/MarieOestreich/DrugDiff.</p><p> This work expands the use of generative modelling in the field of drug development from previously introduced models for proteins and RNA to the here presented application to small molecules. With small molecules making up the majority of drugs, but simultaneously being difficult to model due to their elaborate chemical rules, this work tackles a new level of difficulty in comparison to sequence-based molecule generation as is the case for proteins and RNA. Additionally, the demonstrated framework is highly flexible, allowing easy addition or removal of considered molecular properties without the need to retrain the model, making it highly adaptable to diverse research settings and it shows compelling performance for a wide variety of targeted molecular properties.\n</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-00965-x","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-025-00965-x","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
With the cost/yield-ratio of drug development becoming increasingly unfavourable, recent work has explored machine learning to accelerate early stages of the development process. Given the current success of deep generative models across domains, we here investigated their application to the property-based proposal of new small molecules for drug development. Specifically, we trained a latent diffusion model—DrugDiff—paired with predictor guidance to generate novel compounds with a variety of desired molecular properties. The architecture was designed to be highly flexible and easily adaptable to future scenarios. Our experiments showed successful generation of unique, diverse and novel small molecules with targeted properties. The code is available at https://github.com/MarieOestreich/DrugDiff.
This work expands the use of generative modelling in the field of drug development from previously introduced models for proteins and RNA to the here presented application to small molecules. With small molecules making up the majority of drugs, but simultaneously being difficult to model due to their elaborate chemical rules, this work tackles a new level of difficulty in comparison to sequence-based molecule generation as is the case for proteins and RNA. Additionally, the demonstrated framework is highly flexible, allowing easy addition or removal of considered molecular properties without the need to retrain the model, making it highly adaptable to diverse research settings and it shows compelling performance for a wide variety of targeted molecular properties.
期刊介绍:
Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling.
Coverage includes, but is not limited to:
chemical information systems, software and databases, and molecular modelling,
chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases,
computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.