{"title":"Multi-granularity Score-based Generative Framework Enables Efficient Inverse Design of Complex Organics","authors":"Zijun Chen, Yu Wang, Liuzhenghao Lv, Hao Li, Zongying Lin, Li Yuan, Yonghong Tian","doi":"arxiv-2409.07912","DOIUrl":null,"url":null,"abstract":"Efficiently retrieving an enormous chemical library to design targeted\nmolecules is crucial for accelerating drug discovery, organic chemistry, and\noptoelectronic materials. Despite the emergence of generative models to produce\nnovel drug-like molecules, in a more realistic scenario, the complexity of\nfunctional groups (e.g., pyrene, acenaphthylene, and bridged-ring systems) and\nextensive molecular scaffolds remain challenging obstacles for the generation\nof complex organics. Traditionally, the former demands an extra learning\nprocess, e.g., molecular pre-training, and the latter requires expensive\ncomputational resources. To address these challenges, we propose OrgMol-Design,\na multi-granularity framework for efficiently designing complex organics. Our\nOrgMol-Design is composed of a score-based generative model via fragment prior\nfor diverse coarse-grained scaffold generation and a chemical-rule-aware\nscoring model for fine-grained molecular structure design, circumventing the\ndifficulty of intricate substructure learning without losing connection details\namong fragments. Our approach achieves state-of-the-art performance in four\nreal-world and more challenging benchmarks covering broader scientific domains,\noutperforming advanced molecule generative models. Additionally, it delivers a\nsubstantial speedup and graphics memory reduction compared to diffusion-based\ngraph models. Our results also demonstrate the importance of leveraging\nfragment prior for a generalized molecule inverse design model.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computational Engineering, Finance, and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07912","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Efficiently retrieving an enormous chemical library to design targeted
molecules is crucial for accelerating drug discovery, organic chemistry, and
optoelectronic materials. Despite the emergence of generative models to produce
novel drug-like molecules, in a more realistic scenario, the complexity of
functional groups (e.g., pyrene, acenaphthylene, and bridged-ring systems) and
extensive molecular scaffolds remain challenging obstacles for the generation
of complex organics. Traditionally, the former demands an extra learning
process, e.g., molecular pre-training, and the latter requires expensive
computational resources. To address these challenges, we propose OrgMol-Design,
a multi-granularity framework for efficiently designing complex organics. Our
OrgMol-Design is composed of a score-based generative model via fragment prior
for diverse coarse-grained scaffold generation and a chemical-rule-aware
scoring model for fine-grained molecular structure design, circumventing the
difficulty of intricate substructure learning without losing connection details
among fragments. Our approach achieves state-of-the-art performance in four
real-world and more challenging benchmarks covering broader scientific domains,
outperforming advanced molecule generative models. Additionally, it delivers a
substantial speedup and graphics memory reduction compared to diffusion-based
graph models. Our results also demonstrate the importance of leveraging
fragment prior for a generalized molecule inverse design model.