{"title":"GEP-DNN4Mol: automatic chemical molecular design based on deep neural networks and gene expression programming.","authors":"Wen Zheng, Zhongji Li, Yuanyuan Chen, Wenjia Liao, Lei Deng, Hao Zhang, Yanmei Lin, Yuzhong Peng","doi":"10.1007/s13755-025-00344-8","DOIUrl":null,"url":null,"abstract":"<p><p>The inverse design of molecules has attracted widespread attention in the field of chemical molecular design. However, existing methods fail to address the diversity of the generated molecules. In this work, we propose a molecule generation method called GEP-DNN4Mol to generate molecules with good diversity and desired properties in the exploration of vast chemical space. GEP-DNN4Mol leverages a special gene expression programming algorithm as a generator for molecular generations, uses a deep neural network as an evaluator to guide the update of the generator by extracting the molecular features of the generated molecules, and couples with SMILES and SELFIES molecular representations. The experimental results show that the proposed approach outperforms the state-of-the-art methods in the performance of generated molecules and the efficiency of exploration in chemical space. The molecules generated by GEP-DNN4Mol have advantages in terms of total validity, high novelty, and good diversity.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s13755-025-00344-8.</p>","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"13 1","pages":"31"},"PeriodicalIF":3.4000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11933650/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Information Science and Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13755-025-00344-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/12/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
The inverse design of molecules has attracted widespread attention in the field of chemical molecular design. However, existing methods fail to address the diversity of the generated molecules. In this work, we propose a molecule generation method called GEP-DNN4Mol to generate molecules with good diversity and desired properties in the exploration of vast chemical space. GEP-DNN4Mol leverages a special gene expression programming algorithm as a generator for molecular generations, uses a deep neural network as an evaluator to guide the update of the generator by extracting the molecular features of the generated molecules, and couples with SMILES and SELFIES molecular representations. The experimental results show that the proposed approach outperforms the state-of-the-art methods in the performance of generated molecules and the efficiency of exploration in chemical space. The molecules generated by GEP-DNN4Mol have advantages in terms of total validity, high novelty, and good diversity.
Supplementary information: The online version contains supplementary material available at 10.1007/s13755-025-00344-8.
期刊介绍:
Health Information Science and Systems is a multidisciplinary journal that integrates artificial intelligence/computer science/information technology with health science and services, embracing information science research coupled with topics related to the modeling, design, development, integration and management of health information systems, smart health, artificial intelligence in medicine, and computer aided diagnosis, medical expert systems. The scope includes: i.) smart health, artificial Intelligence in medicine, computer aided diagnosis, medical image processing, medical expert systems ii.) medical big data, medical/health/biomedicine information resources such as patient medical records, devices and equipments, software and tools to capture, store, retrieve, process, analyze, optimize the use of information in the health domain, iii.) data management, data mining, and knowledge discovery, all of which play a key role in decision making, management of public health, examination of standards, privacy and security issues, iv.) development of new architectures and applications for health information systems.