Xihan Li, Kuanping Gong, Yongquan Jiang, Yan Yang, Tianrui Li
{"title":"MolRWKV: Conditional Molecular Generation Model Using Local Enhancement and Graph Enhancement","authors":"Xihan Li, Kuanping Gong, Yongquan Jiang, Yan Yang, Tianrui Li","doi":"10.1002/jcc.70100","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Conditional-based molecule generation techniques help to provide molecules with specific conditions for practical applications. As the SMILES string is represented as a sequence of strings, it can be processed using a language model that gradually generates its complete sequence by employing a loop to generate the next token. The efficient parallelism and efficient reasoning ability of RWKV indicate its potential for success in the field of natural language processing. Therefore, we proposed the MolRWKV de novo conditional molecule generation model, which integrates CNN and GCN based on the RWKV model, combining the ability of CNN to extract local information of SMILES sequences and the ability of GCN to obtain topological structure information of molecular graphs. Experiments show that MolRWKV can achieve comparable results to existing models in both unconditional and conditional generation, improve the accuracy of conditional generation, generate diverse molecules while retaining scaffold information, and generate molecules with affinity for specific target proteins.</p>\n </div>","PeriodicalId":188,"journal":{"name":"Journal of Computational Chemistry","volume":"46 10","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Chemistry","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jcc.70100","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Conditional-based molecule generation techniques help to provide molecules with specific conditions for practical applications. As the SMILES string is represented as a sequence of strings, it can be processed using a language model that gradually generates its complete sequence by employing a loop to generate the next token. The efficient parallelism and efficient reasoning ability of RWKV indicate its potential for success in the field of natural language processing. Therefore, we proposed the MolRWKV de novo conditional molecule generation model, which integrates CNN and GCN based on the RWKV model, combining the ability of CNN to extract local information of SMILES sequences and the ability of GCN to obtain topological structure information of molecular graphs. Experiments show that MolRWKV can achieve comparable results to existing models in both unconditional and conditional generation, improve the accuracy of conditional generation, generate diverse molecules while retaining scaffold information, and generate molecules with affinity for specific target proteins.
基于条件的分子生成技术有助于为实际应用提供特定条件的分子。由于SMILES字符串表示为字符串序列,因此可以使用语言模型对其进行处理,该模型通过使用循环来生成下一个标记,从而逐渐生成完整的序列。RWKV的高效并行性和高效推理能力表明其在自然语言处理领域具有成功的潜力。因此,我们提出了基于RWKV模型的MolRWKV de novo条件分子生成模型,该模型融合了CNN和GCN,结合了CNN提取SMILES序列局部信息的能力和GCN获取分子图拓扑结构信息的能力。实验表明,MolRWKV在无条件生成和条件生成两方面都可以达到与现有模型相当的结果,提高了条件生成的准确性,在保留支架信息的同时生成多样化的分子,并生成对特定靶蛋白具有亲和力的分子。
期刊介绍:
This distinguished journal publishes articles concerned with all aspects of computational chemistry: analytical, biological, inorganic, organic, physical, and materials. The Journal of Computational Chemistry presents original research, contemporary developments in theory and methodology, and state-of-the-art applications. Computational areas that are featured in the journal include ab initio and semiempirical quantum mechanics, density functional theory, molecular mechanics, molecular dynamics, statistical mechanics, cheminformatics, biomolecular structure prediction, molecular design, and bioinformatics.