Single-step retrosynthesis prediction via multitask graph representation learning

IF 14.7 1区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Nature Communications Pub Date : 2025-01-18 DOI:10.1038/s41467-025-56062-y
Peng-Cheng Zhao, Xue-Xin Wei, Qiong Wang, Qi-Hao Wang, Jia-Ning Li, Jie Shang, Cheng Lu, Jian-Yu Shi
{"title":"Single-step retrosynthesis prediction via multitask graph representation learning","authors":"Peng-Cheng Zhao, Xue-Xin Wei, Qiong Wang, Qi-Hao Wang, Jia-Ning Li, Jie Shang, Cheng Lu, Jian-Yu Shi","doi":"10.1038/s41467-025-56062-y","DOIUrl":null,"url":null,"abstract":"<p>Inferring appropriate synthesis reaction (i.e., retrosynthesis) routes for newly designed molecules is vital. Recently, computational methods have produced promising single-step retrosynthesis predictions. However, template-based methods are limited by the known synthesis templates; template-free methods are weakly interpretable; and semi template-based methods are deficient with regard to utilizing the associations between chemical entities. To address these issues, this paper leverages the intra-associations between synthons, the inter-associations between synthons and leaving groups (LGs), and the intra-associations between LGs. It develops a multitask graph representation learning model for single-step retrosynthesis prediction (Retro-MTGR) to solve reaction centre deduction and LG identification simultaneously. A comparison with 16 state-of-the-art methods first demonstrates the superiority of Retro-MTGR. Then, its robustness and scalability and the contributions of its crucial components are validated. More importantly, it can determine whether a bond can be a reaction centre and what LGs are appropriate for a given synthon, respectively. The answers reflect underlying chemical synthesis rules, especially opposite electrical properties between chemical entities (e.g., reaction sites, synthons, and LGs). Finally, case studies demonstrate that the retrosynthesis routes inferred by Retro-MTGR are promising for single-step synthesis reactions. The code and data of this study are freely available at https://doi.org/10.5281/zenodo.14346324.</p>","PeriodicalId":19066,"journal":{"name":"Nature Communications","volume":"5 1","pages":""},"PeriodicalIF":14.7000,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Communications","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41467-025-56062-y","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Inferring appropriate synthesis reaction (i.e., retrosynthesis) routes for newly designed molecules is vital. Recently, computational methods have produced promising single-step retrosynthesis predictions. However, template-based methods are limited by the known synthesis templates; template-free methods are weakly interpretable; and semi template-based methods are deficient with regard to utilizing the associations between chemical entities. To address these issues, this paper leverages the intra-associations between synthons, the inter-associations between synthons and leaving groups (LGs), and the intra-associations between LGs. It develops a multitask graph representation learning model for single-step retrosynthesis prediction (Retro-MTGR) to solve reaction centre deduction and LG identification simultaneously. A comparison with 16 state-of-the-art methods first demonstrates the superiority of Retro-MTGR. Then, its robustness and scalability and the contributions of its crucial components are validated. More importantly, it can determine whether a bond can be a reaction centre and what LGs are appropriate for a given synthon, respectively. The answers reflect underlying chemical synthesis rules, especially opposite electrical properties between chemical entities (e.g., reaction sites, synthons, and LGs). Finally, case studies demonstrate that the retrosynthesis routes inferred by Retro-MTGR are promising for single-step synthesis reactions. The code and data of this study are freely available at https://doi.org/10.5281/zenodo.14346324.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于多任务图表示学习的单步反合成预测
为新设计的分子推断适当的合成反应(即反合成)路线是至关重要的。最近,计算方法产生了有希望的单步反合成预测。然而,基于模板的方法受到已知合成模板的限制;无模板方法的可解释性很弱;而基于半模板的方法在利用化学实体之间的关联方面存在缺陷。为了解决这些问题,本文利用了synthons之间的内部关联、synthons与离去基团(LGs)之间的内部关联以及LGs之间的内部关联。提出了一种用于单步反合成预测的多任务图表示学习模型(retromtgr),以同时解决反应中心演绎和LG识别问题。与16种最先进的方法进行比较,首先证明了retror - mtgr的优越性。然后,验证了其鲁棒性和可扩展性以及关键组件的贡献。更重要的是,它可以分别确定一个键是否可以作为反应中心,以及什么LGs适合给定的合成子。答案反映了潜在的化学合成规则,特别是化学实体之间相反的电学性质(例如,反应位点、合成子和LGs)。最后,实例研究表明,由retror - mtgr推断的反合成路线对于单步合成反应是有希望的。本研究的代码和数据可在https://doi.org/10.5281/zenodo.14346324免费获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Nature Communications
Nature Communications Biological Science Disciplines-
CiteScore
24.90
自引率
2.40%
发文量
6928
审稿时长
3.7 months
期刊介绍: Nature Communications, an open-access journal, publishes high-quality research spanning all areas of the natural sciences. Papers featured in the journal showcase significant advances relevant to specialists in each respective field. With a 2-year impact factor of 16.6 (2022) and a median time of 8 days from submission to the first editorial decision, Nature Communications is committed to rapid dissemination of research findings. As a multidisciplinary journal, it welcomes contributions from biological, health, physical, chemical, Earth, social, mathematical, applied, and engineering sciences, aiming to highlight important breakthroughs within each domain.
期刊最新文献
Temperature seasonality regulates organic carbon burial in lake Single soliton microcomb combined with optical phased array for parallel FMCW LiDAR Water-regulated viscosity-plasticity phase transitions in a peptide self-assembled muscle-like hydrogel Composition and liquid-to-solid maturation of protein aggregates contribute to bacterial dormancy development and recovery Sensitive dependence of pairing symmetry on Ni-eg crystal field splitting in the nickelate superconductor La3Ni2O7
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1