BatGPT-Chem:用于逆合成预测的大型基础模型

Yifei Yang, Runhan Shi, Zuchao Li, Shu Jiang, Bao-Liang Lu, Yang Yang, Hai Zhao
{"title":"BatGPT-Chem:用于逆合成预测的大型基础模型","authors":"Yifei Yang, Runhan Shi, Zuchao Li, Shu Jiang, Bao-Liang Lu, Yang Yang, Hai Zhao","doi":"arxiv-2408.10285","DOIUrl":null,"url":null,"abstract":"Retrosynthesis analysis is pivotal yet challenging in drug discovery and\norganic chemistry. Despite the proliferation of computational tools over the\npast decade, AI-based systems often fall short in generalizing across diverse\nreaction types and exploring alternative synthetic pathways. This paper\npresents BatGPT-Chem, a large language model with 15 billion parameters,\ntailored for enhanced retrosynthesis prediction. Integrating chemical tasks via\na unified framework of natural language and SMILES notation, this approach\nsynthesizes extensive instructional data from an expansive chemical database.\nEmploying both autoregressive and bidirectional training techniques across over\none hundred million instances, BatGPT-Chem captures a broad spectrum of\nchemical knowledge, enabling precise prediction of reaction conditions and\nexhibiting strong zero-shot capabilities. Superior to existing AI methods, our\nmodel demonstrates significant advancements in generating effective strategies\nfor complex molecules, as validated by stringent benchmark tests. BatGPT-Chem\nnot only boosts the efficiency and creativity of retrosynthetic analysis but\nalso establishes a new standard for computational tools in synthetic design.\nThis development empowers chemists to adeptly address the synthesis of novel\ncompounds, potentially expediting the innovation cycle in drug manufacturing\nand materials science. We release our trial platform at\n\\url{https://www.batgpt.net/dapp/chem}.","PeriodicalId":501309,"journal":{"name":"arXiv - CS - Computational Engineering, Finance, and Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction\",\"authors\":\"Yifei Yang, Runhan Shi, Zuchao Li, Shu Jiang, Bao-Liang Lu, Yang Yang, Hai Zhao\",\"doi\":\"arxiv-2408.10285\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Retrosynthesis analysis is pivotal yet challenging in drug discovery and\\norganic chemistry. Despite the proliferation of computational tools over the\\npast decade, AI-based systems often fall short in generalizing across diverse\\nreaction types and exploring alternative synthetic pathways. This paper\\npresents BatGPT-Chem, a large language model with 15 billion parameters,\\ntailored for enhanced retrosynthesis prediction. Integrating chemical tasks via\\na unified framework of natural language and SMILES notation, this approach\\nsynthesizes extensive instructional data from an expansive chemical database.\\nEmploying both autoregressive and bidirectional training techniques across over\\none hundred million instances, BatGPT-Chem captures a broad spectrum of\\nchemical knowledge, enabling precise prediction of reaction conditions and\\nexhibiting strong zero-shot capabilities. Superior to existing AI methods, our\\nmodel demonstrates significant advancements in generating effective strategies\\nfor complex molecules, as validated by stringent benchmark tests. BatGPT-Chem\\nnot only boosts the efficiency and creativity of retrosynthetic analysis but\\nalso establishes a new standard for computational tools in synthetic design.\\nThis development empowers chemists to adeptly address the synthesis of novel\\ncompounds, potentially expediting the innovation cycle in drug manufacturing\\nand materials science. We release our trial platform at\\n\\\\url{https://www.batgpt.net/dapp/chem}.\",\"PeriodicalId\":501309,\"journal\":{\"name\":\"arXiv - CS - Computational Engineering, Finance, and Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computational Engineering, Finance, and Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.10285\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computational Engineering, Finance, and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.10285","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

逆合成分析在药物发现和无机化学中至关重要,但也极具挑战性。尽管计算工具在过去十年中不断涌现,但基于人工智能的系统在泛化各种反应类型和探索替代合成途径方面往往存在不足。本文介绍的 BatGPT-Chem 是一个拥有 150 亿个参数的大型语言模型,专为增强逆合成预测而量身定制。该方法通过自然语言和 SMILES 符号的统一框架整合了化学任务,并从庞大的化学数据库中合成了大量的教学数据。BatGPT-Chem 在超过 1 亿个实例中采用了自回归和双向训练技术,捕捉到了广泛的化学知识,实现了对反应条件的精确预测,并展现了强大的归零能力。与现有的人工智能方法相比,我们的模型在为复杂分子生成有效策略方面取得了显著进步,这一点通过严格的基准测试得到了验证。BatGPT-Chem 不仅提高了逆向合成分析的效率和创造性,还为合成设计领域的计算工具建立了新的标准。这项开发使化学家们能够熟练地解决新型化合物的合成问题,从而有可能加快药物制造和材料科学领域的创新周期。我们在(url{https://www.batgpt.net/dapp/chem}上发布了我们的试验平台。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction
Retrosynthesis analysis is pivotal yet challenging in drug discovery and organic chemistry. Despite the proliferation of computational tools over the past decade, AI-based systems often fall short in generalizing across diverse reaction types and exploring alternative synthetic pathways. This paper presents BatGPT-Chem, a large language model with 15 billion parameters, tailored for enhanced retrosynthesis prediction. Integrating chemical tasks via a unified framework of natural language and SMILES notation, this approach synthesizes extensive instructional data from an expansive chemical database. Employing both autoregressive and bidirectional training techniques across over one hundred million instances, BatGPT-Chem captures a broad spectrum of chemical knowledge, enabling precise prediction of reaction conditions and exhibiting strong zero-shot capabilities. Superior to existing AI methods, our model demonstrates significant advancements in generating effective strategies for complex molecules, as validated by stringent benchmark tests. BatGPT-Chem not only boosts the efficiency and creativity of retrosynthetic analysis but also establishes a new standard for computational tools in synthetic design. This development empowers chemists to adeptly address the synthesis of novel compounds, potentially expediting the innovation cycle in drug manufacturing and materials science. We release our trial platform at \url{https://www.batgpt.net/dapp/chem}.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A generalized non-hourglass updated Lagrangian formulation for SPH solid dynamics A Knowledge-Inspired Hierarchical Physics-Informed Neural Network for Pipeline Hydraulic Transient Simulation Uncertainty Analysis of Limit Cycle Oscillations in Nonlinear Dynamical Systems with the Fourier Generalized Polynomial Chaos Expansion Micropolar elastoplasticity using a fast Fourier transform-based solver A differentiable structural analysis framework for high-performance design optimization
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1