An automatic end-to-end chemical synthesis development platform powered by large language models

IF 14.7 1区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Nature Communications Pub Date : 2024-11-23 DOI:10.1038/s41467-024-54457-x
Yixiang Ruan, Chenyin Lu, Ning Xu, Yuchen He, Yixin Chen, Jian Zhang, Jun Xuan, Jianzhang Pan, Qun Fang, Hanyu Gao, Xiaodong Shen, Ning Ye, Qiang Zhang, Yiming Mo
{"title":"An automatic end-to-end chemical synthesis development platform powered by large language models","authors":"Yixiang Ruan, Chenyin Lu, Ning Xu, Yuchen He, Yixin Chen, Jian Zhang, Jun Xuan, Jianzhang Pan, Qun Fang, Hanyu Gao, Xiaodong Shen, Ning Ye, Qiang Zhang, Yiming Mo","doi":"10.1038/s41467-024-54457-x","DOIUrl":null,"url":null,"abstract":"<p>The rapid emergence of large language model (LLM) technology presents promising opportunities to facilitate the development of synthetic reactions. In this work, we leveraged the power of GPT-4 to build an LLM-based reaction development framework (LLM-RDF) to handle fundamental tasks involved throughout the chemical synthesis development. LLM-RDF comprises six specialized LLM-based agents, including Literature Scouter, Experiment Designer, Hardware Executor, Spectrum Analyzer, Separation Instructor, and Result Interpreter, which are pre-prompted to accomplish the designated tasks. A web application with LLM-RDF as the backend was built to allow chemist users to interact with automated experimental platforms and analyze results via natural language, thus, eliminating the need for coding skills and ensuring accessibility for all chemists. We demonstrated the capabilities of LLM-RDF in guiding the end-to-end synthesis development process for the copper/TEMPO catalyzed aerobic alcohol oxidation to aldehyde reaction, including literature search and information extraction, substrate scope and condition screening, reaction kinetics study, reaction condition optimization, reaction scale-up and product purification. Furthermore, LLM-RDF’s broader applicability and versability was validated on various synthesis tasks of three distinct reactions (S<sub>N</sub>Ar reaction, photoredox C-C cross-coupling reaction, and heterogeneous photoelectrochemical reaction).</p>","PeriodicalId":19066,"journal":{"name":"Nature Communications","volume":"8 1","pages":""},"PeriodicalIF":14.7000,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Communications","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41467-024-54457-x","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

The rapid emergence of large language model (LLM) technology presents promising opportunities to facilitate the development of synthetic reactions. In this work, we leveraged the power of GPT-4 to build an LLM-based reaction development framework (LLM-RDF) to handle fundamental tasks involved throughout the chemical synthesis development. LLM-RDF comprises six specialized LLM-based agents, including Literature Scouter, Experiment Designer, Hardware Executor, Spectrum Analyzer, Separation Instructor, and Result Interpreter, which are pre-prompted to accomplish the designated tasks. A web application with LLM-RDF as the backend was built to allow chemist users to interact with automated experimental platforms and analyze results via natural language, thus, eliminating the need for coding skills and ensuring accessibility for all chemists. We demonstrated the capabilities of LLM-RDF in guiding the end-to-end synthesis development process for the copper/TEMPO catalyzed aerobic alcohol oxidation to aldehyde reaction, including literature search and information extraction, substrate scope and condition screening, reaction kinetics study, reaction condition optimization, reaction scale-up and product purification. Furthermore, LLM-RDF’s broader applicability and versability was validated on various synthesis tasks of three distinct reactions (SNAr reaction, photoredox C-C cross-coupling reaction, and heterogeneous photoelectrochemical reaction).

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
由大型语言模型驱动的端到端化学合成自动开发平台
大型语言模型(LLM)技术的迅速兴起为促进合成反应的开发提供了大有可为的机会。在这项工作中,我们利用 GPT-4 的强大功能,构建了一个基于 LLM 的反应开发框架(LLM-RDF),以处理整个化学合成开发过程中涉及的基本任务。LLM-RDF 由六个基于 LLM 的专门代理组成,包括文献搜寻器、实验设计者、硬件执行器、光谱分析器、分离指导器和结果解释器,它们会预先被提示完成指定任务。我们建立了一个以 LLM-RDF 为后台的网络应用程序,让化学家用户能够通过自然语言与自动化实验平台互动并分析结果,从而无需编码技能,确保所有化学家都能使用。我们展示了 LLM-RDF 在指导铜/TEMPO 催化有氧醇氧化成醛反应的端到端合成开发过程中的能力,包括文献检索和信息提取、底物范围和条件筛选、反应动力学研究、反应条件优化、反应放大和产物纯化。此外,LLM-RDF 在三个不同反应(SNAr 反应、光氧化 C-C 交叉偶联反应和异相光电化学反应)的各种合成任务中验证了其更广泛的适用性和通用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Nature Communications
Nature Communications Biological Science Disciplines-
CiteScore
24.90
自引率
2.40%
发文量
6928
审稿时长
3.7 months
期刊介绍: Nature Communications, an open-access journal, publishes high-quality research spanning all areas of the natural sciences. Papers featured in the journal showcase significant advances relevant to specialists in each respective field. With a 2-year impact factor of 16.6 (2022) and a median time of 8 days from submission to the first editorial decision, Nature Communications is committed to rapid dissemination of research findings. As a multidisciplinary journal, it welcomes contributions from biological, health, physical, chemical, Earth, social, mathematical, applied, and engineering sciences, aiming to highlight important breakthroughs within each domain.
期刊最新文献
Whole-cell multi-target single-molecule super-resolution imaging in 3D with microfluidics and a single-objective tilted light sheet Interstellar formation of lactaldehyde, a key intermediate in the methylglyoxal pathway Zfp260 choreographs the early stage osteo-lineage commitment of skeletal stem cells Bacterial single-cell RNA sequencing captures biofilm transcriptional heterogeneity and differential responses to immune pressure Structural insight into the distinct regulatory mechanism of the HEPN–MNT toxin-antitoxin system in Legionella pneumophila
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1