Bayesian Meta-Learning for Few-Shot Reaction Outcome Prediction of Asymmetric Hydrogenation of Olefins

IF 16.9 1区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Angewandte Chemie International Edition Pub Date : 2025-04-23 DOI:10.1002/anie.202503821

Sukriti Singh, José Miguel Hernández-Lobato

{"title":"Bayesian Meta-Learning for Few-Shot Reaction Outcome Prediction of Asymmetric Hydrogenation of Olefins","authors":"Sukriti Singh, José Miguel Hernández-Lobato","doi":"10.1002/anie.202503821","DOIUrl":null,"url":null,"abstract":"<p>Recent years have witnessed the increasing application of machine learning (ML) in chemical reaction development. These ML methods, in general, require huge training set examples. The published literature has large amounts of data, but there are modelling challenges due to the sparse nature of these datasets. Herein, we report a meta-learning workflow that can utilize the literature-mined data and return accurate predictions with limited data. A literature dataset comprising of over 12 000 transition metal catalyzed asymmetric hydrogenation of olefins (AHO) is chosen to demonstrate the utility of our protocol. A meta-model is trained in a binary classification setting to identify highly enantioselective AHO reactions. Two Bayesian meta-learning approaches are considered, namely, deep kernel transfer (DKT) and adaptive deep kernel fitting (ADKF). Both these methods returned better predictions compared to prototypical network, which is another popular meta-learning approach. Single-task methods, such as random forest, graph neural network, and deep kernel learning, performed worse than meta-learning methods even when trained on full training data. Additionally, we propose another meta-learning approach called ADKF-prior that is shown to further improve the performance in low-data settings. The generalizability of our meta-model is also evaluated on substrate- and time-based splits. Our meta-learning workflow can be utilized to build a pretrained meta-model for any reaction of interest, which can then be useful to predict the outcome of new but related reactions in a few-shot manners.</p>","PeriodicalId":125,"journal":{"name":"Angewandte Chemie International Edition","volume":"64 27","pages":""},"PeriodicalIF":16.9000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/anie.202503821","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Angewandte Chemie International Edition","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/anie.202503821","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Recent years have witnessed the increasing application of machine learning (ML) in chemical reaction development. These ML methods, in general, require huge training set examples. The published literature has large amounts of data, but there are modelling challenges due to the sparse nature of these datasets. Herein, we report a meta-learning workflow that can utilize the literature-mined data and return accurate predictions with limited data. A literature dataset comprising of over 12 000 transition metal catalyzed asymmetric hydrogenation of olefins (AHO) is chosen to demonstrate the utility of our protocol. A meta-model is trained in a binary classification setting to identify highly enantioselective AHO reactions. Two Bayesian meta-learning approaches are considered, namely, deep kernel transfer (DKT) and adaptive deep kernel fitting (ADKF). Both these methods returned better predictions compared to prototypical network, which is another popular meta-learning approach. Single-task methods, such as random forest, graph neural network, and deep kernel learning, performed worse than meta-learning methods even when trained on full training data. Additionally, we propose another meta-learning approach called ADKF-prior that is shown to further improve the performance in low-data settings. The generalizability of our meta-model is also evaluated on substrate- and time-based splits. Our meta-learning workflow can be utilized to build a pretrained meta-model for any reaction of interest, which can then be useful to predict the outcome of new but related reactions in a few-shot manners.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于贝叶斯元学习的烯烃不对称加氢少次反应结果预测

近年来，机器学习（ML）在反应开发中的应用越来越多。一般来说，这些机器学习方法需要大量的训练集。已发表的文献有大量的数据，但由于这些数据集的稀疏性，存在建模挑战。在这里，我们报告了一个元学习工作流，它可以利用文献挖掘的数据，并在有限的数据下返回准确的预测。一个文献数据集包括超过12000个过渡金属催化烯烃不对称氢化（who）被选择来证明我们的协议的效用。在二元分类设置中训练了一个元模型，以确定高度对映选择性的世卫组织反应。考虑了两种贝叶斯元学习方法，即深度核迁移（DKT）和自适应深度核拟合（ADKF）。与原型网络相比，这两种方法都得到了更好的预测结果。随机森林和图神经网络等单任务学习方法的表现不如元学习方法。此外，我们提出了另一种称为ADKF-prior的元学习方法，该方法被证明可以进一步提高低数据设置下的性能。我们的元学习工作流程可以用来为任何感兴趣的反应构建预训练的元模型，然后可以用少量的方式预测新的但相关的反应的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Angewandte Chemie International Edition 化学-化学综合

CiteScore

26.60

自引率

6.60%

发文量

3549

审稿时长

1.5 months

期刊介绍： Angewandte Chemie, a journal of the German Chemical Society (GDCh), maintains a leading position among scholarly journals in general chemistry with an impressive Impact Factor of 16.6 (2022 Journal Citation Reports, Clarivate, 2023). Published weekly in a reader-friendly format, it features new articles almost every day. Established in 1887, Angewandte Chemie is a prominent chemistry journal, offering a dynamic blend of Review-type articles, Highlights, Communications, and Research Articles on a weekly basis, making it unique in the field.