AI-Guided Design of MALDI Matrices: Exploring the Electron Transfer Chemical Space for Mass Spectrometric Analysis of Low-Molecular-Weight Compounds.

IF 3.1 2区 化学 Q2 BIOCHEMICAL RESEARCH METHODS Journal of the American Society for Mass Spectrometry Pub Date : 2024-10-14 DOI:10.1021/jasms.4c00186
Carlos A Padilla, Luis M Díaz-Sánchez, Cristian Blanco-Tirado, Aldo F Combariza, Marianny Y Combariza
{"title":"AI-Guided Design of MALDI Matrices: Exploring the Electron Transfer Chemical Space for Mass Spectrometric Analysis of Low-Molecular-Weight Compounds.","authors":"Carlos A Padilla, Luis M Díaz-Sánchez, Cristian Blanco-Tirado, Aldo F Combariza, Marianny Y Combariza","doi":"10.1021/jasms.4c00186","DOIUrl":null,"url":null,"abstract":"<p><p>The development of matrices for Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI MS) has traditionally relied on experimental efforts. Here, we propose a Goal-Directed artificial intelligence generative model, fueled by computational chemistry calculated data, to construct a chemical space optimized for Electron Transfer (ET) processes in MALDI analysis. We utilized a group of 30 reported ET matrices, subjected to structural enumeration and molecular properties prediction using semiempirical and <i>ab initio</i> calculations, to establish a comprehensive database comprising diverse structural and property data. Subsequently, employing a protocol of structural enumeration with 68 canonical SMILES of Bemis-Murcko (BM) fragments, we expanded the structural complexity of the initial library. This process generated 82753 compounds organized into 10 scaffold levels, with a p50 index from the Cyclic System Retrieval (CSR) curve of scaffolds of 50%. From the resulting enumerated library, a diverse subset of structures was selected by using the Jarvis-Patrick clustering method. These structures, along with their associated properties measured from quantum mechanics and experimental data, were used to train a Machine Learning (ML) model to predict ionization energy (<i>E</i><sub><i>i</i></sub>) values. Subsequently, a Scoring Neural Network (SNN), coupled with our Goal-Directed generative model using a Recurrent Neural Network (RNN) with Deep Learning (DL) architectures, was trained. The generative model was guided using a prior network within a Reinforcement/Transfer Learning environment. The final AI-generative model learned that structures with high unsaturation, H/C ratios under 1, and molecular weights between 100 and 300 u are favorable for ET MALDI matrices, as well as those with few aromatic rings and zero aliphatic rings. Other molecular features were also favored. The resulting AI-generated library exhibits <i>E</i><sub><i>i</i></sub> values over 8.0 eV, akin to those of reported \"good\" ET MALDI matrices, indicating successful design with high synthesis accessibility scores. In conclusion, our generative model provided valuable insights into the molecular features ideal for ET MALDI compounds while generating a wide range of structurally diverse molecules within a similar molecular property space. The next critical step in this process is to synthesize a selection of these generated compounds for the experimental validation and further characterization.</p>","PeriodicalId":672,"journal":{"name":"Journal of the American Society for Mass Spectrometry","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Society for Mass Spectrometry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/jasms.4c00186","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The development of matrices for Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI MS) has traditionally relied on experimental efforts. Here, we propose a Goal-Directed artificial intelligence generative model, fueled by computational chemistry calculated data, to construct a chemical space optimized for Electron Transfer (ET) processes in MALDI analysis. We utilized a group of 30 reported ET matrices, subjected to structural enumeration and molecular properties prediction using semiempirical and ab initio calculations, to establish a comprehensive database comprising diverse structural and property data. Subsequently, employing a protocol of structural enumeration with 68 canonical SMILES of Bemis-Murcko (BM) fragments, we expanded the structural complexity of the initial library. This process generated 82753 compounds organized into 10 scaffold levels, with a p50 index from the Cyclic System Retrieval (CSR) curve of scaffolds of 50%. From the resulting enumerated library, a diverse subset of structures was selected by using the Jarvis-Patrick clustering method. These structures, along with their associated properties measured from quantum mechanics and experimental data, were used to train a Machine Learning (ML) model to predict ionization energy (Ei) values. Subsequently, a Scoring Neural Network (SNN), coupled with our Goal-Directed generative model using a Recurrent Neural Network (RNN) with Deep Learning (DL) architectures, was trained. The generative model was guided using a prior network within a Reinforcement/Transfer Learning environment. The final AI-generative model learned that structures with high unsaturation, H/C ratios under 1, and molecular weights between 100 and 300 u are favorable for ET MALDI matrices, as well as those with few aromatic rings and zero aliphatic rings. Other molecular features were also favored. The resulting AI-generated library exhibits Ei values over 8.0 eV, akin to those of reported "good" ET MALDI matrices, indicating successful design with high synthesis accessibility scores. In conclusion, our generative model provided valuable insights into the molecular features ideal for ET MALDI compounds while generating a wide range of structurally diverse molecules within a similar molecular property space. The next critical step in this process is to synthesize a selection of these generated compounds for the experimental validation and further characterization.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
人工智能引导的 MALDI 基质设计:探索用于低分子量化合物质谱分析的电子转移化学空间
基质辅助激光解吸电离质谱法(MALDI MS)的基质开发历来依赖于实验工作。在此,我们提出了一种目标导向型人工智能生成模型,该模型以计算化学计算数据为基础,构建了一个优化的化学空间,用于 MALDI 分析中的电子转移(ET)过程。我们利用一组 30 个已报道的 ET 矩阵,通过半经验计算和 ab initio 计算进行结构列举和分子性质预测,建立了一个包含各种结构和性质数据的综合数据库。随后,我们利用 68 个 Bemis-Murcko(BM)片段的典型 SMILES 结构枚举协议,扩大了初始库的结构复杂性。这一过程产生了 82753 个化合物,分为 10 个支架级别,支架循环系统检索(CSR)曲线的 p50 指数为 50%。通过使用 Jarvis-Patrick 聚类方法,从由此产生的枚举式化合物库中筛选出不同的结构子集。这些结构及其通过量子力学和实验数据测得的相关特性被用于训练机器学习(ML)模型,以预测电离能(Ei)值。随后,我们使用具有深度学习(DL)架构的循环神经网络(RNN)训练了一个评分神经网络(SNN),并结合我们的目标导向生成模型。生成模型在强化/迁移学习环境中使用先验网络进行引导。最终的人工智能生成模型发现,不饱和度高、H/C 比值低于 1、分子量在 100 到 300 u 之间的结构,以及芳香环少、脂肪环为零的结构,对 ET MALDI 矩阵有利。其他分子特征也受到青睐。由此生成的人工智能库的 Ei 值超过 8.0 eV,与已报道的 "好 "ET MALDI 基质的 Ei 值相近,表明设计成功,合成可得性得分高。总之,我们的生成模型为 ET MALDI 理想化合物的分子特征提供了宝贵的见解,同时在相似的分子特性空间内生成了大量结构多样的分子。这一过程的下一个关键步骤是合成这些生成化合物中的一部分,以便进行实验验证和进一步表征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.50
自引率
9.40%
发文量
257
审稿时长
1 months
期刊介绍: The Journal of the American Society for Mass Spectrometry presents research papers covering all aspects of mass spectrometry, incorporating coverage of fields of scientific inquiry in which mass spectrometry can play a role. Comprehensive in scope, the journal publishes papers on both fundamentals and applications of mass spectrometry. Fundamental subjects include instrumentation principles, design, and demonstration, structures and chemical properties of gas-phase ions, studies of thermodynamic properties, ion spectroscopy, chemical kinetics, mechanisms of ionization, theories of ion fragmentation, cluster ions, and potential energy surfaces. In addition to full papers, the journal offers Communications, Application Notes, and Accounts and Perspectives
期刊最新文献
Infrared Laser Ablation and Capture of Formalin-Fixed Paraffin-Embedded Tissue. Quantitative Analysis of Drugs in a Mimetic Tissue Model Using Nano-DESI on a Triple Quadrupole Mass Spectrometer. Development of a Novel Label-Free Subunit HILIC-MS Method for Domain-Specific Free Thiol Identification and Quantitation in Therapeutic Monoclonal Antibodies. Single Cell MALDI-MSI Analysis of Lipids and Proteins within a Replicative Senescence Fibroblast Model. Processing Next-Generation Mass Spectrometry Imaging Data: Principal Component Analysis at Scale.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1