首页 > 最新文献

Journal of Chemical Information and Modeling 最新文献

英文 中文
Identification of Macrophage-Associated Novel Drug Targets in Atherosclerosis Based on Integrated Transcriptome Features. 基于综合转录组特征识别动脉粥样硬化中与巨噬细胞相关的新药靶点
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-12-09 Epub Date: 2024-11-20 DOI: 10.1021/acs.jcim.4c01558
Jingzhi Wang, Sida Qin, Xiaohui Zhang, Jixin Zhi

Background: This study explores the pathological mechanisms of atherosclerosis (AS), focusing on the role of macrophages in its formation and development, and potential therapeutic targets.

Methods: The heterogeneity of the AS single-cell data set GSE131778 was analyzed using Seurat. Tissue sequencing data GSE28829 and GSE43292 were analyzed for immune cell abundance using CIBERSORT. Differential genes were identified, and WGCNA was used to create a coexpression network. Hub genes were identified using MCODE and CytoHubba and analyzed with GO and KEGG enrichment analysis, GSVA, and immune infiltration analysis. DrugBank identified potential drugs, and molecular docking verified drug binding to key targets. Key targets were experimentally validated.

Results: Nineteen cell clusters were identified in the GSE131778 data set, classified into ten cell types. Macrophages in AS and normal tissues were identified based on cell abundance. CIBERSORT showed a significant increase in cell cluster 9 in AS samples. Thirty-two hub genes, including CD86, LILRB2, and IRF8, were validated. GO and KEGG analyses indicated Hub genes primarily affect immune functions. GSVA identified 29 significantly increased pathways in AS samples. Immune infiltration analysis revealed a positive correlation between IRF8, CD86, and LILRB2 expression and macrophage content. Molecular docking suggested CD86 as a potential drug target for AS. qRT-PCR confirmed increased IRF8 and CD86 expression.

Conclusions: CD86, LILRB2, and IRF8 are highly expressed in foam cell samples, with CD86 forming hydrogen bonds with several AS drugs, indicating CD86 as a promising target for AS treatment.

背景:本研究探讨了动脉粥样硬化(AS)的病理机制,重点是巨噬细胞在其形成和发展中的作用以及潜在的治疗靶点:本研究探讨了动脉粥样硬化(AS)的病理机制,重点是巨噬细胞在其形成和发展中的作用以及潜在的治疗靶点:使用 Seurat 分析了 AS 单细胞数据集 GSE131778 的异质性。使用 CIBERSORT 分析了组织测序数据 GSE28829 和 GSE43292 的免疫细胞丰度。找出差异基因,并使用 WGCNA 创建共表达网络。利用 MCODE 和 CytoHubba 确定了枢纽基因,并通过 GO 和 KEGG 富集分析、GSVA 和免疫浸润分析进行了分析。DrugBank 确定了潜在的药物,分子对接验证了药物与关键靶点的结合。对关键靶点进行了实验验证:结果:在GSE131778数据集中发现了19个细胞群,分为10种细胞类型。根据细胞丰度确定了强直性脊柱炎和正常组织中的巨噬细胞。CIBERSORT显示强直性脊柱炎样本中细胞群9显著增加。包括CD86、LILRB2和IRF8在内的32个中心基因得到了验证。GO和KEGG分析表明,枢纽基因主要影响免疫功能。GSVA在强直性脊柱炎样本中发现了29条明显增加的通路。免疫浸润分析表明,IRF8、CD86和LILRB2的表达与巨噬细胞含量呈正相关。qRT-PCR证实了IRF8和CD86表达的增加:结论:CD86、LILRB2和IRF8在泡沫细胞样本中高表达,CD86与多种强直性脊柱炎药物形成氢键,表明CD86是治疗强直性脊柱炎的潜在靶点。
{"title":"Identification of Macrophage-Associated Novel Drug Targets in Atherosclerosis Based on Integrated Transcriptome Features.","authors":"Jingzhi Wang, Sida Qin, Xiaohui Zhang, Jixin Zhi","doi":"10.1021/acs.jcim.4c01558","DOIUrl":"10.1021/acs.jcim.4c01558","url":null,"abstract":"<p><strong>Background: </strong>This study explores the pathological mechanisms of atherosclerosis (AS), focusing on the role of macrophages in its formation and development, and potential therapeutic targets.</p><p><strong>Methods: </strong>The heterogeneity of the AS single-cell data set GSE131778 was analyzed using Seurat. Tissue sequencing data GSE28829 and GSE43292 were analyzed for immune cell abundance using CIBERSORT. Differential genes were identified, and WGCNA was used to create a coexpression network. Hub genes were identified using MCODE and CytoHubba and analyzed with GO and KEGG enrichment analysis, GSVA, and immune infiltration analysis. DrugBank identified potential drugs, and molecular docking verified drug binding to key targets. Key targets were experimentally validated.</p><p><strong>Results: </strong>Nineteen cell clusters were identified in the GSE131778 data set, classified into ten cell types. Macrophages in AS and normal tissues were identified based on cell abundance. CIBERSORT showed a significant increase in cell cluster 9 in AS samples. Thirty-two hub genes, including CD86, LILRB2, and IRF8, were validated. GO and KEGG analyses indicated Hub genes primarily affect immune functions. GSVA identified 29 significantly increased pathways in AS samples. Immune infiltration analysis revealed a positive correlation between IRF8, CD86, and LILRB2 expression and macrophage content. Molecular docking suggested CD86 as a potential drug target for AS. qRT-PCR confirmed increased IRF8 and CD86 expression.</p><p><strong>Conclusions: </strong>CD86, LILRB2, and IRF8 are highly expressed in foam cell samples, with CD86 forming hydrogen bonds with several AS drugs, indicating CD86 as a promising target for AS treatment.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"9009-9020"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142680023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Prediction of Ligand-Protein Binding Affinities by Meta-modeling. 通过元建模改进配体与蛋白质结合亲和力的预测。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-12-09 Epub Date: 2024-11-22 DOI: 10.1021/acs.jcim.4c01116
Ho-Joon Lee, Prashant S Emani, Mark B Gerstein

The accurate screening of candidate drug ligands against target proteins through computational approaches is of prime interest to drug development efforts. Such virtual screening depends in part on methods to predict the binding affinity between ligands and proteins. Many computational models for binding affinity prediction have been developed, but with varying results across targets. Given that ensembling or meta-modeling approaches have shown great promise in reducing model-specific biases, we develop a framework to integrate published force-field-based empirical docking and sequence-based deep learning models. In building this framework, we evaluate many combinations of individual base models, training databases, and several meta-modeling approaches. We show that many of our meta-models significantly improve affinity predictions over base models. Our best meta-models achieve comparable performance to state-of-the-art deep learning tools exclusively based on 3D structures while allowing for improved database scalability and flexibility through the explicit inclusion of features such as physicochemical properties or molecular descriptors. We further demonstrate improved generalization capability by our models using a large-scale benchmark of affinity prediction as well as a virtual screening application benchmark. Overall, we demonstrate that diverse modeling approaches can be ensembled together to gain meaningful improvement in binding affinity prediction.

通过计算方法针对目标蛋白质准确筛选候选药物配体,是药物开发工作的重中之重。这种虚拟筛选部分取决于配体与蛋白质之间结合亲和力的预测方法。目前已开发出许多用于预测结合亲和力的计算模型,但不同靶标的结果各不相同。鉴于集合或元建模方法在减少特定模型偏差方面已显示出巨大前景,我们开发了一个框架来整合已发表的基于力场的经验对接和基于序列的深度学习模型。在构建这一框架的过程中,我们对单个基础模型、训练数据库和几种元建模方法的多种组合进行了评估。我们的研究表明,与基础模型相比,我们的许多元模型都能显著提高亲和力预测。我们的最佳元模型与完全基于三维结构的最先进深度学习工具性能相当,同时通过明确纳入理化性质或分子描述符等特征,提高了数据库的可扩展性和灵活性。我们还利用大规模亲和力预测基准和虚拟筛选应用基准进一步证明了我们的模型具有更强的泛化能力。总之,我们证明了多种建模方法可以组合在一起,从而在结合亲和力预测方面获得有意义的改进。
{"title":"Improved Prediction of Ligand-Protein Binding Affinities by Meta-modeling.","authors":"Ho-Joon Lee, Prashant S Emani, Mark B Gerstein","doi":"10.1021/acs.jcim.4c01116","DOIUrl":"10.1021/acs.jcim.4c01116","url":null,"abstract":"<p><p>The accurate screening of candidate drug ligands against target proteins through computational approaches is of prime interest to drug development efforts. Such virtual screening depends in part on methods to predict the binding affinity between ligands and proteins. Many computational models for binding affinity prediction have been developed, but with varying results across targets. Given that ensembling or meta-modeling approaches have shown great promise in reducing model-specific biases, we develop a framework to integrate published force-field-based empirical docking and sequence-based deep learning models. In building this framework, we evaluate many combinations of individual base models, training databases, and several meta-modeling approaches. We show that many of our meta-models significantly improve affinity predictions over base models. Our best meta-models achieve comparable performance to state-of-the-art deep learning tools exclusively based on 3D structures while allowing for improved database scalability and flexibility through the explicit inclusion of features such as physicochemical properties or molecular descriptors. We further demonstrate improved generalization capability by our models using a large-scale benchmark of affinity prediction as well as a virtual screening application benchmark. Overall, we demonstrate that diverse modeling approaches can be ensembled together to gain meaningful improvement in binding affinity prediction.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8684-8704"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11632770/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142692246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RankMHC: Learning to Rank Class-I Peptide-MHC Structural Models. RankMHC:学习对 I 类肽-MHC 结构模型进行排序。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-12-09 Epub Date: 2024-11-18 DOI: 10.1021/acs.jcim.4c01278
Romanos Fasoulis, Georgios Paliouras, Lydia E Kavraki

The binding of peptides to class-I Major Histocompability Complex (MHC) receptors and their subsequent recognition downstream by T-cell receptors are crucial processes for most multicellular organisms to be able to fight various diseases. Thus, the identification of peptide antigens that can elicit an immune response is of immense importance for developing successful therapies for bacterial and viral infections, even cancer. Recently, studies have demonstrated the importance of peptide-MHC (pMHC) structural analysis, with pMHC structural modeling methods gradually becoming more popular in peptide antigen identification workflows. Most of the pMHC structural modeling tools provide an ensemble of candidate peptide poses in the MHC-I cleft, each associated with a score stemming from a scoring function, with the top scoring pose assumed to be the most representative of the ensemble. However, identifying the binding mode, that is, the peptide pose from the ensemble that is closer to an unavailable native structure, is not trivial. Oftentimes, the peptide poses characterized as best by a protein-ligand scoring function are not the ones that are the most representative of the actual structure. In this work, we frame the peptide binding pose identification problem as a Learning-to-Rank (LTR) problem. We present RankMHC, an LTR-based pMHC binding mode identification predictor, which is specifically trained to predict the most accurate ranking of an ensemble of pMHC conformations. RankMHC outperforms classical peptide-ligand scoring functions, as well as previous Machine Learning (ML)-based binding pose predictors. We further demonstrate that RankMHC can be used with many pMHC structural modeling tools that use different structural modeling protocols.

肽与第一类主要组织相容性复合物(MHC)受体的结合,以及随后被下游的 T 细胞受体识别,是大多数多细胞生物体对抗各种疾病的关键过程。因此,识别能引起免疫反应的多肽抗原对于开发成功的细菌和病毒感染甚至癌症疗法具有极其重要的意义。最近的研究证明了多肽-MHC(pMHC)结构分析的重要性,pMHC 结构建模方法在多肽抗原鉴定工作流程中逐渐流行起来。大多数 pMHC 结构建模工具都提供了 MHC-I 裂隙中候选肽姿势的集合,每个姿势都有一个源自评分函数的分数,得分最高的姿势被假定为集合中最具代表性的姿势。然而,识别结合模式,即从集合中找出更接近于不可用的原生结构的肽位点,并非易事。通常情况下,蛋白质配体评分函数认为最佳的多肽姿势并不是最能代表实际结构的姿势。在这项研究中,我们将肽结合姿态识别问题归结为学习排名(Learning-to-Rank,LTR)问题。我们提出的 RankMHC 是一种基于 LTR 的 pMHC 结合模式识别预测器,它经过专门训练,可以预测 pMHC 构象集合中最准确的排序。RankMHC 的性能优于经典的肽配体评分函数,也优于以前基于机器学习(ML)的结合姿态预测器。我们进一步证明,RankMHC 可以与许多使用不同结构建模协议的 pMHC 结构建模工具一起使用。
{"title":"RankMHC: Learning to Rank Class-I Peptide-MHC Structural Models.","authors":"Romanos Fasoulis, Georgios Paliouras, Lydia E Kavraki","doi":"10.1021/acs.jcim.4c01278","DOIUrl":"10.1021/acs.jcim.4c01278","url":null,"abstract":"<p><p>The binding of peptides to class-I Major Histocompability Complex (MHC) receptors and their subsequent recognition downstream by T-cell receptors are crucial processes for most multicellular organisms to be able to fight various diseases. Thus, the identification of peptide antigens that can elicit an immune response is of immense importance for developing successful therapies for bacterial and viral infections, even cancer. Recently, studies have demonstrated the importance of peptide-MHC (pMHC) structural analysis, with pMHC structural modeling methods gradually becoming more popular in peptide antigen identification workflows. Most of the pMHC structural modeling tools provide an ensemble of candidate peptide poses in the MHC-I cleft, each associated with a score stemming from a scoring function, with the top scoring pose assumed to be the most representative of the ensemble. However, identifying the binding mode, that is, the peptide pose from the ensemble that is closer to an unavailable native structure, is not trivial. Oftentimes, the peptide poses characterized as best by a protein-ligand scoring function are not the ones that are the most representative of the actual structure. In this work, we frame the peptide binding pose identification problem as a Learning-to-Rank (LTR) problem. We present RankMHC, an LTR-based pMHC binding mode identification predictor, which is specifically trained to predict the most accurate ranking of an ensemble of pMHC conformations. RankMHC outperforms classical peptide-ligand scoring functions, as well as previous Machine Learning (ML)-based binding pose predictors. We further demonstrate that RankMHC can be used with many pMHC structural modeling tools that use different structural modeling protocols.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8729-8742"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11633655/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142646381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Influence of Stereochemistry in a Local Approach for Calculating Protein Conformations. 立体化学在计算蛋白质构象的局部方法中的影响
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-12-09 Epub Date: 2024-11-19 DOI: 10.1021/acs.jcim.4c01232
Wagner da Rocha, Leo Liberti, Antonio Mucherino, Thérèse E Malliavin

Protein structure prediction is generally based on the use of local conformational information coupled with long-range distance restraints. Such restraints can be derived from the knowledge of a template structure or the analysis of protein sequence alignment in the framework of models arising from the physics of disordered systems. The accuracy of approaches based on sequence alignment, however, is limited in the case where the number of aligned sequences is small. Here, we derive protein conformations using only local conformations knowledge by means of the interval Branch-and-Prune algorithm. The computation efficiency is directly related to the knowledge of stereochemistry (bond angle and ω values) along the protein sequence and, in particular, to the variations of the torsion angle ω. The impact of stereochemistry variations is particularly strong in the case of protein topologies defined from numerous long-range restraints, as in the case of protein of β secondary structures. The systematic enumeration of the conformations improves the efficiency of the calculations. The analysis of DNA codons permits to connect the variations of torsion angle ω to the positions of rare DNA codons.

蛋白质结构预测通常基于局部构象信息和长程距离约束。这些约束条件可以从模板结构知识或无序系统物理学模型框架下的蛋白质序列比对分析中获得。然而,基于序列比对的方法在比对序列数量较少的情况下准确性有限。在这里,我们通过间隔分支和剪切算法,仅利用局部构象知识来推导蛋白质构象。计算效率与沿蛋白质序列的立体化学知识(键角和 ω 值)直接相关,特别是与扭转角 ω 的变化直接相关。如果蛋白质拓扑结构是由许多长程约束条件确定的,那么立体化学变化的影响就特别大,β 二级结构的蛋白质就是这种情况。对构象的系统列举提高了计算效率。通过分析 DNA 密码子,可以将扭转角 ω 的变化与稀有 DNA 密码子的位置联系起来。
{"title":"Influence of Stereochemistry in a Local Approach for Calculating Protein Conformations.","authors":"Wagner da Rocha, Leo Liberti, Antonio Mucherino, Thérèse E Malliavin","doi":"10.1021/acs.jcim.4c01232","DOIUrl":"10.1021/acs.jcim.4c01232","url":null,"abstract":"<p><p>Protein structure prediction is generally based on the use of local conformational information coupled with long-range distance restraints. Such restraints can be derived from the knowledge of a template structure or the analysis of protein sequence alignment in the framework of models arising from the physics of disordered systems. The accuracy of approaches based on sequence alignment, however, is limited in the case where the number of aligned sequences is small. Here, we derive protein conformations using only local conformations knowledge by means of the interval Branch-and-Prune algorithm. The computation efficiency is directly related to the knowledge of stereochemistry (bond angle and ω values) along the protein sequence and, in particular, to the variations of the torsion angle ω. The impact of stereochemistry variations is particularly strong in the case of protein topologies defined from numerous long-range restraints, as in the case of protein of β secondary structures. The systematic enumeration of the conformations improves the efficiency of the calculations. The analysis of DNA codons permits to connect the variations of torsion angle ω to the positions of rare DNA codons.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8999-9008"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142666470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ConfRank: Improving GFN-FF Conformer Ranking with Pairwise Training. ConfRank:利用成对训练改进 GFN-FF 对像排序。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-12-09 Epub Date: 2024-11-20 DOI: 10.1021/acs.jcim.4c01524
Christian Hölzer, Rick Oerder, Stefan Grimme, Jan Hamaekers

Conformer ranking is a crucial task for drug discovery, with methods for generating conformers often based on molecular (meta)dynamics or sophisticated sampling techniques. These methods are constrained by the underlying force computation regarding runtime and energy ranking accuracy, limiting their effectiveness for large-scale screening applications. To address these ranking limitations, we introduce ConfRank, a machine learning-based approach that enhances conformer ranking using pairwise training. We demonstrate its performance using GFN-FF-generated conformer ensembles, leveraging the DimeNet++ architecture trained on pairs of 159 760 uncharged organic compounds from the GEOM data set with r2SCAN-3c reference level. Instead of predicting only on single molecules, this approach captures relative energy differences between conformers, leading to a significant improvement of the overall conformational ranking, outperforming GFN-FF and GFN2-xTB. Thereby, the pairwise RMSD of the relative energy difference of two conformers can be reduced from 5.65 to 0.71 kcal mol-1 on the test data set, allowing to correctly identify up to 81% of all lowest lying conformers correctly (GFN-FF: 10%, GFN2-xTB: 47%). The ConfRank approach is cost-effective, allowing for scalable deployment on both CPU and GPU, achieving runtime accelerations by up to 2 orders of magnitude compared to GFN2-xTB. Out-of-sample investigations on CREST-generated conformer ensembles from the QM9 data set and conformers taken from an extended GMTKN55 data set show promising results for the robustness of this approach. Thereby, ranking correlation coefficient such as Spearman can be improved to 0.90 (GFN-FF: 0.39, GFN2-xTB: 0.84) reducing the probability of an incorrect sign flip in pairwise energy comparison from 32 to 7%. On the extended GMTKN55 subsets the pairwise MAD (RMSD) could be reduced on almost all subsets by up to 62% (58%) with an average improvement of 30% (29%). Moreover, an exemplary case study on vancomycin shows similar performance, indicating applicability to larger (bio)molecular structures. Furthermore, we motivate the usage of the pairwise training approach from a theoretical perspective, highlighting that while pairwise training can lead to a decline in single sample prediction of absolute energies for ML models, it significantly enhances conformer ranking performance. The data and models used in this study are available at https://github.com/grimme-lab/confrank.

构象排序是药物发现的一项关键任务,生成构象的方法通常基于分子(元)动力学或复杂的采样技术。这些方法在运行时间和能量排序准确性方面受到底层力计算的限制,从而限制了它们在大规模筛选应用中的有效性。为了解决这些排序限制,我们引入了 ConfRank,这是一种基于机器学习的方法,可通过成对训练来增强构象排序。我们使用 GFN-FF 生成的构象体集合,利用 DimeNet++ 架构,对来自 GEOM 数据集的 159 760 种不带电荷的有机化合物进行成对训练,并采用 r2SCAN-3c 参考水平,展示了该方法的性能。这种方法不仅能预测单个分子,还能捕捉构象间的相对能量差异,从而显著改善整体构象排序,优于 GFN-FF 和 GFN2-xTB。因此,在测试数据集上,两个构象间相对能量差的成对 RMSD 可从 5.65 kcal mol-1 降至 0.71 kcal mol-1,从而正确识别出高达 81% 的最低构象(GFN-FF:10%,GFN2-xTB:47%)。ConfRank 方法具有很高的成本效益,可以在 CPU 和 GPU 上进行扩展部署,与 GFN2-xTB 相比,其运行速度最多可提高 2 个数量级。对来自 QM9 数据集的 CREST 生成的构象体集合和来自扩展的 GMTKN55 数据集的构象体进行的样本外研究表明,这种方法的鲁棒性很好。因此,Spearman 等排序相关系数可以提高到 0.90(GFN-FF:0.39,GFN2-xTB:0.84),将成对能量比较中错误符号翻转的概率从 32% 降低到 7%。在扩展的 GMTKN55 子集中,几乎所有子集的成对 MAD(RMSD)都能降低 62% (58%),平均提高 30% (29%)。此外,对万古霉素的典型案例研究也显示了类似的性能,这表明该方法适用于较大的(生物)分子结构。此外,我们还从理论角度解释了配对训练方法的使用动机,强调配对训练虽然会导致 ML 模型单样本绝对能量预测的下降,但却能显著提高构象排序性能。本研究使用的数据和模型可在 https://github.com/grimme-lab/confrank 上获取。
{"title":"ConfRank: Improving GFN-FF Conformer Ranking with Pairwise Training.","authors":"Christian Hölzer, Rick Oerder, Stefan Grimme, Jan Hamaekers","doi":"10.1021/acs.jcim.4c01524","DOIUrl":"10.1021/acs.jcim.4c01524","url":null,"abstract":"<p><p>Conformer ranking is a crucial task for drug discovery, with methods for generating conformers often based on molecular (meta)dynamics or sophisticated sampling techniques. These methods are constrained by the underlying force computation regarding runtime and energy ranking accuracy, limiting their effectiveness for large-scale screening applications. To address these ranking limitations, we introduce ConfRank, a machine learning-based approach that enhances conformer ranking using pairwise training. We demonstrate its performance using GFN-FF-generated conformer ensembles, leveraging the DimeNet++ architecture trained on pairs of 159 760 uncharged organic compounds from the GEOM data set with r<sup>2</sup>SCAN-3c reference level. Instead of predicting only on single molecules, this approach captures relative energy differences between conformers, leading to a significant improvement of the overall conformational ranking, outperforming GFN-FF and GFN2-xTB. Thereby, the pairwise RMSD of the relative energy difference of two conformers can be reduced from 5.65 to 0.71 kcal mol<sup>-1</sup> on the test data set, allowing to correctly identify up to 81% of all lowest lying conformers correctly (GFN-FF: 10%, GFN2-xTB: 47%). The ConfRank approach is cost-effective, allowing for scalable deployment on both CPU and GPU, achieving runtime accelerations by up to 2 orders of magnitude compared to GFN2-xTB. Out-of-sample investigations on CREST-generated conformer ensembles from the QM9 data set and conformers taken from an extended GMTKN55 data set show promising results for the robustness of this approach. Thereby, ranking correlation coefficient such as Spearman can be improved to 0.90 (GFN-FF: 0.39, GFN2-xTB: 0.84) reducing the probability of an incorrect sign flip in pairwise energy comparison from 32 to 7%. On the extended GMTKN55 subsets the pairwise MAD (RMSD) could be reduced on almost all subsets by up to 62% (58%) with an average improvement of 30% (29%). Moreover, an exemplary case study on vancomycin shows similar performance, indicating applicability to larger (bio)molecular structures. Furthermore, we motivate the usage of the pairwise training approach from a theoretical perspective, highlighting that while pairwise training can lead to a decline in single sample prediction of absolute energies for ML models, it significantly enhances conformer ranking performance. The data and models used in this study are available at https://github.com/grimme-lab/confrank.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8909-8925"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142680019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Delving into Macrolide Binding Affinities and Associated Structural Modulations in Erythromycin Esterase C: Insights into the Venus Flytrap Mechanism. 深入研究红霉素酯酶 C 中大环内酯类化合物的结合亲和力及相关结构调整:洞察维纳斯捕蝇草机制。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-12-09 Epub Date: 2024-11-20 DOI: 10.1021/acs.jcim.4c01523
Abhishek Bera, Pritish Joshi, Niladri Patra

Since their inception in antibacterial therapy, macrolide-based antibiotics have significantly shaped the evolutionary pathways of pathogenic bacteria, driving them to develop diverse antimicrobial resistance (AMR) mechanisms. Among these, macrolide esterase, commonly referred to as erythromycin esterase, emerged as a critical defense mechanism, enabling bacteria to detoxify macrolides by hydrolyzing the macrolactone ring within the bacterial cell. In this study, we delve into the intricate interactions and conformational dynamics of erythromycin esterase C (EreC), a key member of the Ere enzyme family. We have focused on three FDA-approved and widely prescribed macrolides─erythromycin, clarithromycin, and azithromycin─by employing classical molecular dynamics, absolute binding free energy calculations, and 2D well-tempered metadynamics simulations to explore their interactions with EreC. To estimate the absolute binding free energies, we have used the recently developed and robust "Streamlined Alchemical Free Energy Perturbation (SAFEP)" protocol. The results from our molecular dynamics simulations and advanced analyses portrayed the crucial role of hydrophobic interactions within the macrolide binding cleft of EreC, along with the significant influence of the minor lobe in facilitating overall structural fluctuation. In silico alanine scanning identified top three hydrophobic residues, i.e., PHE248, MET333, and PHE344, responsible for macrolide binding inside that cleft. According to the free energy calculations, azithromycin and clarithromycin showed greater binding affinities toward EreC than the parent macrolide erythromycin. Moreover, 2D metadynamics simulations along with graph theory-based eigenvector centrality analyses revealed a metastable "semiopen" state during the hypothesized "active loop closure" of the EreC protein triggered by subtle conformational changes of an important histidine residue, HIS289, upon macrolide capture, drawing a fascinating parallel to the renowned "Venus flytrap" mechanism.

大环内酯类抗生素自用于抗菌治疗以来,极大地改变了病原菌的进化途径,促使它们发展出多种抗菌药耐药性(AMR)机制。其中,大环内酯酯酶(通常称为红霉素酯酶)成为一种关键的防御机制,使细菌能够通过水解细菌细胞内的大环内酯环来解毒大环内酯类药物。在本研究中,我们深入研究了红霉素酯酶家族的关键成员红霉素酯酶 C(EreC)错综复杂的相互作用和构象动力学。我们采用经典分子动力学、绝对结合自由能计算和二维阶跃元动力学模拟,重点研究了三种经美国食品药物管理局(FDA)批准并广泛使用的大环内酯类药物--红霉素、克拉霉素和阿奇霉素--来探索它们与 EreC 的相互作用。为了估算绝对结合自由能,我们使用了最近开发的强大的 "简化化学自由能扰动(SAFEP)"协议。分子动力学模拟和高级分析的结果表明,在 EreC 的大环内酯结合裂隙中,疏水相互作用起着关键作用,小叶在促进整体结构波动方面也有重要影响。硅丙氨酸扫描确定了前三个疏水残基,即 PHE248、MET333 和 PHE344,它们负责在该裂隙内与大环内酯结合。根据自由能计算,阿奇霉素和克拉霉素与 EreC 的结合亲和力高于母体大环内酯红霉素。此外,二维元动力学模拟和基于图论的特征向量中心性分析表明,在大环内酯捕获时,重要组氨酸残基 HIS289 的微妙构象变化会触发 EreC 蛋白的 "活性环闭合",在此过程中会出现 "半开放 "状态,这与著名的 "维纳斯捕蝇草 "机制有着惊人的相似之处。
{"title":"Delving into Macrolide Binding Affinities and Associated Structural Modulations in Erythromycin Esterase C: Insights into the Venus Flytrap Mechanism.","authors":"Abhishek Bera, Pritish Joshi, Niladri Patra","doi":"10.1021/acs.jcim.4c01523","DOIUrl":"10.1021/acs.jcim.4c01523","url":null,"abstract":"<p><p>Since their inception in antibacterial therapy, macrolide-based antibiotics have significantly shaped the evolutionary pathways of pathogenic bacteria, driving them to develop diverse antimicrobial resistance (AMR) mechanisms. Among these, macrolide esterase, commonly referred to as erythromycin esterase, emerged as a critical defense mechanism, enabling bacteria to detoxify macrolides by hydrolyzing the macrolactone ring within the bacterial cell. In this study, we delve into the intricate interactions and conformational dynamics of erythromycin esterase C (EreC), a key member of the Ere enzyme family. We have focused on three FDA-approved and widely prescribed macrolides─erythromycin, clarithromycin, and azithromycin─by employing classical molecular dynamics, absolute binding free energy calculations, and 2D well-tempered metadynamics simulations to explore their interactions with EreC. To estimate the absolute binding free energies, we have used the recently developed and robust \"Streamlined Alchemical Free Energy Perturbation (SAFEP)\" protocol. The results from our molecular dynamics simulations and advanced analyses portrayed the crucial role of hydrophobic interactions within the macrolide binding cleft of EreC, along with the significant influence of the minor lobe in facilitating overall structural fluctuation. In silico alanine scanning identified top three hydrophobic residues, i.e., PHE248, MET333, and PHE344, responsible for macrolide binding inside that cleft. According to the free energy calculations, azithromycin and clarithromycin showed greater binding affinities toward EreC than the parent macrolide erythromycin. Moreover, 2D metadynamics simulations along with graph theory-based eigenvector centrality analyses revealed a metastable \"semiopen\" state during the hypothesized \"active loop closure\" of the EreC protein triggered by subtle conformational changes of an important histidine residue, HIS289, upon macrolide capture, drawing a fascinating parallel to the renowned \"Venus flytrap\" mechanism.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8892-8908"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142680021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conformalized Graph Learning for Molecular ADMET Property Prediction and Reliable Uncertainty Quantification. 用于分子 ADMET 特性预测和可靠的不确定性量化的拟态图学习。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-12-09 Epub Date: 2024-11-21 DOI: 10.1021/acs.jcim.4c01139
Peiyao Li, Lan Hua, Zhechao Ma, Wenbo Hu, Ye Liu, Jun Zhu

Drug discovery and development is a complex and costly process, with a substantial portion of the expense dedicated to characterizing the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of new drug candidates. While the advent of deep learning and molecular graph neural networks (GNNs) has significantly enhanced in silico ADMET prediction capabilities, reliably quantifying prediction uncertainty remains a critical challenge. The performance of GNNs is influenced by both the volume and the quality of the data. Hence, determining the reliability and extent of a prediction is as crucial as achieving accurate predictions, especially for out-of-domain (OoD) compounds. This paper introduces a novel GNN model called conformalized fusion regression (CFR). CFR combined a GNN model with a joint mean-quantile regression loss and an ensemble-based conformal prediction (CP) method. Through rigorous evaluation across various ADMET tasks, we demonstrate that our framework provides accurate predictions, reliable probability calibration, and high-quality prediction intervals, outperforming existing uncertainty quantification methods.

药物发现和开发是一个复杂且成本高昂的过程,其中很大一部分费用专门用于表征新候选药物的吸收、分布、代谢、排泄和毒性(ADMET)特性。虽然深度学习和分子图神经网络(GNNs)的出现大大增强了硅学 ADMET 预测能力,但可靠地量化预测的不确定性仍然是一项严峻的挑战。GNN 的性能受数据量和数据质量的影响。因此,确定预测的可靠性和范围与实现准确预测同样重要,特别是对于域外(OoD)化合物。本文介绍了一种名为保形化融合回归(CFR)的新型 GNN 模型。CFR 将 GNN 模型与联合均值-quantile 回归损失和基于集合的保形预测 (CP) 方法相结合。通过对各种 ADMET 任务的严格评估,我们证明了我们的框架能提供准确的预测、可靠的概率校准和高质量的预测区间,优于现有的不确定性量化方法。
{"title":"Conformalized Graph Learning for Molecular ADMET Property Prediction and Reliable Uncertainty Quantification.","authors":"Peiyao Li, Lan Hua, Zhechao Ma, Wenbo Hu, Ye Liu, Jun Zhu","doi":"10.1021/acs.jcim.4c01139","DOIUrl":"10.1021/acs.jcim.4c01139","url":null,"abstract":"<p><p>Drug discovery and development is a complex and costly process, with a substantial portion of the expense dedicated to characterizing the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of new drug candidates. While the advent of deep learning and molecular graph neural networks (GNNs) has significantly enhanced in silico ADMET prediction capabilities, reliably quantifying prediction uncertainty remains a critical challenge. The performance of GNNs is influenced by both the volume and the quality of the data. Hence, determining the reliability and extent of a prediction is as crucial as achieving accurate predictions, especially for out-of-domain (OoD) compounds. This paper introduces a novel GNN model called conformalized fusion regression (CFR). CFR combined a GNN model with a joint mean-quantile regression loss and an ensemble-based conformal prediction (CP) method. Through rigorous evaluation across various ADMET tasks, we demonstrate that our framework provides accurate predictions, reliable probability calibration, and high-quality prediction intervals, outperforming existing uncertainty quantification methods.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8705-8717"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142685406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effects of All-Atom and Coarse-Grained Molecular Mechanics Force Fields on Amyloid Peptide Assembly: The Case of a Tau K18 Monomer. 全原子和粗粒度分子力学力场对淀粉样肽组装的影响:以 Tau K18 单体为例。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-12-09 Epub Date: 2024-11-23 DOI: 10.1021/acs.jcim.4c01448
Xibing He, Viet Hoang Man, Jie Gao, Junmei Wang

To propose new mechanism-based therapeutics for Alzheimer's disease (AD), it is crucial to study the kinetics and oligomerization/aggregation mechanisms of the hallmark tau proteins, which have various isoforms and are intrinsically disordered. In this study, multiple all-atom (AA) and coarse-grained (CG) force fields (FFs) have been benchmarked on molecular dynamics (MD) simulations of K18 tau (M243-E372), which is a truncated form (130 residues) of full-length tau (441 residues). FF19SB is first excluded because the dynamics are too slow, and the conformations are too stable. All other benchmarked AAFFs (Charmm36m, FF14SB, Gromos54A7, and OPLS-AA) and CGFFs (Martini3 and Sirah2.0) exhibit a trend of shrinking K18 tau into compact structures with the radius of gyration (ROG) around 2.0 nm, which is much smaller than the experimental value of 3.8 nm, within 200 ns of AA-MD or 2000 ns of CG-MD. Gromos54A7, OPLS-AA, and Martini3 shrink much faster than the other FFs. To perform meaningful postanalysis of various properties, we propose a strategy of selecting snapshots with 2.5 < ROG < 4.5 nm, instead of using all sampled snapshots. The calculated chemical shifts of all C, CA, and CB atoms have very good and close root-mean-square error (RMSE) values, while Charmm36m and Sirah2.0 exhibit better chemical shifts of N than other FFs. Comparing the calculated distributions of the distance between the CA atoms of CYS291 and CYS322 with the results of the FRET experiment demonstrates that Charmm36m is a perfect match with the experiment while other FFs exhibit limitations. In summary, Charmm36m is recommended as the best AAFF, and Sirah2.0 is recommended as an excellent CGFF for simulating tau K18.

要针对阿尔茨海默病(AD)提出基于新机制的疗法,研究具有标志性特征的 tau 蛋白的动力学和寡聚/聚集机制至关重要。在本研究中,对 K18 tau(M243-E372)的分子动力学(MD)模拟进行了多个全原子(AA)和粗粒度(CG)力场(FF)基准测试,K18 tau 是全长 tau(441 个残基)的截短形式(130 个残基)。FF19SB 首先被排除在外,因为其动力学速度太慢,构象太稳定。所有其他基准 AAFFs(Charmm36m、FF14SB、Gromos54A7 和 OPLS-AA)和 CGFFs(Martini3 和 Sirah2.0)都显示出 K18 tau 在 AA-MD 的 200 ns 或 CG-MD 的 2000 ns 内收缩成紧凑结构的趋势,回旋半径(ROG)约为 2.0 nm,远小于实验值 3.8 nm。Gromos54A7、OPLS-AA 和 Martini3 的收缩速度比其他 FF 快得多。为了对各种特性进行有意义的后分析,我们提出了一种策略,即选择 2.5 < ROG < 4.5 nm 的快照,而不是使用所有采样快照。计算得出的所有 C、CA 和 CB 原子的化学位移都具有非常好且接近的均方根误差 (RMSE),而 Charmm36m 和 Sirah2.0 的 N 化学位移则优于其他 FF。将计算得出的 CYS291 和 CYS322 CA 原子间的距离分布与 FRET 实验结果进行比较,结果表明 Charmm36m 与实验结果完全吻合,而其他 FFs 则表现出局限性。总之,Charmm36m 被推荐为最佳 AAFF,Sirah2.0 被推荐为模拟 tau K18 的优秀 CGFF。
{"title":"Effects of All-Atom and Coarse-Grained Molecular Mechanics Force Fields on Amyloid Peptide Assembly: The Case of a Tau K18 Monomer.","authors":"Xibing He, Viet Hoang Man, Jie Gao, Junmei Wang","doi":"10.1021/acs.jcim.4c01448","DOIUrl":"10.1021/acs.jcim.4c01448","url":null,"abstract":"<p><p>To propose new mechanism-based therapeutics for Alzheimer's disease (AD), it is crucial to study the kinetics and oligomerization/aggregation mechanisms of the hallmark tau proteins, which have various isoforms and are intrinsically disordered. In this study, multiple all-atom (AA) and coarse-grained (CG) force fields (FFs) have been benchmarked on molecular dynamics (MD) simulations of K18 tau (M243-E372), which is a truncated form (130 residues) of full-length tau (441 residues). FF19SB is first excluded because the dynamics are too slow, and the conformations are too stable. All other benchmarked AAFFs (Charmm36m, FF14SB, Gromos54A7, and OPLS-AA) and CGFFs (Martini3 and Sirah2.0) exhibit a trend of shrinking K18 tau into compact structures with the radius of gyration (ROG) around 2.0 nm, which is much smaller than the experimental value of 3.8 nm, within 200 ns of AA-MD or 2000 ns of CG-MD. Gromos54A7, OPLS-AA, and Martini3 shrink much faster than the other FFs. To perform meaningful postanalysis of various properties, we propose a strategy of selecting snapshots with 2.5 < ROG < 4.5 nm, instead of using all sampled snapshots. The calculated chemical shifts of all C, CA, and CB atoms have very good and close root-mean-square error (RMSE) values, while Charmm36m and Sirah2.0 exhibit better chemical shifts of N than other FFs. Comparing the calculated distributions of the distance between the CA atoms of CYS291 and CYS322 with the results of the FRET experiment demonstrates that Charmm36m is a perfect match with the experiment while other FFs exhibit limitations. In summary, Charmm36m is recommended as the best AAFF, and Sirah2.0 is recommended as an excellent CGFF for simulating tau K18.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8880-8891"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142694876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Divide-and-Conquer Approach to Nanoparticle Global Optimisation Using Machine Learning. 利用机器学习的纳米粒子全局优化分而治之法
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-12-09 Epub Date: 2024-11-15 DOI: 10.1021/acs.jcim.4c01516
Nicholas B Smith, Anna L Garden

Global optimization of the structure of atomic nanoparticles is often hampered by the presence of many funnels on the potential energy surface. While broad funnels are readily encountered and easily exploited by the search, narrow funnels are more difficult to locate and explore, presenting a problem if the global minimum is situated in such a funnel. Here, a divide-and-conquer approach is applied to overcome the issue posed by the multifunnel effect using a machine learning approach, without using a priori knowledge of the potential energy surface. This approach begins with a truncated exploration to gather coarse-grained knowledge of the potential energy surface. This is then used to train a machine learning Gaussian mixture model to divide up the potential energy surface into separate regions, with each region then being explored in more detail (or conquered) separately. This scheme was tested on a variety of multifunnel systems and yielded significant improvements to the times taken to locate the global minima of Lennard-Jones (LJ) nanoparticles, LJ75 and LJ104, as well as two metallic systems, Au55 and Pd88. However, difficulties were encountered for LJ98, providing insight into how the scheme could be further improved.

原子纳米粒子结构的全局优化通常会受到势能面上许多漏斗的阻碍。宽漏斗很容易遇到并被搜索利用,而窄漏斗则更难定位和探索,如果全局最小值位于这样的漏斗中,就会出现问题。在此,我们采用一种分而治之的方法,在不使用势能面先验知识的情况下,利用机器学习方法克服多漏斗效应带来的问题。这种方法从截断探索开始,收集势能面的粗粒度知识。然后利用这些知识训练机器学习高斯混合模型,将势能面划分为不同的区域,然后分别对每个区域进行更详细的探索(或征服)。该方案在多种多通道系统上进行了测试,并显著缩短了定位伦纳德-琼斯(LJ)纳米粒子 LJ75 和 LJ104 以及两个金属系统 Au55 和 Pd88 的全局最小值所需的时间。然而,在 LJ98 方面遇到了困难,这为如何进一步改进该方案提供了启示。
{"title":"A Divide-and-Conquer Approach to Nanoparticle Global Optimisation Using Machine Learning.","authors":"Nicholas B Smith, Anna L Garden","doi":"10.1021/acs.jcim.4c01516","DOIUrl":"10.1021/acs.jcim.4c01516","url":null,"abstract":"<p><p>Global optimization of the structure of atomic nanoparticles is often hampered by the presence of many funnels on the potential energy surface. While broad funnels are readily encountered and easily exploited by the search, narrow funnels are more difficult to locate and explore, presenting a problem if the global minimum is situated in such a funnel. Here, a divide-and-conquer approach is applied to overcome the issue posed by the multifunnel effect using a machine learning approach, without using <i>a priori</i> knowledge of the potential energy surface. This approach begins with a truncated exploration to gather coarse-grained knowledge of the potential energy surface. This is then used to train a machine learning Gaussian mixture model to divide up the potential energy surface into separate regions, with each region then being explored in more detail (or conquered) separately. This scheme was tested on a variety of multifunnel systems and yielded significant improvements to the times taken to locate the global minima of Lennard-Jones (LJ) nanoparticles, LJ<sub>75</sub> and LJ<sub>104</sub>, as well as two metallic systems, Au<sub>55</sub> and Pd<sub>88</sub>. However, difficulties were encountered for LJ<sub>98</sub>, providing insight into how the scheme could be further improved.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8743-8755"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pairing a Global Optimization Algorithm with EXAFS to Characterize Lanthanide Structure in Solution. 将全局优化算法与 EXAFS 搭配使用,确定溶液中镧系元素的结构特征。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-12-09 Epub Date: 2024-11-22 DOI: 10.1021/acs.jcim.4c01769
Thomas J Summers, Difan Zhang, Josiane A Sobrinho, Ana de Bettencourt-Dias, Roger Rousseau, Vassiliki-Alexandra Glezakou, David C Cantu

Ensemble-average sampling of structures from ab initio molecular dynamics (AIMD) simulations can be used to predict theoretical extended X-ray absorption fine structure (EXAFS) signals that closely match experimental spectra. However, AIMD simulations are time-consuming and resource-intensive, particularly for solvated lanthanide ions, which often form multiple nonrigid geometries with high coordination numbers. To accelerate the characterization of lanthanide structures in solution, we employed the Northwest Potential Energy Surface Search Engine (NWPEsSe), an adaptive-learning global optimization algorithm, to efficiently screen first-shell structures. As case studies, we examine two systems: Eu(NO3)3 dissolved in acetonitrile with a terpyridine ligand (terpyNO2), and Nd(NO3)3 dissolved in acetonitrile. The theoretical spectra for structures identified by NWPEsSe were compared to both experimental and AIMD-derived EXAFS spectra. The NWPEsSe algorithm successfully identified the proper solvation structure for both Eu(NO3)3(terpyNO2) and Nd(NO3)(acetonitrile)3, with the calculated EXAFS signals closely matching the experimental spectra for the Eu-ligand complex and showing good similarity for the Nd salt; the better agreement with the ligand-containing structure is attributed to a less dynamic coordination environment due to the rigid ligand. The key advantage of the global optimization algorithm lies in its ability to sample the coordination environment across the potential energy surface and reduce the time required to identify structures from generally a month to within a week. Additionally, this approach is versatile and can be adapted to characterize main-group metal complexes.

从原子分子动力学(ab initio molecular dynamics,AIMD)模拟中对结构进行集合平均采样,可用于预测与实验光谱密切匹配的理论扩展 X 射线吸收精细结构(EXAFS)信号。然而,AIMD 模拟既耗时又耗费资源,尤其是对于溶解的镧系离子,它们通常会形成具有高配位数的多重非刚性几何结构。为了加快溶液中镧系元素结构的表征,我们采用了西北势能面搜索引擎(NWPEsSe)--一种自适应学习的全局优化算法--来高效筛选第一壳结构。作为案例研究,我们考察了两个系统:Eu(NO3)3溶于乙腈,并带有一个特吡啶配体(terpyNO2);Nd(NO3)3溶于乙腈。将 NWPEsSe 确定的结构的理论光谱与实验光谱和 AIMD 导出的 EXAFS 光谱进行了比较。NWPEsSe 算法成功地为 Eu(NO3)3(terpyNO2) 和 Nd(NO3)(acetonitrile)3 确定了适当的溶解结构,计算出的 EXAFS 信号与 Eu 配体复合物的实验光谱非常吻合,与 Nd 盐的实验光谱也非常相似;与含配体结构的吻合度更高,这归因于刚性配体带来的较低动态配位环境。全局优化算法的主要优势在于它能够对整个势能面的配位环境进行采样,并将确定结构所需的时间从通常的一个月缩短到一周之内。此外,这种方法用途广泛,可用于表征主族金属配合物。
{"title":"Pairing a Global Optimization Algorithm with EXAFS to Characterize Lanthanide Structure in Solution.","authors":"Thomas J Summers, Difan Zhang, Josiane A Sobrinho, Ana de Bettencourt-Dias, Roger Rousseau, Vassiliki-Alexandra Glezakou, David C Cantu","doi":"10.1021/acs.jcim.4c01769","DOIUrl":"10.1021/acs.jcim.4c01769","url":null,"abstract":"<p><p><i>Ensemble</i>-average sampling of structures from <i>ab initio</i> molecular dynamics (AIMD) simulations can be used to predict theoretical extended X-ray absorption fine structure (EXAFS) signals that closely match experimental spectra. However, AIMD simulations are time-consuming and resource-intensive, particularly for solvated lanthanide ions, which often form multiple nonrigid geometries with high coordination numbers. To accelerate the characterization of lanthanide structures in solution, we employed the Northwest Potential Energy Surface Search Engine (NWPEsSe), an adaptive-learning global optimization algorithm, to efficiently screen first-shell structures. As case studies, we examine two systems: Eu(NO<sub>3</sub>)<sub>3</sub> dissolved in acetonitrile with a terpyridine ligand (terpyNO<sub>2</sub>), and Nd(NO<sub>3</sub>)<sub>3</sub> dissolved in acetonitrile. The theoretical spectra for structures identified by NWPEsSe were compared to both experimental and AIMD-derived EXAFS spectra. The NWPEsSe algorithm successfully identified the proper solvation structure for both Eu(NO<sub>3</sub>)<sub>3</sub>(terpyNO<sub>2</sub>) and Nd(NO<sub>3</sub>)(acetonitrile)<sub>3</sub>, with the calculated EXAFS signals closely matching the experimental spectra for the Eu-ligand complex and showing good similarity for the Nd salt; the better agreement with the ligand-containing structure is attributed to a less dynamic coordination environment due to the rigid ligand. The key advantage of the global optimization algorithm lies in its ability to sample the coordination environment across the potential energy surface and reduce the time required to identify structures from generally a month to within a week. Additionally, this approach is versatile and can be adapted to characterize main-group metal complexes.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8926-8936"},"PeriodicalIF":5.6,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142685411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemical Information and Modeling
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1