首页 > 最新文献

Journal of Chemical Information and Modeling 最新文献

英文 中文
Kinetics-Based State Definitions for Discrete Binding Conformations of T4 L99A in MD via Markov State Modeling
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-11-26 DOI: 10.1021/acs.jcim.4c0136410.1021/acs.jcim.4c01364
Chris Zhang, Meghan Osato and David L. Mobley*, 

As a model system, the binding pocket of the L99A mutant of T4 lysozyme has been the subject of numerous computational free energy studies. However, previous studies have failed to fully sample and account for the observed changes in the binding pocket of T4 L99A upon binding of a congeneric ligand series, limiting the accuracy of results. In this work, we resolve the closed, intermediate, and open states for T4 L99A previously reported in experiment in MD and establish definitions for these states based on the dynamics of the system. From this analysis, we arrive at two primary conclusions. First, assignment of simulation trajectories into discrete states should not be done simply based on RMSD to crystal structures as this can result in misassignment of states. Second, the different metastable conformations studied here need to be carefully treated, as we estimate the time scales for conformational interconversion to be on the order of 102 to 103 ns─far longer than time scales for typical binding calculations. We conclude with a discussion on the need to develop enhanced sampling methods to generally account for significant changes in protein conformation due to relatively small ligand perturbations.

{"title":"Kinetics-Based State Definitions for Discrete Binding Conformations of T4 L99A in MD via Markov State Modeling","authors":"Chris Zhang,&nbsp;Meghan Osato and David L. Mobley*,&nbsp;","doi":"10.1021/acs.jcim.4c0136410.1021/acs.jcim.4c01364","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01364https://doi.org/10.1021/acs.jcim.4c01364","url":null,"abstract":"<p >As a model system, the binding pocket of the L99A mutant of T4 lysozyme has been the subject of numerous computational free energy studies. However, previous studies have failed to fully sample and account for the observed changes in the binding pocket of T4 L99A upon binding of a congeneric ligand series, limiting the accuracy of results. In this work, we resolve the closed, intermediate, and open states for T4 L99A previously reported in experiment in MD and establish definitions for these states based on the dynamics of the system. From this analysis, we arrive at two primary conclusions. First, assignment of simulation trajectories into discrete states should not be done simply based on RMSD to crystal structures as this can result in misassignment of states. Second, the different metastable conformations studied here need to be carefully treated, as we estimate the time scales for conformational interconversion to be on the order of 10<sup>2</sup> to 10<sup>3</sup> ns─far longer than time scales for typical binding calculations. We conclude with a discussion on the need to develop enhanced sampling methods to generally account for significant changes in protein conformation due to relatively small ligand perturbations.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 23","pages":"8870–8879 8870–8879"},"PeriodicalIF":5.6,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142843817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to "Semisupervised Learning to Boost hERG, Nav1.5, and Cav1.2 Cardiac Ion Channel Toxicity Prediction by Mining a Large Unlabeled Small Molecule Data Set". 更正 "通过挖掘大量无标记小分子数据集,以半监督学习方式提高 hERG、Nav1.5 和 Cav1.2 心脏离子通道毒性预测"。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-11-26 DOI: 10.1021/acs.jcim.4c02123
Issar Arab, Kris Laukens, Wout Bittremieux
{"title":"Correction to \"Semisupervised Learning to Boost hERG, Nav1.5, and Cav1.2 Cardiac Ion Channel Toxicity Prediction by Mining a Large Unlabeled Small Molecule Data Set\".","authors":"Issar Arab, Kris Laukens, Wout Bittremieux","doi":"10.1021/acs.jcim.4c02123","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c02123","url":null,"abstract":"","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142724529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Build-a-Bio-Strip: An Online Platform for Rapid Toxicity Assessment in Chemical Synthesis. 构建生物条带:化学合成中快速毒性评估的在线平台。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-11-25 Epub Date: 2024-11-03 DOI: 10.1021/acs.jcim.4c01381
Dmitry S Boichenko, Nikita I Kolomoets, Daniil A Boiko, Alexey S Galushko, Alexandra V Posvyatenko, Andrey E Kolesnikov, Ksenia S Egorova, Valentine P Ananikov

The increasing need to understand and control the environmental impact of chemical processes has revealed the challenge in efficient evaluation of toxicity of the vast number of chemical compounds and their varying effects on biological systems. In this study, we introduce "Build-a-bio-Strip", a novel online service designed to carry out a quick initial analysis of the toxic impact of chemical processes. This platform enables users to automatically generate toxicity characteristics of chemical reactions using their own data on cytotoxicity or median lethal doses of the substances involved or computational predictions based on SMILES strings. The service calculates the toxicity metrics such as bio-Factors and cytotoxicity potentials, which can be used to identify the substances with significant contributions to the overall toxicity of a particular process. This facilitates the selection of safer synthetic routes and the optimization of chemical processes from a toxicity perspective. "Build-a-bio-Strip" represents a step toward safer and more sustainable chemical practices. It is available free-of-charge at http://app.ananikovlab.ai:8080/.

人们越来越需要了解和控制化学过程对环境的影响,这揭示了对大量化学物质的毒性及其对生物系统的不同影响进行有效评估所面临的挑战。在本研究中,我们介绍了 "构建生物带",这是一种新颖的在线服务,旨在对化学过程的毒性影响进行快速初步分析。该平台使用户能够利用自己的细胞毒性数据或相关物质的中位致死剂量数据,或基于 SMILES 字符串的计算预测,自动生成化学反应的毒性特征。该服务可计算生物因子和细胞毒性潜能值等毒性指标,用于识别对特定工艺的整体毒性有重大影响的物质。这有助于从毒性角度选择更安全的合成路线和优化化学工艺。"构建生物带 "是向更安全、更可持续的化学实践迈出的一步。它可在 http://app.ananikovlab.ai:8080/ 免费获取。
{"title":"Build-a-Bio-Strip: An Online Platform for Rapid Toxicity Assessment in Chemical Synthesis.","authors":"Dmitry S Boichenko, Nikita I Kolomoets, Daniil A Boiko, Alexey S Galushko, Alexandra V Posvyatenko, Andrey E Kolesnikov, Ksenia S Egorova, Valentine P Ananikov","doi":"10.1021/acs.jcim.4c01381","DOIUrl":"10.1021/acs.jcim.4c01381","url":null,"abstract":"<p><p>The increasing need to understand and control the environmental impact of chemical processes has revealed the challenge in efficient evaluation of toxicity of the vast number of chemical compounds and their varying effects on biological systems. In this study, we introduce \"Build-a-bio-Strip\", a novel online service designed to carry out a quick initial analysis of the toxic impact of chemical processes. This platform enables users to automatically generate toxicity characteristics of chemical reactions using their own data on cytotoxicity or median lethal doses of the substances involved or computational predictions based on SMILES strings. The service calculates the toxicity metrics such as bio-Factors and cytotoxicity potentials, which can be used to identify the substances with significant contributions to the overall toxicity of a particular process. This facilitates the selection of safer synthetic routes and the optimization of chemical processes from a toxicity perspective. \"Build-a-bio-Strip\" represents a step toward safer and more sustainable chemical practices. It is available free-of-charge at http://app.ananikovlab.ai:8080/.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8373-8378"},"PeriodicalIF":5.6,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142566382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Molecular Design for Cardiac Cell Differentiation Using a Small Data Set and Decorated Shape Features
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-11-25 DOI: 10.1021/acs.jcim.4c0135310.1021/acs.jcim.4c01353
Fatemeh Etezadi, Shunichi Ito, Kosuke Yasui, Rodi Kado Abdalkader, Itsunari Minami, Motonari Uesugi, Namasivayam Ganesh Pandian, Haruko Nakano, Atsushi Nakano and Daniel M. Packwood*, 

The discovery of small organic compounds for inducing stem cell differentiation is a time- and resource-intensive process. While data science could, in principle, streamline the discovery of these compounds, novel approaches are required due to the difficulty of acquiring training data from large numbers of example compounds. In this paper, we present the design of a new compound for inducing cardiomyocyte differentiation using simple regression models trained with a data set containing only 80 examples. We introduce decorated shape descriptors, an information-rich molecular feature representation that integrates both molecular shape and hydrophilicity information. These models demonstrate improved performance compared to ones using standard molecular descriptors based on shape alone. Model overtraining is diagnosed using a new type of sensitivity analysis. Our new compound is designed using a conservative molecular design strategy, and its effectiveness is confirmed through expression profiles of cardiomyocyte-related marker genes using real-time polymerase chain reaction experiments on human iPS cell lines. This work demonstrates a viable data-driven strategy for designing new compounds for stem cell differentiation protocols and will be useful in situations where training data is limited.

{"title":"Molecular Design for Cardiac Cell Differentiation Using a Small Data Set and Decorated Shape Features","authors":"Fatemeh Etezadi,&nbsp;Shunichi Ito,&nbsp;Kosuke Yasui,&nbsp;Rodi Kado Abdalkader,&nbsp;Itsunari Minami,&nbsp;Motonari Uesugi,&nbsp;Namasivayam Ganesh Pandian,&nbsp;Haruko Nakano,&nbsp;Atsushi Nakano and Daniel M. Packwood*,&nbsp;","doi":"10.1021/acs.jcim.4c0135310.1021/acs.jcim.4c01353","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01353https://doi.org/10.1021/acs.jcim.4c01353","url":null,"abstract":"<p >The discovery of small organic compounds for inducing stem cell differentiation is a time- and resource-intensive process. While data science could, in principle, streamline the discovery of these compounds, novel approaches are required due to the difficulty of acquiring training data from large numbers of example compounds. In this paper, we present the design of a new compound for inducing cardiomyocyte differentiation using simple regression models trained with a data set containing only 80 examples. We introduce decorated shape descriptors, an information-rich molecular feature representation that integrates both molecular shape and hydrophilicity information. These models demonstrate improved performance compared to ones using standard molecular descriptors based on shape alone. Model overtraining is diagnosed using a new type of sensitivity analysis. Our new compound is designed using a conservative molecular design strategy, and its effectiveness is confirmed through expression profiles of cardiomyocyte-related marker genes using real-time polymerase chain reaction experiments on human iPS cell lines. This work demonstrates a viable data-driven strategy for designing new compounds for stem cell differentiation protocols and will be useful in situations where training data is limited.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 23","pages":"8824–8837 8824–8837"},"PeriodicalIF":5.6,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142850997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ChemXTree: A Feature-Enhanced Graph Neural Network-Neural Decision Tree Framework for ADMET Prediction. ChemXTree:用于 ADMET 预测的特征增强型图神经网络-神经决策树框架。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-11-25 Epub Date: 2024-11-05 DOI: 10.1021/acs.jcim.4c01186
Yuzhi Xu, Xinxin Liu, Wei Xia, Jiankai Ge, Cheng-Wei Ju, Haiping Zhang, John Z H Zhang

The rapid progression of machine learning, especially deep learning (DL), has catalyzed a new era in drug discovery, introducing innovative approaches for predicting molecular properties. Despite the many methods available for feature representation, efficiently utilizing rich, high-dimensional information remains a significant challenge. Our work introduces ChemXTree, a novel graph-based model that integrates a Gate Modulation Feature Unit (GMFU) and neural decision tree (NDT) in the output layer to address this challenge. Extensive evaluations on benchmark data sets, including MoleculeNet and eight additional drug databases, have demonstrated ChemXTree's superior performance, surpassing or matching the current state-of-the-art models. Visualization techniques clearly demonstrate that ChemXTree significantly improves the separation between substrates and nonsubstrates in the latent space. In summary, ChemXTree demonstrates a promising approach for integrating advanced feature extraction with neural decision trees, offering significant improvements in predictive accuracy for drug discovery tasks and opening new avenues for optimizing molecular properties.

机器学习,尤其是深度学习(DL)的快速发展催化了药物发现的新时代,为预测分子特性引入了创新方法。尽管有许多可用的特征表示方法,但有效利用丰富的高维信息仍是一项重大挑战。我们的研究引入了基于图的新型模型 ChemXTree,该模型在输出层集成了门调制特征单元(GMFU)和神经决策树(NDT),以应对这一挑战。在基准数据集(包括 MoleculeNet 和其他八个药物数据库)上进行的广泛评估证明了 ChemXTree 的卓越性能,超过或赶上了当前最先进的模型。可视化技术清楚地表明,ChemXTree 显著提高了潜空间中底物与非底物之间的分离度。总之,ChemXTree 展示了一种将高级特征提取与神经决策树相结合的前景广阔的方法,可显著提高药物发现任务的预测准确性,并为优化分子特性开辟了新途径。
{"title":"ChemXTree: A Feature-Enhanced Graph Neural Network-Neural Decision Tree Framework for ADMET Prediction.","authors":"Yuzhi Xu, Xinxin Liu, Wei Xia, Jiankai Ge, Cheng-Wei Ju, Haiping Zhang, John Z H Zhang","doi":"10.1021/acs.jcim.4c01186","DOIUrl":"10.1021/acs.jcim.4c01186","url":null,"abstract":"<p><p>The rapid progression of machine learning, especially deep learning (DL), has catalyzed a new era in drug discovery, introducing innovative approaches for predicting molecular properties. Despite the many methods available for feature representation, efficiently utilizing rich, high-dimensional information remains a significant challenge. Our work introduces ChemXTree, a novel graph-based model that integrates a Gate Modulation Feature Unit (GMFU) and neural decision tree (NDT) in the output layer to address this challenge. Extensive evaluations on benchmark data sets, including MoleculeNet and eight additional drug databases, have demonstrated ChemXTree's superior performance, surpassing or matching the current state-of-the-art models. Visualization techniques clearly demonstrate that ChemXTree significantly improves the separation between substrates and nonsubstrates in the latent space. In summary, ChemXTree demonstrates a promising approach for integrating advanced feature extraction with neural decision trees, offering significant improvements in predictive accuracy for drug discovery tasks and opening new avenues for optimizing molecular properties.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8440-8452"},"PeriodicalIF":5.6,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11600499/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142574932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding and Predicting Ligand Efficacy in the μ-Opioid Receptor through Quantitative Dynamical Analysis of Complex Structures. 通过对复杂结构的定量动态分析了解和预测配体在μ-阿片受体中的功效
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-11-25 Epub Date: 2024-11-04 DOI: 10.1021/acs.jcim.4c00788
Gabriel T Galdino, Olivier Mailhot, Rafael Najmanovich

The μ-opioid receptor (MOR) is a G-protein coupled receptor involved in nociception and the primary target of opioid drugs. Understanding the relationships among the ligand structure, receptor dynamics, and efficacy in activating MOR is crucial for drug discovery and development. Here, we use coarse-grained normal-mode analysis to predict ligand-induced changes in receptor dynamics with the Quantitative Dynamics Activity Relationship (QDAR) DynaSig-ML methodology, training a LASSO regression model on the entropic signatures (ESs) computed from ligand-receptor complexes. We train and validate the methodology using a data set of 179 MOR ligands with experimentally measured efficacies split into strictly chemically different cross-validation sets. By analyzing the coefficients of the ES LASSO model, we identified key residues involved in MOR activation, several of which have mutational data supporting their role in MOR activation. Additionally, we explored a contact-only LASSO model based on ligand-protein interactions. While the model showed predictive power, it failed at predicting efficacy for ligands with low structural similarity to the training set, emphasizing the importance of receptor dynamics for predicting ligand-induced receptor activation. Moreover, the low computational cost of our approach, at 3 CPU s per ligand-receptor complex, opens the door to its application in large-scale virtual screening contexts. Our work contributes to a better understanding of dynamics-function relationships in the μ-opioid receptor and provides a framework for predicting ligand efficacy based on ligand-induced changes in receptor dynamics.

μ-阿片受体(MOR)是一种参与痛觉的G蛋白偶联受体,也是阿片类药物的主要靶点。了解配体结构、受体动力学和激活 MOR 的功效之间的关系对于药物发现和开发至关重要。在这里,我们利用粗粒度正态模式分析预测配体诱导的受体动力学变化,采用定量动力学活性关系(QDAR)DynaSig-ML 方法,在配体-受体复合物计算出的熵特征(ES)上训练 LASSO 回归模型。我们使用由 179 种 MOR 配体组成的数据集对该方法进行了训练和验证,这些配体的药效是通过实验测得的,并分成了化学性质严格不同的交叉验证集。通过分析 ES LASSO 模型的系数,我们确定了参与 MOR 激活的关键残基,其中几个残基的突变数据支持它们在 MOR 激活中的作用。此外,我们还探索了基于配体与蛋白质相互作用的纯接触 LASSO 模型。虽然该模型显示出了预测能力,但它无法预测与训练集结构相似度较低的配体的药效,这强调了受体动力学对预测配体诱导的受体激活的重要性。此外,我们的方法计算成本低,每个配体-受体复合物只需 3 CPU s,这为其在大规模虚拟筛选中的应用打开了大门。我们的工作有助于更好地理解μ-阿片受体的动力学-功能关系,并为根据配体诱导的受体动力学变化预测配体功效提供了一个框架。
{"title":"Understanding and Predicting Ligand Efficacy in the μ-Opioid Receptor through Quantitative Dynamical Analysis of Complex Structures.","authors":"Gabriel T Galdino, Olivier Mailhot, Rafael Najmanovich","doi":"10.1021/acs.jcim.4c00788","DOIUrl":"10.1021/acs.jcim.4c00788","url":null,"abstract":"<p><p>The μ-opioid receptor (MOR) is a G-protein coupled receptor involved in nociception and the primary target of opioid drugs. Understanding the relationships among the ligand structure, receptor dynamics, and efficacy in activating MOR is crucial for drug discovery and development. Here, we use coarse-grained normal-mode analysis to predict ligand-induced changes in receptor dynamics with the Quantitative Dynamics Activity Relationship (QDAR) DynaSig-ML methodology, training a LASSO regression model on the entropic signatures (ESs) computed from ligand-receptor complexes. We train and validate the methodology using a data set of 179 MOR ligands with experimentally measured efficacies split into strictly chemically different cross-validation sets. By analyzing the coefficients of the ES LASSO model, we identified key residues involved in MOR activation, several of which have mutational data supporting their role in MOR activation. Additionally, we explored a contact-only LASSO model based on ligand-protein interactions. While the model showed predictive power, it failed at predicting efficacy for ligands with low structural similarity to the training set, emphasizing the importance of receptor dynamics for predicting ligand-induced receptor activation. Moreover, the low computational cost of our approach, at 3 CPU s per ligand-receptor complex, opens the door to its application in large-scale virtual screening contexts. Our work contributes to a better understanding of dynamics-function relationships in the μ-opioid receptor and provides a framework for predicting ligand efficacy based on ligand-induced changes in receptor dynamics.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8549-8561"},"PeriodicalIF":5.6,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142574941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Peptaloid: A Comprehensive Database for Exploring Peptide Alkaloid. Peptaloid:探索多肽类生物碱的综合数据库。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-11-25 Epub Date: 2024-11-01 DOI: 10.1021/acs.jcim.4c01667
Bibhu Prasad Behera, Hemangini Naik, V Badireenath Konkimalla

Peptaloid is the first dedicated database for peptide alkaloid molecules, a unique class of naturally derived compounds known for their structural diversity and significant biological activities. Despite their promising potential in drug discovery and therapeutic development, research on peptide alkaloids has been limited by the absence of a comprehensive and centralized resource. Fragmented data across various sources have posed a significant challenge, underscoring the need for a specialized database to facilitate more efficient research and application. Peptaloid addresses this critical gap by providing a database with over 161,000 peptide alkaloid entries, each detailed with structural, physicochemical, and pharmacological properties. By leveraging advanced computational tools and machine learning, Peptaloid generates ADMET profiles, aiding in identifying and optimizing therapeutic candidates. Designed for versatility, the database supports various applications beyond drug discovery, including ecology and material sciences. Peptaloid (as a specialized database for peptide alkaloids) will play a crucial role in innovation and collaboration across scientific disciplines. Peptaloid is accessible at https://peptaloid.niser.ac.in.

肽生物碱是一类独特的天然衍生化合物,以其结构多样性和显著的生物活性而闻名。尽管肽类生物碱在药物发现和治疗开发方面具有巨大潜力,但由于缺乏全面、集中的资源,肽类生物碱的研究一直受到限制。各种来源的零散数据构成了巨大的挑战,突出表明需要一个专门的数据库来促进更有效的研究和应用。肽生物碱数据库提供了一个包含 161,000 多个肽生物碱条目的数据库,每个条目都详细介绍了结构、理化和药理特性,从而弥补了这一重要空白。通过利用先进的计算工具和机器学习,Peptaloid 可以生成 ADMET 图谱,帮助确定和优化候选疗法。该数据库设计用途广泛,支持药物发现以外的各种应用,包括生态学和材料科学。Peptaloid(肽生物碱专业数据库)将在跨学科创新与合作中发挥重要作用。肽生物碱可通过 https://peptaloid.niser.ac.in 访问。
{"title":"Peptaloid: A Comprehensive Database for Exploring Peptide Alkaloid.","authors":"Bibhu Prasad Behera, Hemangini Naik, V Badireenath Konkimalla","doi":"10.1021/acs.jcim.4c01667","DOIUrl":"10.1021/acs.jcim.4c01667","url":null,"abstract":"<p><p>Peptaloid is the first dedicated database for peptide alkaloid molecules, a unique class of naturally derived compounds known for their structural diversity and significant biological activities. Despite their promising potential in drug discovery and therapeutic development, research on peptide alkaloids has been limited by the absence of a comprehensive and centralized resource. Fragmented data across various sources have posed a significant challenge, underscoring the need for a specialized database to facilitate more efficient research and application. Peptaloid addresses this critical gap by providing a database with over 161,000 peptide alkaloid entries, each detailed with structural, physicochemical, and pharmacological properties. By leveraging advanced computational tools and machine learning, Peptaloid generates ADMET profiles, aiding in identifying and optimizing therapeutic candidates. Designed for versatility, the database supports various applications beyond drug discovery, including ecology and material sciences. Peptaloid (as a specialized database for peptide alkaloids) will play a crucial role in innovation and collaboration across scientific disciplines. Peptaloid is accessible at https://peptaloid.niser.ac.in.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8387-8395"},"PeriodicalIF":5.6,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142556620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mixture-of-Experts Based Dissociation Kinetic Model for De Novo Design of HSP90 Inhibitors with Prolonged Residence Time. 基于专家混合物的解离动力学模型,用于从头设计具有较长滞留时间的 HSP90 抑制剂。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-11-25 Epub Date: 2024-11-04 DOI: 10.1021/acs.jcim.4c00726
Yujing Zhao, Lei Zhang, Jian Du, Qingwei Meng, Li Zhang, Heshuang Wang, Liang Sun, Qilei Liu

The dissociation rate constant (koff) significantly impacts the drug potency and dosing frequency. This work proposes a powerful optimization-based framework for de novo drug design guided by koff. First, a comprehensive database containing 2,773 unique koff values is created. Based on the database, a novel generic dissociation kinetic model is developed with a mixture-of-experts architecture, enabling high-throughput predictions of koff with high accuracy. The developed model is then integrated with an optimization-based mathematical programming approach to design drug candidates with low koff. Finally, the τ-RAMD method is utilized to rigorously verify the designed potential drug candidates. In a case study, the framework successfully identified numerous new potential HSP90 inhibitor candidates, achieving a maximum 45.7% improvement in residence time (τ = 1/koff) compared to that of a known exceptional HSP90 inhibitor. These findings demonstrate the feasibility and effectiveness of the kinetics-guided optimization-based de novo drug design framework in designing drug candidates with prolonged τ.

解离速率常数(koff)对药效和用药频率有重大影响。本研究提出了一个基于优化的强大框架,以 koff 为指导进行新药设计。首先,建立了一个包含 2,773 个独特 koff 值的综合数据库。在该数据库的基础上,利用专家混合架构开发了一个新颖的通用解离动力学模型,从而实现了高通量、高精度的 koff 预测。然后将所开发的模型与基于优化的数学编程方法相结合,设计出具有低 koff 的候选药物。最后,利用 τ-RAMD 方法对设计出的潜在候选药物进行严格验证。在一项案例研究中,该框架成功鉴定出许多新的潜在 HSP90 候选抑制剂,与已知的特殊 HSP90 抑制剂相比,停留时间(τ = 1/koff)最多可改善 45.7%。这些发现证明了基于动力学指导的优化从头药物设计框架在设计具有延长τ的候选药物方面的可行性和有效性。
{"title":"Mixture-of-Experts Based Dissociation Kinetic Model for <i>De Novo</i> Design of HSP90 Inhibitors with Prolonged Residence Time.","authors":"Yujing Zhao, Lei Zhang, Jian Du, Qingwei Meng, Li Zhang, Heshuang Wang, Liang Sun, Qilei Liu","doi":"10.1021/acs.jcim.4c00726","DOIUrl":"10.1021/acs.jcim.4c00726","url":null,"abstract":"<p><p>The dissociation rate constant (<i>k</i><sub>off</sub>) significantly impacts the drug potency and dosing frequency. This work proposes a powerful optimization-based framework for <i>de novo</i> drug design guided by <i>k</i><sub>off</sub>. First, a comprehensive database containing 2,773 unique <i>k</i><sub>off</sub> values is created. Based on the database, a novel generic dissociation kinetic model is developed with a mixture-of-experts architecture, enabling high-throughput predictions of <i>k</i><sub>off</sub> with high accuracy. The developed model is then integrated with an optimization-based mathematical programming approach to design drug candidates with low <i>k</i><sub>off</sub>. Finally, the τ-RAMD method is utilized to rigorously verify the designed potential drug candidates. In a case study, the framework successfully identified numerous new potential HSP90 inhibitor candidates, achieving a maximum 45.7% improvement in residence time (τ = 1/<i>k</i><sub>off</sub>) compared to that of a known exceptional HSP90 inhibitor. These findings demonstrate the feasibility and effectiveness of the kinetics-guided optimization-based <i>de novo</i> drug design framework in designing drug candidates with prolonged τ.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8427-8439"},"PeriodicalIF":5.6,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142574936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synergizing Machine Learning, Conceptual Density Functional Theory, and Biochemistry: No-Code Explainable Predictive Models for Mutagenicity in Aromatic Amines. 机器学习、概念密度泛函理论与生物化学的协同作用:芳香族胺类致突变性的无代码可解释预测模型。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-11-25 Epub Date: 2024-11-11 DOI: 10.1021/acs.jcim.4c01246
Andrés Halabi Diaz, Mario Duque-Noreña, Elizabeth Rincón, Eduardo Chamorro

This study synergizes machine learning (ML) with conceptual density functional theory (CDFT) to develop OECD-compliant predictive models for the mutagenic activity of aromatic amines (AAs) with a fully No-Code methodology using a comprehensive data set of 251 AAs, Leave-One-Out-Cross-Validation (LOOCV), and three distinct data splits. Our research employs the GFN2-xTB method, known for its robustness and speed, to compute descriptors for procarcinogens and their activated metabolites in vacuum and aqueous phases. We evaluate the effectiveness of different theoretical definitions of electrophilicity within CDFT, namely, PSL, GCV, and CDP schemes, and the newly introduced Log QP descriptor to approximate Log P information. SPAARC, RandomTree, and JCHAID* ML methods were used to build explainable predictive models with highly robust internal validation (Avg. Correct Classifications = 76% and Avg. Kappa = 0.29) and external validation (Avg. Correct Classifications = 79% and Avg. Kappa = 0.33) metrics, and the results were compared to those of a two hidden layer Multilayer Perceptron. The results indicate that the second CDP definition for the electrophilicity in both vacuum and aqueous phases and also the newly presented Log QP descriptors are the most important ones for predicting the mutagenic activity of AA (namely ω+VacCDP2+, ω+AqCDP2+, and LogQP1+Vac, respectively). The results indicate that metabolic activation, aqueous solvent properties, and the CDP electrophilicity schemes and Log QP should be considered when building predictive models for the mutagenic activity of AA. This study offers a replicable, No-Code approach to QSAR research, making high-level ML and CDFT applications accessible to a broader audience. Future work will expand these methods to other compound families, enhancing predictive capabilities in the study of mutagenic activities and other biological phenomena.

本研究将机器学习 (ML) 与概念密度泛函理论 (CDFT) 相结合,利用包含 251 种芳香胺 (AA) 的综合数据集、留空交叉验证 (LOOCV) 和三种不同的数据拆分,采用完全无代码方法开发出符合 OECD 标准的芳香胺 (AA) 诱变活性预测模型。我们的研究采用了以稳健性和快速性著称的 GFN2-xTB方法来计算真空相和水相中致癌物质及其活化代谢物的描述符。我们评估了 CDFT 中不同亲电性理论定义的有效性,即 PSL、GCV 和 CDP 方案,以及新引入的 Log QP 描述符以近似 Log P 信息。使用 SPAARC、RandomTree 和 JCHAID* ML 方法建立了可解释的预测模型,具有高度稳健的内部验证(平均正确分类率 = 76%,平均 Kappa = 0.29)和外部验证(平均正确分类率 = 79%,平均 Kappa = 0.33)指标,并将结果与双隐层多层感知器的结果进行了比较。结果表明,真空和水相亲电性的第二个 CDP 定义以及新提出的 Log QP 描述因子(分别为 ω+VacCDP2+、ω+AqCDP2+ 和 LogQP1+Vac)是预测 AA 诱变活性最重要的描述因子。结果表明,在建立 AA 诱变活性预测模型时,应考虑代谢活化、水溶剂特性、CDP 亲电方案和 Log QP。这项研究为 QSAR 研究提供了一种可复制的无代码方法,使更多的人可以使用高级 ML 和 CDFT 应用。未来的工作将把这些方法扩展到其他化合物家族,从而提高诱变活性和其他生物现象研究的预测能力。
{"title":"Synergizing Machine Learning, Conceptual Density Functional Theory, and Biochemistry: No-Code Explainable Predictive Models for Mutagenicity in Aromatic Amines.","authors":"Andrés Halabi Diaz, Mario Duque-Noreña, Elizabeth Rincón, Eduardo Chamorro","doi":"10.1021/acs.jcim.4c01246","DOIUrl":"10.1021/acs.jcim.4c01246","url":null,"abstract":"<p><p>This study synergizes machine learning (ML) with conceptual density functional theory (CDFT) to develop OECD-compliant predictive models for the mutagenic activity of aromatic amines (AAs) with a fully No-Code methodology using a comprehensive data set of 251 AAs, Leave-One-Out-Cross-Validation (LOOCV), and three distinct data splits. Our research employs the GFN2-xTB method, known for its robustness and speed, to compute descriptors for procarcinogens and their activated metabolites in vacuum and aqueous phases. We evaluate the effectiveness of different theoretical definitions of electrophilicity within CDFT, namely, PSL, GCV, and CDP schemes, and the newly introduced Log QP descriptor to approximate Log P information. SPAARC, RandomTree, and JCHAID* ML methods were used to build explainable predictive models with highly robust internal validation (Avg. Correct Classifications = 76% and Avg. Kappa = 0.29) and external validation (Avg. Correct Classifications = 79% and Avg. Kappa = 0.33) metrics, and the results were compared to those of a two hidden layer Multilayer Perceptron. The results indicate that the second CDP definition for the electrophilicity in both vacuum and aqueous phases and also the newly presented Log QP descriptors are the most important ones for predicting the mutagenic activity of AA (namely ω<sub>+Vac</sub><sup>CDP2+</sup>, ω<sub>+Aq</sub><sup>CDP2+</sup>, and LogQP1<sub>+Vac</sub>, respectively). The results indicate that metabolic activation, aqueous solvent properties, and the CDP electrophilicity schemes and Log QP should be considered when building predictive models for the mutagenic activity of AA. This study offers a replicable, No-Code approach to QSAR research, making high-level ML and CDFT applications accessible to a broader audience. Future work will expand these methods to other compound families, enhancing predictive capabilities in the study of mutagenic activities and other biological phenomena.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8510-8520"},"PeriodicalIF":5.6,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142612417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
lociPARSE: A Locality-aware Invariant Point Attention Model for Scoring RNA 3D Structures. lociPARSE:用于核糖核酸三维结构评分的局部感知不变点注意力模型。
IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL Pub Date : 2024-11-25 Epub Date: 2024-11-11 DOI: 10.1021/acs.jcim.4c01621
Sumit Tarafder, Debswapna Bhattacharya

A scoring function that can reliably assess the accuracy of a 3D RNA structural model in the absence of experimental structure is not only important for model evaluation and selection but also useful for scoring-guided conformational sampling. However, high-fidelity RNA scoring has proven to be difficult using conventional knowledge-based statistical potentials and currently available machine learning-based approaches. Here, we present lociPARSE, a locality-aware invariant point attention architecture for scoring RNA 3D structures. Unlike existing machine learning methods that estimate superposition-based root-mean-square deviation (RMSD), lociPARSE estimates Local Distance Difference Test (lDDT) scores capturing the accuracy of each nucleotide and its surrounding local atomic environment in a superposition-free manner, before aggregating information to predict global structural accuracy. Tested on multiple datasets including CASP15, lociPARSE significantly outperforms existing statistical potentials (rsRNASP, cgRNASP, DFIRE-RNA, and RASP) and machine learning methods (ARES and RNA3DCNN) across complementary assessment metrics. lociPARSE is freely available at https://github.com/Bhattacharya-Lab/lociPARSE.

在没有实验结构的情况下,能够可靠地评估三维 RNA 结构模型准确性的评分函数不仅对模型评估和选择很重要,而且对评分指导的构象取样也很有用。然而,事实证明,使用传统的基于知识的统计潜力和目前可用的基于机器学习的方法很难进行高保真 RNA 评分。在这里,我们提出了 lociPARSE,一种用于 RNA 3D 结构评分的局部感知不变点注意架构。与现有的基于叠加估算均方根偏差(RMSD)的机器学习方法不同,lociPARSE 以无叠加的方式估算局部距离差分测试(lDDT)分数,捕捉每个核苷酸及其周围局部原子环境的准确性,然后汇总信息来预测全局结构的准确性。在包括 CASP15 在内的多个数据集上进行测试后,lociPARSE 在互补性评估指标方面明显优于现有的统计潜力(rsRNASP、cgRNASP、DFIRE-RNA 和 RASP)和机器学习方法(ARES 和 RNA3DCNN)。lociPARSE 可在 https://github.com/Bhattacharya-Lab/lociPARSE 免费获取。
{"title":"lociPARSE: A Locality-aware Invariant Point Attention Model for Scoring RNA 3D Structures.","authors":"Sumit Tarafder, Debswapna Bhattacharya","doi":"10.1021/acs.jcim.4c01621","DOIUrl":"10.1021/acs.jcim.4c01621","url":null,"abstract":"<p><p>A scoring function that can reliably assess the accuracy of a 3D RNA structural model in the absence of experimental structure is not only important for model evaluation and selection but also useful for scoring-guided conformational sampling. However, high-fidelity RNA scoring has proven to be difficult using conventional knowledge-based statistical potentials and currently available machine learning-based approaches. Here, we present lociPARSE, a locality-aware invariant point attention architecture for scoring RNA 3D structures. Unlike existing machine learning methods that estimate superposition-based root-mean-square deviation (RMSD), lociPARSE estimates Local Distance Difference Test (lDDT) scores capturing the accuracy of each nucleotide and its surrounding local atomic environment in a superposition-free manner, before aggregating information to predict global structural accuracy. Tested on multiple datasets including CASP15, lociPARSE significantly outperforms existing statistical potentials (rsRNASP, cgRNASP, DFIRE-RNA, and RASP) and machine learning methods (ARES and RNA3DCNN) across complementary assessment metrics. lociPARSE is freely available at https://github.com/Bhattacharya-Lab/lociPARSE.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":"8655-8664"},"PeriodicalIF":5.6,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11600500/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142612432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Chemical Information and Modeling
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1