首页 > 最新文献

Molecular Informatics最新文献

英文 中文
Extended Activity Cliffs-Driven Approaches on Data Splitting for the Study of Bioactivity Machine Learning Predictions. 用于生物活性机器学习预测研究的数据分割扩展活动峭壁驱动方法。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-11-18 DOI: 10.1002/minf.202400054
Kenneth López-Pérez, Ramón Alain Miranda-Quintana

The presence of Activity Cliffs (ACs) has been known to represent a challenge for QSAR modeling. With its high data dependency, Machine Learning QSAR models will be directly influenced by the activity landscape. We propose several extended similarity and extended SALI methods to study the implications of ACs distribution on the training and test sets on the model's errors. Ununiform ACs and chemical space distribution tend to lead to worse models than the proposed uniform methods. ML modeling on AC-rich sets needs to be analyzed case-by-case. Proposed methods can be used as a tool to study the datasets, but as far as generalization, random splitting was the better-performing data splitting alternative overall.

众所周知,活性悬崖(AC)的存在对 QSAR 建模是一个挑战。由于高度依赖数据,机器学习 QSAR 模型将直接受到活动景观的影响。我们提出了几种扩展相似性和扩展 SALI 方法,以研究训练集和测试集上的 ACs 分布对模型误差的影响。与所提出的统一方法相比,不统一的 ACs 和化学空间分布往往会导致更差的模型。在富AC集上的 ML 建模需要逐个分析。建议的方法可作为研究数据集的工具,但就泛化而言,随机拆分是总体表现更好的数据拆分替代方法。
{"title":"Extended Activity Cliffs-Driven Approaches on Data Splitting for the Study of Bioactivity Machine Learning Predictions.","authors":"Kenneth López-Pérez, Ramón Alain Miranda-Quintana","doi":"10.1002/minf.202400054","DOIUrl":"10.1002/minf.202400054","url":null,"abstract":"<p><p>The presence of Activity Cliffs (ACs) has been known to represent a challenge for QSAR modeling. With its high data dependency, Machine Learning QSAR models will be directly influenced by the activity landscape. We propose several extended similarity and extended SALI methods to study the implications of ACs distribution on the training and test sets on the model's errors. Ununiform ACs and chemical space distribution tend to lead to worse models than the proposed uniform methods. ML modeling on AC-rich sets needs to be analyzed case-by-case. Proposed methods can be used as a tool to study the datasets, but as far as generalization, random splitting was the better-performing data splitting alternative overall.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400054"},"PeriodicalIF":2.8,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142668097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pathway-based prediction of the therapeutic effects and mode of action of custom-made multiherbal medicines. 基于途径预测定制多草药的治疗效果和作用模式。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-11-01 Epub Date: 2024-10-15 DOI: 10.1002/minf.202400108
Akihiro Ezoe, Yuki Shimada, Ryusuke Sawada, Akihiro Douke, Tomokazu Shibata, Makoto Kadowaki, Yoshihiro Yamanishi

Multiherbal medicines are traditionally used as personalized medicines with custom combinations of crude drugs; however, the mechanisms of multiherbal medicines are unclear. In this study, we developed a novel pathway-based method to predict therapeutic effects and the mode of action of custom-made multiherbal medicines using machine learning. This method considers disease-related pathways as therapeutic targets and evaluates the comprehensive influence of constituent compounds on their potential target proteins in the disease-related pathways. Our proposed method enabled us to comprehensively predict new indications of 194 Kampo medicines for 87 diseases. Using Kampo-induced transcriptomic data, we demonstrated that Kampo constituent compounds stimulated the disease-related proteins and a customized Kampo formula enhanced the efficacy compared with an existing Kampo formula. The proposed method will be useful for discovering effective Kampo medicines and optimizing custom-made multiherbal medicines in practice.

多草药传统上被用作个性化药物,对粗制药物进行定制组合;然而,多草药的作用机制尚不清楚。在本研究中,我们开发了一种基于通路的新方法,利用机器学习预测定制多草药的治疗效果和作用模式。该方法将疾病相关通路视为治疗靶点,并评估组成化合物对疾病相关通路中潜在靶蛋白的综合影响。我们提出的方法使我们能够全面预测 194 种康普药对 87 种疾病的新适应症。我们利用堪布诱导的转录组数据证明,堪布成分化合物刺激了疾病相关蛋白,与现有堪布配方相比,定制的堪布配方提高了疗效。所提出的方法将有助于在实践中发现有效的康普药物和优化定制的多草药。
{"title":"Pathway-based prediction of the therapeutic effects and mode of action of custom-made multiherbal medicines.","authors":"Akihiro Ezoe, Yuki Shimada, Ryusuke Sawada, Akihiro Douke, Tomokazu Shibata, Makoto Kadowaki, Yoshihiro Yamanishi","doi":"10.1002/minf.202400108","DOIUrl":"10.1002/minf.202400108","url":null,"abstract":"<p><p>Multiherbal medicines are traditionally used as personalized medicines with custom combinations of crude drugs; however, the mechanisms of multiherbal medicines are unclear. In this study, we developed a novel pathway-based method to predict therapeutic effects and the mode of action of custom-made multiherbal medicines using machine learning. This method considers disease-related pathways as therapeutic targets and evaluates the comprehensive influence of constituent compounds on their potential target proteins in the disease-related pathways. Our proposed method enabled us to comprehensively predict new indications of 194 Kampo medicines for 87 diseases. Using Kampo-induced transcriptomic data, we demonstrated that Kampo constituent compounds stimulated the disease-related proteins and a customized Kampo formula enhanced the efficacy compared with an existing Kampo formula. The proposed method will be useful for discovering effective Kampo medicines and optimizing custom-made multiherbal medicines in practice.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400108"},"PeriodicalIF":2.8,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BIOMX-DB: A web application for the BIOFACQUIM natural product database. BIOMX-DB:BIOFACQUIM 天然产品数据库的网络应用程序。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-11-01 Epub Date: 2024-06-05 DOI: 10.1002/minf.202400060
Fernando Martínez-Urrutia, José L Medina-Franco

Natural product databases are an integral part of chemoinformatics and computer-aided drug design. Despite their pivotal role, a distinct scarcity of projects in Latin America, particularly in Mexico, provides accessible tools of this nature. Herein, we introduce BIOMX-DB, an open and freely accessible web-based database designed to address this gap. BIOMX-DB enhances the features of the existing Mexican natural product database, BIOFACQUIM, by incorporating advanced search, filtering, and download capabilities. The user-friendly interface of BIOMX-DB aims to provide an intuitive experience for researchers. For seamless access, BIOMX-DB is freely available at www.biomx-db.com.

天然产品数据库是化学信息学和计算机辅助药物设计的组成部分。尽管天然产物数据库具有举足轻重的作用,但在拉丁美洲,尤其是在墨西哥,提供这种性质的可访问工具的项目却非常稀少。在此,我们介绍 BIOMX-DB,这是一个开放、可免费访问的网络数据库,旨在填补这一空白。BIOMX-DB 通过整合高级搜索、过滤和下载功能,增强了现有墨西哥天然产品数据库 BIOFACQUIM 的功能。BIOMX-DB 的用户友好界面旨在为研究人员提供直观的体验。为实现无缝访问,BIOMX-DB 可在 www.biomx-db.com 免费获取。
{"title":"BIOMX-DB: A web application for the BIOFACQUIM natural product database.","authors":"Fernando Martínez-Urrutia, José L Medina-Franco","doi":"10.1002/minf.202400060","DOIUrl":"10.1002/minf.202400060","url":null,"abstract":"<p><p>Natural product databases are an integral part of chemoinformatics and computer-aided drug design. Despite their pivotal role, a distinct scarcity of projects in Latin America, particularly in Mexico, provides accessible tools of this nature. Herein, we introduce BIOMX-DB, an open and freely accessible web-based database designed to address this gap. BIOMX-DB enhances the features of the existing Mexican natural product database, BIOFACQUIM, by incorporating advanced search, filtering, and download capabilities. The user-friendly interface of BIOMX-DB aims to provide an intuitive experience for researchers. For seamless access, BIOMX-DB is freely available at www.biomx-db.com.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400060"},"PeriodicalIF":2.8,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141262372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chemoinformatics for corrosion science: Data-driven modeling of corrosion inhibition by organic molecules. 腐蚀科学的化学信息学:数据驱动的有机分子腐蚀抑制模型。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-11-01 Epub Date: 2024-10-15 DOI: 10.1002/minf.202400082
Igor Baskin, Yair Ein-Eli

This paper reviews the application of machine learning to the inhibition of corrosion by organic molecules. The methodologies considered include quantitative structure-property relationships (QSPR) and related data-driven approaches. The characteristic features of their key components are considered as applied to corrosion inhibition, including datasets, response properties, molecular descriptors, machine learning methods, and structure-property models. It is shown that the most important factors determining their choice and application features are: (1) the small or very small size of datasets, (2) the mechanism of corrosion inhibition associated with the adsorption of inhibitor molecules on the metal surface, and (3) multifactorial conditioning and noisiness of response property. On this basis, the application of machine learning to the inhibition of corrosion of materials based on iron, aluminum, and magnesium is considered. The main trends in the development of QSPR and related data-driven modeling of corrosion inhibition are discussed, the shortcomings and common errors are considered, and the prospects for their further development are outlined.

本文回顾了机器学习在有机分子腐蚀抑制方面的应用。考虑的方法包括定量结构-性质关系(QSPR)和相关的数据驱动方法。在将其应用于缓蚀时,考虑了其主要组成部分的特征,包括数据集、响应特性、分子描述符、机器学习方法和结构-特性模型。结果表明,决定其选择和应用特征的最重要因素是(1) 数据集的规模较小或非常小;(2) 与抑制剂分子在金属表面的吸附有关的缓蚀机制;(3) 响应特性的多因素调节和噪声。在此基础上,考虑了机器学习在铁、铝和镁基材料缓蚀方面的应用。讨论了 QSPR 和相关数据驱动缓蚀建模的主要发展趋势,指出了其不足之处和常见错误,并展望了其进一步发展的前景。
{"title":"Chemoinformatics for corrosion science: Data-driven modeling of corrosion inhibition by organic molecules.","authors":"Igor Baskin, Yair Ein-Eli","doi":"10.1002/minf.202400082","DOIUrl":"10.1002/minf.202400082","url":null,"abstract":"<p><p>This paper reviews the application of machine learning to the inhibition of corrosion by organic molecules. The methodologies considered include quantitative structure-property relationships (QSPR) and related data-driven approaches. The characteristic features of their key components are considered as applied to corrosion inhibition, including datasets, response properties, molecular descriptors, machine learning methods, and structure-property models. It is shown that the most important factors determining their choice and application features are: (1) the small or very small size of datasets, (2) the mechanism of corrosion inhibition associated with the adsorption of inhibitor molecules on the metal surface, and (3) multifactorial conditioning and noisiness of response property. On this basis, the application of machine learning to the inhibition of corrosion of materials based on iron, aluminum, and magnesium is considered. The main trends in the development of QSPR and related data-driven modeling of corrosion inhibition are discussed, the shortcomings and common errors are considered, and the prospects for their further development are outlined.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400082"},"PeriodicalIF":2.8,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
My 50 Years with Chemoinformatics. 我的化学信息学 50 年。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-11-01 Epub Date: 2024-10-15 DOI: 10.1002/minf.202400036
Johann Gasteiger
{"title":"My 50 Years with Chemoinformatics.","authors":"Johann Gasteiger","doi":"10.1002/minf.202400036","DOIUrl":"10.1002/minf.202400036","url":null,"abstract":"","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400036"},"PeriodicalIF":2.8,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GDMol: Generative Double-Masking Self-Supervised Learning for Molecular Property Prediction. GDMol:用于分子特性预测的生成式双掩蔽自我监督学习。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-10-24 DOI: 10.1002/minf.202400146
Yingxu Liu, Qing Fan, Chengcheng Xu, Xiangzhen Ning, Yu Wang, Yang Liu, Yanmin Zhang, Yadong Chen, Haichun Liu

Background: Effective molecular feature representation is crucial for drug property prediction. Recent years have seen increased attention on graph neural networks (GNNs) that are pre-trained using self-supervised learning techniques, aiming to overcome the scarcity of labeled data in molecular property prediction. Traditional GNNs in self-supervised molecular property prediction typically perform a single masking operation on the nodes and edges of the input molecular graph, masking only local information and insufficient for thorough self-supervised training.

Method: Hence, we propose a model for molecular property prediction based on generative double-masking self-supervised learning, termed as GDMol. This integrates generative learning into the self-supervised learning framework for latent representation, and applies a second round of masking to these latent representations, enabling the model to better capture global information and semantic knowledge of the molecules for a richer, more informative representation, thereby achieving more accurate and robust molecular property prediction.

Results: Our experiments on 5 datasets demonstrated superior performance of GDMol in predicting molecular properties across different domains. Moreover, we used the masking operation to traverse through the gradient changes of each node, the magnitude and sign of which reflect the positive and negative contribution respectively of the local structure in the molecule to the prediction outcome. This in-depth interpretative analysis not only enhances the model's interpretability, but also provides more targeted insights and direction for optimizing drug molecules.

Conclusions: In summary, this research offers novel insights on improving molecular property prediction tasks, and paves the way for further research on the application of generative learning and self-supervised learning in the field of chemistry.

背景:有效的分子特征表示对于药物性质预测至关重要。近年来,使用自我监督学习技术预先训练的图神经网络(GNN)受到越来越多的关注,其目的是克服分子性质预测中标记数据稀缺的问题。传统的自监督分子性质预测 GNN 通常只对输入分子图的节点和边进行一次屏蔽操作,屏蔽的只是局部信息,不足以进行彻底的自监督训练:因此,我们提出了一种基于生成式双掩蔽自监督学习的分子特性预测模型,称为 GDMol。它将生成学习整合到潜在表征的自我监督学习框架中,并对这些潜在表征进行第二轮掩蔽,使模型能够更好地捕捉分子的全局信息和语义知识,从而获得更丰富、更翔实的表征,从而实现更准确、更稳健的分子性质预测:我们在 5 个数据集上进行的实验表明,GDMol 在预测不同领域的分子特性方面表现出色。此外,我们利用掩码操作遍历了每个节点的梯度变化,其大小和符号分别反映了分子中局部结构对预测结果的正负贡献。这种深入的解释性分析不仅增强了模型的可解释性,还为优化药物分子提供了更有针对性的见解和方向:总之,这项研究为改进分子性质预测任务提供了新的见解,并为生成学习和自监督学习在化学领域的进一步应用研究铺平了道路。
{"title":"GDMol: Generative Double-Masking Self-Supervised Learning for Molecular Property Prediction.","authors":"Yingxu Liu, Qing Fan, Chengcheng Xu, Xiangzhen Ning, Yu Wang, Yang Liu, Yanmin Zhang, Yadong Chen, Haichun Liu","doi":"10.1002/minf.202400146","DOIUrl":"https://doi.org/10.1002/minf.202400146","url":null,"abstract":"<p><strong>Background: </strong>Effective molecular feature representation is crucial for drug property prediction. Recent years have seen increased attention on graph neural networks (GNNs) that are pre-trained using self-supervised learning techniques, aiming to overcome the scarcity of labeled data in molecular property prediction. Traditional GNNs in self-supervised molecular property prediction typically perform a single masking operation on the nodes and edges of the input molecular graph, masking only local information and insufficient for thorough self-supervised training.</p><p><strong>Method: </strong>Hence, we propose a model for molecular property prediction based on generative double-masking self-supervised learning, termed as GDMol. This integrates generative learning into the self-supervised learning framework for latent representation, and applies a second round of masking to these latent representations, enabling the model to better capture global information and semantic knowledge of the molecules for a richer, more informative representation, thereby achieving more accurate and robust molecular property prediction.</p><p><strong>Results: </strong>Our experiments on 5 datasets demonstrated superior performance of GDMol in predicting molecular properties across different domains. Moreover, we used the masking operation to traverse through the gradient changes of each node, the magnitude and sign of which reflect the positive and negative contribution respectively of the local structure in the molecule to the prediction outcome. This in-depth interpretative analysis not only enhances the model's interpretability, but also provides more targeted insights and direction for optimizing drug molecules.</p><p><strong>Conclusions: </strong>In summary, this research offers novel insights on improving molecular property prediction tasks, and paves the way for further research on the application of generative learning and self-supervised learning in the field of chemistry.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400146"},"PeriodicalIF":2.8,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142504416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GCLmf: A Novel Molecular Graph Contrastive Learning Framework Based on Hard Negatives and Application in Toxicity Prediction. GCLmf:基于硬阴性的新型分子图对比学习框架及其在毒性预测中的应用
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-10-18 DOI: 10.1002/minf.202400169
Xinxin Yu, Yuanting Chen, Long Chen, Weihua Li, Yuhao Wang, Yun Tang, Guixia Liu

In silico methods for prediction of chemical toxicity can decrease the cost and increase the efficiency in the early stage of drug discovery. However, due to low accessibility of sufficient and reliable toxicity data, constructing robust and accurate prediction models is challenging. Contrastive learning, a type of self-supervised learning, leverages large unlabeled data to obtain more expressive molecular representations, which can boost the prediction performance on downstream tasks. While molecular graph contrastive learning has gathered growing attentions, current models neglect the quality of negative data set. Here, we proposed a self-supervised pretraining deep learning framework named GCLmf. We first utilized molecular fragments that meet specific conditions as hard negative samples to boost the quality of the negative set and thus increase the difficulty of the proxy tasks during pre-training to learn informative representations. GCLmf has shown excellent predictive power on various molecular property benchmarks and demonstrates high performance in 33 toxicity tasks in comparison with multiple baselines. In addition, we further investigated the necessity of introducing hard negatives in model building and the impact of the proportion of hard negatives on the model.

在药物发现的早期阶段,预测化学毒性的硅学方法可以降低成本,提高效率。然而,由于难以获得充足可靠的毒性数据,构建稳健准确的预测模型具有挑战性。对比学习是一种自监督学习,它利用大量未标记数据来获得更具表现力的分子表征,从而提高下游任务的预测性能。虽然分子图对比学习受到越来越多的关注,但目前的模型忽视了负数据集的质量。在此,我们提出了一种名为 GCLmf 的自监督预训练深度学习框架。我们首先利用符合特定条件的分子片段作为硬负样本,以提高负集的质量,从而在预训练过程中增加代理任务的难度,以学习信息表征。GCLmf 在各种分子特性基准上都表现出了卓越的预测能力,与多个基线相比,它在 33 个毒性任务中表现出了很高的性能。此外,我们还进一步研究了在建立模型时引入硬阴性的必要性以及硬阴性比例对模型的影响。
{"title":"GCLmf: A Novel Molecular Graph Contrastive Learning Framework Based on Hard Negatives and Application in Toxicity Prediction.","authors":"Xinxin Yu, Yuanting Chen, Long Chen, Weihua Li, Yuhao Wang, Yun Tang, Guixia Liu","doi":"10.1002/minf.202400169","DOIUrl":"https://doi.org/10.1002/minf.202400169","url":null,"abstract":"<p><p>In silico methods for prediction of chemical toxicity can decrease the cost and increase the efficiency in the early stage of drug discovery. However, due to low accessibility of sufficient and reliable toxicity data, constructing robust and accurate prediction models is challenging. Contrastive learning, a type of self-supervised learning, leverages large unlabeled data to obtain more expressive molecular representations, which can boost the prediction performance on downstream tasks. While molecular graph contrastive learning has gathered growing attentions, current models neglect the quality of negative data set. Here, we proposed a self-supervised pretraining deep learning framework named GCLmf. We first utilized molecular fragments that meet specific conditions as hard negative samples to boost the quality of the negative set and thus increase the difficulty of the proxy tasks during pre-training to learn informative representations. GCLmf has shown excellent predictive power on various molecular property benchmarks and demonstrates high performance in 33 toxicity tasks in comparison with multiple baselines. In addition, we further investigated the necessity of introducing hard negatives in model building and the impact of the proportion of hard negatives on the model.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400169"},"PeriodicalIF":2.8,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ERL-ProLiGraph: Enhanced representation learning on protein-ligand graph structured data for binding affinity prediction. ERL-ProLiGraph:用于结合亲和力预测的蛋白质配体图结构数据的增强表示学习。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-10-15 DOI: 10.1002/minf.202400044
Gloria Geine Paendong, Soualihou Ngnamsie Njimbouom, Candra Zonyfar, Jeong-Dong Kim

Predicting Protein-Ligand Binding Affinity (PLBA) is pivotal in drug development, as accurate estimations of PLBA expedite the identification of promising drug candidates for specific targets, thereby accelerating the drug discovery process. Despite substantial advancements in PLBA prediction, developing an efficient and more accurate method remains non-trivial. Unlike previous computer-aid PLBA studies which primarily using ligand SMILES and protein sequences represented as strings, this research introduces a Deep Learning-based method, the Enhanced Representation Learning on Protein-Ligand Graph Structured data for Binding Affinity Prediction (ERL-ProLiGraph). The unique aspect of this method is the use of graph representations for both proteins and ligands, intending to learn structural information continued from both to enhance the accuracy of PLBA predictions. In these graphs, nodes represent atomic structures, while edges depict chemical bonds and spatial relationship. The proposed model, leveraging deep-learning algorithms, effectively learns to correlate these graphical representations with binding affinities. This graph-based representations approach enhances the model's ability to capture the complex molecular interactions critical in PLBA. This work represents a promising advancement in computational techniques for protein-ligand binding prediction, offering a potential path toward more efficient and accurate predictions in drug development. Comparative analysis indicates that the proposed ERL-ProLiGraph outperforms previous models, showcasing notable efficacy and providing a more suitable approach for accurate PLBA predictions.

预测蛋白质配体结合亲和力(PLBA)在药物开发中至关重要,因为准确估计 PLBA 可以加快针对特定靶点确定有前途的候选药物,从而加速药物发现过程。尽管在 PLBA 预测方面取得了长足的进步,但开发一种高效、更准确的方法仍然不是一件容易的事。与以往主要使用配体 SMILES 和以字符串表示的蛋白质序列的计算机辅助 PLBA 研究不同,本研究引入了一种基于深度学习的方法,即用于结合亲和力预测的蛋白质配体图结构化数据的增强表示学习(ERL-ProLiGraph)。该方法的独特之处在于同时使用蛋白质和配体的图表示法,目的是从蛋白质和配体中持续学习结构信息,以提高 PLBA 预测的准确性。在这些图中,节点代表原子结构,而边则描述化学键和空间关系。所提出的模型利用深度学习算法,有效地学习将这些图形表示与结合亲和力相关联。这种基于图的表示方法增强了模型捕捉 PLBA 中关键的复杂分子相互作用的能力。这项工作代表了蛋白质配体结合预测计算技术的一大进步,为药物开发中更高效、更准确的预测提供了一条潜在的途径。对比分析表明,所提出的 ERL-ProLiGraph 优于以前的模型,展示了显著的功效,为准确预测 PLBA 提供了更合适的方法。
{"title":"ERL-ProLiGraph: Enhanced representation learning on protein-ligand graph structured data for binding affinity prediction.","authors":"Gloria Geine Paendong, Soualihou Ngnamsie Njimbouom, Candra Zonyfar, Jeong-Dong Kim","doi":"10.1002/minf.202400044","DOIUrl":"https://doi.org/10.1002/minf.202400044","url":null,"abstract":"<p><p>Predicting Protein-Ligand Binding Affinity (PLBA) is pivotal in drug development, as accurate estimations of PLBA expedite the identification of promising drug candidates for specific targets, thereby accelerating the drug discovery process. Despite substantial advancements in PLBA prediction, developing an efficient and more accurate method remains non-trivial. Unlike previous computer-aid PLBA studies which primarily using ligand SMILES and protein sequences represented as strings, this research introduces a Deep Learning-based method, the Enhanced Representation Learning on Protein-Ligand Graph Structured data for Binding Affinity Prediction (ERL-ProLiGraph). The unique aspect of this method is the use of graph representations for both proteins and ligands, intending to learn structural information continued from both to enhance the accuracy of PLBA predictions. In these graphs, nodes represent atomic structures, while edges depict chemical bonds and spatial relationship. The proposed model, leveraging deep-learning algorithms, effectively learns to correlate these graphical representations with binding affinities. This graph-based representations approach enhances the model's ability to capture the complex molecular interactions critical in PLBA. This work represents a promising advancement in computational techniques for protein-ligand binding prediction, offering a potential path toward more efficient and accurate predictions in drug development. Comparative analysis indicates that the proposed ERL-ProLiGraph outperforms previous models, showcasing notable efficacy and providing a more suitable approach for accurate PLBA predictions.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400044"},"PeriodicalIF":2.8,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review of the 8th autumn school in chemoinformatics. 第八届化学信息学秋季学校回顾。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-10-15 DOI: 10.1002/minf.202400037
Johann Gasteiger

This paper gives an overview of the lectures and posters presented at the 8th Autumn School in Chemoinformatics held in Nara, Japan on 28th - 30th November 2023. The topics ranged from the study of chemical reactions through drug design and the use of Chemical Language Models and electronic structure informatics to the modeling of materials. In addition, a brief overview of the 50 years of work in chemoinformatics by Johann Gasteiger is given with an emphasis on the essential decisions during his scientific career.

本文概述了 2023 年 11 月 28-30 日在日本奈良举办的第八届化学信息学秋季学校的演讲和海报。主题范围从化学反应研究到药物设计,从化学语言模型和电子结构信息学的使用到材料建模。此外,还简要介绍了约翰-加斯泰格(Johann Gasteiger)50 年来在化学信息学方面的工作,重点介绍了他在科学生涯中做出的重要决定。
{"title":"Review of the 8<sup>th</sup> autumn school in chemoinformatics.","authors":"Johann Gasteiger","doi":"10.1002/minf.202400037","DOIUrl":"https://doi.org/10.1002/minf.202400037","url":null,"abstract":"<p><p>This paper gives an overview of the lectures and posters presented at the 8th Autumn School in Chemoinformatics held in Nara, Japan on 28th - 30th November 2023. The topics ranged from the study of chemical reactions through drug design and the use of Chemical Language Models and electronic structure informatics to the modeling of materials. In addition, a brief overview of the 50 years of work in chemoinformatics by Johann Gasteiger is given with an emphasis on the essential decisions during his scientific career.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400037"},"PeriodicalIF":2.8,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Navigating a 1E+60 Chemical Space of Peptide/Peptoid Oligomers. 浏览肽/肽低聚物的 1E+60 化学空间。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-10-10 DOI: 10.1002/minf.202400186
Markus Orsi, Jean-Louis Reymond

Herein we report a virtual library of 1E+60 members, a common estimate for the size of the drug-like chemical space. The library consists of linear or cyclic oligomers forming molecules within the size range of peptide drugs. We demonstrate ligand-based virtual screening using a genetic algorithm.

在此,我们报告了一个由 1E+60 个成员组成的虚拟库,这是对类药物化学空间大小的常见估计。该库由线性或环状低聚物组成,分子大小在肽类药物的范围内。我们利用遗传算法演示了基于配体的虚拟筛选。
{"title":"Navigating a 1E+60 Chemical Space of Peptide/Peptoid Oligomers.","authors":"Markus Orsi, Jean-Louis Reymond","doi":"10.1002/minf.202400186","DOIUrl":"https://doi.org/10.1002/minf.202400186","url":null,"abstract":"<p><p>Herein we report a virtual library of 1E+60 members, a common estimate for the size of the drug-like chemical space. The library consists of linear or cyclic oligomers forming molecules within the size range of peptide drugs. We demonstrate ligand-based virtual screening using a genetic algorithm.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400186"},"PeriodicalIF":2.8,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142400782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Molecular Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1