首页 > 最新文献

Molecular Informatics最新文献

英文 中文
ERL-ProLiGraph: Enhanced representation learning on protein-ligand graph structured data for binding affinity prediction. ERL-ProLiGraph:用于结合亲和力预测的蛋白质配体图结构数据的增强表示学习。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-12-01 Epub Date: 2024-10-15 DOI: 10.1002/minf.202400044
Gloria Geine Paendong, Soualihou Ngnamsie Njimbouom, Candra Zonyfar, Jeong-Dong Kim

Predicting Protein-Ligand Binding Affinity (PLBA) is pivotal in drug development, as accurate estimations of PLBA expedite the identification of promising drug candidates for specific targets, thereby accelerating the drug discovery process. Despite substantial advancements in PLBA prediction, developing an efficient and more accurate method remains non-trivial. Unlike previous computer-aid PLBA studies which primarily using ligand SMILES and protein sequences represented as strings, this research introduces a Deep Learning-based method, the Enhanced Representation Learning on Protein-Ligand Graph Structured data for Binding Affinity Prediction (ERL-ProLiGraph). The unique aspect of this method is the use of graph representations for both proteins and ligands, intending to learn structural information continued from both to enhance the accuracy of PLBA predictions. In these graphs, nodes represent atomic structures, while edges depict chemical bonds and spatial relationship. The proposed model, leveraging deep-learning algorithms, effectively learns to correlate these graphical representations with binding affinities. This graph-based representations approach enhances the model's ability to capture the complex molecular interactions critical in PLBA. This work represents a promising advancement in computational techniques for protein-ligand binding prediction, offering a potential path toward more efficient and accurate predictions in drug development. Comparative analysis indicates that the proposed ERL-ProLiGraph outperforms previous models, showcasing notable efficacy and providing a more suitable approach for accurate PLBA predictions.

预测蛋白质配体结合亲和力(PLBA)在药物开发中至关重要,因为准确估计 PLBA 可以加快针对特定靶点确定有前途的候选药物,从而加速药物发现过程。尽管在 PLBA 预测方面取得了长足的进步,但开发一种高效、更准确的方法仍然不是一件容易的事。与以往主要使用配体 SMILES 和以字符串表示的蛋白质序列的计算机辅助 PLBA 研究不同,本研究引入了一种基于深度学习的方法,即用于结合亲和力预测的蛋白质配体图结构化数据的增强表示学习(ERL-ProLiGraph)。该方法的独特之处在于同时使用蛋白质和配体的图表示法,目的是从蛋白质和配体中持续学习结构信息,以提高 PLBA 预测的准确性。在这些图中,节点代表原子结构,而边则描述化学键和空间关系。所提出的模型利用深度学习算法,有效地学习将这些图形表示与结合亲和力相关联。这种基于图的表示方法增强了模型捕捉 PLBA 中关键的复杂分子相互作用的能力。这项工作代表了蛋白质配体结合预测计算技术的一大进步,为药物开发中更高效、更准确的预测提供了一条潜在的途径。对比分析表明,所提出的 ERL-ProLiGraph 优于以前的模型,展示了显著的功效,为准确预测 PLBA 提供了更合适的方法。
{"title":"ERL-ProLiGraph: Enhanced representation learning on protein-ligand graph structured data for binding affinity prediction.","authors":"Gloria Geine Paendong, Soualihou Ngnamsie Njimbouom, Candra Zonyfar, Jeong-Dong Kim","doi":"10.1002/minf.202400044","DOIUrl":"10.1002/minf.202400044","url":null,"abstract":"<p><p>Predicting Protein-Ligand Binding Affinity (PLBA) is pivotal in drug development, as accurate estimations of PLBA expedite the identification of promising drug candidates for specific targets, thereby accelerating the drug discovery process. Despite substantial advancements in PLBA prediction, developing an efficient and more accurate method remains non-trivial. Unlike previous computer-aid PLBA studies which primarily using ligand SMILES and protein sequences represented as strings, this research introduces a Deep Learning-based method, the Enhanced Representation Learning on Protein-Ligand Graph Structured data for Binding Affinity Prediction (ERL-ProLiGraph). The unique aspect of this method is the use of graph representations for both proteins and ligands, intending to learn structural information continued from both to enhance the accuracy of PLBA predictions. In these graphs, nodes represent atomic structures, while edges depict chemical bonds and spatial relationship. The proposed model, leveraging deep-learning algorithms, effectively learns to correlate these graphical representations with binding affinities. This graph-based representations approach enhances the model's ability to capture the complex molecular interactions critical in PLBA. This work represents a promising advancement in computational techniques for protein-ligand binding prediction, offering a potential path toward more efficient and accurate predictions in drug development. Comparative analysis indicates that the proposed ERL-ProLiGraph outperforms previous models, showcasing notable efficacy and providing a more suitable approach for accurate PLBA predictions.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400044"},"PeriodicalIF":2.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11639045/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The freedom space - a new set of commercially available molecules for hit discovery. 自由空间--一组新的商业化分子,用于发现新药。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-12-01 Epub Date: 2024-08-22 DOI: 10.1002/minf.202400114
Mykola V Protopopov, Valentyna V Tararina, Fanny Bonachera, Igor M Dzyuba, Anna Kapeliukha, Serhii Hlotov, Oleksii Chuk, Gilles Marcou, Olga Klimchuk, Dragos Horvath, Erik Yeghyan, Olena Savych, Olga O Tarkhanova, Alexandre Varnek, Yurii S Moroz

The advent of high-performance virtual screening techniques nowadays allows drug designers to explore ultra-large sets of candidate compounds in search of molecules predicted to have desired properties. However, the success of such an endeavor heavily relies on the pertinence (drug-likeness and, foremost, chemical feasibility) of these candidates, or otherwise, virtual screening will return valueless "hits", by the garbage in/garbage out principle. The huge popularity of the judiciously enumerated Enamine REAL Space is clear proof of the strength of this Big Data trend in drug discovery. Here we describe a new dataset of make-on-demand compounds called the Freedom space. It follows the principles of Enamine REAL Space and contains highly feasible molecules (synthesis success rate over 75 percent). However, the scaffold and chemography analysis revealed significant differences to both the REAL and biologically annotated compounds from the ChEMBL database. The Freedom Space is a significant extension of the REAL Space and can be utilized for a more comprehensive exploration of the synthetically feasible chemical space in hit finding and hit-to-lead campaigns.

如今,高性能虚拟筛选技术的出现使药物设计人员能够探索超大规模的候选化合物集,寻找具有预期特性的分子。然而,这种努力的成功在很大程度上依赖于这些候选化合物的相关性(药物相似性,最重要的是化学可行性),否则,根据垃圾进/垃圾出原则,虚拟筛选将返回无价值的 "命中"。经过审慎枚举的 Enamine REAL Space 的大受欢迎充分证明了大数据趋势在药物发现中的优势。在此,我们将介绍一个名为 "自由空间"(Freedom space)的按需制造化合物新数据集。它遵循恩胺真实空间的原则,包含高度可行的分子(合成成功率超过 75%)。然而,支架和化学分析显示,它与 REAL 和 ChEMBL 数据库中的生物注释化合物存在显著差异。自由空间是 REAL 空间的重要扩展,可用于在寻找新药和新药先导活动中更全面地探索合成上可行的化学空间。
{"title":"The freedom space - a new set of commercially available molecules for hit discovery.","authors":"Mykola V Protopopov, Valentyna V Tararina, Fanny Bonachera, Igor M Dzyuba, Anna Kapeliukha, Serhii Hlotov, Oleksii Chuk, Gilles Marcou, Olga Klimchuk, Dragos Horvath, Erik Yeghyan, Olena Savych, Olga O Tarkhanova, Alexandre Varnek, Yurii S Moroz","doi":"10.1002/minf.202400114","DOIUrl":"10.1002/minf.202400114","url":null,"abstract":"<p><p>The advent of high-performance virtual screening techniques nowadays allows drug designers to explore ultra-large sets of candidate compounds in search of molecules predicted to have desired properties. However, the success of such an endeavor heavily relies on the pertinence (drug-likeness and, foremost, chemical feasibility) of these candidates, or otherwise, virtual screening will return valueless \"hits\", by the garbage in/garbage out principle. The huge popularity of the judiciously enumerated Enamine REAL Space is clear proof of the strength of this Big Data trend in drug discovery. Here we describe a new dataset of make-on-demand compounds called the Freedom space. It follows the principles of Enamine REAL Space and contains highly feasible molecules (synthesis success rate over 75 percent). However, the scaffold and chemography analysis revealed significant differences to both the REAL and biologically annotated compounds from the ChEMBL database. The Freedom Space is a significant extension of the REAL Space and can be utilized for a more comprehensive exploration of the synthetically feasible chemical space in hit finding and hit-to-lead campaigns.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400114"},"PeriodicalIF":2.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142018020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review of the 8th autumn school in chemoinformatics. 第八届化学信息学秋季学校回顾。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-12-01 Epub Date: 2024-10-15 DOI: 10.1002/minf.202400037
Johann Gasteiger

This paper gives an overview of the lectures and posters presented at the 8th Autumn School in Chemoinformatics held in Nara, Japan on 28th - 30th November 2023. The topics ranged from the study of chemical reactions through drug design and the use of Chemical Language Models and electronic structure informatics to the modeling of materials. In addition, a brief overview of the 50 years of work in chemoinformatics by Johann Gasteiger is given with an emphasis on the essential decisions during his scientific career.

本文概述了 2023 年 11 月 28-30 日在日本奈良举办的第八届化学信息学秋季学校的演讲和海报。主题范围从化学反应研究到药物设计,从化学语言模型和电子结构信息学的使用到材料建模。此外,还简要介绍了约翰-加斯泰格(Johann Gasteiger)50 年来在化学信息学方面的工作,重点介绍了他在科学生涯中做出的重要决定。
{"title":"Review of the 8<sup>th</sup> autumn school in chemoinformatics.","authors":"Johann Gasteiger","doi":"10.1002/minf.202400037","DOIUrl":"10.1002/minf.202400037","url":null,"abstract":"<p><p>This paper gives an overview of the lectures and posters presented at the 8th Autumn School in Chemoinformatics held in Nara, Japan on 28th - 30th November 2023. The topics ranged from the study of chemical reactions through drug design and the use of Chemical Language Models and electronic structure informatics to the modeling of materials. In addition, a brief overview of the 50 years of work in chemoinformatics by Johann Gasteiger is given with an emphasis on the essential decisions during his scientific career.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400037"},"PeriodicalIF":2.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11639044/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DIGEP-Pred 2.0: A web application for predicting drug-induced cell signaling and gene expression changes. DIGEP-Pred 2.0:用于预测药物诱导的细胞信号传导和基因表达变化的网络应用程序。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-12-01 Epub Date: 2024-07-09 DOI: 10.1002/minf.202400032
Sergey M Ivanov, Anastasia V Rudik, Alexey A Lagunin, Dmitry A Filimonov, Vladimir V Poroikov

The analysis of drug-induced gene expression profiles (DIGEP) is widely used to estimate the potential therapeutic and adverse drug effects as well as the molecular mechanisms of drug action. However, the corresponding experimental data is absent for many existing drugs and drug-like compounds. To solve this problem, we created the DIGEP-Pred 2.0 web application, which allows predicting DIGEP and potential drug targets by structural formula of drug-like compounds. It is based on the combined use of structure-activity relationships (SARs) and network analysis. SAR models were created using PASS (Prediction of Activity Spectra for Substances) technology for data from the Comparative Toxicogenomics Database (CTD), the Connectivity Map (CMap) for the prediction of DIGEP, and PubChem and ChEMBL for the prediction of molecular mechanisms of action (MoA). Using only the structural formula of a compound, the user can obtain information on potential gene expression changes in several cell lines and drug targets, which are potential master regulators responsible for the observed DIGEP. The mean accuracy of prediction calculated by leave-one-out cross validation was 86.5 % for 13377 genes and 94.8 % for 2932 proteins (CTD data), and it was 97.9 % for 2170 MoAs. SAR models (mean accuracy-87.5 %) were also created for CMap data given on MCF7, PC3, and HL60 cell lines with different threshold values for the logarithm of fold changes: 0.5, 0.7, 1, 1.5, and 2. Additionally, the data on pathways (KEGG, Reactome), biological processes of Gene Ontology, and diseases (DisGeNet) enriched by the predicted genes, together with the estimation of target-master regulators based on OmniPath data, is also provided. DIGEP-Pred 2.0 web application is freely available at https://www.way2drug.com/digep-pred.

药物诱导基因表达谱(DIGEP)分析被广泛用于评估药物的潜在治疗和不良反应以及药物作用的分子机制。然而,许多现有药物和类药物缺乏相应的实验数据。为了解决这个问题,我们创建了 DIGEP-Pred 2.0 网络应用程序,它可以通过类药物的结构式预测 DIGEP 和潜在的药物靶点。它基于结构-活性关系(SAR)和网络分析的结合使用。SAR 模型是利用 PASS(物质活性谱预测)技术创建的,其数据来自比较毒物基因组学数据库(CTD),连接图(CMap)用于预测 DIGEP,PubChem 和 ChEMBL 用于预测分子作用机制(MoA)。用户只需使用化合物的结构式,就能获得多个细胞系和药物靶点中潜在基因表达变化的信息,这些基因表达变化是导致观察到的 DIGEP 的潜在主调节因子。通过缺一交叉验证计算出的 13377 个基因和 2932 个蛋白质(CTD 数据)的平均预测准确率分别为 86.5%和 94.8%,2170 个 MoAs 的平均预测准确率为 97.9%。此外,还针对 MCF7、PC3 和 HL60 细胞系的 CMap 数据创建了 SAR 模型(平均准确率为 87.5%),并采用了不同的折叠变化对数值阈值:0.5、0.7、1、1.5 和 2。此外,还提供了预测基因富集的通路(KEGG、Reactome)、基因本体论的生物过程和疾病(DisGeNet)数据,以及基于 OmniPath 数据的目标主调节因子估算。DIGEP-Pred 2.0 网络应用程序可在 https://www.way2drug.com/digep-pred 免费获取。
{"title":"DIGEP-Pred 2.0: A web application for predicting drug-induced cell signaling and gene expression changes.","authors":"Sergey M Ivanov, Anastasia V Rudik, Alexey A Lagunin, Dmitry A Filimonov, Vladimir V Poroikov","doi":"10.1002/minf.202400032","DOIUrl":"10.1002/minf.202400032","url":null,"abstract":"<p><p>The analysis of drug-induced gene expression profiles (DIGEP) is widely used to estimate the potential therapeutic and adverse drug effects as well as the molecular mechanisms of drug action. However, the corresponding experimental data is absent for many existing drugs and drug-like compounds. To solve this problem, we created the DIGEP-Pred 2.0 web application, which allows predicting DIGEP and potential drug targets by structural formula of drug-like compounds. It is based on the combined use of structure-activity relationships (SARs) and network analysis. SAR models were created using PASS (Prediction of Activity Spectra for Substances) technology for data from the Comparative Toxicogenomics Database (CTD), the Connectivity Map (CMap) for the prediction of DIGEP, and PubChem and ChEMBL for the prediction of molecular mechanisms of action (MoA). Using only the structural formula of a compound, the user can obtain information on potential gene expression changes in several cell lines and drug targets, which are potential master regulators responsible for the observed DIGEP. The mean accuracy of prediction calculated by leave-one-out cross validation was 86.5 % for 13377 genes and 94.8 % for 2932 proteins (CTD data), and it was 97.9 % for 2170 MoAs. SAR models (mean accuracy-87.5 %) were also created for CMap data given on MCF7, PC3, and HL60 cell lines with different threshold values for the logarithm of fold changes: 0.5, 0.7, 1, 1.5, and 2. Additionally, the data on pathways (KEGG, Reactome), biological processes of Gene Ontology, and diseases (DisGeNet) enriched by the predicted genes, together with the estimation of target-master regulators based on OmniPath data, is also provided. DIGEP-Pred 2.0 web application is freely available at https://www.way2drug.com/digep-pred.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400032"},"PeriodicalIF":2.8,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141559261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pathway-based prediction of the therapeutic effects and mode of action of custom-made multiherbal medicines. 基于途径预测定制多草药的治疗效果和作用模式。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-11-01 Epub Date: 2024-10-15 DOI: 10.1002/minf.202400108
Akihiro Ezoe, Yuki Shimada, Ryusuke Sawada, Akihiro Douke, Tomokazu Shibata, Makoto Kadowaki, Yoshihiro Yamanishi

Multiherbal medicines are traditionally used as personalized medicines with custom combinations of crude drugs; however, the mechanisms of multiherbal medicines are unclear. In this study, we developed a novel pathway-based method to predict therapeutic effects and the mode of action of custom-made multiherbal medicines using machine learning. This method considers disease-related pathways as therapeutic targets and evaluates the comprehensive influence of constituent compounds on their potential target proteins in the disease-related pathways. Our proposed method enabled us to comprehensively predict new indications of 194 Kampo medicines for 87 diseases. Using Kampo-induced transcriptomic data, we demonstrated that Kampo constituent compounds stimulated the disease-related proteins and a customized Kampo formula enhanced the efficacy compared with an existing Kampo formula. The proposed method will be useful for discovering effective Kampo medicines and optimizing custom-made multiherbal medicines in practice.

多草药传统上被用作个性化药物,对粗制药物进行定制组合;然而,多草药的作用机制尚不清楚。在本研究中,我们开发了一种基于通路的新方法,利用机器学习预测定制多草药的治疗效果和作用模式。该方法将疾病相关通路视为治疗靶点,并评估组成化合物对疾病相关通路中潜在靶蛋白的综合影响。我们提出的方法使我们能够全面预测 194 种康普药对 87 种疾病的新适应症。我们利用堪布诱导的转录组数据证明,堪布成分化合物刺激了疾病相关蛋白,与现有堪布配方相比,定制的堪布配方提高了疗效。所提出的方法将有助于在实践中发现有效的康普药物和优化定制的多草药。
{"title":"Pathway-based prediction of the therapeutic effects and mode of action of custom-made multiherbal medicines.","authors":"Akihiro Ezoe, Yuki Shimada, Ryusuke Sawada, Akihiro Douke, Tomokazu Shibata, Makoto Kadowaki, Yoshihiro Yamanishi","doi":"10.1002/minf.202400108","DOIUrl":"10.1002/minf.202400108","url":null,"abstract":"<p><p>Multiherbal medicines are traditionally used as personalized medicines with custom combinations of crude drugs; however, the mechanisms of multiherbal medicines are unclear. In this study, we developed a novel pathway-based method to predict therapeutic effects and the mode of action of custom-made multiherbal medicines using machine learning. This method considers disease-related pathways as therapeutic targets and evaluates the comprehensive influence of constituent compounds on their potential target proteins in the disease-related pathways. Our proposed method enabled us to comprehensively predict new indications of 194 Kampo medicines for 87 diseases. Using Kampo-induced transcriptomic data, we demonstrated that Kampo constituent compounds stimulated the disease-related proteins and a customized Kampo formula enhanced the efficacy compared with an existing Kampo formula. The proposed method will be useful for discovering effective Kampo medicines and optimizing custom-made multiherbal medicines in practice.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400108"},"PeriodicalIF":2.8,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BIOMX-DB: A web application for the BIOFACQUIM natural product database. BIOMX-DB:BIOFACQUIM 天然产品数据库的网络应用程序。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-11-01 Epub Date: 2024-06-05 DOI: 10.1002/minf.202400060
Fernando Martínez-Urrutia, José L Medina-Franco

Natural product databases are an integral part of chemoinformatics and computer-aided drug design. Despite their pivotal role, a distinct scarcity of projects in Latin America, particularly in Mexico, provides accessible tools of this nature. Herein, we introduce BIOMX-DB, an open and freely accessible web-based database designed to address this gap. BIOMX-DB enhances the features of the existing Mexican natural product database, BIOFACQUIM, by incorporating advanced search, filtering, and download capabilities. The user-friendly interface of BIOMX-DB aims to provide an intuitive experience for researchers. For seamless access, BIOMX-DB is freely available at www.biomx-db.com.

天然产品数据库是化学信息学和计算机辅助药物设计的组成部分。尽管天然产物数据库具有举足轻重的作用,但在拉丁美洲,尤其是在墨西哥,提供这种性质的可访问工具的项目却非常稀少。在此,我们介绍 BIOMX-DB,这是一个开放、可免费访问的网络数据库,旨在填补这一空白。BIOMX-DB 通过整合高级搜索、过滤和下载功能,增强了现有墨西哥天然产品数据库 BIOFACQUIM 的功能。BIOMX-DB 的用户友好界面旨在为研究人员提供直观的体验。为实现无缝访问,BIOMX-DB 可在 www.biomx-db.com 免费获取。
{"title":"BIOMX-DB: A web application for the BIOFACQUIM natural product database.","authors":"Fernando Martínez-Urrutia, José L Medina-Franco","doi":"10.1002/minf.202400060","DOIUrl":"10.1002/minf.202400060","url":null,"abstract":"<p><p>Natural product databases are an integral part of chemoinformatics and computer-aided drug design. Despite their pivotal role, a distinct scarcity of projects in Latin America, particularly in Mexico, provides accessible tools of this nature. Herein, we introduce BIOMX-DB, an open and freely accessible web-based database designed to address this gap. BIOMX-DB enhances the features of the existing Mexican natural product database, BIOFACQUIM, by incorporating advanced search, filtering, and download capabilities. The user-friendly interface of BIOMX-DB aims to provide an intuitive experience for researchers. For seamless access, BIOMX-DB is freely available at www.biomx-db.com.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400060"},"PeriodicalIF":2.8,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141262372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chemoinformatics for corrosion science: Data-driven modeling of corrosion inhibition by organic molecules. 腐蚀科学的化学信息学:数据驱动的有机分子腐蚀抑制模型。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-11-01 Epub Date: 2024-10-15 DOI: 10.1002/minf.202400082
Igor Baskin, Yair Ein-Eli

This paper reviews the application of machine learning to the inhibition of corrosion by organic molecules. The methodologies considered include quantitative structure-property relationships (QSPR) and related data-driven approaches. The characteristic features of their key components are considered as applied to corrosion inhibition, including datasets, response properties, molecular descriptors, machine learning methods, and structure-property models. It is shown that the most important factors determining their choice and application features are: (1) the small or very small size of datasets, (2) the mechanism of corrosion inhibition associated with the adsorption of inhibitor molecules on the metal surface, and (3) multifactorial conditioning and noisiness of response property. On this basis, the application of machine learning to the inhibition of corrosion of materials based on iron, aluminum, and magnesium is considered. The main trends in the development of QSPR and related data-driven modeling of corrosion inhibition are discussed, the shortcomings and common errors are considered, and the prospects for their further development are outlined.

本文回顾了机器学习在有机分子腐蚀抑制方面的应用。考虑的方法包括定量结构-性质关系(QSPR)和相关的数据驱动方法。在将其应用于缓蚀时,考虑了其主要组成部分的特征,包括数据集、响应特性、分子描述符、机器学习方法和结构-特性模型。结果表明,决定其选择和应用特征的最重要因素是(1) 数据集的规模较小或非常小;(2) 与抑制剂分子在金属表面的吸附有关的缓蚀机制;(3) 响应特性的多因素调节和噪声。在此基础上,考虑了机器学习在铁、铝和镁基材料缓蚀方面的应用。讨论了 QSPR 和相关数据驱动缓蚀建模的主要发展趋势,指出了其不足之处和常见错误,并展望了其进一步发展的前景。
{"title":"Chemoinformatics for corrosion science: Data-driven modeling of corrosion inhibition by organic molecules.","authors":"Igor Baskin, Yair Ein-Eli","doi":"10.1002/minf.202400082","DOIUrl":"10.1002/minf.202400082","url":null,"abstract":"<p><p>This paper reviews the application of machine learning to the inhibition of corrosion by organic molecules. The methodologies considered include quantitative structure-property relationships (QSPR) and related data-driven approaches. The characteristic features of their key components are considered as applied to corrosion inhibition, including datasets, response properties, molecular descriptors, machine learning methods, and structure-property models. It is shown that the most important factors determining their choice and application features are: (1) the small or very small size of datasets, (2) the mechanism of corrosion inhibition associated with the adsorption of inhibitor molecules on the metal surface, and (3) multifactorial conditioning and noisiness of response property. On this basis, the application of machine learning to the inhibition of corrosion of materials based on iron, aluminum, and magnesium is considered. The main trends in the development of QSPR and related data-driven modeling of corrosion inhibition are discussed, the shortcomings and common errors are considered, and the prospects for their further development are outlined.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400082"},"PeriodicalIF":2.8,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
My 50 Years with Chemoinformatics. 我的化学信息学 50 年。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-11-01 Epub Date: 2024-10-15 DOI: 10.1002/minf.202400036
Johann Gasteiger
{"title":"My 50 Years with Chemoinformatics.","authors":"Johann Gasteiger","doi":"10.1002/minf.202400036","DOIUrl":"10.1002/minf.202400036","url":null,"abstract":"","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400036"},"PeriodicalIF":2.8,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142470302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Task ADME/PK prediction at industrial scale: leveraging large and diverse experimental datasets. 工业规模的多任务 ADME/PK 预测:利用大型多样的实验数据集。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-10-01 Epub Date: 2024-07-08 DOI: 10.1002/minf.202400079
Moritz Walter, Jens M Borghardt, Lina Humbeck, Miha Skalic

ADME (Absorption, Distribution, Metabolism, Excretion) properties are key parameters to judge whether a drug candidate exhibits a desired pharmacokinetic (PK) profile. In this study, we tested multi-task machine learning (ML) models to predict ADME and animal PK endpoints trained on in-house data generated at Boehringer Ingelheim. Models were evaluated both at the design stage of a compound (i. e., no experimental data of test compounds available) and at testing stage when a particular assay would be conducted (i. e., experimental data of earlier conducted assays may be available). Using realistic time-splits, we found a clear benefit in performance of multi-task graph-based neural network models over single-task model, which was even stronger when experimental data of earlier assays is available. In an attempt to explain the success of multi-task models, we found that especially endpoints with the largest numbers of data points (physicochemical endpoints, clearance in microsomes) are responsible for increased predictivity in more complex ADME and PK endpoints. In summary, our study provides insight into how data for multiple ADME/PK endpoints in a pharmaceutical company can be best leveraged to optimize predictivity of ML models.

ADME(吸收、分布、代谢、排泄)特性是判断候选药物是否具有理想药代动力学(PK)特征的关键参数。在这项研究中,我们测试了多任务机器学习(ML)模型,这些模型是根据勃林格殷格翰公司内部生成的数据训练而成的,用于预测 ADME 和动物 PK 终点。我们在化合物设计阶段(即没有测试化合物的实验数据)和测试阶段(即可能有早期进行的实验数据)对模型进行了评估。利用现实的时间分割,我们发现基于图的多任务神经网络模型的性能明显优于单任务模型。为了解释多任务模型的成功,我们发现数据点数量最多的终点(理化终点、微粒体中的清除率)尤其能提高更复杂的 ADME 和 PK 终点的预测能力。总之,我们的研究深入探讨了如何充分利用制药公司的多个 ADME/PK 终点数据来优化多重任务模型的预测能力。
{"title":"Multi-Task ADME/PK prediction at industrial scale: leveraging large and diverse experimental datasets.","authors":"Moritz Walter, Jens M Borghardt, Lina Humbeck, Miha Skalic","doi":"10.1002/minf.202400079","DOIUrl":"10.1002/minf.202400079","url":null,"abstract":"<p><p>ADME (Absorption, Distribution, Metabolism, Excretion) properties are key parameters to judge whether a drug candidate exhibits a desired pharmacokinetic (PK) profile. In this study, we tested multi-task machine learning (ML) models to predict ADME and animal PK endpoints trained on in-house data generated at Boehringer Ingelheim. Models were evaluated both at the design stage of a compound (i. e., no experimental data of test compounds available) and at testing stage when a particular assay would be conducted (i. e., experimental data of earlier conducted assays may be available). Using realistic time-splits, we found a clear benefit in performance of multi-task graph-based neural network models over single-task model, which was even stronger when experimental data of earlier assays is available. In an attempt to explain the success of multi-task models, we found that especially endpoints with the largest numbers of data points (physicochemical endpoints, clearance in microsomes) are responsible for increased predictivity in more complex ADME and PK endpoints. In summary, our study provides insight into how data for multiple ADME/PK endpoints in a pharmaceutical company can be best leveraged to optimize predictivity of ML models.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400079"},"PeriodicalIF":2.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141555197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distinct binding hotspots for natural and synthetic agonists of FFA4 from in silico approaches. 从硅学方法看天然和合成 FFA4 激动剂的不同结合热点。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2024-10-01 Epub Date: 2024-07-24 DOI: 10.1002/minf.202400046
Guillaume Patient, Corentin Bedart, Naim A Khan, Nicolas Renault, Amaury Farce

FFA4 has gained interest in recent years since its deorphanization in 2005 and the characterization of the Free Fatty Acids receptors family for their therapeutic potential in metabolic disorders. The expression of FFA4 (also known as GPR120) in numerous organs throughout the human body makes this receptor a highly potent target, particularly in fat sensing and diet preference. This offers an attractive approach to tackle obesity and related metabolic diseases. Recent cryo-EM structures of the receptor have provided valuable information for a potential active state although the previous studies of FFA4 presented diverging information. We performed molecular docking and molecular dynamics simulations of four agonist ligands, TUG-891, Linoleic acid, α-Linolenic acid, and Oleic acid, based on a homology model. Our simulations, which accumulated a total of 2 μs of simulation, highlighted two binding hotspots at Arg992.64 and Lys293 (ECL3). The results indicate that the residues are located in separate areas of the binding pocket and interact with various types of ligands, implying different potential active states of FFA4 and a highly adaptable binding intra-receptor pocket. This article proposes additional structural characteristics and mechanisms for agonist binding that complement the experimental structures.

自 2005 年 FFA4 被非形态化,以及游离脂肪酸受体家族在新陈代谢疾病中的治疗潜力被定性以来,FFA4 近年来越来越受到人们的关注。FFA4(又称 GPR120)在人体众多器官中的表达使该受体成为一个非常有效的靶点,尤其是在脂肪感应和饮食偏好方面。这为解决肥胖和相关代谢疾病提供了一种极具吸引力的方法。尽管以前对 FFA4 的研究提供了不同的信息,但最近该受体的低温电子显微镜结构为潜在的活性状态提供了宝贵的信息。我们基于同源模型对四种激动剂配体 TUG-891、亚油酸、α-亚麻酸和油酸进行了分子对接和分子动力学模拟。我们的模拟总共耗时 2 μs,突出显示了 Arg992.64 和 Lys293(ECL3)处的两个结合热点。结果表明,这两个残基分别位于结合口袋的不同区域,并与不同类型的配体相互作用,这意味着 FFA4 具有不同的潜在活性状态和一个具有高度适应性的受体内结合口袋。本文提出了与实验结构互补的其他结构特征和激动剂结合机制。
{"title":"Distinct binding hotspots for natural and synthetic agonists of FFA4 from in silico approaches.","authors":"Guillaume Patient, Corentin Bedart, Naim A Khan, Nicolas Renault, Amaury Farce","doi":"10.1002/minf.202400046","DOIUrl":"10.1002/minf.202400046","url":null,"abstract":"<p><p>FFA4 has gained interest in recent years since its deorphanization in 2005 and the characterization of the Free Fatty Acids receptors family for their therapeutic potential in metabolic disorders. The expression of FFA4 (also known as GPR120) in numerous organs throughout the human body makes this receptor a highly potent target, particularly in fat sensing and diet preference. This offers an attractive approach to tackle obesity and related metabolic diseases. Recent cryo-EM structures of the receptor have provided valuable information for a potential active state although the previous studies of FFA4 presented diverging information. We performed molecular docking and molecular dynamics simulations of four agonist ligands, TUG-891, Linoleic acid, α-Linolenic acid, and Oleic acid, based on a homology model. Our simulations, which accumulated a total of 2 μs of simulation, highlighted two binding hotspots at Arg99<sup>2.64</sup> and Lys293 (ECL3). The results indicate that the residues are located in separate areas of the binding pocket and interact with various types of ligands, implying different potential active states of FFA4 and a highly adaptable binding intra-receptor pocket. This article proposes additional structural characteristics and mechanisms for agonist binding that complement the experimental structures.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":" ","pages":"e202400046"},"PeriodicalIF":2.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141752164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Molecular Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1