首页 > 最新文献

Molecular Informatics最新文献

英文 中文
Modeling Carbon Basicity.
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-03-01 DOI: 10.1002/minf.202400296
Robert Fraczkiewicz, Marvin Waldman

This work presents a predictive model of aqueous ionization constants (pKa) of protonatable carbons in certain aromatic rings. The phenomenon of carbon atoms sometimes acting as a stable and reversible base accepting a proton in water solution is surprisingly little recognized in medicinal chemistry, although known to general chemists for the past 60+years. We present the development and results for two predictive models: 1) identifying the most basic carbon in a ring, and 2) calculating the resulting microscopic pKa value. Both models were incorporated into our global (i. e., taking all ionizable groups into account) S+pKa model.[1-2].

{"title":"Modeling Carbon Basicity.","authors":"Robert Fraczkiewicz, Marvin Waldman","doi":"10.1002/minf.202400296","DOIUrl":"https://doi.org/10.1002/minf.202400296","url":null,"abstract":"<p><p>This work presents a predictive model of aqueous ionization constants (pK<sub>a</sub>) of protonatable carbons in certain aromatic rings. The phenomenon of carbon atoms sometimes acting as a stable and reversible base accepting a proton in water solution is surprisingly little recognized in medicinal chemistry, although known to general chemists for the past 60+years. We present the development and results for two predictive models: 1) identifying the most basic carbon in a ring, and 2) calculating the resulting microscopic pK<sub>a</sub> value. Both models were incorporated into our global (i. e., taking all ionizable groups into account) S+pKa model.[1-2].</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 3","pages":"e202400296"},"PeriodicalIF":2.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning in Drug Development for Neurological Diseases: A Review of Blood Brain Barrier Permeability Prediction Models.
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-03-01 DOI: 10.1002/minf.202400325
Aryon Eckleel Nabi, Pedram Pouladvand, Litian Liu, Ning Hua, Cyrus Ayubcha

The blood brain barrier (BBB) is an endothelial-derived structure which restricts the movement of certain molecules between the general somatic circulatory system to the central nervous system (CNS). While the BBB maintains homeostasis by regulating the molecular environment induced by cerebrovascular perfusion, it also presents significant challenges in developing therapeutics intended to act on CNS targets. Many drug development practices rely partly on extensive cell and animal models to predict, to an extent, whether prospective therapeutic molecules can cross the BBB. In interest to reduce costs and improve prediction accuracy, many propose using advanced computational modeling of BBB permeability profiles leveraging empirical data. Given the scale of growth in machine learning and deep learning, we review the most recent machine learning approaches in predicting BBB permeability.

{"title":"Machine Learning in Drug Development for Neurological Diseases: A Review of Blood Brain Barrier Permeability Prediction Models.","authors":"Aryon Eckleel Nabi, Pedram Pouladvand, Litian Liu, Ning Hua, Cyrus Ayubcha","doi":"10.1002/minf.202400325","DOIUrl":"10.1002/minf.202400325","url":null,"abstract":"<p><p>The blood brain barrier (BBB) is an endothelial-derived structure which restricts the movement of certain molecules between the general somatic circulatory system to the central nervous system (CNS). While the BBB maintains homeostasis by regulating the molecular environment induced by cerebrovascular perfusion, it also presents significant challenges in developing therapeutics intended to act on CNS targets. Many drug development practices rely partly on extensive cell and animal models to predict, to an extent, whether prospective therapeutic molecules can cross the BBB. In interest to reduce costs and improve prediction accuracy, many propose using advanced computational modeling of BBB permeability profiles leveraging empirical data. Given the scale of growth in machine learning and deep learning, we review the most recent machine learning approaches in predicting BBB permeability.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 3","pages":"e202400325"},"PeriodicalIF":2.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11949286/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143729938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Molecular Representation to Identify Isofunctional Molecules.
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-03-01 DOI: 10.1002/minf.202400159
Philippe Pinel, Gwenn Guichaoua, Nicolas Devaux, Yann Gaston-Mathé, Brice Hoffmann, Véronique Stoven

The challenges of drug discovery from hit identification to clinical development sometimes involves addressing scaffold hopping issues, in order to optimise molecular biological activity or ADME properties, or mitigate toxicology concerns of a drug candidate. Docking is usually viewed as the method of choice for identification of isofunctional molecules, i. e. highly dissimilar molecules that share common binding modes with a protein target. However, the structure of the protein may not be suitable for docking because of a low resolution, or may even be unknown. This problem is frequently encountered in the case of membrane proteins, although they constitute an important category of the druggable proteome. In such cases, ligand-based approaches offer promise but are often inadequate to handle large-step scaffold hopping, because they usually rely on molecular structure. Therefore, we propose the Interaction Fingerprints Profile (IFPP), a molecular representation that captures molecules binding modes based on docking experiments against a panel of diverse high-quality proteins structures. Evaluation on the LH benchmark demonstrates the interest of IFPP for identification of isofunctional molecules. Nevertheless, computation of IFPPs is expensive, which limits its scalability for screening very large molecular libraries. We propose to overcome this limitation by leveraging Metric Learning approaches, allowing fast estimation of molecules IFPP similarities, thus providing an efficient pre-screening strategy that in applicable to very large molecular libraries. Overall, our results suggest that IFPP provides an interesting and complementary tool alongside existing methods, in order to address challenging scaffold hopping problems effectively in drug discovery.

{"title":"A Molecular Representation to Identify Isofunctional Molecules.","authors":"Philippe Pinel, Gwenn Guichaoua, Nicolas Devaux, Yann Gaston-Mathé, Brice Hoffmann, Véronique Stoven","doi":"10.1002/minf.202400159","DOIUrl":"https://doi.org/10.1002/minf.202400159","url":null,"abstract":"<p><p>The challenges of drug discovery from hit identification to clinical development sometimes involves addressing scaffold hopping issues, in order to optimise molecular biological activity or ADME properties, or mitigate toxicology concerns of a drug candidate. Docking is usually viewed as the method of choice for identification of isofunctional molecules, i. e. highly dissimilar molecules that share common binding modes with a protein target. However, the structure of the protein may not be suitable for docking because of a low resolution, or may even be unknown. This problem is frequently encountered in the case of membrane proteins, although they constitute an important category of the druggable proteome. In such cases, ligand-based approaches offer promise but are often inadequate to handle large-step scaffold hopping, because they usually rely on molecular structure. Therefore, we propose the Interaction Fingerprints Profile (IFPP), a molecular representation that captures molecules binding modes based on docking experiments against a panel of diverse high-quality proteins structures. Evaluation on the LH benchmark demonstrates the interest of IFPP for identification of isofunctional molecules. Nevertheless, computation of IFPPs is expensive, which limits its scalability for screening very large molecular libraries. We propose to overcome this limitation by leveraging Metric Learning approaches, allowing fast estimation of molecules IFPP similarities, thus providing an efficient pre-screening strategy that in applicable to very large molecular libraries. Overall, our results suggest that IFPP provides an interesting and complementary tool alongside existing methods, in order to address challenging scaffold hopping problems effectively in drug discovery.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 3","pages":"e202400159"},"PeriodicalIF":2.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143657826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CoLiNN: A Tool for Fast Chemical Space Visualization of Combinatorial Libraries Without Enumeration.
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-03-01 DOI: 10.1002/minf.202400263
Regina Pikalyova, Tagir Akhmetshin, Dragos Horvath, Alexandre Varnek

Visualization of the combinatorial library chemical space provides a comprehensive overview of available compound classes, their diversity, and physicochemical property distribution - key factors in drug discovery. Typically, this visualization requires time- and resource-consuming compound enumeration, standardization, descriptor calculation, and dimensionality reduction. In this study, we present the Combinatorial Library Neural Network (CoLiNN) designed to predict the projection of compounds on a 2D chemical space map using only their building blocks and reaction information, thus eliminating the need for compound enumeration. Trained on 2.5 K virtual DNA-Encoded Libraries (DELs), CoLiNN demonstrated high predictive performance, accurately predicting the compound position on Generative Topographic Maps (GTMs). GTMs predicted by CoLiNN were found very similar to the maps built for enumerated structures. In the library comparison task, we compared the GTMs of DELs and the ChEMBL database. The similarity-based DELs/ChEMBL rankings obtained with "true" and CoLiNN predicted GTMs were consistent. Therefore, CoLiNN has the potential to become the go-to tool for combinatorial compound library design - it can explore the library design space more efficiently by skipping the compound enumeration.

{"title":"CoLiNN: A Tool for Fast Chemical Space Visualization of Combinatorial Libraries Without Enumeration.","authors":"Regina Pikalyova, Tagir Akhmetshin, Dragos Horvath, Alexandre Varnek","doi":"10.1002/minf.202400263","DOIUrl":"10.1002/minf.202400263","url":null,"abstract":"<p><p>Visualization of the combinatorial library chemical space provides a comprehensive overview of available compound classes, their diversity, and physicochemical property distribution - key factors in drug discovery. Typically, this visualization requires time- and resource-consuming compound enumeration, standardization, descriptor calculation, and dimensionality reduction. In this study, we present the Combinatorial Library Neural Network (CoLiNN) designed to predict the projection of compounds on a 2D chemical space map using only their building blocks and reaction information, thus eliminating the need for compound enumeration. Trained on 2.5 K virtual DNA-Encoded Libraries (DELs), CoLiNN demonstrated high predictive performance, accurately predicting the compound position on Generative Topographic Maps (GTMs). GTMs predicted by CoLiNN were found very similar to the maps built for enumerated structures. In the library comparison task, we compared the GTMs of DELs and the ChEMBL database. The similarity-based DELs/ChEMBL rankings obtained with \"true\" and CoLiNN predicted GTMs were consistent. Therefore, CoLiNN has the potential to become the go-to tool for combinatorial compound library design - it can explore the library design space more efficiently by skipping the compound enumeration.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 3","pages":"e202400263"},"PeriodicalIF":2.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11916640/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143657828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Molecular Odor Prediction Using Olfactory Receptor Information.
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-03-01 DOI: 10.1002/minf.202400274
Yuta Wakutsu, Hiromasa Kaneko

In fragrance development, the framework development process is a bottleneck from the perspective of labor, cost, and human resource development. Odors vary greatly depending on the structure and functional groups of the molecule. Although odor has been predicted from only the structure of molecules, its practical application remains elusive. In this study, we developed a model for predicting the odor of molecules that have only small differences in structure. Focusing on the mechanism of human olfaction, we divided the mechanism into three levels and constructed three models: a classification model that predicts the presence or absence of binding between molecules and olfactory receptors, a regression model that predicts the strength of binding, and a classification model that predicts the presence or absence of odor based on the strength of binding. Olfactory receptors were used as descriptors to discriminate between similar molecular odors. Our models predicted odor differences between some similar molecules, including optical isomers.

{"title":"Molecular Odor Prediction Using Olfactory Receptor Information.","authors":"Yuta Wakutsu, Hiromasa Kaneko","doi":"10.1002/minf.202400274","DOIUrl":"10.1002/minf.202400274","url":null,"abstract":"<p><p>In fragrance development, the framework development process is a bottleneck from the perspective of labor, cost, and human resource development. Odors vary greatly depending on the structure and functional groups of the molecule. Although odor has been predicted from only the structure of molecules, its practical application remains elusive. In this study, we developed a model for predicting the odor of molecules that have only small differences in structure. Focusing on the mechanism of human olfaction, we divided the mechanism into three levels and constructed three models: a classification model that predicts the presence or absence of binding between molecules and olfactory receptors, a regression model that predicts the strength of binding, and a classification model that predicts the presence or absence of odor based on the strength of binding. Olfactory receptors were used as descriptors to discriminate between similar molecular odors. Our models predicted odor differences between some similar molecules, including optical isomers.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 3","pages":"e202400274"},"PeriodicalIF":2.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11906144/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143625317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing Explanations of Molecular Machine Learning Models Generated with Different Methods for the Calculation of Shapley Values.
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-03-01 DOI: 10.1002/minf.202500067
Alec Lamens, Jürgen Bajorath

Feature attribution methods from explainable artificial intelligence (XAI) provide explanations of machine learning models by quantifying feature importance for predictions of test instances. While features determining individual predictions have frequently been identified in machine learning applications, the consistency of feature importance-based explanations of machine learning models using different attribution methods has not been thoroughly investigated. We have systematically compared model explanations in molecular machine learning. Therefore, a test system of highly accurate compound activity predictions for different targets using different machine learning methods was generated. For these predictions, explanations were computed using methodological variants of the Shapley value formalism, a popular feature attribution approach in machine learning adapted from game theory. Predictions of each model were assessed using a model-agnostic and model-specific Shapley value-based method. The resulting feature importance distributions were characterized and compared by a global statistical analysis using diverse measures. Unexpectedly, methodological variants for Shapley value calculations yielded distinct feature importance distributions for highly accurate predictions. There was only little agreement between alternative model explanations. Our findings suggest that feature importance-based explanations of machine learning predictions should include an assessment of consistency using alternative methods.

可解释人工智能(XAI)的特征归因方法通过量化特征对测试实例预测的重要性来解释机器学习模型。虽然在机器学习应用中,决定单个预测的特征经常被识别出来,但使用不同归因方法对机器学习模型进行的基于特征重要性的解释的一致性尚未得到深入研究。我们系统地比较了分子机器学习中的模型解释。因此,我们利用不同的机器学习方法生成了一个针对不同靶点的高精度化合物活性预测测试系统。对于这些预测,我们使用沙普利值形式主义的方法变体来计算解释,沙普利值形式主义是机器学习中一种流行的特征归因方法,由博弈论改编而来。使用基于 Shapley 值的模型无关和特定模型方法对每个模型的预测进行了评估。通过使用不同的测量方法进行全局统计分析,对得出的特征重要性分布进行了表征和比较。出乎意料的是,夏普利值计算方法的变体产生了不同的特征重要性分布,从而实现了高度准确的预测。替代模型解释之间的一致性很低。我们的研究结果表明,基于特征重要性的机器学习预测解释应包括使用替代方法对一致性进行评估。
{"title":"Comparing Explanations of Molecular Machine Learning Models Generated with Different Methods for the Calculation of Shapley Values.","authors":"Alec Lamens, Jürgen Bajorath","doi":"10.1002/minf.202500067","DOIUrl":"10.1002/minf.202500067","url":null,"abstract":"<p><p>Feature attribution methods from explainable artificial intelligence (XAI) provide explanations of machine learning models by quantifying feature importance for predictions of test instances. While features determining individual predictions have frequently been identified in machine learning applications, the consistency of feature importance-based explanations of machine learning models using different attribution methods has not been thoroughly investigated. We have systematically compared model explanations in molecular machine learning. Therefore, a test system of highly accurate compound activity predictions for different targets using different machine learning methods was generated. For these predictions, explanations were computed using methodological variants of the Shapley value formalism, a popular feature attribution approach in machine learning adapted from game theory. Predictions of each model were assessed using a model-agnostic and model-specific Shapley value-based method. The resulting feature importance distributions were characterized and compared by a global statistical analysis using diverse measures. Unexpectedly, methodological variants for Shapley value calculations yielded distinct feature importance distributions for highly accurate predictions. There was only little agreement between alternative model explanations. Our findings suggest that feature importance-based explanations of machine learning predictions should include an assessment of consistency using alternative methods.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 3","pages":"e202500067"},"PeriodicalIF":2.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11925390/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Integrated Fuzzy Neural Network and Topological Data Analysis for Molecular Graph Representation Learning and Property Forecasting.
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-03-01 DOI: 10.1002/minf.202400335
Phu Pham

Within a recent decade, graph neural network (GNN) has emerged as a powerful neural architecture for various graph-structured data modelling and task-driven representation learning problems. Recent studies have highlighted the remarkable capabilities of GNNs in handling complex graph representation learning tasks, achieving state-of-the-art results in node/graph classification, regression, and generation. However, most traditional GNN-based architectures like GCN and GraphSAGE still faced several challenges related to the capability of preserving the multi-scaled topological structures. These models primarily focus on capturing local neighborhood information, often failing to retain global structural features essential for graph-level representation and classification tasks. Furthermore, their expressiveness is limited when learning topological structures in complex molecular graph datasets. To overcome these limitations, in this paper, we proposed a novel graph neural architecture which is an integration between neuro-fuzzy network and topological graph learning approach, naming as: FTPG. Specifically, within our proposed FTPG model, we introduce a novel approach to molecular graph representation and property prediction by integrating multi-scaled topological graph learning with advanced neural components. The architecture employs separate graph neural learning modules to effectively capture both local graph-based structures as well as global topological features. Moreover, to further address feature uncertainty in the global-view representation, a multi-layered neuro-fuzzy network is incorporated within our model to enhance the robustness and expressiveness of the learned molecular graph embeddings. This combinatorial approach can assist to leverage the strengths of multi-view and multi-modal neural learning, enabling FTPG to deliver superior performance in molecular graph tasks. Extensive experiments on real-world/benchmark molecular datasets demonstrate the effectiveness of our proposed FTPG model. It consistently outperforms state-of-the-art GNN-based baselines categorized in different approaches, including canonical local proximity message passing based, graph transformer-based, and topology-driven approaches.

{"title":"An Integrated Fuzzy Neural Network and Topological Data Analysis for Molecular Graph Representation Learning and Property Forecasting.","authors":"Phu Pham","doi":"10.1002/minf.202400335","DOIUrl":"https://doi.org/10.1002/minf.202400335","url":null,"abstract":"<p><p>Within a recent decade, graph neural network (GNN) has emerged as a powerful neural architecture for various graph-structured data modelling and task-driven representation learning problems. Recent studies have highlighted the remarkable capabilities of GNNs in handling complex graph representation learning tasks, achieving state-of-the-art results in node/graph classification, regression, and generation. However, most traditional GNN-based architectures like GCN and GraphSAGE still faced several challenges related to the capability of preserving the multi-scaled topological structures. These models primarily focus on capturing local neighborhood information, often failing to retain global structural features essential for graph-level representation and classification tasks. Furthermore, their expressiveness is limited when learning topological structures in complex molecular graph datasets. To overcome these limitations, in this paper, we proposed a novel graph neural architecture which is an integration between neuro-fuzzy network and topological graph learning approach, naming as: FTPG. Specifically, within our proposed FTPG model, we introduce a novel approach to molecular graph representation and property prediction by integrating multi-scaled topological graph learning with advanced neural components. The architecture employs separate graph neural learning modules to effectively capture both local graph-based structures as well as global topological features. Moreover, to further address feature uncertainty in the global-view representation, a multi-layered neuro-fuzzy network is incorporated within our model to enhance the robustness and expressiveness of the learned molecular graph embeddings. This combinatorial approach can assist to leverage the strengths of multi-view and multi-modal neural learning, enabling FTPG to deliver superior performance in molecular graph tasks. Extensive experiments on real-world/benchmark molecular datasets demonstrate the effectiveness of our proposed FTPG model. It consistently outperforms state-of-the-art GNN-based baselines categorized in different approaches, including canonical local proximity message passing based, graph transformer-based, and topology-driven approaches.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 3","pages":"e202400335"},"PeriodicalIF":2.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143616256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discovery of New HER2 Inhibitors via Computational Docking, Pharmacophore Modeling, and Machine Learning.
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-02-01 DOI: 10.1002/minf.202400336
Aseel Yasin Matrouk, Haneen Mohammad, Safa Daoud, Mutasem Omar Taha

The human epidermal growth factor receptor 2 (HER2) is a critical oncogene implicated in the development of various aggressive cancers, particularly breast cancer. Discovering novel HER2 inhibitors is crucial for expanding therapeutic options for HER2-related malignancies. In this study, we present a computational workflow that focuses on generating pharmacophores derived from docked poses of a selected list of 15 diverse, potent HER2 inhibitors, utilizing flexible docking. The resulting pharmacophores, along with other physicochemical molecular descriptors, were then evaluated in a machine learning-quantitative structure-activity relationship (ML-QSAR) analysis against 1,272 HER2 inhibitors. Several machine learning methods were assessed, and a genetic function algorithm (GFA) was employed for feature selection. Ultimately, GFA combined with Bagging and J48Graft classifiers produced the best self-consistent and predictive models. These models highlighted the significance of two pharmacophores, Hypo_1 and Hypo_2, in distinguishing potent from less active inhibitors. The successful ML-QSAR models and their associated pharmacophores were used to screen the National Cancer Institute (NCI) database for novel HER2 inhibitors. Three promising anti-HER2 leads were identified, with the top-performing lead demonstrating an experimental anti-HER2 IC50 value of 3.85 μM. Notably, the three inhibitors exhibited distinct chemical scaffolds compared to existing HER2 inhibitors, as indicated by principal component analysis.

{"title":"Discovery of New HER2 Inhibitors via Computational Docking, Pharmacophore Modeling, and Machine Learning.","authors":"Aseel Yasin Matrouk, Haneen Mohammad, Safa Daoud, Mutasem Omar Taha","doi":"10.1002/minf.202400336","DOIUrl":"https://doi.org/10.1002/minf.202400336","url":null,"abstract":"<p><p>The human epidermal growth factor receptor 2 (HER2) is a critical oncogene implicated in the development of various aggressive cancers, particularly breast cancer. Discovering novel HER2 inhibitors is crucial for expanding therapeutic options for HER2-related malignancies. In this study, we present a computational workflow that focuses on generating pharmacophores derived from docked poses of a selected list of 15 diverse, potent HER2 inhibitors, utilizing flexible docking. The resulting pharmacophores, along with other physicochemical molecular descriptors, were then evaluated in a machine learning-quantitative structure-activity relationship (ML-QSAR) analysis against 1,272 HER2 inhibitors. Several machine learning methods were assessed, and a genetic function algorithm (GFA) was employed for feature selection. Ultimately, GFA combined with Bagging and J48Graft classifiers produced the best self-consistent and predictive models. These models highlighted the significance of two pharmacophores, Hypo_1 and Hypo_2, in distinguishing potent from less active inhibitors. The successful ML-QSAR models and their associated pharmacophores were used to screen the National Cancer Institute (NCI) database for novel HER2 inhibitors. Three promising anti-HER2 leads were identified, with the top-performing lead demonstrating an experimental anti-HER2 IC<sub>50</sub> value of 3.85 μM. Notably, the three inhibitors exhibited distinct chemical scaffolds compared to existing HER2 inhibitors, as indicated by principal component analysis.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 2","pages":"e202400336"},"PeriodicalIF":2.8,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143458679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MAYA (Multiple ActivitY Analyzer): An Open Access Tool to Explore Structure-Multiple Activity Relationships in the Chemical Universe.
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-02-01 DOI: 10.1002/minf.202400306
J Israel Espinoza-Castañeda, José L Medina-Franco

Herein, we introduce MAYA (Multiple Activity Analyzer), a tool designed to automatically construct a chemical multiverse, generating multiple visualizations of chemical spaces of a compound data set described by structural descriptors of different nature such as Molecular ACCess Systems (MACCS) keys, extended connectivity fingerprints with different radius, molecular descriptors with pharmaceutical relevance, and bioactivity descriptors. These representations are integrated with various data visualization techniques for the automated analysis focused on structure - multiple activity/property relationships, enabling analysis for various problems set in user-friendly source software. The source code of MAYA is freely available on GitHub at https://github.com/IsrC11/MAYA.git.

{"title":"MAYA (Multiple ActivitY Analyzer): An Open Access Tool to Explore Structure-Multiple Activity Relationships in the Chemical Universe.","authors":"J Israel Espinoza-Castañeda, José L Medina-Franco","doi":"10.1002/minf.202400306","DOIUrl":"10.1002/minf.202400306","url":null,"abstract":"<p><p>Herein, we introduce MAYA (Multiple Activity Analyzer), a tool designed to automatically construct a chemical multiverse, generating multiple visualizations of chemical spaces of a compound data set described by structural descriptors of different nature such as Molecular ACCess Systems (MACCS) keys, extended connectivity fingerprints with different radius, molecular descriptors with pharmaceutical relevance, and bioactivity descriptors. These representations are integrated with various data visualization techniques for the automated analysis focused on structure - multiple activity/property relationships, enabling analysis for various problems set in user-friendly source software. The source code of MAYA is freely available on GitHub at https://github.com/IsrC11/MAYA.git.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 2","pages":"e202400306"},"PeriodicalIF":2.8,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11812492/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143391311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting the Price of Molecules Using Their Predicted Synthetic Pathways.
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-02-01 DOI: 10.1002/minf.202400039
Massina Abderrahmane, Hamza Tajmouati, Vinicius Barros Ribeiro da Silva, Quentin Perron

Currently, numerous metrics allow chemists and computational chemists to refine and filter libraries of virtual molecules in order to prioritize their synthesis. Some of the most commonly used metrics and models are QSAR models, docking scores, diverse druggability metrics, and synthetic feasibility scores to name only a few. To our knowledge, among the known metrics, a function which estimates the price of a novel virtual molecule and which takes into account the availability and price of starting materials has not been considered before in literature. Being able to make such a prediction could improve and accelerate the decision-making process related to the cost-of-goods. Taking advantage of recent advances in the field of Computer Aided Synthetic Planning (CASP), we decided to investigate if the predicted retrosynthetic pathways of a given molecule and the prices of its associated starting materials could be good features to predict the price of that compound. In this work, we present a deep learning model, RetroPriceNet, that predicts the price of molecules using their predicted synthetic pathways. On a holdout test set, the model achieves better performance than the state-of-the-art model. The developed approach takes into account the synthetic feasibility of molecules and the availability and prices of the starting materials.

{"title":"Predicting the Price of Molecules Using Their Predicted Synthetic Pathways.","authors":"Massina Abderrahmane, Hamza Tajmouati, Vinicius Barros Ribeiro da Silva, Quentin Perron","doi":"10.1002/minf.202400039","DOIUrl":"https://doi.org/10.1002/minf.202400039","url":null,"abstract":"<p><p>Currently, numerous metrics allow chemists and computational chemists to refine and filter libraries of virtual molecules in order to prioritize their synthesis. Some of the most commonly used metrics and models are QSAR models, docking scores, diverse druggability metrics, and synthetic feasibility scores to name only a few. To our knowledge, among the known metrics, a function which estimates the price of a novel virtual molecule and which takes into account the availability and price of starting materials has not been considered before in literature. Being able to make such a prediction could improve and accelerate the decision-making process related to the cost-of-goods. Taking advantage of recent advances in the field of Computer Aided Synthetic Planning (CASP), we decided to investigate if the predicted retrosynthetic pathways of a given molecule and the prices of its associated starting materials could be good features to predict the price of that compound. In this work, we present a deep learning model, RetroPriceNet, that predicts the price of molecules using their predicted synthetic pathways. On a holdout test set, the model achieves better performance than the state-of-the-art model. The developed approach takes into account the synthetic feasibility of molecules and the availability and prices of the starting materials.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 2","pages":"e202400039"},"PeriodicalIF":2.8,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143066819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Molecular Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1