Molecular Informatics最新文献_第2页

Structural Flexibility and Shape Similarity Contribute to Exclusive Functions of Certain Atg8 Isoforms in the Autophagy Process. 结构灵活性和形状相似性有助于某些at8亚型在自噬过程中的排他性功能。

IF 2.8 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics

Pub Date : 2025-07-01 DOI: 10.1002/minf.70004

Alexey Rayevsky, Eliah Bulgakov, Mariia Stykhylias, Sergey Ozheredov, Svetlana Spivak, Yaroslav Blume

Despite the abundance of systematically collected experimental data and facts, the multistep process of autophagy still contains many dark spots. One concerns the background selectivity of interactions between certain autophagy-related protein (ATG8) isoforms and their receptors/adaptors in plants during the autophagy process. By regulating phagophore initiation, expansion, and maturation, these proteins control the assembly of numerous autophagy proteins at this key docking platform. Bioinformatics analysis of human, yeast, and plant ATG8 amino acid sequences allow us to build a sequence tree of plant ATG8s, divided in three groups. We perform a structural study aimed at revealing some of the underlying reasons for the differences in the selectivity of ATG8 isoforms. A series of molecular dynamics (MD) simulations are performed to explain the stage-dependent functionality of ATG8. The conserved secondary structure and folding across all ATG8 proteins, resulting in nearly identical protein-protein interaction interfaces, makes this study particularly important and interesting. Recognizing the dual role of the LC3 interacting region (LIR) in autophagosome biogenesis and recruitment of the anchored selective autophagy receptor (SAR), we perform a mobility domain analysis. To this end, the amino acid sequence associated with the LIR docking site (LDS) interface is localized and subjected to root mean square deviation (RMSD)-based clustering analysis. Starting from Atg8-targeted protein-peptide docking, we attempt to identify conformational changes in the contact region of the corresponding adaptors and receptors involved in the common biogenesis events in autophagy. For the molecular dynamics, we select three representatives, sharing common patterns with other members of the groups. The resulting ATG8-peptide complexes display a significant preference for binding specific partners by different ATG8 isotypes.

尽管系统收集了大量的实验数据和事实，但自噬的多步骤过程仍然存在许多黑点。其中一个涉及植物自噬过程中某些自噬相关蛋白（ATG8）亚型与其受体/接头之间相互作用的背景选择性。通过调节吞噬细胞的起始、扩张和成熟，这些蛋白控制了许多自噬蛋白在这个关键对接平台上的组装。通过对人类、酵母和植物ATG8氨基酸序列的生物信息学分析，我们建立了植物ATG8序列树，分为三组。我们进行了一项结构研究，旨在揭示ATG8亚型选择性差异的一些潜在原因。通过一系列分子动力学（MD）模拟来解释ATG8的阶段依赖功能。所有ATG8蛋白的保守二级结构和折叠，导致几乎相同的蛋白-蛋白相互作用界面，使得这项研究特别重要和有趣。认识到LC3相互作用区（LIR）在自噬体生物发生和锚定选择性自噬受体（SAR）募集中的双重作用，我们进行了迁移域分析。为此，我们定位了与LIR对接位点（LDS）接口相关的氨基酸序列，并进行了基于均方根偏差（RMSD）的聚类分析。从atg8靶向蛋白-肽对接开始，我们试图确定自噬中参与常见生物发生事件的相应接头和受体接触区域的构象变化。对于分子动力学，我们选择了三个代表，与其他成员共享共同的模式。由此产生的ATG8肽复合物显示出不同ATG8同型结合特定伴侣的显著偏好。

{"title":"Structural Flexibility and Shape Similarity Contribute to Exclusive Functions of Certain Atg8 Isoforms in the Autophagy Process.","authors":"Alexey Rayevsky, Eliah Bulgakov, Mariia Stykhylias, Sergey Ozheredov, Svetlana Spivak, Yaroslav Blume","doi":"10.1002/minf.70004","DOIUrl":"https://doi.org/10.1002/minf.70004","url":null,"abstract":"Despite the abundance of systematically collected experimental data and facts, the multistep process of autophagy still contains many dark spots. One concerns the background selectivity of interactions between certain autophagy-related protein (ATG8) isoforms and their receptors/adaptors in plants during the autophagy process. By regulating phagophore initiation, expansion, and maturation, these proteins control the assembly of numerous autophagy proteins at this key docking platform. Bioinformatics analysis of human, yeast, and plant ATG8 amino acid sequences allow us to build a sequence tree of plant ATG8s, divided in three groups. We perform a structural study aimed at revealing some of the underlying reasons for the differences in the selectivity of ATG8 isoforms. A series of molecular dynamics (MD) simulations are performed to explain the stage-dependent functionality of ATG8. The conserved secondary structure and folding across all ATG8 proteins, resulting in nearly identical protein-protein interaction interfaces, makes this study particularly important and interesting. Recognizing the dual role of the LC3 interacting region (LIR) in autophagosome biogenesis and recruitment of the anchored selective autophagy receptor (SAR), we perform a mobility domain analysis. To this end, the amino acid sequence associated with the LIR docking site (LDS) interface is localized and subjected to root mean square deviation (RMSD)-based clustering analysis. Starting from Atg8-targeted protein-peptide docking, we attempt to identify conformational changes in the contact region of the corresponding adaptors and receptors involved in the common biogenesis events in autophagy. For the molecular dynamics, we select three representatives, sharing common patterns with other members of the groups. The resulting ATG8-peptide complexes display a significant preference for binding specific partners by different ATG8 isotypes.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 7","pages":"e202500025"},"PeriodicalIF":2.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144659700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Network Analysis of the Organic Chemistry in Patents, Literature, and Pharmaceutical Industry. 有机化学在专利、文献和制药工业中的网络分析。

IF 2.8 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics

Pub Date : 2025-07-01 DOI: 10.1002/minf.202500011

Emma Svensson, Emma Granqvist, Tomas Bastys, Christos Kannas, Mikhail Kabeshov, Samuel Genheden, Ola Engkvist, Thierry Kogej

Chemical reactions can be connected in large networks such as knowledge graphs. In this way, prior work has been able to draw meaningful conclusions about the properties and structures involved in organic chemistry reactions. However, the research has focused on public sources of organic synthesis that might lack the intricate details of the synthetic routes used in in-house drug discovery. In this work, previous analyses are expanded to also include an in-house electronic lab notebook (ELN) source, such that we can compare it to knowledge graphs that were constructed from US Patent and Trademark Office (USPTO) and Reaxys. We found that the Reaxys knowledge graph is the most interconnected and has the largest proportion of nodes belonging to the core, whereas the USPTO is much less connected and only has a small core. The ELN knowledge graph falls between these extremes in connectivity and it does not have any core. The hub molecules of ELN and USPTO are most similar, primarily represented by small, organic building blocks. We hypothesize that these differences can be attributed to the different origins of the data in the three sources. We discuss what impact this might have on synthesis prediction modelling.

化学反应可以在像知识图谱这样的大网络中连接起来。通过这种方式，先前的工作已经能够得出有关有机化学反应的性质和结构的有意义的结论。然而，这项研究的重点是有机合成的公共来源，可能缺乏内部药物发现中使用的合成路线的复杂细节。在这项工作中，先前的分析被扩展到还包括内部电子实验室笔记本（ELN）源，这样我们就可以将其与美国专利商标局（USPTO）和Reaxys构建的知识图谱进行比较。我们发现Reaxys知识图谱的关联度最高，属于核心的节点比例最大，而USPTO的关联度要低得多，只有一个小核心。ELN知识图谱在连通性方面介于这两个极端之间，它没有任何核心。ELN和USPTO的中心分子最相似，主要由小的有机构建块表示。我们假设这些差异可以归因于三个来源的数据的不同来源。我们讨论了这可能对合成预测建模产生的影响。

{"title":"Network Analysis of the Organic Chemistry in Patents, Literature, and Pharmaceutical Industry.","authors":"Emma Svensson, Emma Granqvist, Tomas Bastys, Christos Kannas, Mikhail Kabeshov, Samuel Genheden, Ola Engkvist, Thierry Kogej","doi":"10.1002/minf.202500011","DOIUrl":"10.1002/minf.202500011","url":null,"abstract":"Chemical reactions can be connected in large networks such as knowledge graphs. In this way, prior work has been able to draw meaningful conclusions about the properties and structures involved in organic chemistry reactions. However, the research has focused on public sources of organic synthesis that might lack the intricate details of the synthetic routes used in in-house drug discovery. In this work, previous analyses are expanded to also include an in-house electronic lab notebook (ELN) source, such that we can compare it to knowledge graphs that were constructed from US Patent and Trademark Office (USPTO) and Reaxys. We found that the Reaxys knowledge graph is the most interconnected and has the largest proportion of nodes belonging to the core, whereas the USPTO is much less connected and only has a small core. The ELN knowledge graph falls between these extremes in connectivity and it does not have any core. The hub molecules of ELN and USPTO are most similar, primarily represented by small, organic building blocks. We hypothesize that these differences can be attributed to the different origins of the data in the three sources. We discuss what impact this might have on synthesis prediction modelling.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 7","pages":"e202500011"},"PeriodicalIF":2.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12273192/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144659699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Neural Network Models for Prediction of Biological Activity using Molecular Dynamics Data: A Case of Photoswitchable Peptides. 利用分子动力学数据预测生物活性的神经网络模型：以光开关肽为例。

IF 2.8 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics

Pub Date : 2025-07-01 DOI: 10.1002/minf.70001

Anton Cherednichenko, Sergii Afonin, Oleg Babii, Taras Voitsitskyi, Roman Stratiichuk, Ihor Koleiev, Volodymyr Vozniak, Nazar Shevchuk, Zakhar Ostrovsky, Semen Yesylevskyy, Alan Nafiiev, Serhii Starosyla, Anne S Ulrich, Aigars Jirgensons, Igor V Komarov

Prediction of biological activities of chemical compounds by the machine learning techniques in general and the neural networks (NNs) in particular, is usually based on the analysis of their binding to the target of interest. If such affinity data is not available, the ligand-based approaches can be used where the NN models are trained to assess similarity of compounds to those with known biological activity. Obviously, this approach only works well if the similarity between the training set and the evaluated molecules is sufficiently high. In the case of large and conformationally flexible organic compounds, the activity becomes dependent not only on chemical identity but also on the dynamics of molecular motions, which imposes significant challenges to existing approaches based on static structural 2D and 3D molecular descriptors. A prominent example of compounds, which are especially challenging for existing NN activity prediction techniques, are photoswitchable macrocyclic peptides containing a diarylethene "photoswitch" (DAE). These molecules exist in two isomeric forms with remarkably different biological activities, which are interconvertible by light of different wavelengths. Activity prediction models have to distinguish in this case not only between the different peptides but also between the photoisomers of the same peptide. In this work, we demonstrate that the features extracted from classical molecular dynamics (MD) trajectories are superior to conventional 2D or 3D descriptor-based features when used in activity prediction NN models of DAE-containing photoswitchable peptides. Using MD-derived features, we successfully created two NN models that predict activities of photoswitchable peptidomimetics, analogs of the natural peptidic antibiotic gramicidin S. The first model precisely predicts the cytotoxic activity of similar peptide analogs. The second model reliably predicts the differences in the biological activities of DAE photoisomers of the same peptide, even if the type of its activity differs from one in the training dataset. Our results demonstrate that accounting for MD-derived dynamic features allows generalizing the ligand-based activity prediction NN models to the cases of large and conformationally flexible molecules, which were previously considered intractable by this class of models.

通过机器学习技术，特别是神经网络（NNs）来预测化合物的生物活性，通常是基于分析它们与感兴趣的目标的结合。如果没有这样的亲和性数据，可以使用基于配体的方法来训练神经网络模型，以评估化合物与具有已知生物活性的化合物的相似性。显然，这种方法只有在训练集和评估分子之间的相似性足够高的情况下才有效。对于大型和构象灵活的有机化合物，活性不仅依赖于化学特性，还依赖于分子运动动力学，这对基于静态结构二维和三维分子描述符的现有方法提出了重大挑战。一个突出的例子是含有二乙烯“光开关”（DAE）的光开关大环肽，这对现有的神经网络活性预测技术尤其具有挑战性。这些分子以两种异构体形式存在，具有显著不同的生物活性，它们可以通过不同波长的光相互转换。在这种情况下，活性预测模型不仅要区分不同的肽，还要区分同一肽的光异构体。在这项工作中，我们证明了从经典分子动力学（MD）轨迹中提取的特征在用于含有光开关肽的dae的活性预测神经网络模型时优于传统的基于2D或3D描述符的特征。利用md衍生的特征，我们成功地创建了两个神经网络模型来预测光开关肽模拟物（天然肽抗生素gramicidin s的类似物）的活性。第一个模型精确地预测了类似肽类似物的细胞毒性活性。第二个模型可靠地预测了相同肽的DAE光异构体的生物活性差异，即使其活性类型与训练数据集中的不同。我们的研究结果表明，考虑到md衍生的动态特征，可以将基于配体的活性预测神经网络模型推广到大型和构象柔性分子的情况，这些情况以前被这类模型认为是难以处理的。

{"title":"Neural Network Models for Prediction of Biological Activity using Molecular Dynamics Data: A Case of Photoswitchable Peptides.","authors":"Anton Cherednichenko, Sergii Afonin, Oleg Babii, Taras Voitsitskyi, Roman Stratiichuk, Ihor Koleiev, Volodymyr Vozniak, Nazar Shevchuk, Zakhar Ostrovsky, Semen Yesylevskyy, Alan Nafiiev, Serhii Starosyla, Anne S Ulrich, Aigars Jirgensons, Igor V Komarov","doi":"10.1002/minf.70001","DOIUrl":"10.1002/minf.70001","url":null,"abstract":"Prediction of biological activities of chemical compounds by the machine learning techniques in general and the neural networks (NNs) in particular, is usually based on the analysis of their binding to the target of interest. If such affinity data is not available, the ligand-based approaches can be used where the NN models are trained to assess similarity of compounds to those with known biological activity. Obviously, this approach only works well if the similarity between the training set and the evaluated molecules is sufficiently high. In the case of large and conformationally flexible organic compounds, the activity becomes dependent not only on chemical identity but also on the dynamics of molecular motions, which imposes significant challenges to existing approaches based on static structural 2D and 3D molecular descriptors. A prominent example of compounds, which are especially challenging for existing NN activity prediction techniques, are photoswitchable macrocyclic peptides containing a diarylethene \"photoswitch\" (DAE). These molecules exist in two isomeric forms with remarkably different biological activities, which are interconvertible by light of different wavelengths. Activity prediction models have to distinguish in this case not only between the different peptides but also between the photoisomers of the same peptide. In this work, we demonstrate that the features extracted from classical molecular dynamics (MD) trajectories are superior to conventional 2D or 3D descriptor-based features when used in activity prediction NN models of DAE-containing photoswitchable peptides. Using MD-derived features, we successfully created two NN models that predict activities of photoswitchable peptidomimetics, analogs of the natural peptidic antibiotic gramicidin S. The first model precisely predicts the cytotoxic activity of similar peptide analogs. The second model reliably predicts the differences in the biological activities of DAE photoisomers of the same peptide, even if the type of its activity differs from one in the training dataset. Our results demonstrate that accounting for MD-derived dynamic features allows generalizing the ligand-based activity prediction NN models to the cases of large and conformationally flexible molecules, which were previously considered intractable by this class of models.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 7","pages":"e70001"},"PeriodicalIF":2.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12257427/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144626740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rapid Assessment of Virtually Synthesizable Chemical Structures via Support Vector Machine Models. 基于支持向量机模型的虚拟合成化学结构快速评估。

IF 2.8 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics

Pub Date : 2025-07-01 DOI: 10.1002/minf.70000

Yuto Iwasaki, Tomoyuki Miyao

Support vector machine (SVM) and support vector regression (SVR) are widely used for building quantitative structure-activity relationship models for small- and medium-sized datasets. Although SVM and SVR models can efficiently predict compound activity, evaluating billions of molecules remains challenging, which sometimes occurs when screening the virtual molecules derived through virtual synthesis. Herein, we present an SVM-/SVR-based method for screening virtually synthesizable molecules based on their reactants. The proposed method employs a combination of reactant-wise kernel functions for fast evaluation without sacrificing prediction accuracy. Tested on 120 small molecular activity datasets against 10 macromolecule targets, the proposed SVR models with data augmentation worked equally to standard SVR models with the Tanimoto kernel. As a demonstration, exhaustive 6.4 × 10¹² reactant combinations were evaluated by an SVR model within 8 days on a single desktop computer, enabling large-scale screening without sampling.

支持向量机（SVM）和支持向量回归（SVR）被广泛用于构建中小型数据集的定量结构-活动关系模型。虽然支持向量机和支持向量回归模型可以有效地预测化合物活性，但评估数十亿个分子仍然具有挑战性，有时在筛选通过虚拟合成衍生的虚拟分子时出现这种情况。在此，我们提出了一种基于SVM / svr的方法来筛选基于其反应物的虚拟合成分子。所提出的方法在不牺牲预测精度的情况下，采用组合反应物核函数进行快速评估。在针对10个大分子目标的120个小分子活性数据集上进行了测试，结果表明，该模型与基于谷本核的标准SVR模型具有相同的效果。作为示范，在一台台式计算机上，用SVR模型在8天内评估了详尽的6.4 × 1012种反应物组合，实现了大规模的不抽样筛选。

引用次数: 0

In Silico Identification of Novel and Potent Inhibitors Against Mutant BRAF (V600E), MD Simulations, Free Energy Calculations, and Experimental Determination of Binding Affinity. 抗突变BRAF （V600E）的新型有效抑制剂的硅鉴定，MD模拟，自由能计算和结合亲和力的实验测定。

IF 2.8 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics

Pub Date : 2025-06-01 DOI: 10.1002/minf.202400372

Vikas Yadav, Mohammad Kashif, Zenab Kamali, Samudrala Gourinath, Naidu Subbarao

BRAF is a proto oncogene that functions as a key signal transducer in the MAPK-ERK pathway, which regulates cell growth, division, and survival. Mutations in BRAF, particularly the V600E substitution in its kinase domain, are major drivers in melanoma and several other metastatic cancers, including breast, colorectal, NSCLC, and gastrointestinal cancers. In this study, novel inhibitors targeting the BRAF(V600E) mutant using a structure-based drug design approach are identified. Four chemical libraries ChemDiv Kinase, ChemDiv Anticancer, NCI, and ChEMBL Kinase SARfari are screened. Compounds from the ChemDiv Anticancer database show better Glide scores comparable to the FDA-approved BRAF inhibitor Vemurafenib. The compounds P184-1419 and P184-1479 score -12.688 and -12.012 kcal/mol, respectively, versus -14.288 kcal/mol for Vemurafenib. Top hits are further validated using GOLD docking, X-score ranking, and interaction profiling via LigPlot. Molecular dynamics simulations, principal component analysis, and free energy calculations confirm the stability of protein-ligand complexes. Biolayer interferometry assays reveal P184-1419 exhibits stronger binding affinity (KD = 151 μM) than Vemurafenib (KD = 437 μM). These findings suggest P184-1419 is a promising lead compound against BRAF(V600E), offering potential for future development of more effective cancer therapies.

BRAF是一种原癌基因，在调控细胞生长、分裂和存活的MAPK-ERK通路中起关键信号换能器的作用。BRAF的突变，特别是其激酶结构域的V600E替代，是黑色素瘤和其他转移性癌症（包括乳腺癌、结直肠癌、非小细胞肺癌和胃肠道癌症）的主要驱动因素。在这项研究中，使用基于结构的药物设计方法确定了针对BRAF（V600E）突变体的新型抑制剂。筛选了ChemDiv激酶、ChemDiv抗癌、NCI和ChEMBL激酶SARfari四个化学文库。ChemDiv抗癌数据库中的化合物显示出比fda批准的BRAF抑制剂Vemurafenib更好的Glide评分。化合物P184-1419和P184-1479分别为-12.688和-12.012 kcal/mol，而Vemurafenib为-14.288 kcal/mol。使用GOLD对接、X-score排名和通过LigPlot进行的交互分析进一步验证热门命中。分子动力学模拟、主成分分析和自由能计算证实了蛋白质配体复合物的稳定性。生物层干涉分析显示，P184-1419的结合亲和力（KD = 151 μM）高于Vemurafenib （KD = 437 μM）。这些发现表明P184-1419是一种很有前途的抗BRAF（V600E）先导化合物，为未来开发更有效的癌症治疗提供了潜力。

{"title":"In Silico Identification of Novel and Potent Inhibitors Against Mutant BRAF (V600E), MD Simulations, Free Energy Calculations, and Experimental Determination of Binding Affinity.","authors":"Vikas Yadav, Mohammad Kashif, Zenab Kamali, Samudrala Gourinath, Naidu Subbarao","doi":"10.1002/minf.202400372","DOIUrl":"https://doi.org/10.1002/minf.202400372","url":null,"abstract":"BRAF is a proto oncogene that functions as a key signal transducer in the MAPK-ERK pathway, which regulates cell growth, division, and survival. Mutations in BRAF, particularly the V600E substitution in its kinase domain, are major drivers in melanoma and several other metastatic cancers, including breast, colorectal, NSCLC, and gastrointestinal cancers. In this study, novel inhibitors targeting the BRAF(V600E) mutant using a structure-based drug design approach are identified. Four chemical libraries ChemDiv Kinase, ChemDiv Anticancer, NCI, and ChEMBL Kinase SARfari are screened. Compounds from the ChemDiv Anticancer database show better Glide scores comparable to the FDA-approved BRAF inhibitor Vemurafenib. The compounds P184-1419 and P184-1479 score -12.688 and -12.012 kcal/mol, respectively, versus -14.288 kcal/mol for Vemurafenib. Top hits are further validated using GOLD docking, X-score ranking, and interaction profiling via LigPlot. Molecular dynamics simulations, principal component analysis, and free energy calculations confirm the stability of protein-ligand complexes. Biolayer interferometry assays reveal P184-1419 exhibits stronger binding affinity (KD = 151 μM) than Vemurafenib (KD = 437 μM). These findings suggest P184-1419 is a promising lead compound against BRAF(V600E), offering potential for future development of more effective cancer therapies.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2400372"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144310149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Drug Search and Design Considering Cell Specificity of Chemically Induced Gene Expression Profiles for Disease-Associated Tissues. 考虑疾病相关组织化学诱导基因表达谱细胞特异性的药物搜索和设计。

IF 2.8 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics

Pub Date : 2025-06-01 DOI: 10.1002/minf.2444

Chikashige Yamanaka, Michio Iwata, Kazuma Kaitoh, Yoshihiro Yamanishi

The use of omics data, including gene expression profiles, has recently gained increasing attention in drug discovery. Omics-based drug searches and designs are often based on the correlations between chemically induced and disease-induced gene expression profiles; however, the cell specificity has not been considered. In this study, we designed a novel computational method for drug search and design using cell-specific correlations between drugs and diseases. A data completion technique allowed the characterization of cell-specific gene expression patterns in diseased cells. This proposed method was applied to search for drug candidates and generate new chemical structures for gastric cancer and atopic dermatitis. The results of drug search demonstrated that compounds with diverse chemical structures were detected and were associated with target diseases at the molecular pathway levels. The results of drug design also demonstrated that newly generated compounds were reasonable in terms of the reproducibility of registered drugs. The proposed method is expected to be useful for omics-based drug discovery.

使用组学数据，包括基因表达谱，最近在药物发现中获得了越来越多的关注。基于组学的药物搜索和设计通常基于化学诱导和疾病诱导的基因表达谱之间的相关性；然而，细胞特异性尚未被考虑。在这项研究中，我们设计了一种新的计算方法，用于药物和疾病之间的细胞特异性相关性的药物搜索和设计。数据完成技术允许表征细胞特异性基因表达模式的病变细胞。该方法已应用于胃癌和特应性皮炎的候选药物的寻找和新的化学结构的生成。药物搜索结果表明，在分子途径水平上发现了具有多种化学结构的化合物，并与目标疾病相关。药物设计的结果也表明，新生成的化合物在注册药物的重现性方面是合理的。该方法有望用于基于组学的药物发现。

{"title":"Drug Search and Design Considering Cell Specificity of Chemically Induced Gene Expression Profiles for Disease-Associated Tissues.","authors":"Chikashige Yamanaka, Michio Iwata, Kazuma Kaitoh, Yoshihiro Yamanishi","doi":"10.1002/minf.2444","DOIUrl":"10.1002/minf.2444","url":null,"abstract":"The use of omics data, including gene expression profiles, has recently gained increasing attention in drug discovery. Omics-based drug searches and designs are often based on the correlations between chemically induced and disease-induced gene expression profiles; however, the cell specificity has not been considered. In this study, we designed a novel computational method for drug search and design using cell-specific correlations between drugs and diseases. A data completion technique allowed the characterization of cell-specific gene expression patterns in diseased cells. This proposed method was applied to search for drug candidates and generate new chemical structures for gastric cancer and atopic dermatitis. The results of drug search demonstrated that compounds with diverse chemical structures were detected and were associated with target diseases at the molecular pathway levels. The results of drug design also demonstrated that newly generated compounds were reasonable in terms of the reproducibility of registered drugs. The proposed method is expected to be useful for omics-based drug discovery.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2444"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12188700/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144485147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing the Reliability of Integrated Consensus Strategies to Boost Docking-Based Screening Campaigns Using Publicly Available Docking Programs. 提高综合共识策略的可靠性，以促进基于对接的筛查活动，使用公开可用的对接计划。

IF 2.8 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics

Pub Date : 2025-06-01 DOI: 10.1002/minf.2445

Valeria Scardino, M Justina Galarce, M Emilia Mignone, Claudio N Cavasotto

The use of docking-based virtual screening is today an established critical component within the drug discovery pipeline. In the context where the performance of molecular docking has been found to depend on the protein target and the program, consensus docking has been found to be a valuable approach to enhance the performance of high-throughput docking (HTD). We present and evaluate an integrated pose and ranking consensus approach that combines the advantages of pose consensus and the exponential consensus ranking (ECR) approach, using only publicly available docking programs (rDock, DOCK 6, Auto Dock 4, PLANTS, and Vina). Based on a thorough analysis performed to assess the optimal combination of matching poses and ECR thresholds, using a benchmarking set of 50 protein targets of diverse families and different property-matched ligand/decoy libraries, this enhanced pose/ranking consensus approach displayed a notably superior performance than the individual docking programs, and the ECR. This approach was also evaluated in HTD campaigns using larger libraries (∼1.1 million molecules) on six targets, thus obtaining an average improvement of the ECR of about 40%. We thus may say that this pose/ranking consensus methodology can be confidently used in prospective HTD campaigns using free-available docking programs.

使用基于对接的虚拟筛选是当今药物发现管道中建立的关键组成部分。在发现分子对接的性能依赖于蛋白靶点和程序的情况下，共识对接被认为是提高高通量对接（HTD）性能的一种有价值的方法。我们提出并评估了一种综合姿态和排名共识方法，该方法结合了姿态共识和指数共识排名（ECR）方法的优点，仅使用公开可用的对接程序（rDock, DOCK 6, Auto DOCK 4， PLANTS和Vina）。基于对不同家族和不同属性匹配配体/诱饵库的50个蛋白质靶标的基准集进行的全面分析，以评估匹配姿态和ECR阈值的最佳组合，这种增强的姿态/排名共识方法显示出明显优于单个对接方案和ECR的性能。该方法也在HTD活动中进行了评估，在六个靶标上使用更大的文库（约110万个分子），从而获得了约40%的ECR平均改善。因此，我们可以说，这种姿态/排名共识方法可以自信地用于使用免费对接程序的未来HTD活动。

{"title":"Enhancing the Reliability of Integrated Consensus Strategies to Boost Docking-Based Screening Campaigns Using Publicly Available Docking Programs.","authors":"Valeria Scardino, M Justina Galarce, M Emilia Mignone, Claudio N Cavasotto","doi":"10.1002/minf.2445","DOIUrl":"https://doi.org/10.1002/minf.2445","url":null,"abstract":"The use of docking-based virtual screening is today an established critical component within the drug discovery pipeline. In the context where the performance of molecular docking has been found to depend on the protein target and the program, consensus docking has been found to be a valuable approach to enhance the performance of high-throughput docking (HTD). We present and evaluate an integrated pose and ranking consensus approach that combines the advantages of pose consensus and the exponential consensus ranking (ECR) approach, using only publicly available docking programs (rDock, DOCK 6, Auto Dock 4, PLANTS, and Vina). Based on a thorough analysis performed to assess the optimal combination of matching poses and ECR thresholds, using a benchmarking set of 50 protein targets of diverse families and different property-matched ligand/decoy libraries, this enhanced pose/ranking consensus approach displayed a notably superior performance than the individual docking programs, and the ECR. This approach was also evaluated in HTD campaigns using larger libraries (∼1.1 million molecules) on six targets, thus obtaining an average improvement of the ECR of about 40%. We thus may say that this pose/ranking consensus methodology can be confidently used in prospective HTD campaigns using free-available docking programs.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2445"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144333535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spherical GTM: A New Proposition for Visualization of Chemical Data. 球形GTM：化工数据可视化的新命题。

IF 2.8 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics

Pub Date : 2025-06-01 DOI: 10.1002/minf.202500045

Farah Asgarkhanova, Marcou Gilles, Mikhail Volkov, Murielle Muzard, Richard Plantier-Royon, Caroline Rémond, Dragos Horvath, Alexandre Varnek

The Spherical Generative Topographic Mapping (SGTM) method represents an intuitive approach to visualize chemical data. Unlike the original Generative Topographic Mapping algorithm, which utilizes a bounded flat Euclidean space as a manifold, our proposed modification introduces a spherical manifold to address known nonflat topology issues. In this study, we describe the mathematical formalism of this new approach and showcase its ability to visualize 2D electron density patterns of water and benzene and the CosMoPoly chemical library-an enumeration of synthetically accessible molecules. By comparing the outcomes with established references, it is demonstrated that SGTM emerges as a novel 3D data visualization method, offering improved accuracy in the depiction of chemical structures.

球形生成地形映射（SGTM）方法是一种直观的化学数据可视化方法。与原始的生成地形映射算法不同，该算法利用有界的平面欧几里得空间作为流形，我们提出的修改引入了球面流形来解决已知的非平面拓扑问题。在这项研究中，我们描述了这种新方法的数学形式，并展示了它可视化水和苯的二维电子密度模式的能力，以及世界化学库——一种可合成分子的枚举。通过将结果与已有文献进行比较，证明了SGTM是一种新的三维数据可视化方法，可以提高化学结构描述的准确性。

引用次数: 0

Development of Machine Learning-Based Models for Mutagenicity Predictions with Applications to Non-Sugar Sweeteners. 基于机器学习的致突变性预测模型及其在非糖甜味剂中的应用。

IF 2.8 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics

Pub Date : 2025-06-01 DOI: 10.1002/minf.202400357

Shilpayan Ghosh, Vinay Kumar, Kunal Roy

Artificial sweeteners, often known as non-sugar sweeteners (NSSs), have been utilized as food additives since World War II. However, there is also concern regarding the mutagenicity potential of NSSs. Every new chemical registration in the food and pharmaceutical industries requires an evaluation of its mutagenic potential, which is essential for food safety. Most of the studies focus solely on determining the mutagenicity of NSSs through in vivo trials, which may be troublesome in terms of the time and cost required for experimental evaluation. To avoid the associated complexities concerning experimentation, a new approach methodology by developing machine learning (ML) models for mutagenicity predictions and selecting the best models by a stringent cross-validation analysis is explored. Two random splits (50/50) of a dataset of 6881 organic compounds for model development are used. Consensus predictions are provided for the mutagenic potential of an external set of 332 NSSs using six selected models (three best ML models based on cross-validation using either data splitting strategy) through voting and considering the applicability domain using two different approaches. In addition, to check the reliability of predictions, the model-derived consensus predictions have also been compared to the predictions generated by the k-nearest neighbor method using the virtual models for property evaluation of chemicals within a global architecture platform and the consensus method generated in the toxicity estimation software tool platform. Finally, based on the analysis, six compounds could be prioritized as mutagenic NSSs in this investigation. The developed models have been made available from https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor.

人工甜味剂，通常被称为非糖甜味剂（nss），自第二次世界大战以来一直被用作食品添加剂。然而，也有人担心nss的致突变性。食品和制药行业的每一种新化学品注册都需要对其致突变潜力进行评估，这对食品安全至关重要。大多数研究仅仅是通过体内试验来确定nss的突变性，这在实验评估的时间和成本方面可能会很麻烦。为了避免与实验相关的复杂性，通过开发机器学习（ML）模型进行诱变预测并通过严格的交叉验证分析选择最佳模型，探索了一种新的方法方法。对6881种有机化合物的数据集进行两次随机分割（50/50），用于模型开发。通过投票和使用两种不同的方法考虑适用性领域，使用六个选定的模型（三个基于交叉验证的最佳ML模型，使用任一数据分割策略）为外部332个nss集的致突变潜力提供了共识预测。此外，为了检查预测的可靠性，还将模型衍生的共识预测与使用全球架构平台中用于化学品属性评估的虚拟模型的k近邻方法和毒性估计软件工具平台中生成的共识方法生成的预测进行了比较。最后，基于分析，6个化合物可优先作为本研究的致突变性nss。已开发的模型可从https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor获得。

{"title":"Development of Machine Learning-Based Models for Mutagenicity Predictions with Applications to Non-Sugar Sweeteners.","authors":"Shilpayan Ghosh, Vinay Kumar, Kunal Roy","doi":"10.1002/minf.202400357","DOIUrl":"https://doi.org/10.1002/minf.202400357","url":null,"abstract":"Artificial sweeteners, often known as non-sugar sweeteners (NSSs), have been utilized as food additives since World War II. However, there is also concern regarding the mutagenicity potential of NSSs. Every new chemical registration in the food and pharmaceutical industries requires an evaluation of its mutagenic potential, which is essential for food safety. Most of the studies focus solely on determining the mutagenicity of NSSs through in vivo trials, which may be troublesome in terms of the time and cost required for experimental evaluation. To avoid the associated complexities concerning experimentation, a new approach methodology by developing machine learning (ML) models for mutagenicity predictions and selecting the best models by a stringent cross-validation analysis is explored. Two random splits (50/50) of a dataset of 6881 organic compounds for model development are used. Consensus predictions are provided for the mutagenic potential of an external set of 332 NSSs using six selected models (three best ML models based on cross-validation using either data splitting strategy) through voting and considering the applicability domain using two different approaches. In addition, to check the reliability of predictions, the model-derived consensus predictions have also been compared to the predictions generated by the k-nearest neighbor method using the virtual models for property evaluation of chemicals within a global architecture platform and the consensus method generated in the toxicity estimation software tool platform. Finally, based on the analysis, six compounds could be prioritized as mutagenic NSSs in this investigation. The developed models have been made available from https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor.","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2400357"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144302554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Modeling of Gain-of-Function Mutations on Androgen Receptor. 雄激素受体功能获得性突变的深度建模。

IF 2.8 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics

Pub Date : 2025-04-01 DOI: 10.1002/minf.202500018

Jiaying You, Jane Foo, Nada Lallous, Artem Cherkasov

The efficiency of Androgen Receptor (AR) pathway inhibitors for prostate cancer (PCa) is on decline due to resistance mechanisms including the occurrence of gain-of-function mutations on human androgen receptor (AR). Hence, understanding and predicting such mutations is crucial for developing effective PCa treatment strategies. Leveraging accu- mulated data on clinically relevant AR mutants with recent advances in deep modeling techniques, this study aims to unveil and quantify critical AR mutation-drug relation- ships. By incorporating molecular descriptors for drugs and mutated genes sequences, this work represented these features as single vectors and demonstrates their effective- ness in modeling AR mutant responses to conventional antiandrogens. The developed approach achieves above 80% accuracy in predicting the gain-of-function behavior of AR mutants and therefore can potentially uncover unknown agonist/antagonist relationships among mutant-drug pairs.

雄激素受体（AR）途径抑制剂治疗前列腺癌（PCa）的效率正在下降，原因是人类雄激素受体（AR）发生功能获得性突变等耐药机制。因此，了解和预测这些突变对于制定有效的前列腺癌治疗策略至关重要。利用积累的临床相关AR突变数据和最新的深度建模技术，本研究旨在揭示和量化关键的AR突变与药物之间的关系。通过结合药物和突变基因序列的分子描述符，这项工作将这些特征表示为单一载体，并证明了它们在模拟AR突变对常规抗雄激素的反应方面的有效性。所开发的方法在预测AR突变体的功能获得行为方面达到80%以上的准确率，因此可以潜在地揭示突变药物对之间未知的激动剂/拮抗剂关系。

引用次数: 0