首页 > 最新文献

Molecular Informatics最新文献

英文 中文
Neural Network Models for Prediction of Biological Activity using Molecular Dynamics Data: A Case of Photoswitchable Peptides. 利用分子动力学数据预测生物活性的神经网络模型:以光开关肽为例。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-07-01 DOI: 10.1002/minf.70001
Anton Cherednichenko, Sergii Afonin, Oleg Babii, Taras Voitsitskyi, Roman Stratiichuk, Ihor Koleiev, Volodymyr Vozniak, Nazar Shevchuk, Zakhar Ostrovsky, Semen Yesylevskyy, Alan Nafiiev, Serhii Starosyla, Anne S Ulrich, Aigars Jirgensons, Igor V Komarov

Prediction of biological activities of chemical compounds by the machine learning techniques in general and the neural networks (NNs) in particular, is usually based on the analysis of their binding to the target of interest. If such affinity data is not available, the ligand-based approaches can be used where the NN models are trained to assess similarity of compounds to those with known biological activity. Obviously, this approach only works well if the similarity between the training set and the evaluated molecules is sufficiently high. In the case of large and conformationally flexible organic compounds, the activity becomes dependent not only on chemical identity but also on the dynamics of molecular motions, which imposes significant challenges to existing approaches based on static structural 2D and 3D molecular descriptors. A prominent example of compounds, which are especially challenging for existing NN activity prediction techniques, are photoswitchable macrocyclic peptides containing a diarylethene "photoswitch" (DAE). These molecules exist in two isomeric forms with remarkably different biological activities, which are interconvertible by light of different wavelengths. Activity prediction models have to distinguish in this case not only between the different peptides but also between the photoisomers of the same peptide. In this work, we demonstrate that the features extracted from classical molecular dynamics (MD) trajectories are superior to conventional 2D or 3D descriptor-based features when used in activity prediction NN models of DAE-containing photoswitchable peptides. Using MD-derived features, we successfully created two NN models that predict activities of photoswitchable peptidomimetics, analogs of the natural peptidic antibiotic gramicidin S. The first model precisely predicts the cytotoxic activity of similar peptide analogs. The second model reliably predicts the differences in the biological activities of DAE photoisomers of the same peptide, even if the type of its activity differs from one in the training dataset. Our results demonstrate that accounting for MD-derived dynamic features allows generalizing the ligand-based activity prediction NN models to the cases of large and conformationally flexible molecules, which were previously considered intractable by this class of models.

通过机器学习技术,特别是神经网络(NNs)来预测化合物的生物活性,通常是基于分析它们与感兴趣的目标的结合。如果没有这样的亲和性数据,可以使用基于配体的方法来训练神经网络模型,以评估化合物与具有已知生物活性的化合物的相似性。显然,这种方法只有在训练集和评估分子之间的相似性足够高的情况下才有效。对于大型和构象灵活的有机化合物,活性不仅依赖于化学特性,还依赖于分子运动动力学,这对基于静态结构二维和三维分子描述符的现有方法提出了重大挑战。一个突出的例子是含有二乙烯“光开关”(DAE)的光开关大环肽,这对现有的神经网络活性预测技术尤其具有挑战性。这些分子以两种异构体形式存在,具有显著不同的生物活性,它们可以通过不同波长的光相互转换。在这种情况下,活性预测模型不仅要区分不同的肽,还要区分同一肽的光异构体。在这项工作中,我们证明了从经典分子动力学(MD)轨迹中提取的特征在用于含有光开关肽的dae的活性预测神经网络模型时优于传统的基于2D或3D描述符的特征。利用md衍生的特征,我们成功地创建了两个神经网络模型来预测光开关肽模拟物(天然肽抗生素gramicidin s的类似物)的活性。第一个模型精确地预测了类似肽类似物的细胞毒性活性。第二个模型可靠地预测了相同肽的DAE光异构体的生物活性差异,即使其活性类型与训练数据集中的不同。我们的研究结果表明,考虑到md衍生的动态特征,可以将基于配体的活性预测神经网络模型推广到大型和构象柔性分子的情况,这些情况以前被这类模型认为是难以处理的。
{"title":"Neural Network Models for Prediction of Biological Activity using Molecular Dynamics Data: A Case of Photoswitchable Peptides.","authors":"Anton Cherednichenko, Sergii Afonin, Oleg Babii, Taras Voitsitskyi, Roman Stratiichuk, Ihor Koleiev, Volodymyr Vozniak, Nazar Shevchuk, Zakhar Ostrovsky, Semen Yesylevskyy, Alan Nafiiev, Serhii Starosyla, Anne S Ulrich, Aigars Jirgensons, Igor V Komarov","doi":"10.1002/minf.70001","DOIUrl":"10.1002/minf.70001","url":null,"abstract":"<p><p>Prediction of biological activities of chemical compounds by the machine learning techniques in general and the neural networks (NNs) in particular, is usually based on the analysis of their binding to the target of interest. If such affinity data is not available, the ligand-based approaches can be used where the NN models are trained to assess similarity of compounds to those with known biological activity. Obviously, this approach only works well if the similarity between the training set and the evaluated molecules is sufficiently high. In the case of large and conformationally flexible organic compounds, the activity becomes dependent not only on chemical identity but also on the dynamics of molecular motions, which imposes significant challenges to existing approaches based on static structural 2D and 3D molecular descriptors. A prominent example of compounds, which are especially challenging for existing NN activity prediction techniques, are photoswitchable macrocyclic peptides containing a diarylethene \"photoswitch\" (DAE). These molecules exist in two isomeric forms with remarkably different biological activities, which are interconvertible by light of different wavelengths. Activity prediction models have to distinguish in this case not only between the different peptides but also between the photoisomers of the same peptide. In this work, we demonstrate that the features extracted from classical molecular dynamics (MD) trajectories are superior to conventional 2D or 3D descriptor-based features when used in activity prediction NN models of DAE-containing photoswitchable peptides. Using MD-derived features, we successfully created two NN models that predict activities of photoswitchable peptidomimetics, analogs of the natural peptidic antibiotic gramicidin S. The first model precisely predicts the cytotoxic activity of similar peptide analogs. The second model reliably predicts the differences in the biological activities of DAE photoisomers of the same peptide, even if the type of its activity differs from one in the training dataset. Our results demonstrate that accounting for MD-derived dynamic features allows generalizing the ligand-based activity prediction NN models to the cases of large and conformationally flexible molecules, which were previously considered intractable by this class of models.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 7","pages":"e70001"},"PeriodicalIF":2.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12257427/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144626740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rapid Assessment of Virtually Synthesizable Chemical Structures via Support Vector Machine Models. 基于支持向量机模型的虚拟合成化学结构快速评估。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-07-01 DOI: 10.1002/minf.70000
Yuto Iwasaki, Tomoyuki Miyao

Support vector machine (SVM) and support vector regression (SVR) are widely used for building quantitative structure-activity relationship models for small- and medium-sized datasets. Although SVM and SVR models can efficiently predict compound activity, evaluating billions of molecules remains challenging, which sometimes occurs when screening the virtual molecules derived through virtual synthesis. Herein, we present an SVM-/SVR-based method for screening virtually synthesizable molecules based on their reactants. The proposed method employs a combination of reactant-wise kernel functions for fast evaluation without sacrificing prediction accuracy. Tested on 120 small molecular activity datasets against 10 macromolecule targets, the proposed SVR models with data augmentation worked equally to standard SVR models with the Tanimoto kernel. As a demonstration, exhaustive 6.4 × 1012 reactant combinations were evaluated by an SVR model within 8 days on a single desktop computer, enabling large-scale screening without sampling.

支持向量机(SVM)和支持向量回归(SVR)被广泛用于构建中小型数据集的定量结构-活动关系模型。虽然支持向量机和支持向量回归模型可以有效地预测化合物活性,但评估数十亿个分子仍然具有挑战性,有时在筛选通过虚拟合成衍生的虚拟分子时出现这种情况。在此,我们提出了一种基于SVM / svr的方法来筛选基于其反应物的虚拟合成分子。所提出的方法在不牺牲预测精度的情况下,采用组合反应物核函数进行快速评估。在针对10个大分子目标的120个小分子活性数据集上进行了测试,结果表明,该模型与基于谷本核的标准SVR模型具有相同的效果。作为示范,在一台台式计算机上,用SVR模型在8天内评估了详尽的6.4 × 1012种反应物组合,实现了大规模的不抽样筛选。
{"title":"Rapid Assessment of Virtually Synthesizable Chemical Structures via Support Vector Machine Models.","authors":"Yuto Iwasaki, Tomoyuki Miyao","doi":"10.1002/minf.70000","DOIUrl":"10.1002/minf.70000","url":null,"abstract":"<p><p>Support vector machine (SVM) and support vector regression (SVR) are widely used for building quantitative structure-activity relationship models for small- and medium-sized datasets. Although SVM and SVR models can efficiently predict compound activity, evaluating billions of molecules remains challenging, which sometimes occurs when screening the virtual molecules derived through virtual synthesis. Herein, we present an SVM-/SVR-based method for screening virtually synthesizable molecules based on their reactants. The proposed method employs a combination of reactant-wise kernel functions for fast evaluation without sacrificing prediction accuracy. Tested on 120 small molecular activity datasets against 10 macromolecule targets, the proposed SVR models with data augmentation worked equally to standard SVR models with the Tanimoto kernel. As a demonstration, exhaustive 6.4 × 10<sup>12</sup> reactant combinations were evaluated by an SVR model within 8 days on a single desktop computer, enabling large-scale screening without sampling.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 7","pages":"e202500039"},"PeriodicalIF":2.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12278806/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144675311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
In Silico Identification of Novel and Potent Inhibitors Against Mutant BRAF (V600E), MD Simulations, Free Energy Calculations, and Experimental Determination of Binding Affinity. 抗突变BRAF (V600E)的新型有效抑制剂的硅鉴定,MD模拟,自由能计算和结合亲和力的实验测定。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-06-01 DOI: 10.1002/minf.202400372
Vikas Yadav, Mohammad Kashif, Zenab Kamali, Samudrala Gourinath, Naidu Subbarao

BRAF is a proto oncogene that functions as a key signal transducer in the MAPK-ERK pathway, which regulates cell growth, division, and survival. Mutations in BRAF, particularly the V600E substitution in its kinase domain, are major drivers in melanoma and several other metastatic cancers, including breast, colorectal, NSCLC, and gastrointestinal cancers. In this study, novel inhibitors targeting the BRAF(V600E) mutant using a structure-based drug design approach are identified. Four chemical libraries ChemDiv Kinase, ChemDiv Anticancer, NCI, and ChEMBL Kinase SARfari are screened. Compounds from the ChemDiv Anticancer database show better Glide scores comparable to the FDA-approved BRAF inhibitor Vemurafenib. The compounds P184-1419 and P184-1479 score -12.688 and -12.012 kcal/mol, respectively, versus -14.288 kcal/mol for Vemurafenib. Top hits are further validated using GOLD docking, X-score ranking, and interaction profiling via LigPlot. Molecular dynamics simulations, principal component analysis, and free energy calculations confirm the stability of protein-ligand complexes. Biolayer interferometry assays reveal P184-1419 exhibits stronger binding affinity (KD = 151 μM) than Vemurafenib (KD = 437 μM). These findings suggest P184-1419 is a promising lead compound against BRAF(V600E), offering potential for future development of more effective cancer therapies.

BRAF是一种原癌基因,在调控细胞生长、分裂和存活的MAPK-ERK通路中起关键信号换能器的作用。BRAF的突变,特别是其激酶结构域的V600E替代,是黑色素瘤和其他转移性癌症(包括乳腺癌、结直肠癌、非小细胞肺癌和胃肠道癌症)的主要驱动因素。在这项研究中,使用基于结构的药物设计方法确定了针对BRAF(V600E)突变体的新型抑制剂。筛选了ChemDiv激酶、ChemDiv抗癌、NCI和ChEMBL激酶SARfari四个化学文库。ChemDiv抗癌数据库中的化合物显示出比fda批准的BRAF抑制剂Vemurafenib更好的Glide评分。化合物P184-1419和P184-1479分别为-12.688和-12.012 kcal/mol,而Vemurafenib为-14.288 kcal/mol。使用GOLD对接、X-score排名和通过LigPlot进行的交互分析进一步验证热门命中。分子动力学模拟、主成分分析和自由能计算证实了蛋白质配体复合物的稳定性。生物层干涉分析显示,P184-1419的结合亲和力(KD = 151 μM)高于Vemurafenib (KD = 437 μM)。这些发现表明P184-1419是一种很有前途的抗BRAF(V600E)先导化合物,为未来开发更有效的癌症治疗提供了潜力。
{"title":"In Silico Identification of Novel and Potent Inhibitors Against Mutant BRAF (V600E), MD Simulations, Free Energy Calculations, and Experimental Determination of Binding Affinity.","authors":"Vikas Yadav, Mohammad Kashif, Zenab Kamali, Samudrala Gourinath, Naidu Subbarao","doi":"10.1002/minf.202400372","DOIUrl":"https://doi.org/10.1002/minf.202400372","url":null,"abstract":"<p><p>BRAF is a proto oncogene that functions as a key signal transducer in the MAPK-ERK pathway, which regulates cell growth, division, and survival. Mutations in BRAF, particularly the V600E substitution in its kinase domain, are major drivers in melanoma and several other metastatic cancers, including breast, colorectal, NSCLC, and gastrointestinal cancers. In this study, novel inhibitors targeting the BRAF(V600E) mutant using a structure-based drug design approach are identified. Four chemical libraries ChemDiv Kinase, ChemDiv Anticancer, NCI, and ChEMBL Kinase SARfari are screened. Compounds from the ChemDiv Anticancer database show better Glide scores comparable to the FDA-approved BRAF inhibitor Vemurafenib. The compounds P184-1419 and P184-1479 score -12.688 and -12.012 kcal/mol, respectively, versus -14.288 kcal/mol for Vemurafenib. Top hits are further validated using GOLD docking, X-score ranking, and interaction profiling via LigPlot. Molecular dynamics simulations, principal component analysis, and free energy calculations confirm the stability of protein-ligand complexes. Biolayer interferometry assays reveal P184-1419 exhibits stronger binding affinity (KD = 151 μM) than Vemurafenib (KD = 437 μM). These findings suggest P184-1419 is a promising lead compound against BRAF(V600E), offering potential for future development of more effective cancer therapies.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2400372"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144310149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Drug Search and Design Considering Cell Specificity of Chemically Induced Gene Expression Profiles for Disease-Associated Tissues. 考虑疾病相关组织化学诱导基因表达谱细胞特异性的药物搜索和设计。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-06-01 DOI: 10.1002/minf.2444
Chikashige Yamanaka, Michio Iwata, Kazuma Kaitoh, Yoshihiro Yamanishi

The use of omics data, including gene expression profiles, has recently gained increasing attention in drug discovery. Omics-based drug searches and designs are often based on the correlations between chemically induced and disease-induced gene expression profiles; however, the cell specificity has not been considered. In this study, we designed a novel computational method for drug search and design using cell-specific correlations between drugs and diseases. A data completion technique allowed the characterization of cell-specific gene expression patterns in diseased cells. This proposed method was applied to search for drug candidates and generate new chemical structures for gastric cancer and atopic dermatitis. The results of drug search demonstrated that compounds with diverse chemical structures were detected and were associated with target diseases at the molecular pathway levels. The results of drug design also demonstrated that newly generated compounds were reasonable in terms of the reproducibility of registered drugs. The proposed method is expected to be useful for omics-based drug discovery.

使用组学数据,包括基因表达谱,最近在药物发现中获得了越来越多的关注。基于组学的药物搜索和设计通常基于化学诱导和疾病诱导的基因表达谱之间的相关性;然而,细胞特异性尚未被考虑。在这项研究中,我们设计了一种新的计算方法,用于药物和疾病之间的细胞特异性相关性的药物搜索和设计。数据完成技术允许表征细胞特异性基因表达模式的病变细胞。该方法已应用于胃癌和特应性皮炎的候选药物的寻找和新的化学结构的生成。药物搜索结果表明,在分子途径水平上发现了具有多种化学结构的化合物,并与目标疾病相关。药物设计的结果也表明,新生成的化合物在注册药物的重现性方面是合理的。该方法有望用于基于组学的药物发现。
{"title":"Drug Search and Design Considering Cell Specificity of Chemically Induced Gene Expression Profiles for Disease-Associated Tissues.","authors":"Chikashige Yamanaka, Michio Iwata, Kazuma Kaitoh, Yoshihiro Yamanishi","doi":"10.1002/minf.2444","DOIUrl":"10.1002/minf.2444","url":null,"abstract":"<p><p>The use of omics data, including gene expression profiles, has recently gained increasing attention in drug discovery. Omics-based drug searches and designs are often based on the correlations between chemically induced and disease-induced gene expression profiles; however, the cell specificity has not been considered. In this study, we designed a novel computational method for drug search and design using cell-specific correlations between drugs and diseases. A data completion technique allowed the characterization of cell-specific gene expression patterns in diseased cells. This proposed method was applied to search for drug candidates and generate new chemical structures for gastric cancer and atopic dermatitis. The results of drug search demonstrated that compounds with diverse chemical structures were detected and were associated with target diseases at the molecular pathway levels. The results of drug design also demonstrated that newly generated compounds were reasonable in terms of the reproducibility of registered drugs. The proposed method is expected to be useful for omics-based drug discovery.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2444"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12188700/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144485147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing the Reliability of Integrated Consensus Strategies to Boost Docking-Based Screening Campaigns Using Publicly Available Docking Programs. 提高综合共识策略的可靠性,以促进基于对接的筛查活动,使用公开可用的对接计划。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-06-01 DOI: 10.1002/minf.2445
Valeria Scardino, M Justina Galarce, M Emilia Mignone, Claudio N Cavasotto

The use of docking-based virtual screening is today an established critical component within the drug discovery pipeline. In the context where the performance of molecular docking has been found to depend on the protein target and the program, consensus docking has been found to be a valuable approach to enhance the performance of high-throughput docking (HTD). We present and evaluate an integrated pose and ranking consensus approach that combines the advantages of pose consensus and the exponential consensus ranking (ECR) approach, using only publicly available docking programs (rDock, DOCK 6, Auto Dock 4, PLANTS, and Vina). Based on a thorough analysis performed to assess the optimal combination of matching poses and ECR thresholds, using a benchmarking set of 50 protein targets of diverse families and different property-matched ligand/decoy libraries, this enhanced pose/ranking consensus approach displayed a notably superior performance than the individual docking programs, and the ECR. This approach was also evaluated in HTD campaigns using larger libraries (∼1.1 million molecules) on six targets, thus obtaining an average improvement of the ECR of about 40%. We thus may say that this pose/ranking consensus methodology can be confidently used in prospective HTD campaigns using free-available docking programs.

使用基于对接的虚拟筛选是当今药物发现管道中建立的关键组成部分。在发现分子对接的性能依赖于蛋白靶点和程序的情况下,共识对接被认为是提高高通量对接(HTD)性能的一种有价值的方法。我们提出并评估了一种综合姿态和排名共识方法,该方法结合了姿态共识和指数共识排名(ECR)方法的优点,仅使用公开可用的对接程序(rDock, DOCK 6, Auto DOCK 4, PLANTS和Vina)。基于对不同家族和不同属性匹配配体/诱饵库的50个蛋白质靶标的基准集进行的全面分析,以评估匹配姿态和ECR阈值的最佳组合,这种增强的姿态/排名共识方法显示出明显优于单个对接方案和ECR的性能。该方法也在HTD活动中进行了评估,在六个靶标上使用更大的文库(约110万个分子),从而获得了约40%的ECR平均改善。因此,我们可以说,这种姿态/排名共识方法可以自信地用于使用免费对接程序的未来HTD活动。
{"title":"Enhancing the Reliability of Integrated Consensus Strategies to Boost Docking-Based Screening Campaigns Using Publicly Available Docking Programs.","authors":"Valeria Scardino, M Justina Galarce, M Emilia Mignone, Claudio N Cavasotto","doi":"10.1002/minf.2445","DOIUrl":"https://doi.org/10.1002/minf.2445","url":null,"abstract":"<p><p>The use of docking-based virtual screening is today an established critical component within the drug discovery pipeline. In the context where the performance of molecular docking has been found to depend on the protein target and the program, consensus docking has been found to be a valuable approach to enhance the performance of high-throughput docking (HTD). We present and evaluate an integrated pose and ranking consensus approach that combines the advantages of pose consensus and the exponential consensus ranking (ECR) approach, using only publicly available docking programs (rDock, DOCK 6, Auto Dock 4, PLANTS, and Vina). Based on a thorough analysis performed to assess the optimal combination of matching poses and ECR thresholds, using a benchmarking set of 50 protein targets of diverse families and different property-matched ligand/decoy libraries, this enhanced pose/ranking consensus approach displayed a notably superior performance than the individual docking programs, and the ECR. This approach was also evaluated in HTD campaigns using larger libraries (∼1.1 million molecules) on six targets, thus obtaining an average improvement of the ECR of about 40%. We thus may say that this pose/ranking consensus methodology can be confidently used in prospective HTD campaigns using free-available docking programs.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2445"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144333535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spherical GTM: A New Proposition for Visualization of Chemical Data. 球形GTM:化工数据可视化的新命题。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-06-01 DOI: 10.1002/minf.202500045
Farah Asgarkhanova, Marcou Gilles, Mikhail Volkov, Murielle Muzard, Richard Plantier-Royon, Caroline Rémond, Dragos Horvath, Alexandre Varnek

The Spherical Generative Topographic Mapping (SGTM) method represents an intuitive approach to visualize chemical data. Unlike the original Generative Topographic Mapping algorithm, which utilizes a bounded flat Euclidean space as a manifold, our proposed modification introduces a spherical manifold to address known nonflat topology issues. In this study, we describe the mathematical formalism of this new approach and showcase its ability to visualize 2D electron density patterns of water and benzene and the CosMoPoly chemical library-an enumeration of synthetically accessible molecules. By comparing the outcomes with established references, it is demonstrated that SGTM emerges as a novel 3D data visualization method, offering improved accuracy in the depiction of chemical structures.

球形生成地形映射(SGTM)方法是一种直观的化学数据可视化方法。与原始的生成地形映射算法不同,该算法利用有界的平面欧几里得空间作为流形,我们提出的修改引入了球面流形来解决已知的非平面拓扑问题。在这项研究中,我们描述了这种新方法的数学形式,并展示了它可视化水和苯的二维电子密度模式的能力,以及世界化学库——一种可合成分子的枚举。通过将结果与已有文献进行比较,证明了SGTM是一种新的三维数据可视化方法,可以提高化学结构描述的准确性。
{"title":"Spherical GTM: A New Proposition for Visualization of Chemical Data.","authors":"Farah Asgarkhanova, Marcou Gilles, Mikhail Volkov, Murielle Muzard, Richard Plantier-Royon, Caroline Rémond, Dragos Horvath, Alexandre Varnek","doi":"10.1002/minf.202500045","DOIUrl":"10.1002/minf.202500045","url":null,"abstract":"<p><p>The Spherical Generative Topographic Mapping (SGTM) method represents an intuitive approach to visualize chemical data. Unlike the original Generative Topographic Mapping algorithm, which utilizes a bounded flat Euclidean space as a manifold, our proposed modification introduces a spherical manifold to address known nonflat topology issues. In this study, we describe the mathematical formalism of this new approach and showcase its ability to visualize 2D electron density patterns of water and benzene and the CosMoPoly chemical library-an enumeration of synthetically accessible molecules. By comparing the outcomes with established references, it is demonstrated that SGTM emerges as a novel 3D data visualization method, offering improved accuracy in the depiction of chemical structures.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2500045"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12186103/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144302555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of Machine Learning-Based Models for Mutagenicity Predictions with Applications to Non-Sugar Sweeteners. 基于机器学习的致突变性预测模型及其在非糖甜味剂中的应用。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-06-01 DOI: 10.1002/minf.202400357
Shilpayan Ghosh, Vinay Kumar, Kunal Roy

Artificial sweeteners, often known as non-sugar sweeteners (NSSs), have been utilized as food additives since World War II. However, there is also concern regarding the mutagenicity potential of NSSs. Every new chemical registration in the food and pharmaceutical industries requires an evaluation of its mutagenic potential, which is essential for food safety. Most of the studies focus solely on determining the mutagenicity of NSSs through in vivo trials, which may be troublesome in terms of the time and cost required for experimental evaluation. To avoid the associated complexities concerning experimentation, a new approach methodology by developing machine learning (ML) models for mutagenicity predictions and selecting the best models by a stringent cross-validation analysis is explored. Two random splits (50/50) of a dataset of 6881 organic compounds for model development are used. Consensus predictions are provided for the mutagenic potential of an external set of 332 NSSs using six selected models (three best ML models based on cross-validation using either data splitting strategy) through voting and considering the applicability domain using two different approaches. In addition, to check the reliability of predictions, the model-derived consensus predictions have also been compared to the predictions generated by the k-nearest neighbor method using the virtual models for property evaluation of chemicals within a global architecture platform and the consensus method generated in the toxicity estimation software tool platform. Finally, based on the analysis, six compounds could be prioritized as mutagenic NSSs in this investigation. The developed models have been made available from https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor.

人工甜味剂,通常被称为非糖甜味剂(nss),自第二次世界大战以来一直被用作食品添加剂。然而,也有人担心nss的致突变性。食品和制药行业的每一种新化学品注册都需要对其致突变潜力进行评估,这对食品安全至关重要。大多数研究仅仅是通过体内试验来确定nss的突变性,这在实验评估的时间和成本方面可能会很麻烦。为了避免与实验相关的复杂性,通过开发机器学习(ML)模型进行诱变预测并通过严格的交叉验证分析选择最佳模型,探索了一种新的方法方法。对6881种有机化合物的数据集进行两次随机分割(50/50),用于模型开发。通过投票和使用两种不同的方法考虑适用性领域,使用六个选定的模型(三个基于交叉验证的最佳ML模型,使用任一数据分割策略)为外部332个nss集的致突变潜力提供了共识预测。此外,为了检查预测的可靠性,还将模型衍生的共识预测与使用全球架构平台中用于化学品属性评估的虚拟模型的k近邻方法和毒性估计软件工具平台中生成的共识方法生成的预测进行了比较。最后,基于分析,6个化合物可优先作为本研究的致突变性nss。已开发的模型可从https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor获得。
{"title":"Development of Machine Learning-Based Models for Mutagenicity Predictions with Applications to Non-Sugar Sweeteners.","authors":"Shilpayan Ghosh, Vinay Kumar, Kunal Roy","doi":"10.1002/minf.202400357","DOIUrl":"https://doi.org/10.1002/minf.202400357","url":null,"abstract":"<p><p>Artificial sweeteners, often known as non-sugar sweeteners (NSSs), have been utilized as food additives since World War II. However, there is also concern regarding the mutagenicity potential of NSSs. Every new chemical registration in the food and pharmaceutical industries requires an evaluation of its mutagenic potential, which is essential for food safety. Most of the studies focus solely on determining the mutagenicity of NSSs through in vivo trials, which may be troublesome in terms of the time and cost required for experimental evaluation. To avoid the associated complexities concerning experimentation, a new approach methodology by developing machine learning (ML) models for mutagenicity predictions and selecting the best models by a stringent cross-validation analysis is explored. Two random splits (50/50) of a dataset of 6881 organic compounds for model development are used. Consensus predictions are provided for the mutagenic potential of an external set of 332 NSSs using six selected models (three best ML models based on cross-validation using either data splitting strategy) through voting and considering the applicability domain using two different approaches. In addition, to check the reliability of predictions, the model-derived consensus predictions have also been compared to the predictions generated by the k-nearest neighbor method using the virtual models for property evaluation of chemicals within a global architecture platform and the consensus method generated in the toxicity estimation software tool platform. Finally, based on the analysis, six compounds could be prioritized as mutagenic NSSs in this investigation. The developed models have been made available from https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2400357"},"PeriodicalIF":2.8,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144302554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Modeling of Gain-of-Function Mutations on Androgen Receptor. 雄激素受体功能获得性突变的深度建模。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-04-01 DOI: 10.1002/minf.202500018
Jiaying You, Jane Foo, Nada Lallous, Artem Cherkasov

The efficiency of Androgen Receptor (AR) pathway inhibitors for prostate cancer (PCa) is on decline due to resistance mechanisms including the occurrence of gain-of-function mutations on human androgen receptor (AR). Hence, understanding and predicting such mutations is crucial for developing effective PCa treatment strategies. Leveraging accu- mulated data on clinically relevant AR mutants with recent advances in deep modeling techniques, this study aims to unveil and quantify critical AR mutation-drug relation- ships. By incorporating molecular descriptors for drugs and mutated genes sequences, this work represented these features as single vectors and demonstrates their effective- ness in modeling AR mutant responses to conventional antiandrogens. The developed approach achieves above 80% accuracy in predicting the gain-of-function behavior of AR mutants and therefore can potentially uncover unknown agonist/antagonist relationships among mutant-drug pairs.

雄激素受体(AR)途径抑制剂治疗前列腺癌(PCa)的效率正在下降,原因是人类雄激素受体(AR)发生功能获得性突变等耐药机制。因此,了解和预测这些突变对于制定有效的前列腺癌治疗策略至关重要。利用积累的临床相关AR突变数据和最新的深度建模技术,本研究旨在揭示和量化关键的AR突变与药物之间的关系。通过结合药物和突变基因序列的分子描述符,这项工作将这些特征表示为单一载体,并证明了它们在模拟AR突变对常规抗雄激素的反应方面的有效性。所开发的方法在预测AR突变体的功能获得行为方面达到80%以上的准确率,因此可以潜在地揭示突变药物对之间未知的激动剂/拮抗剂关系。
{"title":"Deep Modeling of Gain-of-Function Mutations on Androgen Receptor.","authors":"Jiaying You, Jane Foo, Nada Lallous, Artem Cherkasov","doi":"10.1002/minf.202500018","DOIUrl":"https://doi.org/10.1002/minf.202500018","url":null,"abstract":"<p><p>The efficiency of Androgen Receptor (AR) pathway inhibitors for prostate cancer (PCa) is on decline due to resistance mechanisms including the occurrence of gain-of-function mutations on human androgen receptor (AR). Hence, understanding and predicting such mutations is crucial for developing effective PCa treatment strategies. Leveraging accu- mulated data on clinically relevant AR mutants with recent advances in deep modeling techniques, this study aims to unveil and quantify critical AR mutation-drug relation- ships. By incorporating molecular descriptors for drugs and mutated genes sequences, this work represented these features as single vectors and demonstrates their effective- ness in modeling AR mutant responses to conventional antiandrogens. The developed approach achieves above 80% accuracy in predicting the gain-of-function behavior of AR mutants and therefore can potentially uncover unknown agonist/antagonist relationships among mutant-drug pairs.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 4","pages":"e202500018"},"PeriodicalIF":2.8,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144035853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling Carbon Basicity. 碳碱度建模。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-03-01 DOI: 10.1002/minf.202400296
Robert Fraczkiewicz, Marvin Waldman

This work presents a predictive model of aqueous ionization constants (pKa) of protonatable carbons in certain aromatic rings. The phenomenon of carbon atoms sometimes acting as a stable and reversible base accepting a proton in water solution is surprisingly little recognized in medicinal chemistry, although known to general chemists for the past 60+years. We present the development and results for two predictive models: 1) identifying the most basic carbon in a ring, and 2) calculating the resulting microscopic pKa value. Both models were incorporated into our global (i. e., taking all ionizable groups into account) S+pKa model.[1-2].

本文提出了一种芳香环中可质子化碳水溶液电离常数(pKa)的预测模型。碳原子有时在水溶液中作为稳定可逆的碱接受质子的现象在药物化学中很少被认识到,尽管在过去的60多年里一般化学家都知道。我们介绍了两个预测模型的发展和结果:1)确定环中最基本的碳,2)计算得到的微观pKa值。这两个模型都被纳入了我们的全球(即。(考虑了所有可电离基团)S+pKa模型[1-2]。
{"title":"Modeling Carbon Basicity.","authors":"Robert Fraczkiewicz, Marvin Waldman","doi":"10.1002/minf.202400296","DOIUrl":"10.1002/minf.202400296","url":null,"abstract":"<p><p>This work presents a predictive model of aqueous ionization constants (pK<sub>a</sub>) of protonatable carbons in certain aromatic rings. The phenomenon of carbon atoms sometimes acting as a stable and reversible base accepting a proton in water solution is surprisingly little recognized in medicinal chemistry, although known to general chemists for the past 60+years. We present the development and results for two predictive models: 1) identifying the most basic carbon in a ring, and 2) calculating the resulting microscopic pK<sub>a</sub> value. Both models were incorporated into our global (i. e., taking all ionizable groups into account) S+pKa model.[1-2].</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 3","pages":"e202400296"},"PeriodicalIF":2.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning in Drug Development for Neurological Diseases: A Review of Blood Brain Barrier Permeability Prediction Models. 神经系统疾病药物开发中的机器学习:血脑屏障渗透性预测模型综述。
IF 2.8 4区 医学 Q3 CHEMISTRY, MEDICINAL Pub Date : 2025-03-01 DOI: 10.1002/minf.202400325
Aryon Eckleel Nabi, Pedram Pouladvand, Litian Liu, Ning Hua, Cyrus Ayubcha

The blood brain barrier (BBB) is an endothelial-derived structure which restricts the movement of certain molecules between the general somatic circulatory system to the central nervous system (CNS). While the BBB maintains homeostasis by regulating the molecular environment induced by cerebrovascular perfusion, it also presents significant challenges in developing therapeutics intended to act on CNS targets. Many drug development practices rely partly on extensive cell and animal models to predict, to an extent, whether prospective therapeutic molecules can cross the BBB. In interest to reduce costs and improve prediction accuracy, many propose using advanced computational modeling of BBB permeability profiles leveraging empirical data. Given the scale of growth in machine learning and deep learning, we review the most recent machine learning approaches in predicting BBB permeability.

血脑屏障(BBB)是一种内皮衍生的结构,它限制了一般躯体循环系统与中枢神经系统(CNS)之间某些分子的运动。虽然血脑屏障通过调节脑血管灌注诱导的分子环境来维持体内平衡,但在开发针对中枢神经系统靶点的治疗方法方面也面临着重大挑战。许多药物开发实践部分依赖于广泛的细胞和动物模型来预测,在一定程度上,未来的治疗分子是否可以穿过血脑屏障。为了降低成本和提高预测精度,许多人建议利用经验数据对血脑屏障渗透率剖面进行先进的计算建模。鉴于机器学习和深度学习的增长规模,我们回顾了预测血脑屏障渗透率的最新机器学习方法。
{"title":"Machine Learning in Drug Development for Neurological Diseases: A Review of Blood Brain Barrier Permeability Prediction Models.","authors":"Aryon Eckleel Nabi, Pedram Pouladvand, Litian Liu, Ning Hua, Cyrus Ayubcha","doi":"10.1002/minf.202400325","DOIUrl":"10.1002/minf.202400325","url":null,"abstract":"<p><p>The blood brain barrier (BBB) is an endothelial-derived structure which restricts the movement of certain molecules between the general somatic circulatory system to the central nervous system (CNS). While the BBB maintains homeostasis by regulating the molecular environment induced by cerebrovascular perfusion, it also presents significant challenges in developing therapeutics intended to act on CNS targets. Many drug development practices rely partly on extensive cell and animal models to predict, to an extent, whether prospective therapeutic molecules can cross the BBB. In interest to reduce costs and improve prediction accuracy, many propose using advanced computational modeling of BBB permeability profiles leveraging empirical data. Given the scale of growth in machine learning and deep learning, we review the most recent machine learning approaches in predicting BBB permeability.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 3","pages":"e202400325"},"PeriodicalIF":2.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11949286/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143729938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Molecular Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1