首页 > 最新文献

Journal of Computer-Aided Molecular Design最新文献

英文 中文
pKa prediction for small molecules: an overview of experimental, quantum, and machine learning-based approaches 小分子的pKa预测:基于实验、量子和机器学习方法的概述
IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-25 DOI: 10.1007/s10822-025-00719-9
Juda Baikété, Alhadji Malloum, Jeanet Conradie

The pKa, also known as the logarithmic dissociation constant, is a crucial parameter that defines the ionization level of a molecule when it is in solution. It is essential for several physicochemical properties, including lipophilicity, solubility, protein binding affinity, and the ability to cross biological membranes. Therefore, obtaining accurate pKa assessments is vital for modifying and refining the acidity and basicity of organic compounds. Accurate prediction can help improve drug design, optimize pharmaceutical formulations, analyze the behavior of pollutants in the environment, and guide the development of new materials. Traditionally, pKa determination has relied on experimental techniques. However, the recent emergence of machine learning (ML) has led to significant advances in pKa prediction. In this review, we examine various approaches for pKa prediction, with a focus on recent advances in machine learning. We discuss the performance of these models, drawing on results reported in publications related to the SAMPL Challenges and Novartis prediction challenges. Because of their different theoretical and computational frameworks, protein pKa prediction methods are not included in this review, which focuses exclusively on small organic molecules. Finally, we highlight current challenges and future directions, including the integration of hybrid models combining quantum mechanics and machine learning, the improvement of benchmark databases, and the development of more universal and interpretable predictive models. We hope that this paper can provide useful guidelines for future research.

pKa,也被称为对数解离常数,是一个关键参数,它定义了一个分子在溶液中的电离水平。它对几种物理化学性质至关重要,包括亲脂性、溶解度、蛋白质结合亲和力和穿越生物膜的能力。因此,获得准确的pKa评价对于修饰和精炼有机化合物的酸碱度至关重要。准确的预测可以帮助改进药物设计,优化药物配方,分析环境中污染物的行为,指导新材料的开发。传统上,pKa的测定依赖于实验技术。然而,最近机器学习(ML)的出现使pKa预测取得了重大进展。在这篇综述中,我们研究了pKa预测的各种方法,重点是机器学习的最新进展。我们讨论了这些模型的性能,借鉴了与SAMPL挑战和诺华预测挑战相关的出版物中报告的结果。由于其不同的理论和计算框架,蛋白质pKa预测方法不包括在本综述中,主要集中在小有机分子。最后,我们强调了当前的挑战和未来的方向,包括结合量子力学和机器学习的混合模型的集成,基准数据库的改进,以及更通用和可解释的预测模型的发展。希望本文能为今后的研究提供有益的指导。
{"title":"pKa prediction for small molecules: an overview of experimental, quantum, and machine learning-based approaches","authors":"Juda Baikété,&nbsp;Alhadji Malloum,&nbsp;Jeanet Conradie","doi":"10.1007/s10822-025-00719-9","DOIUrl":"10.1007/s10822-025-00719-9","url":null,"abstract":"<p>The pKa, also known as the logarithmic dissociation constant, is a crucial parameter that defines the ionization level of a molecule when it is in solution. It is essential for several physicochemical properties, including lipophilicity, solubility, protein binding affinity, and the ability to cross biological membranes. Therefore, obtaining accurate pKa assessments is vital for modifying and refining the acidity and basicity of organic compounds. Accurate prediction can help improve drug design, optimize pharmaceutical formulations, analyze the behavior of pollutants in the environment, and guide the development of new materials. Traditionally, pKa determination has relied on experimental techniques. However, the recent emergence of machine learning (ML) has led to significant advances in pKa prediction. In this review, we examine various approaches for pKa prediction, with a focus on recent advances in machine learning. We discuss the performance of these models, drawing on results reported in publications related to the SAMPL Challenges and Novartis prediction challenges. Because of their different theoretical and computational frameworks, protein pKa prediction methods are not included in this review, which focuses exclusively on small organic molecules. Finally, we highlight current challenges and future directions, including the integration of hybrid models combining quantum mechanics and machine learning, the improvement of benchmark databases, and the development of more universal and interpretable predictive models. We hope that this paper can provide useful guidelines for future research.</p>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"40 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial intelligence in protein-based detection and inhibition of AMR pathways 基于蛋白质的AMR通路检测和抑制中的人工智能
IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-25 DOI: 10.1007/s10822-025-00710-4
Suchandrima Sadhukhan, Rupsa Bhattacharya, Debasmita Bhattcharya, Sudipta Sahana, Buddhadeb Pradhan, Soumya Pandit, Harjot Singh Gill, Mithul Rajeev, Moupriya Nag, Dibyajit Lahiri

Antimicrobial Resistance (AMR) is a global concern demanding high-throughput and precise AMR surveillance strategies. This review provides a comprehensive list of Artificial Intelligence (AI) driven frameworks widely employed in the early detection, structural characterization, and designing of novel inhibitors to block the resistance pathways critical for AMR. Deep learning algorithms including DeepGO, DeepGOPlus, DeepGO-SE, PFresGO, DPFunc, ProtENN and graph-based architectures of GraphSite, GrASP enables precise functional annotation of resistance-associated proteins. AI-guided protein modeling performed by AlphaFold, RoseTTAFold, ProtGPT-2, ESMFold etc. generates high resolution 3D conformations, further utilized in performing molecular docking via tools like AutoDock, DeepDocking and DeepChem and analyzed with tools like DeepDriveMD, TorchMD, and PRITHVI, which can perform real-time molecular dynamics simulations. Identification of relevant resistant biomarkers from mass-spectrometry profiles can also be achieved with the help of DeepNovo, Casanovo, or Prosit. Tools like DeepARG, HMD-ARG, and BacEffluxPred enables identification of unannotated resistance genes from metagenomic samples. Natural Language Processing (NLP) and Large Language-based models (LLM) facilitate identification of resistant determinants via literature mining enabling regulatory network mapping and rational inhibitor design. Furthermore, AI-mediated de-novo inhibitor design is achieved using Variational Autoencoders (VAE), Generative Adversarial Networks (GAN), diffusion and flow-matching based frameworks serve as potential options for enhancing diagnostic interventions against resistant phenotypes. AI-based protein–protein interaction predictors include DeepInteract, Pred_PPI, PLIP, DeepAIPs-Pred, DeepAIPs-SFLA, SBSM-Pro, Deep Stacked-AVPs, and pNPs-CapsNet help in understanding how resistance proteins interact with each other enabling precise identification of AMR-modulating peptides and supports the modeling of novel antibiotics for blocking interactions and disrupting resistance pathways.

Graphical abstract

抗菌素耐药性(AMR)是一个全球关注的问题,需要高通量和精确的AMR监测策略。本文综述了广泛用于早期检测、结构表征和设计新型抑制剂以阻断AMR关键耐药途径的人工智能(AI)驱动框架的综合列表。深度学习算法包括DeepGO、DeepGOPlus、DeepGO- se、PFresGO、DPFunc、ProtENN以及GraphSite、GrASP的基于图形的架构,能够对抗性相关蛋白进行精确的功能注释。AlphaFold、RoseTTAFold、ProtGPT-2、ESMFold等进行人工智能引导的蛋白质建模,生成高分辨率的3D构象,并通过AutoDock、DeepDocking和DeepChem等工具进行分子对接,并使用DeepDriveMD、TorchMD和PRITHVI等工具进行分析,可以进行实时分子动力学模拟。在DeepNovo、Casanovo或Prosit的帮助下,也可以从质谱谱中识别出相关的耐药生物标志物。DeepARG、HMD-ARG和BacEffluxPred等工具可以从宏基因组样本中鉴定未注释的抗性基因。自然语言处理(NLP)和基于大型语言的模型(LLM)通过文献挖掘促进了调控网络映射和合理抑制剂设计,从而促进了抗性决定因素的识别。此外,人工智能介导的去novo抑制剂设计是使用变分自编码器(VAE)、生成对抗网络(GAN)、扩散和基于流量匹配的框架来实现的,这些框架可以作为增强针对抗性表型的诊断干预的潜在选择。基于人工智能的蛋白-蛋白相互作用预测因子包括deepinteraction、Pred_PPI、PLIP、DeepAIPs-Pred、DeepAIPs-SFLA、SBSM-Pro、Deep stacking - avps和pNPs-CapsNet,这些因子有助于理解耐药蛋白如何相互作用,从而精确鉴定抗菌素耐药性调节肽,并支持建立阻断相互作用和破坏耐药途径的新型抗生素模型。图形抽象
{"title":"Artificial intelligence in protein-based detection and inhibition of AMR pathways","authors":"Suchandrima Sadhukhan,&nbsp;Rupsa Bhattacharya,&nbsp;Debasmita Bhattcharya,&nbsp;Sudipta Sahana,&nbsp;Buddhadeb Pradhan,&nbsp;Soumya Pandit,&nbsp;Harjot Singh Gill,&nbsp;Mithul Rajeev,&nbsp;Moupriya Nag,&nbsp;Dibyajit Lahiri","doi":"10.1007/s10822-025-00710-4","DOIUrl":"10.1007/s10822-025-00710-4","url":null,"abstract":"<div><p>Antimicrobial Resistance (AMR) is a global concern demanding high-throughput and precise AMR surveillance strategies. This review provides a comprehensive list of Artificial Intelligence (AI) driven frameworks widely employed in the early detection, structural characterization, and designing of novel inhibitors to block the resistance pathways critical for AMR. Deep learning algorithms including DeepGO, DeepGOPlus, DeepGO-SE, PFresGO, DPFunc, ProtENN and graph-based architectures of GraphSite, GrASP enables precise functional annotation of resistance-associated proteins. AI-guided protein modeling performed by AlphaFold, RoseTTAFold, ProtGPT-2, ESMFold etc. generates high resolution 3D conformations, further utilized in performing molecular docking via tools like AutoDock, DeepDocking and DeepChem and analyzed with tools like DeepDriveMD, TorchMD, and PRITHVI, which can perform real-time molecular dynamics simulations. Identification of relevant resistant biomarkers from mass-spectrometry profiles can also be achieved with the help of DeepNovo, Casanovo, or Prosit. Tools like DeepARG, HMD-ARG, and BacEffluxPred enables identification of unannotated resistance genes from metagenomic samples. Natural Language Processing (NLP) and Large Language-based models (LLM) facilitate identification of resistant determinants via literature mining enabling regulatory network mapping and rational inhibitor design. Furthermore, AI-mediated de-novo inhibitor design is achieved using Variational Autoencoders (VAE), Generative Adversarial Networks (GAN), diffusion and flow-matching based frameworks serve as potential options for enhancing diagnostic interventions against resistant phenotypes. AI-based protein–protein interaction predictors include DeepInteract, Pred_PPI, PLIP, DeepAIPs-Pred, DeepAIPs-SFLA, SBSM-Pro, Deep Stacked-AVPs, and pNPs-CapsNet help in understanding how resistance proteins interact with each other enabling precise identification of AMR-modulating peptides and supports the modeling of novel antibiotics for blocking interactions and disrupting resistance pathways.</p><h3>Graphical abstract</h3><div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"40 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Calibrating the gap: a user-friendly aqueous (text{p}K_a) prediction protocol for organic acids, alcohols, and amines 校准的差距:一个用户友好的水(text{p}K_a)预测方案有机酸,醇和胺
IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-25 DOI: 10.1007/s10822-025-00708-y
Amílcar Duque-Prata, Carlos Serpa, Pedro J. S. B. Caridade

This study introduces a simple and computationally efficient protocol for estimating the values of aqueous (text{p}K_a) in three major classes of functional groups: organic acids, alcohols, and amines. Although direct density functional theory calculations yielded notable discrepancies from experimental values, the application of class-specific linear calibration significantly improved predictive accuracy. The correlation coefficients increased from 0.67 (uncalibrated) to 0.98 (calibrated), with mean absolute errors of 0.51, 0.69 and 0.37 (text{p}K_a) units for acids, alcohols, and amines, respectively. The observed class-dependent linear trends validate the chemical consistency of the approach, even in the presence of structural diversity. Correlation analysis showed that predictive errors are largely uncorrelated with standard molecular descriptors, indicating that model performance is predominantly governed by the functional group of the ionizable proton. By avoiding subclass distinctions and relying solely on functional group identity, the method maintains simplicity and broad applicability without sacrificing accuracy. Most predictions fall within (pm 0.75) (text{p}K_a) units, supporting the robustness of the protocol. The approach offers a practical framework for systematic estimation of aqueous (text{p}K_a), which is a compelling option for routine prediction of aqueous (text{p}K_a) in various chemical contexts.

本研究介绍了一种简单且计算效率高的方案,用于估计水溶液中三个主要官能团(有机酸、醇和胺)(text{p}K_a)的值。尽管直接密度泛函理论计算结果与实验值存在显著差异,但应用类别特定线性校准显著提高了预测精度。相关系数从0.67(未校准)增加到0.98(校准),酸、醇和胺的平均绝对误差分别为0.51、0.69和0.37 (text{p}K_a)单位。观察到的类相关线性趋势验证了该方法的化学一致性,即使存在结构多样性。相关分析表明,预测误差在很大程度上与标准分子描述符无关,表明模型性能主要由可电离质子的官能团控制。该方法避免了子类的区分,仅依靠功能群同一性,在不牺牲准确性的前提下保持了简单性和广泛的适用性。大多数预测落在(pm 0.75)(text{p}K_a)单位内,支持协议的鲁棒性。该方法为系统估计含水(text{p}K_a)提供了一个实用的框架,这是在各种化学背景下对含水(text{p}K_a)进行常规预测的一个引人注目的选择。
{"title":"Calibrating the gap: a user-friendly aqueous (text{p}K_a) prediction protocol for organic acids, alcohols, and amines","authors":"Amílcar Duque-Prata,&nbsp;Carlos Serpa,&nbsp;Pedro J. S. B. Caridade","doi":"10.1007/s10822-025-00708-y","DOIUrl":"10.1007/s10822-025-00708-y","url":null,"abstract":"<div><p>This study introduces a simple and computationally efficient protocol for estimating the values of aqueous <span>(text{p}K_a)</span> in three major classes of functional groups: organic acids, alcohols, and amines. Although direct density functional theory calculations yielded notable discrepancies from experimental values, the application of class-specific linear calibration significantly improved predictive accuracy. The correlation coefficients increased from 0.67 (uncalibrated) to 0.98 (calibrated), with mean absolute errors of 0.51, 0.69 and 0.37 <span>(text{p}K_a)</span> units for acids, alcohols, and amines, respectively. The observed class-dependent linear trends validate the chemical consistency of the approach, even in the presence of structural diversity. Correlation analysis showed that predictive errors are largely uncorrelated with standard molecular descriptors, indicating that model performance is predominantly governed by the functional group of the ionizable proton. By avoiding subclass distinctions and relying solely on functional group identity, the method maintains simplicity and broad applicability without sacrificing accuracy. Most predictions fall within <span>(pm 0.75)</span> <span>(text{p}K_a)</span> units, supporting the robustness of the protocol. The approach offers a practical framework for systematic estimation of aqueous <span>(text{p}K_a)</span>, which is a compelling option for routine prediction of aqueous <span>(text{p}K_a)</span> in various chemical contexts.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"40 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design, synthesis, pharmacological evaluation and computational modeling of 4-formyl-2-nitrophenyl quinoline-8-sulfonate derived thiosemicarbazones as antidiabetic agents 抗糖尿病药物4-甲酰基-2-硝基苯基喹啉-8-磺酸衍生物硫代氨基脲的设计、合成、药理学评价和计算模型
IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-25 DOI: 10.1007/s10822-025-00707-z
Muhammad Tayyab, Khalid Mahmood, Khawar Abbas, Farhan Siddique, Nastaran Sadeghian, Halil Şenol, Maryam Bashir, Parham Taslimi, Abdullah K. Alanazi, Mostafa A. Ismail, Xianliang Zhao, Zahid Shafiq

A novel series of thiosemicarbazone derivatives 6(a–i), synthesized from 4-formyl-2-nitrophenyl quinoline-8-sulfonate, was evaluated for its antidiabetic potential. Among them, compound 6i (IC₅₀ = 54.51 ± 0.84 µM) displayed the most potent α-glucosidase inhibition, whereas 6e (IC₅₀ = 9.66 ± 0.14 µM) exhibited superior α-amylase inhibition, indicating their dual therapeutic potential against key carbohydrate-hydrolyzing enzymes implicated in postprandial hyperglycemia. These derivatives showed structural diversity with potent and selective inhibition profiles. Structure-activity relationship analysis revealed that electron-withdrawing substituents enhanced enzyme affinity and biological activity. However, molecular docking studies demonstrated strong binding affinities for compounds 6f and 6b with docking scores of − 9.1 to − 10.4 kcal/mol against target proteins, via hydrogen bonding and π–π interactions with catalytic residues. Furthermore, in-silico ADMET evaluation predicted good oral bioavailability, low toxicity, and favorable pharmacokinetic properties. The Density Functional Theory (DFT) calculations supported experimental results, where studied compounds showed lower HOMO-LUMO energy gaps (2.41–3.42 eV), suggesting their significant chemical reactivity and molecular stability of these compounds. Overall, in-vitro and in-silico studies revealed that compounds 6b, 6f, 6e, and 6i emerged as promising lead molecules for developing dual-action therapeutic agents targeting hyperglycemia and oxidative damage in diabetes management.

以4-甲酰基-2-硝基苯基喹啉-8-磺酸盐为原料合成了一系列新的硫代氨基脲衍生物6(A - i),并对其抗糖尿病活性进行了评价。其中,化合物6i (IC₅₀= 54.51±0.84µM)表现出最有效的α-葡萄糖苷酶抑制作用,而6e (IC₅₀= 9.66±0.14µM)表现出更强的α-淀粉酶抑制作用,表明它们对餐后高血糖相关的关键碳水化合物水解酶具有双重治疗潜力。这些衍生物显示出结构多样性,具有有效和选择性的抑制特征。构效关系分析表明,吸电子取代基增强了酶的亲和力和生物活性。然而,分子对接研究表明,化合物6f和6b通过氢键和与催化残基的π -π相互作用,与靶蛋白具有很强的结合亲和力,对接分数为−9.1至−10.4 kcal/mol。此外,计算机ADMET评价预测了良好的口服生物利用度、低毒性和良好的药代动力学特性。密度泛函理论(DFT)计算支持实验结果,所研究的化合物具有较低的HOMO-LUMO能隙(2.41-3.42 eV),表明这些化合物具有显著的化学反应性和分子稳定性。总体而言,体外和计算机研究表明,化合物6b、6f、6e和6i有望成为开发针对糖尿病治疗中高血糖和氧化损伤的双作用治疗药物的先导分子。
{"title":"Design, synthesis, pharmacological evaluation and computational modeling of 4-formyl-2-nitrophenyl quinoline-8-sulfonate derived thiosemicarbazones as antidiabetic agents","authors":"Muhammad Tayyab,&nbsp;Khalid Mahmood,&nbsp;Khawar Abbas,&nbsp;Farhan Siddique,&nbsp;Nastaran Sadeghian,&nbsp;Halil Şenol,&nbsp;Maryam Bashir,&nbsp;Parham Taslimi,&nbsp;Abdullah K. Alanazi,&nbsp;Mostafa A. Ismail,&nbsp;Xianliang Zhao,&nbsp;Zahid Shafiq","doi":"10.1007/s10822-025-00707-z","DOIUrl":"10.1007/s10822-025-00707-z","url":null,"abstract":"<div><p>A novel series of thiosemicarbazone derivatives <b>6(a–i)</b>, synthesized from 4-formyl-2-nitrophenyl quinoline-8-sulfonate, was evaluated for its antidiabetic potential. Among them, compound <b>6i</b> (IC₅₀ = 54.51 ± 0.84 µM) displayed the most potent α-glucosidase inhibition, whereas <b>6e</b> (IC₅₀ = 9.66 ± 0.14 µM) exhibited superior α-amylase inhibition, indicating their dual therapeutic potential against key carbohydrate-hydrolyzing enzymes implicated in postprandial hyperglycemia. These derivatives showed structural diversity with potent and selective inhibition profiles. Structure-activity relationship analysis revealed that electron-withdrawing substituents enhanced enzyme affinity and biological activity. However, molecular docking studies demonstrated strong binding affinities for compounds <b>6f</b> and <b>6b</b> with docking scores of − 9.1 to − 10.4 kcal/mol against target proteins, via hydrogen bonding and π–π interactions with catalytic residues. Furthermore, in-silico ADMET evaluation predicted good oral bioavailability, low toxicity, and favorable pharmacokinetic properties. The Density Functional Theory (DFT) calculations supported experimental results, where studied compounds showed lower HOMO-LUMO energy gaps (2.41–3.42 eV), suggesting their significant chemical reactivity and molecular stability of these compounds. Overall, in-vitro and in-silico studies revealed that compounds <b>6b</b>, <b>6f</b>, <b>6e</b>, and <b>6i</b> emerged as promising lead molecules for developing dual-action therapeutic agents targeting hyperglycemia and oxidative damage in diabetes management.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"40 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breast cancer diagnosis from histopathological images and molecular signatures by fusing features with an explainable AI-based residual tabular network model 通过融合可解释的基于ai的残差表格网络模型的特征,从组织病理图像和分子特征中诊断乳腺癌
IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-25 DOI: 10.1007/s10822-025-00709-x
S. Sam Jaikumar, S. Mary Praveena

Early Breast Cancer (BC) Diagnosis has the potential to cut BC death rates in the long term drastically. Identifying early-stage cancer cells is the most crucial step in determining the best prognosis. Despite recent advances in the use of AI-based methods, such as machine learning and deep learning (DL), to detect breast cancer, current models are generally limited to simple binary classification of data, rely on a single source of data, and lack transparency, thereby limiting their clinical applicability. To overcome these limitations, we proposed an Explainable Artificial Intelligence (AI)-based Residual Tabular Network (ResTab Net) model based on integrating histopathological images and molecular protein expression data patterns to conduct multimodal BC diagnosis. The proposed model utilizes Adaptive Tissue-Aware Gaussian Filtering (ATGF) to enhance the image, Entropy Enhanced Graph-Watershed Segmentation (EGWS) to clearly define the tumor’s location, and Self-Adaptive Starfish Optimization (SASFO) to select the features. A hybrid framework of residual convolutional blocks and dense layers can facilitate successful multiclass classification. To ensure tangible transparency and clinical trust, the model captures SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) approaches illustrates the impact of molecular protein levels, including image features on classification results. The proposed Explainable AI-ResTab Net model is implemented using Python. The performance evolution of the proposed model achieves an accuracy of 98.56%, a precision of 98.10%, a recall of 98.00%, an F1-score of 98.03%, and an Area Under the Curve (AUC) of 99.60%.

从长远来看,早期乳腺癌诊断有可能大幅降低乳腺癌死亡率。识别早期癌细胞是确定最佳预后的最关键步骤。尽管最近在使用基于人工智能的方法(如机器学习和深度学习(DL))来检测乳腺癌方面取得了进展,但目前的模型通常仅限于数据的简单二元分类,依赖单一数据来源,缺乏透明度,从而限制了其临床适用性。为了克服这些局限性,我们提出了一个基于可解释人工智能(AI)的残差表网络(ResTab Net)模型,该模型基于整合组织病理图像和分子蛋白表达数据模式进行多模式BC诊断。该模型采用自适应组织感知高斯滤波(ATGF)增强图像,熵增强图分水岭分割(EGWS)清晰定义肿瘤位置,自适应海星优化(SASFO)选择特征。残差卷积块和密集层的混合框架可以促进成功的多类分类。为了确保实际的透明度和临床信任,该模型采用了SHapley加性解释(SHAP)和局部可解释模型不可知论解释(LIME)方法,说明了分子蛋白水平(包括图像特征)对分类结果的影响。提出的可解释AI-ResTab Net模型使用Python实现。该模型的准确率为98.56%,精密度为98.10%,召回率为98.00%,f1得分为98.03%,曲线下面积(AUC)为99.60%。
{"title":"Breast cancer diagnosis from histopathological images and molecular signatures by fusing features with an explainable AI-based residual tabular network model","authors":"S. Sam Jaikumar,&nbsp;S. Mary Praveena","doi":"10.1007/s10822-025-00709-x","DOIUrl":"10.1007/s10822-025-00709-x","url":null,"abstract":"<div><p>Early Breast Cancer (BC) Diagnosis has the potential to cut BC death rates in the long term drastically. Identifying early-stage cancer cells is the most crucial step in determining the best prognosis. Despite recent advances in the use of AI-based methods, such as machine learning and deep learning (DL), to detect breast cancer, current models are generally limited to simple binary classification of data, rely on a single source of data, and lack transparency, thereby limiting their clinical applicability. To overcome these limitations, we proposed an Explainable Artificial Intelligence (AI)-based Residual Tabular Network (ResTab Net) model based on integrating histopathological images and molecular protein expression data patterns to conduct multimodal BC diagnosis. The proposed model utilizes Adaptive Tissue-Aware Gaussian Filtering (ATGF) to enhance the image, Entropy Enhanced Graph-Watershed Segmentation (EGWS) to clearly define the tumor’s location, and Self-Adaptive Starfish Optimization (SASFO) to select the features. A hybrid framework of residual convolutional blocks and dense layers can facilitate successful multiclass classification. To ensure tangible transparency and clinical trust, the model captures SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) approaches illustrates the impact of molecular protein levels, including image features on classification results. The proposed Explainable AI-ResTab Net model is implemented using Python. The performance evolution of the proposed model achieves an accuracy of 98.56%, a precision of 98.10%, a recall of 98.00%, an F1-score of 98.03%, and an Area Under the Curve (AUC) of 99.60%.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"40 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145584947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LiteBoost: a lightweight and explainable boosting model for predicting polymer density from SMILES data LiteBoost:一个轻量级的、可解释的增强模型,用于从SMILES数据预测聚合物密度
IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-14 DOI: 10.1007/s10822-025-00693-2
Tuan Nguyen-Sy, Hieu Do-Trung, Nam Nguyen-Hoang, Duc Toan Truong, My-Kristyna Nguyen-Thao

Accurately predicting polymer density from SMILES strings remains challenging due to the small size, high noise, and chemically diversity of typical datasets. We introduce LiteBoost, a deliberately minimalist gradient boosting model that employs shallow, three-level symmetric trees and exposes only two tunable hyperparameters (n_estimators and learning_rate). Using a curated dataset of 613 polymers, we benchmark LiteBoost against ExtraTrees, XGBoost, LightGBM, and CatBoost, optimizing each with 100–1000 Optuna trials and evaluating performance across seven complementary metrics: R2, RMSE, MAE, median AE, MAPE, maximum error, and explained variance. LiteBoost achieves a MAE of 0.031 g/cm3, RMSE of 0.062 g/cm3, R2 of 0.81, and MAPE of 3.03%, all within 2–3% of the best-in-class CatBoost and XGBoost scores and well within the bounds of experimental uncertainty. Crucially, it does so with orders-of-magnitude fewer hyperparameters. These results demonstrates that a streamlined boosting model can rival heavyweight ensembles in accuracy while dramatically reducing tuning effort, computational cost, and interpretability barriers. LiteBoost is thus a practical first-line surrogate model for high-throughput polymer screening and inverse-design workflows where speed, robustness, and transparency are as critical as raw predictive power.

由于典型数据集的尺寸小、噪声高、化学成分多样化,从SMILES管柱中准确预测聚合物密度仍然具有挑战性。我们介绍了LiteBoost,这是一种精心设计的极简梯度增强模型,它采用浅的三层对称树,只暴露两个可调的超参数(n_estimators和learning_rate)。使用613种聚合物的精选数据集,我们将LiteBoost与ExtraTrees、XGBoost、LightGBM和CatBoost进行基准测试,通过100-1000次Optuna试验对每种测试进行优化,并通过七个互补指标评估性能:R2、RMSE、MAE、AE中位数、MAPE、最大误差和解释方差。LiteBoost的MAE为0.031 g/cm3, RMSE为0.062 g/cm3, R2为0.81,MAPE为3.03%,均在同类最佳的CatBoost和XGBoost分数的2-3%以内,并且在实验不确定度范围内。至关重要的是,它的超参数要少得多。这些结果表明,流线型提升模型可以在精度上与重量级集成相媲美,同时显著减少调优工作量、计算成本和可解释性障碍。因此,LiteBoost是高通量聚合物筛选和逆向设计工作流程的实用一线替代模型,在这些工作流程中,速度、稳健性和透明度与原始预测能力同样重要。
{"title":"LiteBoost: a lightweight and explainable boosting model for predicting polymer density from SMILES data","authors":"Tuan Nguyen-Sy,&nbsp;Hieu Do-Trung,&nbsp;Nam Nguyen-Hoang,&nbsp;Duc Toan Truong,&nbsp;My-Kristyna Nguyen-Thao","doi":"10.1007/s10822-025-00693-2","DOIUrl":"10.1007/s10822-025-00693-2","url":null,"abstract":"<div><p>Accurately predicting polymer density from SMILES strings remains challenging due to the small size, high noise, and chemically diversity of typical datasets. We introduce LiteBoost, a deliberately minimalist gradient boosting model that employs shallow, three-level symmetric trees and exposes only two tunable hyperparameters (<i>n_estimators</i> and <i>learning_rate</i>). Using a curated dataset of 613 polymers, we benchmark LiteBoost against ExtraTrees, XGBoost, LightGBM, and CatBoost, optimizing each with 100–1000 Optuna trials and evaluating performance across seven complementary metrics: R<sup>2</sup>, RMSE, MAE, median AE, MAPE, maximum error, and explained variance. LiteBoost achieves a MAE of 0.031 g/cm<sup>3</sup>, RMSE of 0.062 g/cm<sup>3</sup>, R<sup>2</sup> of 0.81, and MAPE of 3.03%, all within 2–3% of the best-in-class CatBoost and XGBoost scores and well within the bounds of experimental uncertainty. Crucially, it does so with orders-of-magnitude fewer hyperparameters. These results demonstrates that a streamlined boosting model can rival heavyweight ensembles in accuracy while dramatically reducing tuning effort, computational cost, and interpretability barriers. LiteBoost is thus a practical first-line surrogate model for high-throughput polymer screening and inverse-design workflows where speed, robustness, and transparency are as critical as raw predictive power.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 2","pages":""},"PeriodicalIF":3.1,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145510915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-stage variational autoencoders for hierarchical molecular generation and activity optimization 分级分子生成和活性优化的多阶段变分自编码器。
IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-12 DOI: 10.1007/s10822-025-00705-1
Dileep Kumar Murala

Deep generative models may detect novel compounds with favourable features, exhibiting chemical design potential. Traditional single-stage variational autoencoders (VAEs) lack validity, uniqueness, and biologically meaningful distribution alignment. It is difficult to represent global molecular architecture and chemical properties in a single latent representation. To overcome these challenges, we offer a multi-stage VAE system that encodes and decodes molecular representations in sequence. Improvements to latent space retain structural integrity while also adding innovation and distinction. Validity, originality, novelty, Fréchet ChemNet Distance (FCD), and KL divergence are used to validate the methodology with ChEMBL and polymer datasets. The bioefficacy of EGFR inhibitors is evaluated using computational Chemprop-based QSAR models. We offer adaptive fine-tuning strategies for the inner-layer (IL) and outer-layer (OL) to improve generating accuracy. IL adaptability is most suited to active compounds. Quantitative evaluations indicate consistent gains in validity, novelty, and biological activity over strong baselines (for example, MoLeR and RationaleRL). We give MNIST tests that confirm the hierarchical training method’s stability but not its scalability beyond molecular tasks, ensuring cross-domain applicability. For generative drug discovery, hierarchical latent models with a multi-stage VAE are advised.

深度生成模型可以发现具有有利特征的新化合物,展示化学设计潜力。传统的单级变分自编码器(VAEs)缺乏有效性、唯一性和生物学意义上的分布对齐。在一个单一的潜在表示中很难表示全局的分子结构和化学性质。为了克服这些挑战,我们提供了一个多阶段VAE系统,该系统按顺序对分子表征进行编码和解码。对潜在空间的改进保留了结构的完整性,同时也增加了创新和区别。有效性、原创性、新颖性、fr化学网络距离(FCD)和KL散度用于验证ChEMBL和聚合物数据集的方法。使用基于chemprop的QSAR计算模型评估EGFR抑制剂的生物功效。我们为内层(IL)和外层(OL)提供了自适应微调策略,以提高生成精度。IL的适应性最适合于活性化合物。定量评估表明,在有效性、新颖性和生物活性方面,在强大的基线(例如,MoLeR和RationaleRL)上取得了一致的进展。我们给出了MNIST测试,证实了分层训练方法的稳定性,但不是其在分子任务之外的可扩展性,确保了跨领域的适用性。对于生成式药物发现,建议使用具有多阶段VAE的分层潜在模型。
{"title":"Multi-stage variational autoencoders for hierarchical molecular generation and activity optimization","authors":"Dileep Kumar Murala","doi":"10.1007/s10822-025-00705-1","DOIUrl":"10.1007/s10822-025-00705-1","url":null,"abstract":"<div><p>Deep generative models may detect novel compounds with favourable features, exhibiting chemical design potential. Traditional single-stage variational autoencoders (VAEs) lack validity, uniqueness, and biologically meaningful distribution alignment. It is difficult to represent global molecular architecture and chemical properties in a single latent representation. To overcome these challenges, we offer a multi-stage VAE system that encodes and decodes molecular representations in sequence. Improvements to latent space retain structural integrity while also adding innovation and distinction. Validity, originality, novelty, Fréchet ChemNet Distance (FCD), and KL divergence are used to validate the methodology with ChEMBL and polymer datasets. The bioefficacy of EGFR inhibitors is evaluated using computational Chemprop-based QSAR models. We offer adaptive fine-tuning strategies for the inner-layer (IL) and outer-layer (OL) to improve generating accuracy. IL adaptability is most suited to active compounds. Quantitative evaluations indicate consistent gains in validity, novelty, and biological activity over strong baselines (for example, MoLeR and RationaleRL). We give MNIST tests that confirm the hierarchical training method’s stability but not its scalability beyond molecular tasks, ensuring cross-domain applicability. For generative drug discovery, hierarchical latent models with a multi-stage VAE are advised.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 2","pages":""},"PeriodicalIF":3.1,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145494076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of Protein-Ligand binding interactions of alkaline phosphatase inhibitors by Quantum-Mechanical methods 用量子力学方法评价碱性磷酸酶抑制剂的蛋白质-配体结合相互作用。
IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-12 DOI: 10.1007/s10822-025-00701-5
Gabriela L. Borosky

Quantum-mechanical (QM) methods were applied to compute the relative binding energies of a set of structurally similar alkaline phosphatase (AP) inhibitors, using human placental AP (PLAP) as a model AP. The theoretical binding affinities were compared with their corresponding experimental inhibitory potencies. The calculated interaction energies reproduced the experimental activity order, showing linear correlations between QM relative binding energies and experimental pIC50 values with coefficients of determination R2 = 0.86–0.97. Examination of the binding interactions for the test inhibitors revealed that the AP inhibitory activity is determined by the catechol group and the benzimidazole/imidazole moieties of the ligands. The studied compounds formed protein-ligand complexes inside the active site of PLAP, suggesting they are competitive inhibitors. The present theoretical results are expected to be useful in developing new potent AP inhibitors. The employed computational approach for estimating QM protein − ligand interaction energies is proposed as a suitable drug design tool for predicting reliable QM relative binding affinities of structurally related compounds.

采用量子力学(QM)方法,以人胎盘AP (PLAP)为模型,计算了一组结构相似的碱性磷酸酶(AP)抑制剂的相对结合能,并将理论结合亲和力与相应的实验抑制能力进行了比较。计算得到的相互作用能与实验活动顺序一致,QM相对束缚能与实验pIC50值呈线性相关,决定系数R2 = 0.86 ~ 0.97。对测试抑制剂的结合相互作用的检查显示,AP抑制活性是由儿茶酚基团和配体的苯并咪唑/咪唑部分决定的。所研究的化合物在PLAP的活性位点内形成蛋白质-配体复合物,表明它们是竞争性抑制剂。本理论结果有望为开发新的强效AP抑制剂提供参考。提出了QM蛋白与配体相互作用能的计算方法,作为预测结构相关化合物的QM相对结合亲和力的合适药物设计工具。
{"title":"Evaluation of Protein-Ligand binding interactions of alkaline phosphatase inhibitors by Quantum-Mechanical methods","authors":"Gabriela L. Borosky","doi":"10.1007/s10822-025-00701-5","DOIUrl":"10.1007/s10822-025-00701-5","url":null,"abstract":"<div><p>Quantum-mechanical (QM) methods were applied to compute the relative binding energies of a set of structurally similar alkaline phosphatase (AP) inhibitors, using human placental AP (PLAP) as a model AP. The theoretical binding affinities were compared with their corresponding experimental inhibitory potencies. The calculated interaction energies reproduced the experimental activity order, showing linear correlations between QM relative binding energies and experimental pIC<sub>50</sub> values with coefficients of determination R<sup>2</sup> = 0.86–0.97. Examination of the binding interactions for the test inhibitors revealed that the AP inhibitory activity is determined by the catechol group and the benzimidazole/imidazole moieties of the ligands. The studied compounds formed protein-ligand complexes inside the active site of PLAP, suggesting they are competitive inhibitors. The present theoretical results are expected to be useful in developing new potent AP inhibitors. The employed computational approach for estimating QM protein − ligand interaction energies is proposed as a suitable drug design tool for predicting reliable QM relative binding affinities of structurally related compounds.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 2","pages":""},"PeriodicalIF":3.1,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145494131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HCV genotyping and rational computational designing of an immunogenic multiepitope vaccine against genotype 3a HCV基因分型及抗基因型3a免疫原性多表位疫苗的合理计算设计
IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-10 DOI: 10.1007/s10822-025-00698-x
Kashif Iqbal Sahibzada, Rizwan Abid, Haseeb Nisar, Reham A. Abd El Rahman, Muhammad Idrees, Dong-Qing Wei, Yuansen Hu, Saima Sadaf

Pakistan currently holds the second-highest prevalence rate of Hepatitis C virus (HCV) globally. It makes it crucial to continuously monitor the circulating genotypes in the population, especially among the people who inject drugs (PWIDs), as they pose a significant risk of spreading new genotypes in the population. To address this issue, we identified the circulating HCV genotypes among PWIDs and non-PWIDs through Next Generation Sequencing (NGS). Additionally, a multi-epitope vaccine was designed through an immunoinformatic approach using NGS and Sanger sequencing results. The study indicated genotype 3a as the most prevalent genotype among the 61 HCV cases tested through NGS, followed by genotype 1a. The non-allergic and highly antigenic epitopes from both MHC Class-I and Class-II epitopes were retreived from non-structural proteins. Furthermore, B-cell epitopes were retrieved from the E2 protein. The selected epitopes showed 88.26% population coverage rate. Based on large conformational simulation analysis from NMSims, four best constructs suitable for vaccine design were further evaluated for their binding energies through all-atom molecular dynamics simulations and the MMGBSA. One of the constructs showed a low binding energy value with MHC, indicating its potential as a vaccine candidate. However, further experimental work is required to determine its efficacy and safety profile. This research emphasizes the promise of combining multiepitope vaccine design advanced computational methods to accelerate and improve vaccine development thereby filling a crucial gap in the fight against rising antibiotic resistance.

Graphical abstract

巴基斯坦目前是全球丙型肝炎病毒(HCV)患病率第二高的国家。因此,持续监测人群中,特别是注射吸毒者(PWIDs)中的循环基因型至关重要,因为它们具有在人群中传播新基因型的重大风险。为了解决这个问题,我们通过下一代测序(NGS)确定了PWIDs和非PWIDs之间的循环HCV基因型。此外,利用NGS和Sanger测序结果,通过免疫信息学方法设计了一种多表位疫苗。该研究表明,在通过NGS检测的61例HCV病例中,基因型3a是最普遍的基因型,其次是基因型1a。从非结构蛋白中获得MHC i类和ii类表位的非过敏性和高抗原表位。此外,从E2蛋白中提取b细胞表位。所选表位的种群覆盖率为88.26%。基于NMSims的大构象模拟分析,通过全原子分子动力学模拟和MMGBSA进一步评估了4个最适合疫苗设计的最佳构建体的结合能。其中一种结构与MHC的结合能值较低,表明其作为候选疫苗的潜力。然而,需要进一步的实验工作来确定其有效性和安全性。这项研究强调了结合多表位疫苗设计和先进的计算方法来加速和改善疫苗开发的希望,从而填补了对抗不断上升的抗生素耐药性的关键空白。
{"title":"HCV genotyping and rational computational designing of an immunogenic multiepitope vaccine against genotype 3a","authors":"Kashif Iqbal Sahibzada,&nbsp;Rizwan Abid,&nbsp;Haseeb Nisar,&nbsp;Reham A. Abd El Rahman,&nbsp;Muhammad Idrees,&nbsp;Dong-Qing Wei,&nbsp;Yuansen Hu,&nbsp;Saima Sadaf","doi":"10.1007/s10822-025-00698-x","DOIUrl":"10.1007/s10822-025-00698-x","url":null,"abstract":"<div><p>Pakistan currently holds the second-highest prevalence rate of Hepatitis C virus (HCV) globally. It makes it crucial to continuously monitor the circulating genotypes in the population, especially among the people who inject drugs (PWIDs), as they pose a significant risk of spreading new genotypes in the population. To address this issue, we identified the circulating HCV genotypes among PWIDs and non-PWIDs through Next Generation Sequencing (NGS). Additionally, a multi-epitope vaccine was designed through an immunoinformatic approach using NGS and Sanger sequencing results. The study indicated genotype 3a as the most prevalent genotype among the 61 HCV cases tested through NGS, followed by genotype 1a. The non-allergic and highly antigenic epitopes from both MHC Class-I and Class-II epitopes were retreived from non-structural proteins. Furthermore, B-cell epitopes were retrieved from the E2 protein. The selected epitopes showed 88.26% population coverage rate. Based on large conformational simulation analysis from NMSims, four best constructs suitable for vaccine design were further evaluated for their binding energies through all-atom molecular dynamics simulations and the MMGBSA. One of the constructs showed a low binding energy value with MHC, indicating its potential as a vaccine candidate. However, further experimental work is required to determine its efficacy and safety profile. This research emphasizes the promise of combining multiepitope vaccine design advanced computational methods to accelerate and improve vaccine development thereby filling a crucial gap in the fight against rising antibiotic resistance.</p><h3>Graphical abstract</h3><div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 2","pages":""},"PeriodicalIF":3.1,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145480425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structure-based identification and experimental evaluation of Oroxin A as a FYN kinase inhibitor Oroxin A作为FYN激酶抑制剂的结构鉴定和实验评价。
IF 3.1 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-11-10 DOI: 10.1007/s10822-025-00700-6
Vipul Agarwal, Chaitany Jayprakash Raorane, Anugya Gupta, Divya Shastri, Vinit Raj, Sangkil Lee

FYN, a member of the Src family kinases (SFKs) and a non-receptor tyrosine kinase, plays a critical role in signal transduction within the nervous system and is instrumental in the activation and development of T lymphocytes. While the biological significance of FYN kinase in various cellular processes is well recognized, its potential as a therapeutic target remains largely unexplored. In this study, we investigated the potential of natural products (NPs) as preferential inhibitors of FYN kinase. A library of over 3500 NPs was screened for binding affinity with FYN kinase (PDB: 2DQ7) using XGlide docking simulations. The fourteen NPs with the highest docking scores were selected for further analysis. Their interactions with FYN kinase were evaluated through MM-GBSA calculations, and ADMET profiling was performed using SwissADME and pkCSM tools to assess pharmacokinetic properties. Molecular dynamics (MD) simulations using Desmond further confirmed the stability of FYN-NP complexes in solvent environments. Of the top fourteen NPs, only oroxin A demonstrated favorable drug-like properties and sustained stable binding to FYN kinase, as evidenced by MD simulations. Moreover, in vitro kinase inhibition assays revealed that oroxin A exhibited dose-dependent inhibition of FYN kinase. Additionally, C. elegans viability assays confirmed its low toxicity. Moreover, cross-docking revealed that although oroxin A binds to multiple SFKs due to conserved ATP binding pocket, it displayed stronger binding toward FYN, suggesting binding preference over FYN. This study provides a comprehensive evaluation of NPs as potential FYN kinase inhibitors and identifies oroxin A as a natural compound with preliminary evidence of FYN inhibition, warranting further validation.

FYN是Src家族激酶(SFKs)的一员,是一种非受体酪氨酸激酶,在神经系统的信号转导中起关键作用,并有助于T淋巴细胞的激活和发育。虽然FYN激酶在各种细胞过程中的生物学意义已得到充分认识,但其作为治疗靶点的潜力仍未得到很大程度的探索。在这项研究中,我们研究了天然产物(NPs)作为FYN激酶优先抑制剂的潜力。通过XGlide对接模拟,筛选了3500多个NPs与FYN激酶(PDB: 2DQ7)的结合亲和力。选取对接得分最高的14个NPs进行进一步分析。通过MM-GBSA计算评估它们与FYN激酶的相互作用,并使用SwissADME和pkCSM工具进行ADMET分析以评估药代动力学性质。Desmond分子动力学(MD)模拟进一步证实了FYN-NP配合物在溶剂环境中的稳定性。MD模拟表明,在前14个NPs中,只有oroxin A表现出良好的药物样特性,并与FYN激酶保持稳定的结合。此外,体外激酶抑制实验显示,oroxin A对FYN激酶的抑制表现出剂量依赖性。此外,秀丽隐杆线虫活力测定证实了其低毒性。此外,交叉对接显示,虽然oroxin A由于保守的ATP结合袋而与多个sfk结合,但对FYN的结合更强,表明其比FYN具有结合偏好。本研究对NPs作为潜在的FYN激酶抑制剂进行了全面评估,并确定oroxin a是一种天然化合物,具有FYN抑制的初步证据,需要进一步验证。
{"title":"Structure-based identification and experimental evaluation of Oroxin A as a FYN kinase inhibitor","authors":"Vipul Agarwal,&nbsp;Chaitany Jayprakash Raorane,&nbsp;Anugya Gupta,&nbsp;Divya Shastri,&nbsp;Vinit Raj,&nbsp;Sangkil Lee","doi":"10.1007/s10822-025-00700-6","DOIUrl":"10.1007/s10822-025-00700-6","url":null,"abstract":"<div><p>FYN, a member of the Src family kinases (SFKs) and a non-receptor tyrosine kinase, plays a critical role in signal transduction within the nervous system and is instrumental in the activation and development of T lymphocytes. While the biological significance of FYN kinase in various cellular processes is well recognized, its potential as a therapeutic target remains largely unexplored. In this study, we investigated the potential of natural products (NPs) as preferential inhibitors of FYN kinase. A library of over 3500 NPs was screened for binding affinity with FYN kinase (PDB: 2DQ7) using XGlide docking simulations. The fourteen NPs with the highest docking scores were selected for further analysis. Their interactions with FYN kinase were evaluated through MM-GBSA calculations, and ADMET profiling was performed using SwissADME and pkCSM tools to assess pharmacokinetic properties. Molecular dynamics (MD) simulations using Desmond further confirmed the stability of FYN-NP complexes in solvent environments. Of the top fourteen NPs, only oroxin A demonstrated favorable drug-like properties and sustained stable binding to FYN kinase, as evidenced by MD simulations. Moreover, in vitro kinase inhibition assays revealed that oroxin A exhibited dose-dependent inhibition of FYN kinase. Additionally, C. elegans viability assays confirmed its low toxicity. Moreover, cross-docking revealed that although oroxin A binds to multiple SFKs due to conserved ATP binding pocket, it displayed stronger binding toward FYN, suggesting binding preference over FYN. This study provides a comprehensive evaluation of NPs as potential FYN kinase inhibitors and identifies oroxin A as a natural compound with preliminary evidence of FYN inhibition, warranting further validation.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"39 2","pages":""},"PeriodicalIF":3.1,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145480477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Computer-Aided Molecular Design
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1