首页 > 最新文献

Frontiers in bioinformatics最新文献

英文 中文
Neurogenic locus notch homolog protein 1 (NOTCH 1) SNP informatics coupled with intrinsically disordered regions and post-translational modifications reveals the complex structural crosstalk of Lung Adenocarcinoma (LUAD). 神经源性基因座缺口同源蛋白1 (notch 1) SNP信息学结合内在无序区和翻译后修饰揭示了肺腺癌(LUAD)复杂的结构串扰。
IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-10 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1641521
Pearl John, C Sudandiradoss

Background: Lung adenocarcinoma (LUAD) is the predominant histological subtype of lung cancer, representing a major contributor to cancer mortality rate marked by a high frequency of mutations and intricate interactions between multiple signalling pathways.

Objective: Here we explore the role of NOTCH1 associated Single nucleotide polymorphisms (SNPs) IDR and PTM in LUAD progression. Although the NOTCH1 expression is downregulated, it has been validated as an important prognostic marker because of its complex biological roles under specific conditions.

Methods: With the aid of In silico tools we predicted and identified the deleterious SNPs. The Molecular Docking and dynamics simulations (MDS) were conducted to characterize these mutations.

Results: A total of 43 deleterious SNPs were found in the sequential SNP analysis with 13 SNPs resulted deleterious and damaging effects. The stabilizing SNPs such as S1464I, A1705V and T1602I are found within the conserved and functional domains of NOTCH1. In addition, 1660-2555 sequence region of the PEST domain was recognized as an Intrinsically Disordered Region (IDR) with a score of above 0.5. Moreover, the presence of the two phosphodegrons (SCF_FBW7_1 at 2129-2136 and SCF_FBW7_2 at 2508-2515) along with the Post Translational Modification (PTM) such as o-linked glycosylation and Phosphothreonine within the IDR region, PEST and conserved domains suggest functional significance in LUAD progression.

Conclusion: In conclusion our research highlights the potential regulatory role of identified SNPs, PTMs, and the functional domains of Notch1, particularly the PEST domain and IDR, in pathophysiology of LUAD particularly through the crosstalk of the EMT signalling.

背景:肺腺癌(LUAD)是肺癌的主要组织学亚型,是癌症死亡率的主要原因,其特征是高频率的突变和多种信号通路之间复杂的相互作用。目的:探讨NOTCH1相关的单核苷酸多态性(snp) IDR和PTM在LUAD进展中的作用。尽管NOTCH1表达下调,但由于其在特定条件下具有复杂的生物学作用,已被证实为重要的预后标志物。方法:利用计算机辅助工具对有害snp进行预测和鉴定。分子对接和动力学模拟(MDS)对这些突变进行了表征。结果:在序列SNP分析中共发现43个有害SNP,其中13个SNP产生有害和破坏性作用。稳定snp如S1464I、A1705V和T1602I位于NOTCH1的保守和功能结构域内。此外,PEST结构域1660 ~ 2555序列区被识别为内在无序区(IDR),评分在0.5以上。此外,两个磷酸化子(位于2129-2136的SCF_FBW7_1和位于2508-2515的SCF_FBW7_2)以及IDR区、PEST和保守结构域的翻译后修饰(PTM)如o-链糖基化和磷苏氨酸的存在表明在LUAD进展中的功能意义。结论:总之,我们的研究强调了已鉴定的snp、ptm和Notch1的功能域,特别是PEST结构域和IDR,在LUAD的病理生理中,特别是通过EMT信号的串扰,具有潜在的调节作用。
{"title":"Neurogenic locus notch homolog protein 1 (NOTCH 1) SNP informatics coupled with intrinsically disordered regions and post-translational modifications reveals the complex structural crosstalk of Lung Adenocarcinoma (LUAD).","authors":"Pearl John, C Sudandiradoss","doi":"10.3389/fbinf.2025.1641521","DOIUrl":"10.3389/fbinf.2025.1641521","url":null,"abstract":"<p><strong>Background: </strong>Lung adenocarcinoma (LUAD) is the predominant histological subtype of lung cancer, representing a major contributor to cancer mortality rate marked by a high frequency of mutations and intricate interactions between multiple signalling pathways.</p><p><strong>Objective: </strong>Here we explore the role of NOTCH1 associated Single nucleotide polymorphisms (SNPs) IDR and PTM in LUAD progression. Although the NOTCH1 expression is downregulated, it has been validated as an important prognostic marker because of its complex biological roles under specific conditions.</p><p><strong>Methods: </strong>With the aid of In silico tools we predicted and identified the deleterious SNPs. The Molecular Docking and dynamics simulations (MDS) were conducted to characterize these mutations.</p><p><strong>Results: </strong>A total of 43 deleterious SNPs were found in the sequential SNP analysis with 13 SNPs resulted deleterious and damaging effects. The stabilizing SNPs such as S1464I, A1705V and T1602I are found within the conserved and functional domains of NOTCH1. In addition, 1660-2555 sequence region of the PEST domain was recognized as an Intrinsically Disordered Region (IDR) with a score of above 0.5. Moreover, the presence of the two phosphodegrons (SCF_FBW7_1 at 2129-2136 and SCF_FBW7_2 at 2508-2515) along with the Post Translational Modification (PTM) such as o-linked glycosylation and Phosphothreonine within the IDR region, PEST and conserved domains suggest functional significance in LUAD progression.</p><p><strong>Conclusion: </strong>In conclusion our research highlights the potential regulatory role of identified SNPs, PTMs, and the functional domains of Notch1, particularly the PEST domain and IDR, in pathophysiology of LUAD particularly through the crosstalk of the EMT signalling.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1641521"},"PeriodicalIF":3.9,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12727990/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145835538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification and validation of tumor microenvironment-related therapeutic targets in gastric cancer using integrated multi-omics and molecular docking approaches. 基于多组学和分子对接方法的胃癌肿瘤微环境相关治疗靶点鉴定与验证
IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-10 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1654326
Mohamed Kalith Oli M, Jafar Ali Ibrahim Syed Masood

Introduction: With increased drug resistance and tumor heterogeneity accounting for limited therapeutic strategies, gastric cancer remains one of the major causes of cancer-related mortality around the globe. Targeting the components of the tumor microenvironment (TME) has become a promising therapeutic strategy due to their crucial roles in cancer cell proliferation, progression, and metastasis. One of the limitations of the previously identified therapeutic targets is their limited applicability to a broader patient population.

Methods: This study aims to identify (TME)-related therapeutic targets using an integrated bioinformatics and molecular docking approach that involves a larger number of datasets to cover a broader cohort of gastric cancer patients. It analyzed multiple publicly available transcriptomic datasets using Robust Rank Aggregation (RRA) meta-analysis and Weighted Gene Co-expression Network Analysis (WGCNA) to identify significant hub genes. Furthermore, protein-protein interaction (PPI) network analyses, conducted using multiple methods such as Cytohubba topology analysis and ClusterONE module analysis, refined the potential therapeutic candidates. Functional enrichment analyses were performed to identify vital genes involved in TME interactions and ECM remodeling.

Results: The enriched genes were validated for their significant dysregulation in the Cancer Genome Atlas gastric adenocarcinoma dataset (TCGA-STAD) and three independent GEO datasets to ensure differential expression across distinct cohorts. Genes with consistent dysregulation were used in survival analyses across TCGA and two GEO datasets to prioritize hub genes with prognostic significance. Finally, a targeted literature survey ensured the exclusion of previously targeted genes, and molecular docking analyses conducted using phytocompounds identified potential therapeutic leads with strong affinities for the identified targets.

Discussion: This integrated approach revealed notable, promising targets in the TME and natural compounds for developing potential personalized therapeutic strategies in gastric cancer.

导言:由于耐药增加和肿瘤异质性导致治疗策略有限,胃癌仍然是全球癌症相关死亡的主要原因之一。肿瘤微环境(tumor microenvironment, TME)的靶向治疗已成为一种很有前景的治疗策略,因为它们在癌细胞增殖、进展和转移中起着至关重要的作用。先前确定的治疗靶点的局限性之一是它们对更广泛的患者群体的有限适用性。方法:本研究旨在利用集成的生物信息学和分子对接方法识别(TME)相关的治疗靶点,该方法涉及更多的数据集,以覆盖更广泛的胃癌患者队列。该研究使用稳健秩聚集(RRA)荟萃分析和加权基因共表达网络分析(WGCNA)分析了多个公开可用的转录组数据集,以确定重要的枢纽基因。此外,使用多种方法(如Cytohubba拓扑分析和ClusterONE模块分析)进行的蛋白质-蛋白质相互作用(PPI)网络分析,改进了潜在的治疗候选药物。进行功能富集分析以确定参与TME相互作用和ECM重塑的重要基因。结果:富集的基因在癌症基因组图谱胃腺癌数据集(TCGA-STAD)和三个独立的GEO数据集中被证实存在显著的失调,以确保在不同队列中的差异表达。在TCGA和两个GEO数据集的生存分析中,使用一致失调的基因来优先考虑具有预后意义的中心基因。最后,一项有针对性的文献调查确保了先前靶向基因的排除,并利用植物化合物进行了分子对接分析,确定了与所鉴定靶点具有强亲和力的潜在治疗线索。讨论:这种综合方法揭示了TME和天然化合物中值得注意的、有希望的靶点,可用于开发潜在的胃癌个性化治疗策略。
{"title":"Identification and validation of tumor microenvironment-related therapeutic targets in gastric cancer using integrated multi-omics and molecular docking approaches.","authors":"Mohamed Kalith Oli M, Jafar Ali Ibrahim Syed Masood","doi":"10.3389/fbinf.2025.1654326","DOIUrl":"10.3389/fbinf.2025.1654326","url":null,"abstract":"<p><strong>Introduction: </strong>With increased drug resistance and tumor heterogeneity accounting for limited therapeutic strategies, gastric cancer remains one of the major causes of cancer-related mortality around the globe. Targeting the components of the tumor microenvironment (TME) has become a promising therapeutic strategy due to their crucial roles in cancer cell proliferation, progression, and metastasis. One of the limitations of the previously identified therapeutic targets is their limited applicability to a broader patient population.</p><p><strong>Methods: </strong>This study aims to identify (TME)-related therapeutic targets using an integrated bioinformatics and molecular docking approach that involves a larger number of datasets to cover a broader cohort of gastric cancer patients. It analyzed multiple publicly available transcriptomic datasets using Robust Rank Aggregation (RRA) meta-analysis and Weighted Gene Co-expression Network Analysis (WGCNA) to identify significant hub genes. Furthermore, protein-protein interaction (PPI) network analyses, conducted using multiple methods such as Cytohubba topology analysis and ClusterONE module analysis, refined the potential therapeutic candidates. Functional enrichment analyses were performed to identify vital genes involved in TME interactions and ECM remodeling.</p><p><strong>Results: </strong>The enriched genes were validated for their significant dysregulation in the Cancer Genome Atlas gastric adenocarcinoma dataset (TCGA-STAD) and three independent GEO datasets to ensure differential expression across distinct cohorts. Genes with consistent dysregulation were used in survival analyses across TCGA and two GEO datasets to prioritize hub genes with prognostic significance. Finally, a targeted literature survey ensured the exclusion of previously targeted genes, and molecular docking analyses conducted using phytocompounds identified potential therapeutic leads with strong affinities for the identified targets.</p><p><strong>Discussion: </strong>This integrated approach revealed notable, promising targets in the TME and natural compounds for developing potential personalized therapeutic strategies in gastric cancer.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1654326"},"PeriodicalIF":3.9,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12727970/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145835494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncovering human kinase substrates in nipah proteome. 揭示尼帕病毒蛋白质组中的人激酶底物。
IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-05 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1678189
Vineetha Shaji, Akash Anil, Ayisha A Jabbar, Althaf Mahin, Ahmad Rafi, Amjesh Revikumar, Sowmya Soman, Ganesh Prasad, Sneha M Pinto, Yashwanth Subbannayya, Abhithaj Jayanandan, Rajesh Raju

Nipah virus (NiV) is a zoonotic pathogen that causes recurrent outbreaks with considerable implications for public health. Viruses engage host kinases to phosphorylate viral proteins, aiding replication and host disruption. Identifying NiV phosphoproteins and their host kinases is therefore critical for understanding the mechanism of infection and developing therapeutics. We performed kinase-substrate phosphomotif analysis based on prior studies and employed computational tools to identify putative phosphosites in NiV proteins and corresponding host kinases. Redundancy analysis highlighted key kinases capable of phosphorylating multiple NiV proteins and high-potential viral substrates. Integration with human-viral protein-protein interaction data revealed human kinase substrate proteins in human that interact with NiV proteins, while conservation analysis assessed phosphosites across nine NiV proteins in various strains. The functional significance of the identified and predicted viral substrates and their corresponding host kinases was further validated through in silico docking and molecular dynamics simulation (MD). Motif-based kinase-substrate analysis identified 51 human kinases predicted to target 1180 phosphorylation sites across nine NiV proteins, including key human kinases such as Eukaryotic elongation factor 2 kinase [EEF2K], Haploid germ cell-specific nuclear protein kinase [HASPIN], Mitogen-activated protein kinase 9 [MAPK9], Microtubule-associated serine/threonine-protein kinase 2 [MAST2], and Spleen tyrosine kinase [SYK], with the potential to phosphorylate multiple sites across NiV proteins. Using computational prediction tools, we identified several potential phosphorylation sites on NiV proteins, along with their corresponding candidate human kinases. In silico docking revealed interactions between EEF2K and both the NiV Fusion Glycoprotein and NiV Phosphoprotein (P), MAPK9 with the NiV Matrix Protein, and HASPIN with NiV RNA-dependent RNA polymerase. MD simulations of the EEF2K-NiV Fusion Glycoprotein complex confirmed the stability of this interaction. Leucine-rich repeat serine/threonine-protein kinase 2 [LRRK2], HASPIN, MAST2, and EEF2K were the human kinases predicted to phosphorylate experimentally validated sites on NiV nucleocapsid (N), P, and W proteins. Furthermore, through an extensive literature review, we investigated the therapeutic potential of targeting these kinases using known inhibitors and identified compounds that could potentially be repurposed as antiviral agents against NiV infection. Our findings indicate that EEF2K phosphorylates key NiV proteins at conserved phosphosites across variants, underscoring the pathogenic significance of kinases in NiV infection and their potential as therapeutic targets.

尼帕病毒是一种人畜共患病原体,可引起反复暴发,对公共卫生造成重大影响。病毒利用宿主激酶磷酸化病毒蛋白,帮助复制和破坏宿主。因此,鉴定NiV磷酸化蛋白及其宿主激酶对于了解感染机制和开发治疗方法至关重要。我们在先前研究的基础上进行了激酶-底物磷酸化分析,并使用计算工具鉴定了NiV蛋白和相应宿主激酶中的推定磷酸化位点。冗余分析突出了能够磷酸化多个NiV蛋白和高潜力病毒底物的关键激酶。结合人-病毒蛋白-蛋白相互作用数据显示,人类激酶底物蛋白与NiV蛋白相互作用,而保守分析评估了不同菌株中9种NiV蛋白的磷酸化位点。通过计算机对接和分子动力学模拟(MD)进一步验证了鉴定和预测的病毒底物及其相应宿主激酶的功能意义。基于基元的激酶-底物分析确定了51种人类激酶,预计可靶向9种NiV蛋白的1180个磷酸化位点,包括真核延伸因子2激酶(EEF2K)、单倍体生殖细胞特异性核蛋白激酶(HASPIN)、丝裂原活化蛋白激酶9 (MAPK9)、微管相关丝氨酸/苏氨酸蛋白激酶2 (MAST2)和脾酪氨酸激酶(SYK)等关键人类激酶,它们具有磷酸化NiV蛋白多个位点的潜力。利用计算预测工具,我们确定了NiV蛋白上的几个潜在磷酸化位点,以及相应的候选人类激酶。通过硅对接,发现EEF2K与NiV融合糖蛋白和NiV磷酸化蛋白(P)、MAPK9与NiV基质蛋白、HASPIN与NiV RNA依赖的RNA聚合酶之间存在相互作用。EEF2K-NiV融合糖蛋白复合物的MD模拟证实了这种相互作用的稳定性。富含亮氨酸的重复丝氨酸/苏氨酸蛋白激酶2 [LRRK2]、HASPIN、MAST2和EEF2K是预测磷酸化NiV核衣壳(N)、P和W蛋白上实验验证的位点的人激酶。此外,通过广泛的文献回顾,我们研究了使用已知抑制剂靶向这些激酶的治疗潜力,并确定了可能被重新用作抗NiV感染的抗病毒药物的化合物。我们的研究结果表明,EEF2K磷酸化了不同变体中保守磷酸化位点上的关键NiV蛋白,强调了激酶在NiV感染中的致病意义及其作为治疗靶点的潜力。
{"title":"Uncovering human kinase substrates in nipah proteome.","authors":"Vineetha Shaji, Akash Anil, Ayisha A Jabbar, Althaf Mahin, Ahmad Rafi, Amjesh Revikumar, Sowmya Soman, Ganesh Prasad, Sneha M Pinto, Yashwanth Subbannayya, Abhithaj Jayanandan, Rajesh Raju","doi":"10.3389/fbinf.2025.1678189","DOIUrl":"10.3389/fbinf.2025.1678189","url":null,"abstract":"<p><p>Nipah virus (NiV) is a zoonotic pathogen that causes recurrent outbreaks with considerable implications for public health. Viruses engage host kinases to phosphorylate viral proteins, aiding replication and host disruption. Identifying NiV phosphoproteins and their host kinases is therefore critical for understanding the mechanism of infection and developing therapeutics. We performed kinase-substrate phosphomotif analysis based on prior studies and employed computational tools to identify putative phosphosites in NiV proteins and corresponding host kinases. Redundancy analysis highlighted key kinases capable of phosphorylating multiple NiV proteins and high-potential viral substrates. Integration with human-viral protein-protein interaction data revealed human kinase substrate proteins in human that interact with NiV proteins, while conservation analysis assessed phosphosites across nine NiV proteins in various strains. The functional significance of the identified and predicted viral substrates and their corresponding host kinases was further validated through <i>in silico</i> docking and molecular dynamics simulation (MD). Motif-based kinase-substrate analysis identified 51 human kinases predicted to target 1180 phosphorylation sites across nine NiV proteins, including key human kinases such as Eukaryotic elongation factor 2 kinase [EEF2K], Haploid germ cell-specific nuclear protein kinase [HASPIN], Mitogen-activated protein kinase 9 [MAPK9], Microtubule-associated serine/threonine-protein kinase 2 [MAST2], and Spleen tyrosine kinase [SYK], with the potential to phosphorylate multiple sites across NiV proteins. Using computational prediction tools, we identified several potential phosphorylation sites on NiV proteins, along with their corresponding candidate human kinases. <i>In silico</i> docking revealed interactions between EEF2K and both the NiV Fusion Glycoprotein and NiV Phosphoprotein (P), MAPK9 with the NiV Matrix Protein, and HASPIN with NiV RNA-dependent RNA polymerase. MD simulations of the EEF2K-NiV Fusion Glycoprotein complex confirmed the stability of this interaction. Leucine-rich repeat serine/threonine-protein kinase 2 [LRRK2], HASPIN, MAST2, and EEF2K were the human kinases predicted to phosphorylate experimentally validated sites on NiV nucleocapsid (N), P, and W proteins. Furthermore, through an extensive literature review, we investigated the therapeutic potential of targeting these kinases using known inhibitors and identified compounds that could potentially be repurposed as antiviral agents against NiV infection. Our findings indicate that EEF2K phosphorylates key NiV proteins at conserved phosphosites across variants, underscoring the pathogenic significance of kinases in NiV infection and their potential as therapeutic targets.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1678189"},"PeriodicalIF":3.9,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12715814/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145806693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting GD2 expression across cancer types by the integration of pathway topology and transcriptome data. 通过整合途径拓扑和转录组数据预测GD2在不同癌症类型中的表达。
IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-04 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1705930
Arsenij Ustjanzew, Federico Marini, Saskia Wagner, Arthur Wingerter, Roger Sandhoff, Jörg Faber, Claudia Paret

Background: The disialoganglioside GD2 is a key cancer therapy target due to its overexpression in several cancers and limited presence in normal tissues. However, experimental assessment is technically challenging and not routinely available. We developed a computational framework that integrates reaction activity derived from transcriptomic data with the glycosphingolipid biosynthesis pathway to predict GD2 expression.

Methods: We computed Reaction Activity Scores from transcriptomic data and weighted the reactions of a glycosphingolipid metabolic network, refining edge weights with topology-based transition probabilities to account for enzyme promiscuity. Cumulative activities of GD2-promoting and -mitigating reactions served as features in a Support Vector Machine (SVM) to model GD2-associated differences between neuroblastoma and normal tissue. SVM decision values were used as a continuous proxy for GD2 expression. We validated the predicted GD2 scores across independent datasets by comparing them with literature-reported values and flow-cytometric confirmation of a model-predicted high-GD2 tumor. Copy-number alteration (CNA) data were integrated to identify candidate genomic biomarkers of GD2-positive samples.

Results: Our SVM-based GD2 score achieved balanced accuracy of 0.80 with a linear kernel, selected due to reduced overfitting risk and interpretability, while matching the accuracy of more complex kernels. The model transferred reliably across six independent RNA-seq datasets and reproduced known GD2 expression patterns, outperforming a two-gene signature in capturing subtype-specific heterogeneity and avoiding overestimation in normal brain tissue. Pan-cancer analyses revealed heterogeneous GD2 expression in several cancer subtypes. Notably, we experimentally confirmed high GD2 expression in clear cell sarcoma of the kidney, consistent with model predictions. CNA analysis implicated B4GALNT1 amplification as a GD2-promoting factor in dedifferentiated liposarcoma. To facilitate adoption of our approach, we developed GD2Viz, an R package with an interactive Shiny application for score computation, visualization, and analysis of user data.

Conclusion: Our computational framework provides a robust, interpretable, biologically grounded predictor of GD2 expression, offering greater consistency and clinical interpretability over existing gene-based signatures. Importantly, with over 20 GD2-directed trials ongoing, our approach may help prioritize tumor entities with high GD2 levels, delineate candidate patient subgroups, and generate testable hypotheses in underexplored cancers, thereby supporting patient stratification and eligibility screening for clinical trials.

背景:二对话神经节苷脂GD2在几种癌症中过表达,在正常组织中存在有限,是一个关键的癌症治疗靶点。然而,实验评估在技术上具有挑战性,并且不是常规可用的。我们开发了一个计算框架,将来自转录组学数据的反应活性与鞘糖脂生物合成途径相结合,以预测GD2的表达。方法:我们从转录组学数据中计算反应活性分数,并对糖鞘脂代谢网络的反应进行加权,用基于拓扑的转移概率来细化边缘权重,以解释酶的混杂性。gd2促进和缓解反应的累积活性作为支持向量机(SVM)的特征来模拟神经母细胞瘤和正常组织之间gd2相关的差异。使用SVM决策值作为GD2表达的连续代理。我们通过将预测的GD2评分与文献报道的值和流式细胞术对模型预测的高GD2肿瘤的确认进行比较,验证了独立数据集的预测GD2评分。整合拷贝数改变(CNA)数据来鉴定gd2阳性样本的候选基因组生物标志物。结果:我们基于svm的GD2评分与线性核的平衡精度达到0.80,选择线性核是由于降低了过拟合风险和可解释性,同时匹配更复杂核的精度。该模型在六个独立的RNA-seq数据集之间可靠地转移,并再现了已知的GD2表达模式,在捕获亚型特异性异质性和避免在正常脑组织中高估方面优于双基因标记。泛癌分析揭示了GD2在几种癌症亚型中的异质表达。值得注意的是,我们通过实验证实了GD2在肾透明细胞肉瘤中的高表达,与模型预测一致。CNA分析提示B4GALNT1扩增在去分化脂肪肉瘤中是gd2促进因子。为了便于采用我们的方法,我们开发了GD2Viz,这是一个带有交互式Shiny应用程序的R包,用于分数计算、可视化和用户数据分析。结论:我们的计算框架提供了一个强大的、可解释的、基于生物学的GD2表达预测器,与现有的基于基因的特征相比,提供了更大的一致性和临床可解释性。重要的是,有超过20个GD2导向的试验正在进行中,我们的方法可能有助于优先考虑具有高GD2水平的肿瘤实体,描绘候选患者亚组,并在未被探索的癌症中产生可测试的假设,从而支持患者分层和临床试验的资格筛选。
{"title":"Predicting GD2 expression across cancer types by the integration of pathway topology and transcriptome data.","authors":"Arsenij Ustjanzew, Federico Marini, Saskia Wagner, Arthur Wingerter, Roger Sandhoff, Jörg Faber, Claudia Paret","doi":"10.3389/fbinf.2025.1705930","DOIUrl":"10.3389/fbinf.2025.1705930","url":null,"abstract":"<p><strong>Background: </strong>The disialoganglioside GD2 is a key cancer therapy target due to its overexpression in several cancers and limited presence in normal tissues. However, experimental assessment is technically challenging and not routinely available. We developed a computational framework that integrates reaction activity derived from transcriptomic data with the glycosphingolipid biosynthesis pathway to predict GD2 expression.</p><p><strong>Methods: </strong>We computed Reaction Activity Scores from transcriptomic data and weighted the reactions of a glycosphingolipid metabolic network, refining edge weights with topology-based transition probabilities to account for enzyme promiscuity. Cumulative activities of GD2-promoting and -mitigating reactions served as features in a Support Vector Machine (SVM) to model GD2-associated differences between neuroblastoma and normal tissue. SVM decision values were used as a continuous proxy for GD2 expression. We validated the predicted GD2 scores across independent datasets by comparing them with literature-reported values and flow-cytometric confirmation of a model-predicted high-GD2 tumor. Copy-number alteration (CNA) data were integrated to identify candidate genomic biomarkers of GD2-positive samples.</p><p><strong>Results: </strong>Our SVM-based GD2 score achieved balanced accuracy of 0.80 with a linear kernel, selected due to reduced overfitting risk and interpretability, while matching the accuracy of more complex kernels. The model transferred reliably across six independent RNA-seq datasets and reproduced known GD2 expression patterns, outperforming a two-gene signature in capturing subtype-specific heterogeneity and avoiding overestimation in normal brain tissue. Pan-cancer analyses revealed heterogeneous GD2 expression in several cancer subtypes. Notably, we experimentally confirmed high GD2 expression in clear cell sarcoma of the kidney, consistent with model predictions. CNA analysis implicated B4GALNT1 amplification as a GD2-promoting factor in dedifferentiated liposarcoma. To facilitate adoption of our approach, we developed GD2Viz, an R package with an interactive Shiny application for score computation, visualization, and analysis of user data.</p><p><strong>Conclusion: </strong>Our computational framework provides a robust, interpretable, biologically grounded predictor of GD2 expression, offering greater consistency and clinical interpretability over existing gene-based signatures. Importantly, with over 20 GD2-directed trials ongoing, our approach may help prioritize tumor entities with high GD2 levels, delineate candidate patient subgroups, and generate testable hypotheses in underexplored cancers, thereby supporting patient stratification and eligibility screening for clinical trials.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1705930"},"PeriodicalIF":3.9,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12711791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145806649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Integrative machine learning and bioinformatics analysis to identify cellular senescence-related genes and potential therapeutic targets in ulcerative colitis and colorectal cancer. 纠正:整合机器学习和生物信息学分析,以识别细胞衰老相关基因和溃疡性结肠炎和结直肠癌的潜在治疗靶点。
IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-03 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1750582
Tianle Xue, Yunpeng Chen, Xiaomeng Li, Zhixiang Zhou, Qiyang Chen

[This corrects the article DOI: 10.3389/fbinf.2025.1599098.].

[这更正了文章DOI: 10.3389/fbinf.2025.1599098.]。
{"title":"Correction: Integrative machine learning and bioinformatics analysis to identify cellular senescence-related genes and potential therapeutic targets in ulcerative colitis and colorectal cancer.","authors":"Tianle Xue, Yunpeng Chen, Xiaomeng Li, Zhixiang Zhou, Qiyang Chen","doi":"10.3389/fbinf.2025.1750582","DOIUrl":"https://doi.org/10.3389/fbinf.2025.1750582","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.3389/fbinf.2025.1599098.].</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1750582"},"PeriodicalIF":3.9,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12709918/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145783590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subtractive genomic approach to uncover novel drug targets in Salmonella typhimurium and computational screening of food-based polyphenols as inhibitors. 减法基因组方法揭示鼠伤寒沙门氏菌的新药物靶点和基于食物的多酚作为抑制剂的计算筛选。
IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-03 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1695217
Mohammed Naveez Valathoor, Subhashree Venugopal, Anand Prem Rajan

Introduction: The rise of multidrug-resistant Salmonella typhimurium is a severe public health threat that renders conventional antibiotics ineffective. This study employed a computational strategy to identify a novel drug target in S. typhimurium and screen food-based polyphenols as potential inhibitors.

Methods: A subtractive genomics approach was used to identify essential, pathogen-specific proteins. A lead target was prioritized based on its druggability, localization, and network interactions. The target's 3D structure was then modeled for molecular docking, molecular dynamics (MD) simulations, and binding free energy calculations with a polyphenol library.

Results: The screening identified UDP-N-acetylglucosamine transferase (MurG) as a promising and previously unexplored drug target. The polyphenol 6-prenylnaringenin showed a superior binding affinity for MurG compared to the antibiotic ciprofloxacin. Subsequent MD simulations and binding free energy calculations confirmed that the MurG-6-prenylnaringenin complex was significantly more stable.

Conclusion: This study validates MurG as a druggable target in S. typhimurium and identifies 6-prenylnaringenin as a potent inhibitor. With computational metrics superior to ciprofloxacin, 6-prenylnaringenin is a promising lead compound for developing new anti-Salmonella therapeutics. Future experimental validation is required to confirm these in silico findings.

多药耐药鼠伤寒沙门氏菌的兴起是一个严重的公共卫生威胁,使传统抗生素无效。本研究采用计算策略确定鼠伤寒沙门氏菌的新药物靶点,并筛选基于食物的多酚作为潜在的抑制剂。方法:采用减法基因组学方法鉴定必需的病原体特异性蛋白质。一个先导靶点是根据其可药物性、定位和网络相互作用来确定优先级的。然后用多酚库对目标的三维结构进行分子对接、分子动力学(MD)模拟和结合自由能计算。结果:筛选确定了udp - n -乙酰氨基葡萄糖转移酶(MurG)是一个有希望的和以前未开发的药物靶点。与抗生素环丙沙星相比,多酚6-烯丙基柚皮素对MurG具有更好的结合亲和力。随后的MD模拟和结合自由能计算证实,murg -6- prenylnaringin配合物明显更稳定。结论:本研究证实了MurG是鼠伤寒沙门氏菌的一个可药物靶点,并鉴定了6-烯丙基柚皮素是一个有效的抑制剂。计算指标优于环丙沙星,6-烯丙基柚皮素是一个有前途的先导化合物,用于开发新的抗沙门氏菌治疗。需要进一步的实验验证来证实这些计算机上的发现。
{"title":"Subtractive genomic approach to uncover novel drug targets in <i>Salmonella typhimurium</i> and computational screening of food-based polyphenols as inhibitors.","authors":"Mohammed Naveez Valathoor, Subhashree Venugopal, Anand Prem Rajan","doi":"10.3389/fbinf.2025.1695217","DOIUrl":"10.3389/fbinf.2025.1695217","url":null,"abstract":"<p><strong>Introduction: </strong>The rise of multidrug-resistant <i>Salmonella typhimurium</i> is a severe public health threat that renders conventional antibiotics ineffective. This study employed a computational strategy to identify a novel drug target in <i>S. typhimurium</i> and screen food-based polyphenols as potential inhibitors.</p><p><strong>Methods: </strong>A subtractive genomics approach was used to identify essential, pathogen-specific proteins. A lead target was prioritized based on its druggability, localization, and network interactions. The target's 3D structure was then modeled for molecular docking, molecular dynamics (MD) simulations, and binding free energy calculations with a polyphenol library.</p><p><strong>Results: </strong>The screening identified UDP-N-acetylglucosamine transferase (MurG) as a promising and previously unexplored drug target. The polyphenol 6-prenylnaringenin showed a superior binding affinity for MurG compared to the antibiotic ciprofloxacin. Subsequent MD simulations and binding free energy calculations confirmed that the MurG-6-prenylnaringenin complex was significantly more stable.</p><p><strong>Conclusion: </strong>This study validates MurG as a druggable target in <i>S. typhimurium</i> and identifies 6-prenylnaringenin as a potent inhibitor. With computational metrics superior to ciprofloxacin, 6-prenylnaringenin is a promising lead compound for developing new anti-Salmonella therapeutics. Future experimental validation is required to confirm these <i>in silico</i> findings.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1695217"},"PeriodicalIF":3.9,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12710465/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145783616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genomics-driven drug repurposing and novel targets identification for sickle cell disease in Saudi patients. 沙特患者镰状细胞病的基因组驱动药物再利用和新靶点鉴定。
IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-02 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1671626
Ali Alghubayshi, Mohammad A Alshabeeb, Dayanjan Wijesinghe, Mohammed AlAwadh, Suad Alshammari, Khalifa Alrajeh, Mona A Alkhairi, Imadul Islam, Ahmed Alaskar

Background: Sickle cell disease (SCD) is an inherited blood disorder characterized by chronic hemolysis, inflammation, and vaso-occlusive crises (VOC), leading to multiple complications and reduced life expectancy in affected individuals. Limited effective treatment options are currently available; however, recent genomic findings from underrepresented populations (Saudi Arabians) have offered new hope for predicting molecularly guided treatments. This study aimed to identify approved drugs suitable for repurposing based on their interactions with SCD-associated genetic variants and to discover novel druggable targets within genetic pathways linked to disease severity by utilizing genome-wide association study (GWAS) data from Saudi SCD patients.

Methods: Bioinformatic pipelines were used to evaluate drug-gene interactions and identify potential therapeutic targets based on GWAS data derived from the Saudi population. Approved drugs were suggested for repurposing according to their interactions with genes known to impact SCD pathophysiology, using the Drug-Gene Interaction Database (DGIdb 5.0). New drug targets were also proposed by assessing the simulated binding pockets of gene products, using 3D protein structures from the Protein Data Bank (PDB) and the AlphaFold database. Molecules with higher druggability scores, as estimated by the DoGSiteScorer database, were predicted to have a higher success rate for new SCD treatment development.

Results: Our analysis identified 78 approved medications with potential for repurposing in SCD; this list was narrowed to 21 candidates based on safety profiles and interactions with key genetic pathways. Among these, simvastatin, allopurinol, omalizumab, canakinumab, and etanercept were suggested as the most promising agents. Furthermore, novel drug targets encoded by olfactory receptor (OR) gene clusters (OR51V1, OR52A1, OR52A5, OR51B5, and OR51S1), TRIM genes, SIDT2, and CADM3 displayed high druggability scores.

Conclusion: This study provides a robust framework for drug repurposing and novel drug discovery in SCD, particularly tailored to the Saudi population. The findings underscore the potential of leveraging genomic data to identify targeted therapies, offering a pathway to more personalized and effective treatments for SCD patients. Future clinical trials are essential to validate these findings and translate them into clinical practice.

背景:镰状细胞病(SCD)是一种以慢性溶血、炎症和血管闭塞危象(VOC)为特征的遗传性血液疾病,可导致多种并发症和患者预期寿命降低。目前可用的有效治疗方案有限;然而,最近来自代表性不足的人群(沙特阿拉伯人)的基因组研究结果为预测分子引导治疗提供了新的希望。本研究旨在根据与SCD相关遗传变异的相互作用,确定适合重新利用的获批药物,并利用来自沙特SCD患者的全基因组关联研究(GWAS)数据,在与疾病严重程度相关的遗传途径中发现新的可药物靶点。方法:利用生物信息学管道评估药物-基因相互作用,并根据来自沙特人群的GWAS数据确定潜在的治疗靶点。使用药物-基因相互作用数据库(DGIdb 5.0),根据其与已知影响SCD病理生理的基因的相互作用,建议批准的药物重新利用。利用蛋白质数据库(protein Data Bank, PDB)和AlphaFold数据库中的三维蛋白质结构,通过评估基因产物的模拟结合口袋,提出了新的药物靶点。根据DoGSiteScorer数据库估计,具有较高药物性评分的分子预计在新的SCD治疗开发中具有更高的成功率。结果:我们的分析确定了78种已批准的药物在SCD中具有重新利用的潜力;根据安全性和与关键遗传途径的相互作用,这个名单缩小到21个候选药物。其中,辛伐他汀、别嘌呤醇、奥玛珠单抗、canakinumab和依那西普被认为是最有希望的药物。此外,嗅觉受体(OR)基因簇(OR51V1、OR52A1、OR52A5、OR51B5和OR51S1)、TRIM基因、SIDT2和CADM3编码的新型药物靶点显示出较高的药物耐药性评分。结论:这项研究为SCD的药物再利用和新药物发现提供了一个强有力的框架,特别是针对沙特人口。这些发现强调了利用基因组数据确定靶向治疗的潜力,为SCD患者提供了更个性化和更有效的治疗途径。未来的临床试验对于验证这些发现并将其转化为临床实践至关重要。
{"title":"Genomics-driven drug repurposing and novel targets identification for sickle cell disease in Saudi patients.","authors":"Ali Alghubayshi, Mohammad A Alshabeeb, Dayanjan Wijesinghe, Mohammed AlAwadh, Suad Alshammari, Khalifa Alrajeh, Mona A Alkhairi, Imadul Islam, Ahmed Alaskar","doi":"10.3389/fbinf.2025.1671626","DOIUrl":"10.3389/fbinf.2025.1671626","url":null,"abstract":"<p><strong>Background: </strong>Sickle cell disease (SCD) is an inherited blood disorder characterized by chronic hemolysis, inflammation, and vaso-occlusive crises (VOC), leading to multiple complications and reduced life expectancy in affected individuals. Limited effective treatment options are currently available; however, recent genomic findings from underrepresented populations (Saudi Arabians) have offered new hope for predicting molecularly guided treatments. This study aimed to identify approved drugs suitable for repurposing based on their interactions with SCD-associated genetic variants and to discover novel druggable targets within genetic pathways linked to disease severity by utilizing genome-wide association study (GWAS) data from Saudi SCD patients.</p><p><strong>Methods: </strong>Bioinformatic pipelines were used to evaluate drug-gene interactions and identify potential therapeutic targets based on GWAS data derived from the Saudi population. Approved drugs were suggested for repurposing according to their interactions with genes known to impact SCD pathophysiology, using the Drug-Gene Interaction Database (DGIdb 5.0). New drug targets were also proposed by assessing the simulated binding pockets of gene products, using 3D protein structures from the Protein Data Bank (PDB) and the AlphaFold database. Molecules with higher druggability scores, as estimated by the DoGSiteScorer database, were predicted to have a higher success rate for new SCD treatment development.</p><p><strong>Results: </strong>Our analysis identified 78 approved medications with potential for repurposing in SCD; this list was narrowed to 21 candidates based on safety profiles and interactions with key genetic pathways. Among these, simvastatin, allopurinol, omalizumab, canakinumab, and etanercept were suggested as the most promising agents. Furthermore, novel drug targets encoded by olfactory receptor (OR) gene clusters (OR51V1, OR52A1, OR52A5, OR51B5, and OR51S1), TRIM genes, SIDT2, and CADM3 displayed high druggability scores.</p><p><strong>Conclusion: </strong>This study provides a robust framework for drug repurposing and novel drug discovery in SCD, particularly tailored to the Saudi population. The findings underscore the potential of leveraging genomic data to identify targeted therapies, offering a pathway to more personalized and effective treatments for SCD patients. Future clinical trials are essential to validate these findings and translate them into clinical practice.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1671626"},"PeriodicalIF":3.9,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12705626/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145776598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional and structural impacts of oncogenic missense variants on human polo-like kinase 1 protein. 致癌错义变异对人polo样激酶1蛋白的功能和结构影响。
IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-02 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1680578
Gayatri Munieswaran, Venkatraman Manickam

Introduction: The polo-like kinase 1 (PLK1), a master key mitotic regulator, is frequently expressed in various types of cancers and associated with poor prognosis. The missense mutations in PLK1 may compromise its structural integrity and functional interactions, contributing to tumorigenesis.

Methods: This study utilized a comprehensive computational pipeline to identify deleterious missense variants across multiple cancers. 207 non-synonymous single nucleotide polymorphisms (nsSNPs) were retrieved from cBioPortal, and 11 high-risk variants were prioritized using functional and structural prediction tools, such as SIFT, PolyPhen-2, I-mutant 2.0, and so on. Prognostic prevalence was evaluated via Kaplan-Meier survival analysis, and functional networks were explored using STRING. The structural dynamics of modeled mutations were analyzed through molecular dynamic simulations over 100 ns.

Results: The kinase domain mutations such as L244F, R293C, and R293H and polo-box domain mutations such as A520T were found to cause deviations in structural stability, flexibility, solvent exposure, and compactness compared to wild-type. Further, PLK1 overexpression correlated with poor overall survival of patient outcomes in many types of cancers, including breast, liver, lung, kidney, and pancreatic cancers. Protein-protein interaction revealed PLK1's involvement in oncogenic pathways.

Discussion: The study highlights the structural and functional implications of oncogenic PLK1 mutations, emphasizing their role in cancer progression. Integrating predictive and dynamic exploration approaches facilitates prioritization of variants with potential clinical relevance.

Conclusion: The nsSNPs in PLK1 may perturb conformational stability and functions of the protein. Further experimental validation and discovery of novel inhibitors might develop mutation-specific interventions in precision oncology.

polo样激酶1 (PLK1)是一种关键的有丝分裂调节因子,在各种类型的癌症中经常表达,并与不良预后相关。PLK1的错义突变可能损害其结构完整性和功能相互作用,从而导致肿瘤发生。方法:本研究利用综合计算管道识别多种癌症的有害错义变异。从cbiopportal检索到207个非同义单核苷酸多态性(nssnp),并使用功能和结构预测工具(如SIFT、polyphen2、I-mutant 2.0等)对11个高危变异进行优先排序。通过Kaplan-Meier生存分析评估预后患病率,并使用STRING探索功能网络。通过100 ns的分子动力学模拟分析了模型突变的结构动力学。结果:与野生型相比,L244F、R293C和R293H等激酶结构域突变和A520T等polo-box结构域突变导致了结构稳定性、柔韧性、溶剂暴露和致密性的偏差。此外,PLK1过表达与许多类型癌症患者预后的总生存率低相关,包括乳腺癌、肝癌、肺癌、肾癌和胰腺癌。蛋白-蛋白相互作用揭示PLK1参与致癌途径。讨论:该研究强调了致癌PLK1突变的结构和功能意义,强调了它们在癌症进展中的作用。整合预测和动态的探索方法有助于优先考虑具有潜在临床相关性的变异。结论:PLK1中的nsSNPs可能会扰乱该蛋白的构象稳定性和功能。进一步的实验验证和发现新的抑制剂可能会在精确肿瘤学中发展突变特异性干预。
{"title":"Functional and structural impacts of oncogenic missense variants on human polo-like kinase 1 protein.","authors":"Gayatri Munieswaran, Venkatraman Manickam","doi":"10.3389/fbinf.2025.1680578","DOIUrl":"10.3389/fbinf.2025.1680578","url":null,"abstract":"<p><strong>Introduction: </strong>The polo-like kinase 1 (PLK1), a master key mitotic regulator, is frequently expressed in various types of cancers and associated with poor prognosis. The missense mutations in PLK1 may compromise its structural integrity and functional interactions, contributing to tumorigenesis.</p><p><strong>Methods: </strong>This study utilized a comprehensive computational pipeline to identify deleterious missense variants across multiple cancers. 207 non-synonymous single nucleotide polymorphisms (nsSNPs) were retrieved from cBioPortal, and 11 high-risk variants were prioritized using functional and structural prediction tools, such as SIFT, PolyPhen-2, I-mutant 2.0, and so on. Prognostic prevalence was evaluated via Kaplan-Meier survival analysis, and functional networks were explored using STRING. The structural dynamics of modeled mutations were analyzed through molecular dynamic simulations over 100 ns.</p><p><strong>Results: </strong>The kinase domain mutations such as L244F, R293C, and R293H and polo-box domain mutations such as A520T were found to cause deviations in structural stability, flexibility, solvent exposure, and compactness compared to wild-type. Further, PLK1 overexpression correlated with poor overall survival of patient outcomes in many types of cancers, including breast, liver, lung, kidney, and pancreatic cancers. Protein-protein interaction revealed PLK1's involvement in oncogenic pathways.</p><p><strong>Discussion: </strong>The study highlights the structural and functional implications of oncogenic PLK1 mutations, emphasizing their role in cancer progression. Integrating predictive and dynamic exploration approaches facilitates prioritization of variants with potential clinical relevance.</p><p><strong>Conclusion: </strong>The nsSNPs in PLK1 may perturb conformational stability and functions of the protein. Further experimental validation and discovery of novel inhibitors might develop mutation-specific interventions in precision oncology.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1680578"},"PeriodicalIF":3.9,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12704984/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145770081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond Tanimoto: a learned bioactivity similarity index enhances ligand discovery. 超越谷本:学习生物活性相似指数提高配体的发现。
IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-28 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1695353
Gustavo Schottlender, Juan Manuel Prieto, Marcelo A Marti, Dario Fernández Do Porto

Structural similarity metrics such as the Tanimoto coefficient (TC) miss many functionally related compounds-indeed, 60% of similarly bioactive ligand pairs in the ChEMBL database show TC < 0.30, revealing a major blind spot that constrains ligand-based discovery. Our motivation is to overcome this blind spot and enable the recovery of structurally different yet functionally equivalent chemotypes that structure-based similarity fails to detect. Here, we introduce the bioactivity similarity index (BSI), a machine learning model that estimates the probability that two molecules bind the same or related protein receptors. Trained under leave-one-protein-out (LOPO) across Pfam-defined protein groups on dissimilar pairs, BSI not only outperforms TC but also surpasses modern molecular embedding baselines (ChemBERTa and contrastive language-molecule pre-training (CLAMP), using cosine similarity) across protein families. We further develop a cross-family model (BSI-Large) that, while slightly below group-specific models, generalizes better and can be fine-tuned with less data, consistently improving over models trained from scratch. In retrospective validation on new ChEMBL v35 data, BSI achieves strong early-retrieval performance (top 2% enrichment factor, EF2%), with group-specific models delivering the best enrichment, and BSI-Large remaining competitive. In a realistic virtual screening-like scenario against the target gene ADRA2B, the mean rank of the next active, given a known active, improves from 45.2 (TC) to 3.9 (BSI), with 54.9 for ChemBERTa and 28.6 for CLAMP. Altogether, BSI complements, rather than replaces, structure-based similarity and embedding-based comparisons, extending hit finding to remote chemotypes that are structurally dissimilar yet functionally equivalent. The code is available at https://github.com/gschottlender/bioactivity-similarity-index.

结构相似性指标如谷本系数(TC)遗漏了许多功能相关的化合物——事实上,ChEMBL数据库中60%的相似生物活性配体对的TC < 0.30,这揭示了一个限制基于配体的发现的主要盲点。我们的动机是克服这一盲点,并使结构上的相似性无法检测到的结构上不同但功能上等效的化学型得以恢复。在这里,我们引入了生物活性相似指数(BSI),这是一种机器学习模型,用于估计两个分子结合相同或相关蛋白质受体的概率。在不同对的pfam定义的蛋白质组上进行LOPO训练,BSI不仅优于TC,而且超过了跨蛋白质家族的现代分子嵌入基线(ChemBERTa和使用余弦相似性的对比语言分子预训练(CLAMP))。我们进一步开发了一个跨家族模型(BSI-Large),它虽然略低于特定于群体的模型,但泛化效果更好,可以用更少的数据进行微调,比从头开始训练的模型不断改进。在新的ChEMBL v35数据的回顾性验证中,BSI获得了强大的早期检索性能(前2%的富集因子,EF2%),群体特异性模型提供了最好的富集,BSI- large仍然具有竞争力。在针对靶基因ADRA2B的现实虚拟筛选场景中,给定已知活性的下一个活性的平均排名从45.2 (TC)提高到3.9 (BSI), ChemBERTa为54.9,CLAMP为28.6。总之,BSI是对基于结构的相似性和基于嵌入的比较的补充,而不是取代,将命中查找扩展到结构不同但功能相同的远程化学型。代码可在https://github.com/gschottlender/bioactivity-similarity-index上获得。
{"title":"Beyond Tanimoto: a learned bioactivity similarity index enhances ligand discovery.","authors":"Gustavo Schottlender, Juan Manuel Prieto, Marcelo A Marti, Dario Fernández Do Porto","doi":"10.3389/fbinf.2025.1695353","DOIUrl":"10.3389/fbinf.2025.1695353","url":null,"abstract":"<p><p>Structural similarity metrics such as the Tanimoto coefficient (TC) miss many functionally related compounds-indeed, 60% of similarly bioactive ligand pairs in the ChEMBL database show TC < 0.30, revealing a major blind spot that constrains ligand-based discovery. Our motivation is to overcome this blind spot and enable the recovery of structurally different yet functionally equivalent chemotypes that structure-based similarity fails to detect. Here, we introduce the bioactivity similarity index (BSI), a machine learning model that estimates the probability that two molecules bind the same or related protein receptors. Trained under leave-one-protein-out (LOPO) across Pfam-defined protein groups on dissimilar pairs, BSI not only outperforms TC but also surpasses modern molecular embedding baselines (ChemBERTa and contrastive language-molecule pre-training (CLAMP), using cosine similarity) across protein families. We further develop a cross-family model (BSI-Large) that, while slightly below group-specific models, generalizes better and can be fine-tuned with less data, consistently improving over models trained from scratch. In retrospective validation on new ChEMBL v35 data, BSI achieves strong early-retrieval performance (top 2% enrichment factor, EF<sub>2%</sub>), with group-specific models delivering the best enrichment, and BSI-Large remaining competitive. In a realistic virtual screening-like scenario against the target gene ADRA2B, the mean rank of the next active, given a known active, improves from 45.2 (TC) to 3.9 (BSI), with 54.9 for ChemBERTa and 28.6 for CLAMP. Altogether, BSI complements, rather than replaces, structure-based similarity and embedding-based comparisons, extending hit finding to remote chemotypes that are structurally dissimilar yet functionally equivalent. The code is available at https://github.com/gschottlender/bioactivity-similarity-index.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1695353"},"PeriodicalIF":3.9,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12698616/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145758480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Higher frequency of prokaryotic low complexity regions in core and orthologous genes. 核心基因和同源基因中原核低复杂性区域的频率较高。
IF 3.9 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-27 eCollection Date: 2025-01-01 DOI: 10.3389/fbinf.2025.1673480
Vineet Saravanan, Alexander Kravetz, Fabia Ursula Battistuzzi

Prokaryotic genome evolution is shaped by mutation, gene duplication, and horizontal gene transfer, yet the interaction of these mechanisms, particularly in relation to low complexity regions (LCRs), remains poorly understood. LCRs are known to be mutation-prone and have been proposed to promote genetic innovation. However, the interaction between LCR-mediated and paralogy-mediated genetic innovation is still unclear. To clarify the interplay between these two evolutionary forces, we analyzed the distribution of LCRs in protein-coding genes from three closely related enterobacteria (Escherichia coli, Salmonella enterica, and Klebsiella pneumoniae) at both species and population levels. Using pangenomic and orthology-based approaches, we categorized genes by duplication history and conservation status and assessed LCR frequencies across these groups. We found that LCRs were consistently enriched in core and orthologous genes rather than in accessory or paralogous ones. This pattern was stable across evolutionary timescales and particularly pronounced in genes involved in cell cycle control and defense. These results suggest that, contrary to prior assumptions, LCRs may serve conserved functional roles rather than acting primarily as agents of evolutionary plasticity even at population-level timescales.

原核生物基因组进化是由突变、基因复制和水平基因转移形成的,然而这些机制的相互作用,特别是与低复杂性区域(lcr)相关的机制,仍然知之甚少。已知lcr易发生突变,并被提出用于促进基因创新。然而,lcr介导和谬误介导的基因创新之间的相互作用尚不清楚。为了阐明这两种进化力量之间的相互作用,我们分析了三种密切相关的肠杆菌(大肠杆菌、肠炎沙门氏菌和肺炎克雷伯菌)在物种和种群水平上蛋白编码基因中的lcr分布。使用基于全基因组学和同源学的方法,我们根据重复历史和保护状态对基因进行了分类,并评估了这些群体中的LCR频率。我们发现lcr始终在核心和同源基因中富集,而不是在辅助或旁系基因中富集。这种模式在整个进化时间尺度上是稳定的,在涉及细胞周期控制和防御的基因中尤为明显。这些结果表明,与先前的假设相反,即使在种群水平的时间尺度上,lcr也可能起到保守的功能作用,而不是主要作为进化可塑性的代理人。
{"title":"Higher frequency of prokaryotic low complexity regions in core and orthologous genes.","authors":"Vineet Saravanan, Alexander Kravetz, Fabia Ursula Battistuzzi","doi":"10.3389/fbinf.2025.1673480","DOIUrl":"10.3389/fbinf.2025.1673480","url":null,"abstract":"<p><p>Prokaryotic genome evolution is shaped by mutation, gene duplication, and horizontal gene transfer, yet the interaction of these mechanisms, particularly in relation to low complexity regions (LCRs), remains poorly understood. LCRs are known to be mutation-prone and have been proposed to promote genetic innovation. However, the interaction between LCR-mediated and paralogy-mediated genetic innovation is still unclear. To clarify the interplay between these two evolutionary forces, we analyzed the distribution of LCRs in protein-coding genes from three closely related enterobacteria (<i>Escherichia coli</i>, <i>Salmonella enterica</i>, and <i>Klebsiella pneumoniae</i>) at both species and population levels. Using pangenomic and orthology-based approaches, we categorized genes by duplication history and conservation status and assessed LCR frequencies across these groups. We found that LCRs were consistently enriched in core and orthologous genes rather than in accessory or paralogous ones. This pattern was stable across evolutionary timescales and particularly pronounced in genes involved in cell cycle control and defense. These results suggest that, contrary to prior assumptions, LCRs may serve conserved functional roles rather than acting primarily as agents of evolutionary plasticity even at population-level timescales.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1673480"},"PeriodicalIF":3.9,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12695832/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145758527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Frontiers in bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1