Pub Date : 2026-02-10DOI: 10.1016/j.compbiolchem.2026.108940
Mohammed Alfaifi, Hossam Kamli
Triple-negative breast cancer (TNBC) lacks actionable targets and rapidly develops resistance; therefore, we used an integrative in silico approach to functionally characterise C11orf42 and assess its therapeutic relevance. Sequence and structural analyses revealed that C11orf42 is a ∼36.4 kDa soluble cytosolic protein composed of a conserved TED domain (residues 14–213; mean pLDDT ≈ 89) and a flexible intrinsically disordered C-terminal region (residues ∼232–333). Intrinsic disorder supports conformational flexibility; however, specific functional roles (e.g., protein–protein interaction scaffolding or signalling) remain hypothesis-generating and require orthogonal validation. Protein–protein interaction network analysis identified a highly enriched network (71 nodes, 432 edges; p < 1.0 × 10⁻¹⁶), implicating C11orf42 in vesicular trafficking, receptor recycling, and oncogenic signalling pathways relevant to TNBC. Structure-based druggability analysis revealed four ligandable pockets, and molecular docking identified four phytochemicals—chamaejasmin, Genetin J, isomultiflorenol, and podocarpusflavone B—with favorable binding affinities (≈ –8.9 to –9.6 kcal/mol). In 100-ns MD simulations, the full-length protein showed RMSD ∼10–12 Å due to C-terminal disorder, while the TED core (residues ∼14–213) remained stable at 2–3 Å. In-silico profiling indicates Chamaejasmin is a beyond-Ro5, high-affinity, long-residence C11orf42 inhibitor (t½ = 118.07 h; logP = 2.87) with acceptable safety but poor solubility (logS = −6.36), limited oral bioavailability (39.98 %), and multiple drug-likeness violations, making formulation/scaffold optimisation the main barrier. Importantly, functional genomics analysis of DepMap CRISPR-Cas9 screening data shows that C11orf42 is not a pan-essential viability gene but displays a context-restricted dependency profile, consistent with a regulatory or modulatory role rather than a core survival function. Collectively, these results prioritise C11orf42 as a computationally inferred, conditionally relevant regulatory candidate for further experimental evaluation in TNBC and provide a hypothesis-generating structural, network, and functional framework for future validation.
{"title":"In silico characterisation of C11orf42 as a potential therapeutic target in triple-negative breast cancer","authors":"Mohammed Alfaifi, Hossam Kamli","doi":"10.1016/j.compbiolchem.2026.108940","DOIUrl":"10.1016/j.compbiolchem.2026.108940","url":null,"abstract":"<div><div>Triple-negative breast cancer (TNBC) lacks actionable targets and rapidly develops resistance; therefore, we used an integrative in silico approach to functionally characterise C11orf42 and assess its therapeutic relevance. Sequence and structural analyses revealed that C11orf42 is a ∼36.4 kDa soluble cytosolic protein composed of a conserved TED domain (residues 14–213; mean pLDDT ≈ 89) and a flexible intrinsically disordered C-terminal region (residues ∼232–333). Intrinsic disorder supports conformational flexibility; however, specific functional roles (e.g., protein–protein interaction scaffolding or signalling) remain hypothesis-generating and require orthogonal validation. Protein–protein interaction network analysis identified a highly enriched network (71 nodes, 432 edges; p < 1.0 × 10⁻¹⁶), implicating C11orf42 in vesicular trafficking, receptor recycling, and oncogenic signalling pathways relevant to TNBC. Structure-based druggability analysis revealed four ligandable pockets, and molecular docking identified four phytochemicals—chamaejasmin, Genetin J, isomultiflorenol, and podocarpusflavone B—with favorable binding affinities (≈ –8.9 to –9.6 kcal/mol). In 100-ns MD simulations, the full-length protein showed RMSD ∼10–12 Å due to C-terminal disorder, while the TED core (residues ∼14–213) remained stable at 2–3 Å. In-silico profiling indicates Chamaejasmin is a beyond-Ro5, high-affinity, long-residence C11orf42 inhibitor (t½ = 118.07 h; logP = 2.87) with acceptable safety but poor solubility (logS = −6.36), limited oral bioavailability (39.98 %), and multiple drug-likeness violations, making formulation/scaffold optimisation the main barrier. Importantly, functional genomics analysis of DepMap CRISPR-Cas9 screening data shows that C11orf42 is not a pan-essential viability gene but displays a context-restricted dependency profile, consistent with a regulatory or modulatory role rather than a core survival function<strong>.</strong> Collectively, these results prioritise C11orf42 as a computationally inferred, conditionally relevant regulatory candidate for further experimental evaluation in TNBC and provide a hypothesis-generating structural, network, and functional framework for future validation.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108940"},"PeriodicalIF":3.1,"publicationDate":"2026-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-10DOI: 10.1016/j.compbiolchem.2026.108944
Behnam Aghajan , Mohammad Reza Ghaemi , Ali M. Mosammam , Emran Heshmati , Khosrow Khalifeh
Gene co-expression networks (GCNs) provide a powerful framework for uncovering functional gene modules and biological pathways from complex transcriptomic data. However, constructing reliable GCNs from noisy datasets often yields spurious edges and biologically implausible topologies. To address this challenge, we propose a novel multi-objective optimization approach based on the Non-dominated Sorting Genetic Algorithm II (NSGA-II) to refine edge selection in GCNs. Our pipeline integrates Variance Stabilizing Transformation (VST) for RNA-seq normalization, Spearman rank correlation for robust co-expression estimation, permutation testing to establish an initial significance threshold, and bootstrap resampling to assess edge stability. We applied this framework to two heterogeneous datasets including GSE10245 (microarray, n = 58) and GSE102349 (RNA-seq, n = 113), to optimize multiple network properties simultaneously; including sparsity, modularity, scale-free topology, and edge reproducibility. Comparative analyses against conventional widely used methods; Weighted Gene Co-expression Network Analysis (WGCNA) and the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNE), demonstrate that our approach consistently yields sparser, more modular networks that better conform to biologically expected scale-free architectures across both data types. This adaptive, optimization-driven strategy offers a robust foundation for integrative genomic studies and holds significant potential for advancing biomarker discovery and disease mechanism modeling.
基因共表达网络(GCNs)为从复杂的转录组学数据中揭示功能基因模块和生物学途径提供了一个强大的框架。然而,从有噪声的数据集构建可靠的GCNs通常会产生虚假的边缘和生物学上不可信的拓扑。为了解决这一挑战,我们提出了一种新的基于非支配排序遗传算法II (NSGA-II)的多目标优化方法来优化GCNs中的边缘选择。我们的流水线集成了方差稳定变换(VST)用于RNA-seq归一化,Spearman秩相关用于鲁棒共表达估计,排列检验用于建立初始显著性阈值,以及自举重采样用于评估边缘稳定性。我们将该框架应用于两个异构数据集GSE10245(微阵列,n = 58)和GSE102349 (RNA-seq, n = 113),以同时优化多个网络特性;包括稀疏性、模块化、无标度拓扑和边缘再现性。与常用常规方法的比较分析;加权基因共表达网络分析(WGCNA)和精确细胞网络重建算法(ARACNE)表明,我们的方法始终产生更稀疏、更模块化的网络,更好地符合两种数据类型的生物学预期的无标度架构。这种自适应、优化驱动的策略为整合基因组研究提供了坚实的基础,并具有推进生物标志物发现和疾病机制建模的巨大潜力。
{"title":"A novel multi-objective optimization framework using NSGA-II for gene co-expression network inference","authors":"Behnam Aghajan , Mohammad Reza Ghaemi , Ali M. Mosammam , Emran Heshmati , Khosrow Khalifeh","doi":"10.1016/j.compbiolchem.2026.108944","DOIUrl":"10.1016/j.compbiolchem.2026.108944","url":null,"abstract":"<div><div>Gene co-expression networks (GCNs) provide a powerful framework for uncovering functional gene modules and biological pathways from complex transcriptomic data. However, constructing reliable GCNs from noisy datasets often yields spurious edges and biologically implausible topologies. To address this challenge, we propose a novel multi-objective optimization approach based on the Non-dominated Sorting Genetic Algorithm II (NSGA-II) to refine edge selection in GCNs. Our pipeline integrates Variance Stabilizing Transformation (VST) for RNA-seq normalization, Spearman rank correlation for robust co-expression estimation, permutation testing to establish an initial significance threshold, and bootstrap resampling to assess edge stability. We applied this framework to two heterogeneous datasets including GSE10245 (microarray, n = 58) and GSE102349 (RNA-seq, n = 113), to optimize multiple network properties simultaneously; including sparsity, modularity, scale-free topology, and edge reproducibility. Comparative analyses against conventional widely used methods; Weighted Gene Co-expression Network Analysis (WGCNA) and the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNE), demonstrate that our approach consistently yields sparser, more modular networks that better conform to biologically expected scale-free architectures across both data types. This adaptive, optimization-driven strategy offers a robust foundation for integrative genomic studies and holds significant potential for advancing biomarker discovery and disease mechanism modeling.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108944"},"PeriodicalIF":3.1,"publicationDate":"2026-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09DOI: 10.1016/j.compbiolchem.2026.108947
Hui Liu, Yuanting Lai
Background
R-loops are three-stranded nucleic acid structures implicated in genome instability and cancer progression. However, the prognostic significance and mechanistic role of R-loops in uterine corpus endometrial carcinoma (UCEC) remain poorly understood.
Methods
Transcriptomic, clinical, mutational, and spatial data for UCEC were obtained from The Cancer Genome Atlas (TCGA) and public databases. Multiomics analyses, including prognostic modeling, survival analyses, differential expression analyses, copy number variation (CNV) profiling, somatic mutation comparisons, single-cell transcriptomics, spatial transcriptomics, and immune-related pathway exploration, were conducted to elucidate the biological implications of R-loop genes, matrix-specific CSDE1, and the associated SPP1 pathway. In vivo and in vitro functional experiments were conducted to evaluate the role of CSDE1 in UCEC.
Results
Elevated R-loop activity was associated with advanced clinical stage, high tumor grade, and poor survival outcomes in patients with UCEC. A robust prognostic model based on R-loop genes achieved high predictive accuracy across multiple datasets. Low-risk patients had higher tumor mutation burdens and distinct mutational profiles, whereas high-risk patients had more chromosomal instability and more CNV events. CSDE1 emerged as the top predictive gene, displaying fibroblast-specific expression and copy number-driven upregulation. Single-cell and spatial transcriptomics revealed that CSDE1⁺ fibroblasts actively communicated with immune cells via the SPP1 pathway and were spatially enriched in malignant, fibroblast-dense regions. High CSDE1 expression correlated with the activation of oncogenic pathways and the suppression of multiple steps in the cancer–immunity cycle. Furthermore, CSDE1 promoted the proliferation and migration of UCEC cells in vitro and in vivo by reducing R-loop accumulation and DNA damage.
Conclusion
R-loop activity and CSDE1 expression define a clinically relevant molecular program in UCEC that integrates genomic instability, immunosuppression, and stromal remodeling. These findings provide a basis for stratified prognosis and potential therapeutic targeting in endometrial cancer, suggesting that CSDE1 may be a promising new therapeutic target for the treatment of UCEC in the future.
dr -环是与基因组不稳定性和癌症进展有关的三链核酸结构。然而,r -环在子宫肌体子宫内膜癌(UCEC)中的预后意义和机制作用仍然知之甚少。方法从癌症基因组图谱(TCGA)和公共数据库中获取UCEC的转录组学、临床、突变和空间数据。通过多组学分析,包括预后建模、生存分析、差异表达分析、拷贝数变异(CNV)分析、体细胞突变比较、单细胞转录组学、空间转录组学和免疫相关途径探索,阐明了R-loop基因、基质特异性CSDE1和相关SPP1途径的生物学意义。通过体内和体外功能实验评价CSDE1在UCEC中的作用。结果在UCEC患者中,r环活性升高与晚期临床分期、高肿瘤分级和较差的生存结果相关。基于R-loop基因的稳健预后模型在多个数据集上实现了高预测精度。低危患者具有更高的肿瘤突变负担和不同的突变谱,而高危患者具有更多的染色体不稳定性和更多的CNV事件。CSDE1成为最重要的预测基因,显示成纤维细胞特异性表达和拷贝数驱动的上调。单细胞和空间转录组学显示,CSDE1 +成纤维细胞通过SPP1途径与免疫细胞积极交流,并在恶性成纤维细胞密集区空间富集。CSDE1的高表达与致癌途径的激活和癌症免疫周期中多个步骤的抑制相关。此外,CSDE1通过减少R-loop积累和DNA损伤,促进UCEC细胞在体外和体内的增殖和迁移。结论r -loop活性和CSDE1表达确定了UCEC中与临床相关的分子程序,该程序整合了基因组不稳定性、免疫抑制和基质重塑。这些发现为子宫内膜癌的分层预后和潜在的治疗靶向提供了基础,提示CSDE1可能是未来治疗UCEC的一个有希望的新治疗靶点。
{"title":"R-loop-driven molecular subtypes reveal prognostic and immunogenomic features in uterine corpus endometrial carcinoma","authors":"Hui Liu, Yuanting Lai","doi":"10.1016/j.compbiolchem.2026.108947","DOIUrl":"10.1016/j.compbiolchem.2026.108947","url":null,"abstract":"<div><h3>Background</h3><div>R-loops are three-stranded nucleic acid structures implicated in genome instability and cancer progression. However, the prognostic significance and mechanistic role of R-loops in uterine corpus endometrial carcinoma (UCEC) remain poorly understood.</div></div><div><h3>Methods</h3><div>Transcriptomic, clinical, mutational, and spatial data for UCEC were obtained from The Cancer Genome Atlas (TCGA) and public databases. Multiomics analyses, including prognostic modeling, survival analyses, differential expression analyses, copy number variation (CNV) profiling, somatic mutation comparisons, single-cell transcriptomics, spatial transcriptomics, and immune-related pathway exploration, were conducted to elucidate the biological implications of R-loop genes, matrix-specific CSDE1, and the associated SPP1 pathway. In vivo and in vitro functional experiments were conducted to evaluate the role of CSDE1 in UCEC.</div></div><div><h3>Results</h3><div>Elevated R-loop activity was associated with advanced clinical stage, high tumor grade, and poor survival outcomes in patients with UCEC. A robust prognostic model based on R-loop genes achieved high predictive accuracy across multiple datasets. Low-risk patients had higher tumor mutation burdens and distinct mutational profiles, whereas high-risk patients had more chromosomal instability and more CNV events. CSDE1 emerged as the top predictive gene, displaying fibroblast-specific expression and copy number-driven upregulation. Single-cell and spatial transcriptomics revealed that CSDE1⁺ fibroblasts actively communicated with immune cells via the SPP1 pathway and were spatially enriched in malignant, fibroblast-dense regions. High CSDE1 expression correlated with the activation of oncogenic pathways and the suppression of multiple steps in the cancer–immunity cycle. Furthermore, CSDE1 promoted the proliferation and migration of UCEC cells in vitro and in vivo by reducing R-loop accumulation and DNA damage.</div></div><div><h3>Conclusion</h3><div>R-loop activity and CSDE1 expression define a clinically relevant molecular program in UCEC that integrates genomic instability, immunosuppression, and stromal remodeling. These findings provide a basis for stratified prognosis and potential therapeutic targeting in endometrial cancer, suggesting that CSDE1 may be a promising new therapeutic target for the treatment of UCEC in the future.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108947"},"PeriodicalIF":3.1,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-09DOI: 10.1016/j.compbiolchem.2026.108942
Sahadevan Shrinidhi , R. Sagaya Jansi , Ameer Khusro
Transposable Elements (TEs) represent a class of mobile genomic sequences, which may seriously disrupt gene regulation and can contribute to tumorigenesis. Yet, their role in NSCLC has remained unexplored to a great degree. Therefore, an integrated transcriptomic and Transposable Element (TE) analysis was performed to investigate TE-driven gene dysregulation in NSCLC. Hierarchical clustering of differentially expressed TE revealed significant over-representation of LTR1A1 and HERVL18-int in the cancer samples, with notably high expression of LINE and ERV members, especially HERVL-int, L1MC5, and L1M5. The intersection of TE expression with differentially expressed genes revealed several TE-associated genes involved in cell cycle regulation, genomic stability, and tumor progression. Fusion transcript analysis highlighted unique cancer-specific events, offering insights into TE-mediated transcriptomic alterations. Molecular docking of TE-associated proteins, HMMR, and PBK suggested potential interactions that may influence oncogenic pathways. Collectively, our findings uncover novel TE-driven mechanisms of gene dysregulation in NSCLC and highlight specific TEs and associated genes as potential diagnostic markers and therapeutic targets, offering a framework for future experimental studies to explore their mechanistic and clinical significance.
{"title":"Unraveling novel transposable elements (TEs)-driven gene dysregulation in non-small cell lung cancer (NSCLC) by integrated transcriptomic and TEs analysis","authors":"Sahadevan Shrinidhi , R. Sagaya Jansi , Ameer Khusro","doi":"10.1016/j.compbiolchem.2026.108942","DOIUrl":"10.1016/j.compbiolchem.2026.108942","url":null,"abstract":"<div><div>Transposable Elements (TEs) represent a class of mobile genomic sequences, which may seriously disrupt gene regulation and can contribute to tumorigenesis. Yet, their role in NSCLC has remained unexplored to a great degree. Therefore, an integrated transcriptomic and Transposable Element (TE) analysis was performed to investigate TE-driven gene dysregulation in NSCLC. Hierarchical clustering of differentially expressed TE revealed significant over-representation of LTR1A1 and HERVL18-int in the cancer samples, with notably high expression of LINE and ERV members, especially HERVL-int, L1MC5, and L1M5. The intersection of TE expression with differentially expressed genes revealed several TE-associated genes involved in cell cycle regulation, genomic stability, and tumor progression. Fusion transcript analysis highlighted unique cancer-specific events, offering insights into TE-mediated transcriptomic alterations. Molecular docking of TE-associated proteins, HMMR, and PBK suggested potential interactions that may influence oncogenic pathways. Collectively, our findings uncover novel TE-driven mechanisms of gene dysregulation in NSCLC and highlight specific TEs and associated genes as potential diagnostic markers and therapeutic targets, offering a framework for future experimental studies to explore their mechanistic and clinical significance.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108942"},"PeriodicalIF":3.1,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cellular senescence is a complex biological process characterized by several unique features including cell-cycle arrest, macromolecular damage, secretory phenotypes (SASPs), and deregulated metabolism. These factors are essential for understanding their impact on aging and diseases. Extensive studies on various biochemical pathways associated with mammalian aging have identified SIRT-1, Bcl-xL, Hsp-90, MDM-2, AMPK and mTOR as some key regulatory proteins. So, preserving the functions of these proteins could potentially decelerate the aging process. A previous study had demonstrated that 4,4′-diapophytofluene (4,4′-DPE), a squalene analog extracted from the pentane fraction of Cocos nucifera leaves was more effective than squalene in suppressing senescence induction in WI38 and HaCaT cells. In the present study, high-throughput virtual screening was performed to evaluate the interaction between 4,4′-DPE and six aforementioned aging regulators, further validating its role as a natural senotherapeutic along with squalene and some well-known anti-aging botanicals (quercetin, curcumin, resveratrol, metformin, and fisetin). In molecular docking studies, 4,4′-DPE revealed stronger binding affinity (ΔG) with SIRT-1, Bcl-xL, Hsp-90, MDM-2, and mTOR, except for AMPK protein, compared to quercetin, curcumin, resveratrol, and fisetin. The MM/PBSA and FEL plots of molecular dynamics simulation of 100 ns production had also highlighted 4,4′-DPE maintained thermodynamically stable and favourable interactions with binding pockets of five proteins, supported by persistent van der Waals and hydrophobic contacts with minimal structural deviations. Furthermore, the ADMET studies confirmed 4,4′-DPE as a clinically safe bioactive compound, facilitating it to become a novel senotherapeutic/anti-aging agent for pharmaceuticals and dermatological products.
{"title":"Computational exploration of squalene analog 4,4′diapophytofluene as a potential anti-aging phytotherapeutic","authors":"Madhurima Dutta , Anjan Hazra , Suparna Mandal Biswas","doi":"10.1016/j.compbiolchem.2026.108945","DOIUrl":"10.1016/j.compbiolchem.2026.108945","url":null,"abstract":"<div><div>Cellular senescence is a complex biological process characterized by several unique features including cell-cycle arrest, macromolecular damage, secretory phenotypes (SASPs), and deregulated metabolism. These factors are essential for understanding their impact on aging and diseases. Extensive studies on various biochemical pathways associated with mammalian aging have identified SIRT-1, Bcl-xL, Hsp-90, MDM-2, AMPK and mTOR as some key regulatory proteins. So, preserving the functions of these proteins could potentially decelerate the aging process. A previous study had demonstrated that 4,4′-diapophytofluene (4,4′-DPE), a squalene analog extracted from the pentane fraction of <em>Cocos nucifera</em> leaves was more effective than squalene in suppressing senescence induction in WI38 and HaCaT cells. In the present study, high-throughput virtual screening was performed to evaluate the interaction between 4,4′-DPE and six aforementioned aging regulators, further validating its role as a natural senotherapeutic along with squalene and some well-known anti-aging botanicals (quercetin, curcumin, resveratrol, metformin, and fisetin). In molecular docking studies, 4,4′-DPE revealed stronger binding affinity (ΔG) with SIRT-1, Bcl-xL, Hsp-90, MDM-2, and mTOR, except for AMPK protein, compared to quercetin, curcumin, resveratrol, and fisetin. The MM/PBSA and FEL plots of molecular dynamics simulation of 100 ns production had also highlighted 4,4′-DPE maintained thermodynamically stable and favourable interactions with binding pockets of five proteins, supported by persistent van der Waals and hydrophobic contacts with minimal structural deviations. Furthermore, the ADMET studies confirmed 4,4′-DPE as a clinically safe bioactive compound, facilitating it to become a novel senotherapeutic/anti-aging agent for pharmaceuticals and dermatological products.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108945"},"PeriodicalIF":3.1,"publicationDate":"2026-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
<div><div>This study employed an integrative computational and systems biology framework to define a diagnostic gene signature for hepatocellular carcinoma (HCC) and to explore its potential translational relevance in a hypothesis-generating manner. Differential expression analysis of transcriptomic data from 230 samples identified 2748 significantly differentially expressed genes (DEGs), including 2283 upregulated and 465 downregulated genes, with FGF4 (log2FC = 10.08) and REG1B (log2FC = 10.02) among the top hits. Four machine learning classifiers were trained using this signature and demonstrated consistently high predictive performance, with XGBoost emerging as the top-performing model (accuracy = 0.97, F1-score = 0.96, ROC-AUC = 0.981). Logistic Regression (L1) and Random Forest achieved comparable performance (ROC-AUC = 0.980 and 0.979, respectively), while SVM-linear also showed high robustness (ROC-AUC = 0.978). All models showed good calibration, with low Brier scores (<0.04) and precision consistently exceeding 0.90 across most recall thresholds, indicating strong but not perfect classification performance. SHAP-based explainability analysis was used to rank and prioritise the most influential predictors, refining the biomarker panel to 81 genes that collectively accounted for approximately 50 % of the model’s explanatory contribution, and highlighting key downregulated predictors in HCC, including GDF2, COLEC10, BMP10, LRAT, and DNASE1L3. Protein–protein interaction and functional enrichment analyses revealed five major molecular clusters and provided systems-level insights into dysregulated biological processes associated with HCC. Drug–gene interaction mining mapped 78 target proteins to clinically relevant compounds, including tolrestat, alcuronium, metyrosine, and 4-phenylbutyric acid. Molecular docking suggested favorable binding propensities for several complexes, including alcuronium–3UON (–8.5 kcal/mol), tolrestat–1ZUA (–8.3 kcal/mol), metyrosine–2XSN (–6.7 kcal/mol), and 4-phenylbutyric acid–2NZ2 (–5.9 kcal/mol). A 100 ns molecular dynamics simulation of the tolrestat–AKR1B10 (1ZUA) complex indicated structural stability, with protein backbone RMSD stabilising at 1.5–3.0 Å, ligand RMSD at 0.6–1.4 Å, and persistent interactions involving Trp22, His110, Glu111, and Phe122. Physicochemical and pharmacokinetic profiling further prioritised tolrestat as a computationally favourable candidate (MW = 357.35, LogP = 3.64, TPSA = 81.86 Ų), exhibiting acceptable drug-likeness, high predicted gastrointestinal absorption, and low synthetic complexity (SA = 2.34), in contrast to alcuronium (MW = 666.89, SA = 7.86), which showed multiple rule violations. Collectively, this in silico study proposes a robust diagnostic gene signature for HCC and identifies tolrestat as a promising repurposing candidate that warrants experimental validation, demonstrating the utility of integrating machine learning, network biology, and molecular simulation
{"title":"Biomarker discovery and drug repurposing in hepatocellular carcinoma through transcriptomics, machine learning, network pharmacology, and molecular dynamics","authors":"Mohammed Alfaifi , Hossam Kamli , Najeeb Ullah Khan , Ahsanullah Unar","doi":"10.1016/j.compbiolchem.2026.108937","DOIUrl":"10.1016/j.compbiolchem.2026.108937","url":null,"abstract":"<div><div>This study employed an integrative computational and systems biology framework to define a diagnostic gene signature for hepatocellular carcinoma (HCC) and to explore its potential translational relevance in a hypothesis-generating manner. Differential expression analysis of transcriptomic data from 230 samples identified 2748 significantly differentially expressed genes (DEGs), including 2283 upregulated and 465 downregulated genes, with FGF4 (log2FC = 10.08) and REG1B (log2FC = 10.02) among the top hits. Four machine learning classifiers were trained using this signature and demonstrated consistently high predictive performance, with XGBoost emerging as the top-performing model (accuracy = 0.97, F1-score = 0.96, ROC-AUC = 0.981). Logistic Regression (L1) and Random Forest achieved comparable performance (ROC-AUC = 0.980 and 0.979, respectively), while SVM-linear also showed high robustness (ROC-AUC = 0.978). All models showed good calibration, with low Brier scores (<0.04) and precision consistently exceeding 0.90 across most recall thresholds, indicating strong but not perfect classification performance. SHAP-based explainability analysis was used to rank and prioritise the most influential predictors, refining the biomarker panel to 81 genes that collectively accounted for approximately 50 % of the model’s explanatory contribution, and highlighting key downregulated predictors in HCC, including GDF2, COLEC10, BMP10, LRAT, and DNASE1L3. Protein–protein interaction and functional enrichment analyses revealed five major molecular clusters and provided systems-level insights into dysregulated biological processes associated with HCC. Drug–gene interaction mining mapped 78 target proteins to clinically relevant compounds, including tolrestat, alcuronium, metyrosine, and 4-phenylbutyric acid. Molecular docking suggested favorable binding propensities for several complexes, including alcuronium–3UON (–8.5 kcal/mol), tolrestat–1ZUA (–8.3 kcal/mol), metyrosine–2XSN (–6.7 kcal/mol), and 4-phenylbutyric acid–2NZ2 (–5.9 kcal/mol). A 100 ns molecular dynamics simulation of the tolrestat–AKR1B10 (1ZUA) complex indicated structural stability, with protein backbone RMSD stabilising at 1.5–3.0 Å, ligand RMSD at 0.6–1.4 Å, and persistent interactions involving Trp22, His110, Glu111, and Phe122. Physicochemical and pharmacokinetic profiling further prioritised tolrestat as a computationally favourable candidate (MW = 357.35, LogP = 3.64, TPSA = 81.86 Ų), exhibiting acceptable drug-likeness, high predicted gastrointestinal absorption, and low synthetic complexity (SA = 2.34), in contrast to alcuronium (MW = 666.89, SA = 7.86), which showed multiple rule violations. Collectively, this in silico study proposes a robust diagnostic gene signature for HCC and identifies tolrestat as a promising repurposing candidate that warrants experimental validation, demonstrating the utility of integrating machine learning, network biology, and molecular simulation","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108937"},"PeriodicalIF":3.1,"publicationDate":"2026-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146168556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Urinary proteins are promising non-invasive biomarkers, but their low abundance and wide dynamic range make detection challenging. This study presents UriPred, a computational tool that integrates machine learning (ML), BLAST, and Motif-EmeRging and Classes-Identification (MERCI) to predict urinary proteins and facilitate the identification of liver cancer (LC) biomarkers. A dataset of 10588 urinary and non-urinary proteins was curated, from which two feature types were generated: 10074 compositional and 20 evolutionary features. Seven feature selection methods were applied to compositional features, and 11 ML algorithms were trained on different feature sets. Evolutionary features achieved the highest training performance (AUC 0.79, accuracy 71.99 %), whereas amino acid composition (AAC) with 20 features achieved identical validation AUC (0.74) and comparable accuracy while being computationally less expensive and consistently selected. The ML-AAC model was therefore chosen as the final model. This optimal model was integrated with BLAST and MERCI to create UriPred, which reduced false positives from 34.59 % (ML) to 3.12 % (hybrid) on the validation dataset and from 5.8 % (ML) to zero (hybrid) on an external dataset. Using UriPred, 53 LC differentially expressed protein-coding genes were predicted as urinary proteins. Protein-protein interaction analysis, AUROC evaluation (AUC > 0.80), survival analysis, and cross-verification of urine detectability with the Human Protein Atlas and Human Urine PeptideAtlas databases identified five proteins (KIF23, COL15A1, CTHRC1, MMP9, and SPP1) as potential LC biomarkers. UriPred efficiently predicts urinary proteins using AAC features and enables biomarker discovery for LC. The tool is publicly available at https://github.com/Dahrii-Paul/UriPred.
尿蛋白是一种很有前途的非侵入性生物标志物,但其低丰度和宽动态范围给检测带来了挑战。本研究提出了UriPred,一种集成了机器学习(ML), BLAST和Motif-EmeRging and Classes-Identification (MERCI)的计算工具,用于预测尿蛋白并促进肝癌(LC)生物标志物的鉴定。收集了10588个尿蛋白和非尿蛋白的数据集,从中生成了两种特征类型:10074个组成特征和20个进化特征。将7种特征选择方法应用于组合特征,并在不同的特征集上训练了11种 ML算法。进化特征获得了最高的训练性能(AUC 0.79,准确率71.99 %),而氨基酸组成(AAC)与20个特征获得相同的验证AUC(0.74)和相当的准确性,同时计算成本更低,选择一致。因此选择ML-AAC模型作为最终模型。该优化模型与BLAST和MERCI集成创建了UriPred,在验证数据集上将假阳性从34.59 % (ML)减少到3.12 %(混合),在外部数据集上将假阳性从5.8 % (ML)减少到零(混合)。利用UriPred预测了53个LC差异表达蛋白编码基因作为尿蛋白。蛋白-蛋白相互作用分析、AUROC评估(AUC > 0.80)、生存分析以及与Human Protein Atlas和Human urine PeptideAtlas数据库交叉验证尿液可检出性,确定了5种蛋白(KIF23、COL15A1、CTHRC1、MMP9和SPP1)作为潜在的LC生物标志物。UriPred利用AAC特征有效地预测尿蛋白,并使LC的生物标志物发现成为可能。该工具可在https://github.com/Dahrii-Paul/UriPred上公开获取。
{"title":"UriPred: Machine learning prediction of urinary proteins and identification of biomarkers for liver cancer","authors":"Dahrii Paul, Vigneshwar Suriya Prakash Sinnarasan, Rajesh Das, Md Mujibur Rahman Sheikh, Santhosh Manickannan, Amouda Venkatesan","doi":"10.1016/j.compbiolchem.2026.108946","DOIUrl":"10.1016/j.compbiolchem.2026.108946","url":null,"abstract":"<div><div>Urinary proteins are promising non-invasive biomarkers, but their low abundance and wide dynamic range make detection challenging. This study presents UriPred, a computational tool that integrates machine learning (ML), BLAST, and Motif-EmeRging and Classes-Identification (MERCI) to predict urinary proteins and facilitate the identification of liver cancer (LC) biomarkers. A dataset of 10588 urinary and non-urinary proteins was curated, from which two feature types were generated: 10074 compositional and 20 evolutionary features. Seven feature selection methods were applied to compositional features, and 11 ML algorithms were trained on different feature sets. Evolutionary features achieved the highest training performance (AUC 0.79, accuracy 71.99 %), whereas amino acid composition (AAC) with 20 features achieved identical validation AUC (0.74) and comparable accuracy while being computationally less expensive and consistently selected. The ML-AAC model was therefore chosen as the final model. This optimal model was integrated with BLAST and MERCI to create UriPred, which reduced false positives from 34.59 % (ML) to 3.12 % (hybrid) on the validation dataset and from 5.8 % (ML) to zero (hybrid) on an external dataset. Using UriPred, 53 LC differentially expressed protein-coding genes were predicted as urinary proteins. Protein-protein interaction analysis, AUROC evaluation (AUC > 0.80), survival analysis, and cross-verification of urine detectability with the Human Protein Atlas and Human Urine PeptideAtlas databases identified five proteins (KIF23, COL15A1, CTHRC1, MMP9, and SPP1) as potential LC biomarkers. UriPred efficiently predicts urinary proteins using AAC features and enables biomarker discovery for LC. The tool is publicly available at <span><span>https://github.com/Dahrii-Paul/UriPred</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108946"},"PeriodicalIF":3.1,"publicationDate":"2026-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146183832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-08DOI: 10.1016/j.compbiolchem.2026.108939
Shirui Li , Feng Jiang , Xiuyang Li
Background
Irritable Bowel Syndrome (IBS) and Major Depressive Disorder (MDD) exhibit high comorbidity, driven by dysregulation of gut-brain axis interactions. Despite evidence of shared pathophysiology, the core molecular mechanisms and therapeutic targets remain elusive, largely due to clinical heterogeneity and fragmented research approaches.
Methods
We established an integrated framework combining: (1) Bidirectional epidemiological analysis using the CHARLS cohort; (2) Multi-tissue transcriptomics (intestinal mucosa/prefrontal cortex) from GEO datasets using differential expression analysis, WGCNA, and machine learning (LASSO/RF/SVM-RFE); (3) PPI network reconstruction followed by multi-algorithm topological validation; (4) Functional enrichment and immune deconvolution (CIBERSORTx); (5) Bidirectional pharmacology (CTD-based compounds screening and TCM network pharmacology); (6) Molecular docking and short-term molecular dynamics (MD) simulations for binding stability assessment; (7) ADME/Tox Profiling.
Results
Epidemiological analysis indicated bidirectional IBS-MDD risk (Digestive to Mental: OR=1.82(95%CI:1.65-6.79), Mental to Digestive: OR=3.34(95%CI:1.17-2.82)). Integrated transcriptomics identified MPO, LCN2, and GMPPB as core comorbidity genes, validated across cohorts and linked to neutrophil activation, iron dysregulation, and glycosylation defects. Immune profiling revealed tissue-specific dysregulation, with gut-dominated neutrophil/M2 macrophage infiltration in IBS versus brain-enriched CD8⁺ T/NK cells in MDD. Bidirectional pharmacology prioritized bisphenol A/lipopolysaccharide (pathogenic) and resveratrol/quercetin (therapeutic) as high-affinity binders to core targets (ΔG < –7.0 kcal/mol). Short-term MD simulations provided preliminary support for the binding of key therapeutic compounds to targets GMPPB and MPO, supported by TCM herbs (e.g., Jujubae Fructus).
Conclusion
Our study analyzes neuro-immune-endocrine crosstalk underlying IBS-MDD comorbidity, nominating MPO/LCN2/GMPPB as diagnostic biomarkers and therapeutic targets. Environmental toxins and natural compounds offer actionable strategies for gut-brain axis modulation.
{"title":"Targeting the MPO/LCN2/GMPPB axis in IBS-depression comorbidity: Integrated multi-omics and bidirectional network pharmacology for precision diagnostics and therapeutics","authors":"Shirui Li , Feng Jiang , Xiuyang Li","doi":"10.1016/j.compbiolchem.2026.108939","DOIUrl":"10.1016/j.compbiolchem.2026.108939","url":null,"abstract":"<div><h3>Background</h3><div>Irritable Bowel Syndrome (IBS) and Major Depressive Disorder (MDD) exhibit high comorbidity, driven by dysregulation of gut-brain axis interactions. Despite evidence of shared pathophysiology, the core molecular mechanisms and therapeutic targets remain elusive, largely due to clinical heterogeneity and fragmented research approaches.</div></div><div><h3>Methods</h3><div>We established an integrated framework combining: (1) Bidirectional epidemiological analysis using the CHARLS cohort; (2) Multi-tissue transcriptomics (intestinal mucosa/prefrontal cortex) from GEO datasets using differential expression analysis, WGCNA, and machine learning (LASSO/RF/SVM-RFE); (3) PPI network reconstruction followed by multi-algorithm topological validation; (4) Functional enrichment and immune deconvolution (CIBERSORTx); (5) Bidirectional pharmacology (CTD-based compounds screening and TCM network pharmacology); (6) Molecular docking and short-term molecular dynamics (MD) simulations for binding stability assessment; (7) ADME/Tox Profiling.</div></div><div><h3>Results</h3><div>Epidemiological analysis indicated bidirectional IBS-MDD risk (Digestive to Mental: OR=1.82(95%CI:1.65-6.79), Mental to Digestive: OR=3.34(95%CI:1.17-2.82)). Integrated transcriptomics identified MPO, LCN2, and GMPPB as core comorbidity genes, validated across cohorts and linked to neutrophil activation, iron dysregulation, and glycosylation defects. Immune profiling revealed tissue-specific dysregulation, with gut-dominated neutrophil/M2 macrophage infiltration in IBS versus brain-enriched CD8⁺ T/NK cells in MDD. Bidirectional pharmacology prioritized bisphenol A/lipopolysaccharide (pathogenic) and resveratrol/quercetin (therapeutic) as high-affinity binders to core targets (ΔG < –7.0 kcal/mol). Short-term MD simulations provided preliminary support for the binding of key therapeutic compounds to targets GMPPB and MPO, supported by TCM herbs (e.g., Jujubae Fructus).</div></div><div><h3>Conclusion</h3><div>Our study analyzes neuro-immune-endocrine crosstalk underlying IBS-MDD comorbidity, nominating MPO/LCN2/GMPPB as diagnostic biomarkers and therapeutic targets. Environmental toxins and natural compounds offer actionable strategies for gut-brain axis modulation.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108939"},"PeriodicalIF":3.1,"publicationDate":"2026-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146183922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-06DOI: 10.1016/j.compbiolchem.2026.108928
Debaleena Samanta, Malavika Bhattacharya
Background
Gut ecosystem is maintained by immune regulation through intestinal microbiota that leads to inflammatory diseases such as Gastric Cancer. Hyaluronic acid is derived from gut microorganism Streptococcus pyogenes which directly controls the up and down regulation of potential gene sets that helps to promote or inhibit gastric cancer.
Methods
GEO database is used to observe potential hub genes related to hyaluronic acid mediated gastric cancer. Gene expression analysis and PPI network analysis are implicated through EMBL-EBI and STRING database under DAVID software respectively. Gene interactions are studied by Reactome data source and gene networking is identified through GeneMANIA online server. BIOVENN is used for producing Venn diagram and GSEA is followed for generation of Heat Map. Identification of Microbial Signal Transduction through MiST website, regulons and transcription factors analysis through RegPrecise and MetaCyc web source is incorporated for biosynthetic pathway analysis. TCGA is incorporated for studying cancer genomics and gene interaction pathways. KEGG Pathway enrichment is done through ShinyGO resource. KM-Survival Plots is depicted through CybersortX. Genome expressional analysis is done by GEPIA web portal. Resistomes and Variants isolation and bi-product of Streptococcus pyogenes MGAS are implicated through CARD and BV-BRC database. Ligand-Drug Analysis and TCGA Drug Response and Survival Analysis are incorporated through MCULE and GEPIA 3 web source.
Results
Differential Expression Analysis has identified up-regulated and down-regulated genes related to HMMR gene. Venn Analysis interpreted 3 co-expressed genes within HMMR, IL1B and HAS3 genes. Global Cancer Heat Map of HMMR gene has shown high expression level of intensity value 0.50204 to lowest value −0.58367. Cellular response related to HMMR gene is responsible for programmed cell death due to inactivation of Cyclin B (Cdk1) complex mediated by Chk1/Chk2 (Cds1). Streptococcus pyogenes mediated biological pathways, transcription factors, regulons and genomic analysis of HMMR protein are also identified. KEGG Enrichment Analysis shows NF-kB Signaling pathway with Hyaluronic Acid mediated network gene set. KM-Survival Analysis is depicted through Hazard Ratio (HR) and p-value identification. Drug-Target Docking Analysis of ligand molecule Hyaluronic Acid and drugs 5-Fluorouracil and Epirubicin and TCGA Drug Survival Analysis and Response are implicated for therapeutic interventions.
{"title":"Pivot gene enrichment analysis of Streptococcus pyogenes specific hyaluronic acid mediated disease prognosis on gastric cancer: Based on bioinformatics study","authors":"Debaleena Samanta, Malavika Bhattacharya","doi":"10.1016/j.compbiolchem.2026.108928","DOIUrl":"10.1016/j.compbiolchem.2026.108928","url":null,"abstract":"<div><h3>Background</h3><div>Gut ecosystem is maintained by immune regulation through intestinal microbiota that leads to inflammatory diseases such as Gastric Cancer. Hyaluronic acid is derived from gut microorganism <em>Streptococcus pyogenes</em> which directly controls the up and down regulation of potential gene sets that helps to promote or inhibit gastric cancer.</div></div><div><h3>Methods</h3><div>GEO database is used to observe potential hub genes related to hyaluronic acid mediated gastric cancer. Gene expression analysis and PPI network analysis are implicated through EMBL-EBI and STRING database under DAVID software respectively. Gene interactions are studied by Reactome data source and gene networking is identified through GeneMANIA online server. BIOVENN is used for producing Venn diagram and GSEA is followed for generation of Heat Map. Identification of Microbial Signal Transduction through MiST website, regulons and transcription factors analysis through RegPrecise and MetaCyc web source is incorporated for biosynthetic pathway analysis. TCGA is incorporated for studying cancer genomics and gene interaction pathways. KEGG Pathway enrichment is done through ShinyGO resource. KM-Survival Plots is depicted through CybersortX. Genome expressional analysis is done by GEPIA web portal. Resistomes and Variants isolation and bi-product of <em>Streptococcus pyogenes</em> MGAS are implicated through CARD and BV-BRC database. Ligand-Drug Analysis and TCGA Drug Response and Survival Analysis are incorporated through MCULE and GEPIA 3 web source.</div></div><div><h3>Results</h3><div>Differential Expression Analysis has identified up-regulated and down-regulated genes related to HMMR gene. Venn Analysis interpreted 3 co-expressed genes within HMMR, IL1B and HAS3 genes. Global Cancer Heat Map of HMMR gene has shown high expression level of intensity value 0.50204 to lowest value −0.58367. Cellular response related to HMMR gene is responsible for programmed cell death due to inactivation of Cyclin B (Cdk1) complex mediated by Chk1/Chk2 (Cds1). <em>Streptococcus pyogenes</em> mediated biological pathways, transcription factors, regulons and genomic analysis of HMMR protein are also identified. KEGG Enrichment Analysis shows NF-kB Signaling pathway with Hyaluronic Acid mediated network gene set. KM-Survival Analysis is depicted through Hazard Ratio (HR) and <em>p</em>-value identification. Drug-Target Docking Analysis of ligand molecule Hyaluronic Acid and drugs 5-Fluorouracil and Epirubicin and TCGA Drug Survival Analysis and Response are implicated for therapeutic interventions.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108928"},"PeriodicalIF":3.1,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146159605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-04DOI: 10.1016/j.compbiolchem.2026.108938
Ziyu Zhuang, Jiayi Hu, Hongbo Yu, Yu Xie
Background
Compared to non-triple-negative breast cancer (Non-TNBC), triple-negative breast cancer (TNBC) exhibits significantly poorer prognosis. Previous research has confirmed that the PI3K/AKT pathway is closely associated with prognosis in breast cancer patients. Yet, it remains unclear whether this pathway is implicated in the prognostic differences observed between TNBC and Non-TNBC.
Methods
After downloading raw transcriptomic datasets from the GEO database and removing batch effects, we performed an integrated analysis to delineate how key genes drive the poor prognosis of TNBC. Functional enrichment, machine-learning-based feature selection, immune-cell infiltration profiling, drug-sensitivity screening, single-cell RNA sequencing and spatial transcriptomics were successively applied. Molecular-docking simulations were finally conducted to evaluate the binding affinity of MYB toward bioactive compounds derived from the Taohong Siwu Decoction.
Results
Across 113 algorithm combinations, MYB plays the most critical role in distinguishing TNBC from Non-TNBC. The constructed prognostic model confirms the significant association between MYB expression and patient outcomes. Immune cell infiltration, drug sensitivity, single-cell data analysis and spatial transcriptome revealed the specific mechanisms through which MYB influences patient prognosis. Molecular docking experiments demonstrate strong binding between key components in Taohong Siwu Decoction and MYB.
Conclusion
Based on multi-omics analysis, our findings indicate that the PI3K/AKT pathway is a key factor contributing to the significant prognostic disparity between TNBC and Non-TNBC. Within this pathway, the MYB gene emerges as a potential therapeutic target. This discovery provides a potential basis for future research exploring MYB as a therapeutic target for TNBC patients.
{"title":"MYB: A potential therapeutic target in triple-negative breast cancer based on the PI3K/AKT signaling pathway","authors":"Ziyu Zhuang, Jiayi Hu, Hongbo Yu, Yu Xie","doi":"10.1016/j.compbiolchem.2026.108938","DOIUrl":"10.1016/j.compbiolchem.2026.108938","url":null,"abstract":"<div><h3>Background</h3><div>Compared to non-triple-negative breast cancer (Non-TNBC), triple-negative breast cancer (TNBC) exhibits significantly poorer prognosis. Previous research has confirmed that the PI3K/AKT pathway is closely associated with prognosis in breast cancer patients. Yet, it remains unclear whether this pathway is implicated in the prognostic differences observed between TNBC and Non-TNBC.</div></div><div><h3>Methods</h3><div>After downloading raw transcriptomic datasets from the GEO database and removing batch effects, we performed an integrated analysis to delineate how key genes drive the poor prognosis of TNBC. Functional enrichment, machine-learning-based feature selection, immune-cell infiltration profiling, drug-sensitivity screening, single-cell RNA sequencing and spatial transcriptomics were successively applied. Molecular-docking simulations were finally conducted to evaluate the binding affinity of MYB toward bioactive compounds derived from the Taohong Siwu Decoction.</div></div><div><h3>Results</h3><div>Across 113 algorithm combinations, MYB plays the most critical role in distinguishing TNBC from Non-TNBC. The constructed prognostic model confirms the significant association between MYB expression and patient outcomes. Immune cell infiltration, drug sensitivity, single-cell data analysis and spatial transcriptome revealed the specific mechanisms through which MYB influences patient prognosis. Molecular docking experiments demonstrate strong binding between key components in Taohong Siwu Decoction and MYB.</div></div><div><h3>Conclusion</h3><div>Based on multi-omics analysis, our findings indicate that the PI3K/AKT pathway is a key factor contributing to the significant prognostic disparity between TNBC and Non-TNBC. Within this pathway, the MYB gene emerges as a potential therapeutic target. This discovery provides a potential basis for future research exploring MYB as a therapeutic target for TNBC patients.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"122 ","pages":"Article 108938"},"PeriodicalIF":3.1,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146159607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}