Cancer Informatics最新文献

THBS3 Functions as a Novel Biomarker for Prognosis and Immunotherapeutic Response in Colorectal Cancer: An Integrative Analysis and Validation of the Thrombospondin Gene Family. THBS3作为结直肠癌预后和免疫治疗反应的新生物标志物：血栓反应蛋白基因家族的综合分析和验证。

IF 2.5 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2026-01-27 eCollection Date: 2026-01-01 DOI: 10.1177/11769351251412614

Tao Jiang, Sichao Zhu, Hengyi Zhou, Ningning Zhang, Long Zhang, Changwen Zou, Hu Song

Background: The THBS gene family plays key functions in various diseases; however, its specific roles in colorectal cancer (CRC) have not been systematically characterized.

Methods: Multi-omics data and online databases were used to analyze the mRNA expression levels of the THBS gene family in CRC and their correlations with clinicopathological features and survival. This analysis identified THBS3 as a potential oncogene closely linked with CRC progression. Then, the relationship between THBS3 expression and the immune landscape was assessed. Single-cell RNA sequencing analyzed THBS3 distribution in CRC subtypes. Additionally, GO, KEGG, and GSEA enrichment analyses investigated the mechanisms of THBS3 in CRC. Molecular docking identified anticancer compounds with high affinity for THBS3. Lastly, in vitro experiments examined THBS3's function in CRC.

Results: THBS3 was significantly upregulated in CRC and correlated with poor prognosis. Elevated THBS3 correlated with increased infiltration of M2 macrophages and regulatory T cells (Treg cells), as well as higher expression of immune checkpoint molecules, suggesting its role in shaping an immunosuppressive microenvironment. THBS3 promoted CRC cell proliferation and metastasis, through activation of the PI3K-AKT and EMT pathways.

Conclusion: THBS3 facilitates the progression of CRC and may serve as a novel prognostic biomarker and therapeutic target.

背景：THBS基因家族在多种疾病中起关键作用；然而，其在结直肠癌（CRC）中的具体作用尚未系统表征。方法：采用多组学数据和在线数据库分析结直肠癌中THBS基因家族mRNA表达水平及其与临床病理特征和生存的相关性。该分析确定THBS3是与CRC进展密切相关的潜在癌基因。然后，评估THBS3表达与免疫景观的关系。单细胞RNA测序分析了THBS3在结直肠癌亚型中的分布。此外，GO、KEGG和GSEA富集分析探讨了THBS3在CRC中的作用机制。分子对接发现与THBS3高亲和力的抗癌化合物。最后，体外实验检测了THBS3在结直肠癌中的功能。结果：THBS3在结直肠癌中表达显著上调，且与预后不良相关。THBS3升高与M2巨噬细胞和调节性T细胞（Treg细胞）浸润增加以及免疫检查点分子表达升高相关，提示其在形成免疫抑制微环境中的作用。THBS3通过激活PI3K-AKT和EMT通路促进结直肠癌细胞增殖和转移。结论：THBS3促进结直肠癌的进展，可作为一种新的预后生物标志物和治疗靶点。

{"title":"THBS3 Functions as a Novel Biomarker for Prognosis and Immunotherapeutic Response in Colorectal Cancer: An Integrative Analysis and Validation of the Thrombospondin Gene Family.","authors":"Tao Jiang, Sichao Zhu, Hengyi Zhou, Ningning Zhang, Long Zhang, Changwen Zou, Hu Song","doi":"10.1177/11769351251412614","DOIUrl":"10.1177/11769351251412614","url":null,"abstract":"Background: The THBS gene family plays key functions in various diseases; however, its specific roles in colorectal cancer (CRC) have not been systematically characterized.Methods: Multi-omics data and online databases were used to analyze the mRNA expression levels of the THBS gene family in CRC and their correlations with clinicopathological features and survival. This analysis identified THBS3 as a potential oncogene closely linked with CRC progression. Then, the relationship between THBS3 expression and the immune landscape was assessed. Single-cell RNA sequencing analyzed THBS3 distribution in CRC subtypes. Additionally, GO, KEGG, and GSEA enrichment analyses investigated the mechanisms of THBS3 in CRC. Molecular docking identified anticancer compounds with high affinity for THBS3. Lastly, in vitro experiments examined THBS3's function in CRC.Results: THBS3 was significantly upregulated in CRC and correlated with poor prognosis. Elevated THBS3 correlated with increased infiltration of M2 macrophages and regulatory T cells (Treg cells), as well as higher expression of immune checkpoint molecules, suggesting its role in shaping an immunosuppressive microenvironment. THBS3 promoted CRC cell proliferation and metastasis, through activation of the PI3K-AKT and EMT pathways.Conclusion: THBS3 facilitates the progression of CRC and may serve as a novel prognostic biomarker and therapeutic target.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"25 ","pages":"11769351251412614"},"PeriodicalIF":2.5,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12847665/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146087384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prospective Breast Cancer Biomarkers Identified Using miR-526b-Driven Metabolic Alterations. 使用mir -526b驱动的代谢改变确定前瞻性乳腺癌生物标志物。

IF 2.5 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2026-01-18 eCollection Date: 2026-01-01 DOI: 10.1177/11769351251408670

Braydon Nault, Mousumi Majumder

Objectives: Breast cancer is a heterogeneous disease driven by dysregulated cellular processes, including altered metabolic pathways. The oncogenic microRNA miR-526b influences several cancer hallmark phenotypes and holds promise as a plasma biomarker. Given miR-526b's role in metabolic regulation, we have decided LDHA, PDHA1, ATP5A1, and TIGAR that may help to identify additional biomarkers for breast cancer detection.

Methods: We analyzed mRNA expression of these 4 metabolic markers in breast cancer tissue biopsies and plasma samples from patients and disease-free controls, using publicly available datasets and RT-qPCR validation. Diagnostic performance was evaluated using univariate and multivariate logistic regression and LASSO regression modeling. The potential of combining ATP5A1 with pri-miR-526b expression to improve plasma biomarker accuracy was also assessed.

Results: Individually, none of the metabolic markers demonstrated sufficient sensitivity or specificity as plasma biomarkers. However, combining markers via logistic and LASSO regression improved classification performance. ATP5A1 showed strong biomarker potential in biopsy tissue samples but limited utility in blood plasma. The combination of ATP5A1 with pri-miR-526b significantly enhanced plasma-based diagnostic accuracy, highlighting the value of integrated biomarker panels.

Conclusions: Our study validates the potential of miR-526b-regulated metabolic genes as complementary breast cancer biomarkers. While ATP5A1 shows promise in tissue, plasma-based screening benefits from combining multiple markers, including pri-miR-526b. Further research is needed to refine plasma biomarker panels for effective early detection of breast cancer.

目的：乳腺癌是一种异质性疾病，由细胞过程失调驱动，包括代谢途径改变。致癌microRNA miR-526b影响几种癌症标志表型，并有望作为血浆生物标志物。考虑到miR-526b在代谢调节中的作用，我们确定了LDHA、PDHA1、ATP5A1和TIGAR可能有助于确定乳腺癌检测的其他生物标志物。方法：我们使用公开的数据集和RT-qPCR验证，分析了乳腺癌组织活检和患者及无疾病对照的血浆样本中这4种代谢标志物的mRNA表达。使用单变量和多变量逻辑回归以及LASSO回归模型评估诊断性能。我们还评估了ATP5A1与pri-miR-526b联合表达提高血浆生物标志物准确性的潜力。结果：单独而言，没有一种代谢标志物表现出足够的敏感性或特异性作为血浆生物标志物。然而，通过逻辑回归和LASSO回归结合标记提高了分类性能。ATP5A1在活检组织样本中显示出很强的生物标志物潜力，但在血浆中的应用有限。ATP5A1联合pri-miR-526b显著提高了基于血浆的诊断准确性，突出了集成生物标志物面板的价值。结论：我们的研究验证了mir -526b调节的代谢基因作为补充乳腺癌生物标志物的潜力。虽然ATP5A1在组织中显示出前景，但基于血浆的筛查受益于组合多种标记物，包括pri-miR-526b。需要进一步的研究来完善血浆生物标志物面板，以有效地早期检测乳腺癌。

{"title":"Prospective Breast Cancer Biomarkers Identified Using miR-526b-Driven Metabolic Alterations.","authors":"Braydon Nault, Mousumi Majumder","doi":"10.1177/11769351251408670","DOIUrl":"10.1177/11769351251408670","url":null,"abstract":"Objectives: Breast cancer is a heterogeneous disease driven by dysregulated cellular processes, including altered metabolic pathways. The oncogenic microRNA miR-526b influences several cancer hallmark phenotypes and holds promise as a plasma biomarker. Given miR-526b's role in metabolic regulation, we have decided LDHA, PDHA1, ATP5A1, and TIGAR that may help to identify additional biomarkers for breast cancer detection.Methods: We analyzed mRNA expression of these 4 metabolic markers in breast cancer tissue biopsies and plasma samples from patients and disease-free controls, using publicly available datasets and RT-qPCR validation. Diagnostic performance was evaluated using univariate and multivariate logistic regression and LASSO regression modeling. The potential of combining ATP5A1 with pri-miR-526b expression to improve plasma biomarker accuracy was also assessed.Results: Individually, none of the metabolic markers demonstrated sufficient sensitivity or specificity as plasma biomarkers. However, combining markers via logistic and LASSO regression improved classification performance. ATP5A1 showed strong biomarker potential in biopsy tissue samples but limited utility in blood plasma. The combination of ATP5A1 with pri-miR-526b significantly enhanced plasma-based diagnostic accuracy, highlighting the value of integrated biomarker panels.Conclusions: Our study validates the potential of miR-526b-regulated metabolic genes as complementary breast cancer biomarkers. While ATP5A1 shows promise in tissue, plasma-based screening benefits from combining multiple markers, including pri-miR-526b. Further research is needed to refine plasma biomarker panels for effective early detection of breast cancer.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"25 ","pages":"11769351251408670"},"PeriodicalIF":2.5,"publicationDate":"2026-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12813260/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146012369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MGDB: A Novel Bioinformatics Quality Control Tool for Clinical Next-Generation Sequencing. MGDB：一种用于临床下一代测序的新型生物信息学质量控制工具。

IF 2.5 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2026-01-03 eCollection Date: 2026-01-01 DOI: 10.1177/11769351251411074

Hadrien T Gayap, Philippe-Pierre Robichaud, Nicolas Crapoulet, Eric P Allain

Background and objectives: Next-generation sequencing (NGS) is transforming clinical diagnostics by enabling the detection of genetic variation with unprecedented precision. However, successful implementation of NGS workflows necessitates stringent quality control. This study introduces Molecular Genetics Dashboard (MGDB), a novel bioinformatics tool designed to enhance quality control in clinical NGS workflows.

Methods: Using the Python dash framework for visualizations and MySQL databases, we have developed a novel tool for variant-level monitoring of clinical NGS sequencing runs. MGDB uses a docker-compose containerization for improved portability and can flexibly include or exclude samples from accumulated statistics with notes from interpreters.

Results: MGDB facilitates variant-level run-to-run monitoring, ensuring the consistency of variant detection across sequencing cycles. The tool provides an interactive platform for visualizing and assessing variant data, identifying potential inconsistencies or outliers and improving data management and interpretation compared to traditional methods. MGDB was tested using samples sequenced with Oncomine Focus/Comprehensive Plus assays on S5 sequencers and analyzed via IonReporter software.

Conclusions: MGDB offers a robust and user-friendly solution for enhancing quality control in clinical NGS workflows, contributing to greater accuracy and reliability in variant detection. The tool is freely available on GitHub: https://github.com/acri-nb/GeneticVariantsDB.

背景和目的：下一代测序（NGS）正在通过前所未有的精度检测遗传变异，从而改变临床诊断。然而，NGS工作流程的成功实施需要严格的质量控制。本研究介绍了分子遗传学仪表盘（MGDB），这是一种新型的生物信息学工具，旨在加强临床NGS工作流程的质量控制。方法：利用Python dash框架进行可视化和MySQL数据库，我们开发了一种新的工具，用于临床NGS测序运行的变水平监测。MGDB使用docker-compose容器化来提高可移植性，并且可以灵活地使用解释器的注释从累积的统计数据中包括或排除样本。结果：MGDB促进了变异水平的运行-运行监测，确保了跨测序周期变异检测的一致性。与传统方法相比，该工具提供了一个交互式平台，用于可视化和评估变量数据，识别潜在的不一致或异常值，并改进数据管理和解释。MGDB使用Oncomine Focus/Comprehensive Plus测定法在S5测序仪上测序，并通过IonReporter软件进行分析。结论：MGDB为加强临床NGS工作流程的质量控制提供了一个强大且用户友好的解决方案，有助于提高变异检测的准确性和可靠性。该工具在GitHub上免费提供：https://github.com/acri-nb/GeneticVariantsDB。

{"title":"MGDB: A Novel Bioinformatics Quality Control Tool for Clinical Next-Generation Sequencing.","authors":"Hadrien T Gayap, Philippe-Pierre Robichaud, Nicolas Crapoulet, Eric P Allain","doi":"10.1177/11769351251411074","DOIUrl":"10.1177/11769351251411074","url":null,"abstract":"Background and objectives: Next-generation sequencing (NGS) is transforming clinical diagnostics by enabling the detection of genetic variation with unprecedented precision. However, successful implementation of NGS workflows necessitates stringent quality control. This study introduces Molecular Genetics Dashboard (MGDB), a novel bioinformatics tool designed to enhance quality control in clinical NGS workflows.Methods: Using the Python dash framework for visualizations and MySQL databases, we have developed a novel tool for variant-level monitoring of clinical NGS sequencing runs. MGDB uses a docker-compose containerization for improved portability and can flexibly include or exclude samples from accumulated statistics with notes from interpreters.Results: MGDB facilitates variant-level run-to-run monitoring, ensuring the consistency of variant detection across sequencing cycles. The tool provides an interactive platform for visualizing and assessing variant data, identifying potential inconsistencies or outliers and improving data management and interpretation compared to traditional methods. MGDB was tested using samples sequenced with Oncomine Focus/Comprehensive Plus assays on S5 sequencers and analyzed via IonReporter software.Conclusions: MGDB offers a robust and user-friendly solution for enhancing quality control in clinical NGS workflows, contributing to greater accuracy and reliability in variant detection. The tool is freely available on GitHub: https://github.com/acri-nb/GeneticVariantsDB.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"25 ","pages":"11769351251411074"},"PeriodicalIF":2.5,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12764754/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145906846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Clustering Analysis of Multiple Omics Data Types Identifies Cancer Patients With Consistent Survival Outcomes. 多组学数据类型的聚类分析确定了具有一致生存结果的癌症患者。

IF 2.5 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2025-12-23 eCollection Date: 2025-01-01 DOI: 10.1177/11769351251394107

Shuting Lin, Peng Qiu

Objectives: Cancer stratification is essential for accurate prognosis and personalized treatment selection. While many existing approaches integrate multiple omics data types to identify cancer subtypes, it remains unclear how clustering results from individual omics layers compare in their ability to capture survival-related patient clusters. This study aims to examine patient clusters separately defined by different omics data types and to explore the consistency of these clusters as well as their associations with survival outcomes.

Methods: In this study, we conducted clustering analysis on miRNA expression, gene expression, and DNA methylation data across 20 cancer types in TCGA. We employed a standard clustering pipeline similar to the widely used Seurat clustering pipeline in scRNA-seq analysis. We performed survival analysis to assess whether the resulting patient clusters exhibit significantly different survival outcomes.

Results: We observed significant survival differences among patient clusters in 11 cancer types. Notably, in 6 of these 11 cancer types, the survival differences among patient clusters were significant in multiple omics data types. For each of these 6 cancer types, we compared the consistency of patient clusters across different omics data types. Interestingly, in each cancer type, we noticed one set of patients who consistently clustered together irrespective of the omics data type, and these patients exhibited either the most favorable or the most unfavorable survival outcomes. This observation suggested that those patients with the most prominent survival outcomes show distinct expression patterns in multiple genomics aspects and could be captured by clustering analysis in multiple omics data types. To interpret these findings, we identified differentially expressed molecular features. Using established miRNA-target relationships, gene-gene interactions, as well as gene-CpG relationships, we constructed networks specific to each cancer type based on the differentially expressed features. These networks revealed several molecular modules associated with patient survival outcomes, such as the miR-200c-3p/ZEB2 axis in bladder cancer, the regulatory role of miR-98 in breast cancer, as well as the association of miR-21 with target genes APC in kidney renal cell carcinoma.

Conclusion: These findings suggest that omics-specific clustering can identify robust survival-related patient clusters and uncover molecular features that may contribute to differential survival outcomes.

目的：肿瘤分层是准确预后和个性化治疗选择的必要条件。虽然许多现有的方法整合了多种组学数据类型来识别癌症亚型，但目前尚不清楚来自各个组学层的聚类结果如何能够捕获与生存相关的患者聚类。本研究旨在检查由不同组学数据类型单独定义的患者群，并探索这些群的一致性以及它们与生存结果的关联。方法：在本研究中，我们对TCGA 20种癌症类型的miRNA表达、基因表达和DNA甲基化数据进行聚类分析。我们采用了类似于scRNA-seq分析中广泛使用的Seurat聚类管道的标准聚类管道。我们进行了生存分析，以评估所产生的患者群是否表现出显著不同的生存结果。结果：我们在11种癌症类型的患者群中观察到显著的生存差异。值得注意的是，在这11种癌症类型中的6种中，患者群之间的生存差异在多个组学数据类型中都是显著的。对于这6种癌症类型中的每一种，我们比较了不同组学数据类型的患者群的一致性。有趣的是，在每种癌症类型中，我们注意到一组患者无论组学数据类型如何都始终聚集在一起，这些患者表现出最有利或最不利的生存结果。这一观察结果表明，那些生存结果最突出的患者在多个基因组学方面表现出不同的表达模式，可以通过多组学数据类型的聚类分析来捕获。为了解释这些发现，我们确定了差异表达的分子特征。利用已建立的miRNA-target关系、基因-基因相互作用以及基因- cpg关系，我们基于差异表达特征构建了针对每种癌症类型的网络。这些网络揭示了与患者生存结果相关的几个分子模块，如膀胱癌中的miR-200c-3p/ZEB2轴，乳腺癌中miR-98的调节作用，以及肾肾细胞癌中miR-21与靶基因APC的关联。结论：这些发现表明，组学特异性聚类可以识别与生存相关的患者簇，并揭示可能导致差异生存结果的分子特征。

{"title":"Clustering Analysis of Multiple Omics Data Types Identifies Cancer Patients With Consistent Survival Outcomes.","authors":"Shuting Lin, Peng Qiu","doi":"10.1177/11769351251394107","DOIUrl":"10.1177/11769351251394107","url":null,"abstract":"Objectives: Cancer stratification is essential for accurate prognosis and personalized treatment selection. While many existing approaches integrate multiple omics data types to identify cancer subtypes, it remains unclear how clustering results from individual omics layers compare in their ability to capture survival-related patient clusters. This study aims to examine patient clusters separately defined by different omics data types and to explore the consistency of these clusters as well as their associations with survival outcomes.Methods: In this study, we conducted clustering analysis on miRNA expression, gene expression, and DNA methylation data across 20 cancer types in TCGA. We employed a standard clustering pipeline similar to the widely used Seurat clustering pipeline in scRNA-seq analysis. We performed survival analysis to assess whether the resulting patient clusters exhibit significantly different survival outcomes.Results: We observed significant survival differences among patient clusters in 11 cancer types. Notably, in 6 of these 11 cancer types, the survival differences among patient clusters were significant in multiple omics data types. For each of these 6 cancer types, we compared the consistency of patient clusters across different omics data types. Interestingly, in each cancer type, we noticed one set of patients who consistently clustered together irrespective of the omics data type, and these patients exhibited either the most favorable or the most unfavorable survival outcomes. This observation suggested that those patients with the most prominent survival outcomes show distinct expression patterns in multiple genomics aspects and could be captured by clustering analysis in multiple omics data types. To interpret these findings, we identified differentially expressed molecular features. Using established miRNA-target relationships, gene-gene interactions, as well as gene-CpG relationships, we constructed networks specific to each cancer type based on the differentially expressed features. These networks revealed several molecular modules associated with patient survival outcomes, such as the miR-200c-3p/ZEB2 axis in bladder cancer, the regulatory role of miR-98 in breast cancer, as well as the association of miR-21 with target genes APC in kidney renal cell carcinoma.Conclusion: These findings suggest that omics-specific clustering can identify robust survival-related patient clusters and uncover molecular features that may contribute to differential survival outcomes.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"24 ","pages":"11769351251394107"},"PeriodicalIF":2.5,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12743153/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145850943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust Cancer Biomarker Identification From Matched Transcriptomic Data Via Bootstrapped Regularized Conditional Logistic Regression. 基于自举正则化条件逻辑回归的匹配转录组数据鲁棒性癌症生物标志物鉴定。

IF 2.5 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2025-12-16 eCollection Date: 2025-01-01 DOI: 10.1177/11769351251404255

Jie-Huei Wang, Zih-Han Wu, Hui-Chen Lu, Tzung-Ying Guo

Objectives: With the increasing application of high-throughput transcriptomic data in cancer research, identifying reliable cancer biomarkers in high-dimensional settings remains a major challenge. This study aims to systematically evaluate various regularized conditional logistic regression (CLR) methods under a matched case-control (MCC) design, focusing on their performance in variable selection, parameter estimation, and predictive accuracy. Special emphasis is placed on the importance of the matching design in reducing confounding effects and improving model interpretability.

Methods: We utilize RNA-seq data from The Cancer Genome Atlas (TCGA), specifically datasets for liver, thyroid, and lung cancers, which include paired tumor and adjacent normal tissue samples. In our analysis, we apply 4 regularized CLR methods implemented in R packages-namely "clogitL1," "pclogit," "clogitLasso," and "penalizedclr"-to analyze over 20 000 gene expression features. We evaluate the comparative performance of these methods based on metrics such as gene selection stability, predictive accuracy, and interpretability. Additionally, we employ a bootstrap resampling framework to estimate gene selection probabilities, which serve as a measure of gene importance.

Results: Our results show that incorporating the MCC design significantly enhances feature selection performance by mitigating confounding noise. The regularized CLR models successfully identify several well-established cancer-related genes with high selection consistency and statistical significance. In contrast, models that ignore the matched design tend to miss critical biomarkers or produce excessive false positives, leading to potentially misleading interpretations.

Conclusions: This study highlights the value of integrating a matched case-control design with regularized CLR methods for the analysis of high-dimensional transcriptomic data. The proposed analytical framework offers improved accuracy, robustness, and biological relevance, providing a practical and scalable approach for cancer genomics research. It also supports the advancement of precision medicine and translational applications.

随着高通量转录组数据在癌症研究中的应用越来越多，在高维环境中识别可靠的癌症生物标志物仍然是一个主要挑战。本研究旨在系统评估匹配病例对照（MCC）设计下的各种正则化条件逻辑回归（CLR）方法，重点关注它们在变量选择、参数估计和预测精度方面的性能。特别强调了匹配设计在减少混淆效应和提高模型可解释性方面的重要性。方法：我们利用来自癌症基因组图谱（TCGA）的RNA-seq数据，特别是肝癌、甲状腺癌和肺癌的数据集，包括成对的肿瘤和邻近的正常组织样本。在我们的分析中，我们使用了在R包中实现的4种正则化CLR方法——即“clogitL1”、“pclogit”、“clogitLasso”和“penalizedclr”——来分析超过20,000个基因表达特征。我们基于诸如基因选择稳定性、预测准确性和可解释性等指标来评估这些方法的比较性能。此外，我们采用自举重采样框架来估计基因选择概率，这是基因重要性的衡量标准。结果：我们的研究结果表明，结合MCC设计可以显著提高特征选择性能，降低混杂噪声。正则化的CLR模型成功地识别了几个已建立的癌症相关基因，具有很高的选择一致性和统计学意义。相比之下，忽略匹配设计的模型往往会错过关键的生物标志物或产生过多的假阳性，从而导致潜在的误导性解释。结论：本研究强调了将匹配病例对照设计与正则化CLR方法整合在高维转录组学数据分析中的价值。提出的分析框架提供了更高的准确性、稳健性和生物学相关性，为癌症基因组学研究提供了一种实用和可扩展的方法。它还支持精准医学和转化应用的进步。

{"title":"Robust Cancer Biomarker Identification From Matched Transcriptomic Data Via Bootstrapped Regularized Conditional Logistic Regression.","authors":"Jie-Huei Wang, Zih-Han Wu, Hui-Chen Lu, Tzung-Ying Guo","doi":"10.1177/11769351251404255","DOIUrl":"10.1177/11769351251404255","url":null,"abstract":"Objectives: With the increasing application of high-throughput transcriptomic data in cancer research, identifying reliable cancer biomarkers in high-dimensional settings remains a major challenge. This study aims to systematically evaluate various regularized conditional logistic regression (CLR) methods under a matched case-control (MCC) design, focusing on their performance in variable selection, parameter estimation, and predictive accuracy. Special emphasis is placed on the importance of the matching design in reducing confounding effects and improving model interpretability.Methods: We utilize RNA-seq data from The Cancer Genome Atlas (TCGA), specifically datasets for liver, thyroid, and lung cancers, which include paired tumor and adjacent normal tissue samples. In our analysis, we apply 4 regularized CLR methods implemented in R packages-namely \"clogitL1,\" \"pclogit,\" \"clogitLasso,\" and \"penalizedclr\"-to analyze over 20 000 gene expression features. We evaluate the comparative performance of these methods based on metrics such as gene selection stability, predictive accuracy, and interpretability. Additionally, we employ a bootstrap resampling framework to estimate gene selection probabilities, which serve as a measure of gene importance.Results: Our results show that incorporating the MCC design significantly enhances feature selection performance by mitigating confounding noise. The regularized CLR models successfully identify several well-established cancer-related genes with high selection consistency and statistical significance. In contrast, models that ignore the matched design tend to miss critical biomarkers or produce excessive false positives, leading to potentially misleading interpretations.Conclusions: This study highlights the value of integrating a matched case-control design with regularized CLR methods for the analysis of high-dimensional transcriptomic data. The proposed analytical framework offers improved accuracy, robustness, and biological relevance, providing a practical and scalable approach for cancer genomics research. It also supports the advancement of precision medicine and translational applications.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"24 ","pages":"11769351251404255"},"PeriodicalIF":2.5,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12709001/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145782996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrative Analysis of eQTL Genes Reveals Key Biomarkers and Mechanisms for Early Diagnosis of Pancreatic Ductal Adenocarcinoma. eQTL基因的整合分析揭示了早期诊断胰腺导管腺癌的关键生物标志物和机制。

IF 2.5 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2025-12-16 eCollection Date: 2025-01-01 DOI: 10.1177/11769351251400465

Xuebo Wang, Xusheng Zhang, Shicai Liang, Jialong Wang, Yannan Xie, Jiawei Wang, Bendong Chen

Background: Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal malignancy with a dismal 5-year survival rate, largely due to the absence of reliable biomarkers for early detection. The molecular mechanisms underpinning PDAC pathogenesis remain incompletely understood, highlighting the urgent need for novel diagnostic strategies.Objective: This study aimed to integrate eQTL-driven Mendelian randomization (MR) with transcriptomic and genome-wide association data to identify causal PDAC-associated genes and construct a diagnostic nomogram based on 5 hub genes (CTSC, SMYD3, MFGE8, IGFBP7, POC1B) for early detection of pancreatic ductal adenocarcinoma (PDAC).Methods: Transcriptomic data from GSE62165 and GSE25471 were retrieved from the Gene Expression Omnibus (GEO) and processed for differential expression using LIMMA and GEO2R, followed by batch correction and weighted gene co-expression network analysis (WGCNA). Summary-level eQTL statistics were obtained from OpenGWAS, and GWAS data included over 5000 PDAC cases. MR analysis was performed using inverse variance weighted (IVW) as the primary approach, supplemented with MR-Egger, weighted median, weighted mode, and MR-PRESSO. Instrument strength, pleiotropy, and heterogeneity were assessed via F-statistics, Egger intercept, and Cochran's Q test. Candidate genes were filtered using a consensus approach combining random forest (RF), support vector machine-recursive feature elimination (SVM-RFE), and Lasso regression. Diagnostic performance was evaluated via ROC curves, C-index, calibration plots, and decision curve analysis. Mechanistic insights were derived from KEGG and GO enrichment analyses, as well as protein-protein interaction (PPI) network analyses.Results: Five eQTL-associated hub genes-CTSC, SMYD3, MFGE8, IGFBP7, and POC1B-were identified as causally linked to PDAC via robust MR analysis with minimal evidence of pleiotropy or heterogeneity. These genes demonstrated high diagnostic potential (AUC > 0.85, P < .001). A diagnostic nomogram incorporating these genes achieved strong predictive performance (C-index = 0.92) with favorable clinical decision curve results. Functional enrichment and PPI analyses implicated these genes, particularly CTSC, in modulating the ITGAV/ITGB3-PI3K-Akt signaling axis, contributing to PDAC cell cycle regulation and apoptosis resistance.Conclusions: This study presents a multi-omics, MR-informed framework for identifying eQTL-regulated biomarkers of PDAC. The identified hub genes offer promising avenues for early detection, while the mechanistic mapping of the PI3K-Akt pathway provides translational insights. These findings warrant further validation in clinical and experimental settings and hold potential to reshape PDAC diagnostic strategies.Pancreatic ductal adenocarcinoma (PDAC) remains a formidable clinical ch

背景：胰腺导管腺癌（Pancreatic ductal adencarcinoma， PDAC）是一种高致死率的恶性肿瘤，5年生存率低，主要原因是缺乏可靠的早期检测生物标志物。支持PDAC发病机制的分子机制仍然不完全清楚，强调迫切需要新的诊断策略。目的：本研究旨在将eqtl驱动的孟德尔随机化（MR）与转录组学和全基因组关联数据相结合，鉴定PDAC相关的致病基因，并构建基于5个中心基因（CTSC、SMYD3、MFGE8、IGFBP7、POC1B）的诊断图，用于胰腺导管腺癌（PDAC）的早期检测。方法：从Gene Expression Omnibus （GEO）检索GSE62165和GSE25471的转录组学数据，使用LIMMA和GEO2R进行差异表达处理，然后进行批量校正和加权基因共表达网络分析（WGCNA）。从OpenGWAS中获得汇总级eQTL统计数据，GWAS数据包括5000多例PDAC病例。MR分析以逆方差加权（IVW）为主要方法，辅以MR- egger、加权中位数、加权模式和MR- presso。通过f统计、Egger截距和Cochran’s Q检验评估工具强度、多效性和异质性。采用随机森林（RF）、支持向量机递归特征消除（SVM-RFE）和Lasso回归相结合的共识方法筛选候选基因。通过ROC曲线、c指数、校准图和决策曲线分析评估诊断效果。通过KEGG和GO富集分析以及蛋白质-蛋白质相互作用（PPI）网络分析获得了机制见解。结果：5个eqtl相关的中枢基因——ctsc、SMYD3、MFGE8、IGFBP7和poc1b——通过强有力的MR分析被确定与PDAC有因果关系，而多效性或异质性的证据很少。这些基因显示出很高的诊断潜力（AUC > 0.85， P ITGAV/ITGB3-PI3K-Akt信号轴），参与PDAC细胞周期调控和细胞凋亡抵抗。结论：本研究提出了一个多组学、磁共振信息框架，用于鉴定eqtl调控的PDAC生物标志物。已确定的枢纽基因为早期检测提供了有希望的途径，而PI3K-Akt通路的机制定位提供了翻译方面的见解。这些发现值得在临床和实验环境中进一步验证，并具有重塑PDAC诊断策略的潜力。胰腺导管腺癌（PDAC）由于其侵袭性和缺乏有效的早期诊断生物标志物，仍然是一个巨大的临床挑战。为了解决这个问题，我们利用孟德尔随机化（MR）整合转录组学数据、全基因组关联研究（GWAS）和表达数量性状位点（eQTL）信息，以确定与PDAC风险因果相关的基因。在两个GEO数据集（GSE62165, GSE25471）中鉴定差异表达基因，并使用加权基因共表达网络分析（WGCNA）对其进行优先级排序。采用IVW、MR- egger、加权中位数和MR- presso进行MR分析，鉴定出5个中心基因——ctsc、SMYD3、MFGE8、IGFBP7和poc1b——是PDAC的重要致病因素。这些基因被纳入到使用机器学习方法（随机森林、SVM-RFE、Lasso）构建的诊断模型中，该模型具有较强的分类性能（AUC > 0.85）和良好的校准（C-index = 0.92）。功能富集和蛋白相互作用分析显示，CTSC调控ECM-integrin-PI3K-Akt信号通路，参与肿瘤细胞增殖和存活。研究结果建立了一个基于多组学的生物标志物面板，具有很强的诊断效用和机制相关性，为未来临床队列的转化验证提供了一个潜在的框架。

{"title":"Integrative Analysis of eQTL Genes Reveals Key Biomarkers and Mechanisms for Early Diagnosis of Pancreatic Ductal Adenocarcinoma.","authors":"Xuebo Wang, Xusheng Zhang, Shicai Liang, Jialong Wang, Yannan Xie, Jiawei Wang, Bendong Chen","doi":"10.1177/11769351251400465","DOIUrl":"10.1177/11769351251400465","url":null,"abstract":"Background: Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal malignancy with a dismal 5-year survival rate, largely due to the absence of reliable biomarkers for early detection. The molecular mechanisms underpinning PDAC pathogenesis remain incompletely understood, highlighting the urgent need for novel diagnostic strategies.Objective: This study aimed to integrate eQTL-driven Mendelian randomization (MR) with transcriptomic and genome-wide association data to identify causal PDAC-associated genes and construct a diagnostic nomogram based on 5 hub genes (CTSC, SMYD3, MFGE8, IGFBP7, POC1B) for early detection of pancreatic ductal adenocarcinoma (PDAC).Methods: Transcriptomic data from GSE62165 and GSE25471 were retrieved from the Gene Expression Omnibus (GEO) and processed for differential expression using LIMMA and GEO2R, followed by batch correction and weighted gene co-expression network analysis (WGCNA). Summary-level eQTL statistics were obtained from OpenGWAS, and GWAS data included over 5000 PDAC cases. MR analysis was performed using inverse variance weighted (IVW) as the primary approach, supplemented with MR-Egger, weighted median, weighted mode, and MR-PRESSO. Instrument strength, pleiotropy, and heterogeneity were assessed via F-statistics, Egger intercept, and Cochran's Q test. Candidate genes were filtered using a consensus approach combining random forest (RF), support vector machine-recursive feature elimination (SVM-RFE), and Lasso regression. Diagnostic performance was evaluated via ROC curves, C-index, calibration plots, and decision curve analysis. Mechanistic insights were derived from KEGG and GO enrichment analyses, as well as protein-protein interaction (PPI) network analyses.Results: Five eQTL-associated hub genes-CTSC, SMYD3, MFGE8, IGFBP7, and POC1B-were identified as causally linked to PDAC via robust MR analysis with minimal evidence of pleiotropy or heterogeneity. These genes demonstrated high diagnostic potential (AUC > 0.85, P < .001). A diagnostic nomogram incorporating these genes achieved strong predictive performance (C-index = 0.92) with favorable clinical decision curve results. Functional enrichment and PPI analyses implicated these genes, particularly CTSC, in modulating the ITGAV/ITGB3-PI3K-Akt signaling axis, contributing to PDAC cell cycle regulation and apoptosis resistance.Conclusions: This study presents a multi-omics, MR-informed framework for identifying eQTL-regulated biomarkers of PDAC. The identified hub genes offer promising avenues for early detection, while the mechanistic mapping of the PI3K-Akt pathway provides translational insights. These findings warrant further validation in clinical and experimental settings and hold potential to reshape PDAC diagnostic strategies.Pancreatic ductal adenocarcinoma (PDAC) remains a formidable clinical ch","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"24 ","pages":"11769351251400465"},"PeriodicalIF":2.5,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12709030/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145782962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prediction and Feature Selection of Mastectomy-Related Post Traumatic Stress Disorder (PTSD) Using Machine Learning Among Breast Cancer Patients in Bangladesh. 使用机器学习在孟加拉国乳腺癌患者中预测和选择乳房切除相关的创伤后应激障碍（PTSD）

IF 2.5 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2025-12-12 eCollection Date: 2025-01-01 DOI: 10.1177/11769351251401330

Syed Billal Hossain, Md Mizanoor Rahman, Kapashia Binte Giash, Md Hazrat Ali, Mst Asma Akter, A B M Alauddin Chowdhury

Background: Post-mastectomy PTSD is a serious mental health issue, but it has not been studied enough, particularly in low-resource settings like Bangladesh. This study aimed to predict PTSD among breast cancer survivors using machine learning (ML) models and identify significant predictors through the Boruta algorithm, a feature selection tool, offering scalable solutions for early detection and intervention.

Methods: A cross-sectional study of 138 post-mastectomy breast cancer patients was conducted across 3 hospitals in Bangladesh. Data on sociodemographic, health history, social experience, and treatment were collected using validated tools, including the PTSD Checklist for DSM-5 (PCL-5). The Boruta algorithm identified key predictors, and 10 ML models were evaluated for PTSD prediction using metrics such as accuracy, sensitivity, specificity, and AUC.

Results: Random Forest (RF) outperformed other models (accuracy: 88.9%, AUC: 0.914). Significant predictors included education, monthly income, and changes in family behaviour. Factors like marital status, having chronic diseases, and hormone therapy were not statistically significant. PTSD prevalence was 34.1%, with urban residents and younger patients facing higher risks.

Conclusion: ML models, particularly RF, demonstrated strong predictive performance and identified critical PTSD predictors. These findings highlight the potential for cost-effective PTSD screening in resource-constrained settings. Future research should focus on broader validation and longitudinal studies to refine predictive models.

背景：乳房切除术后创伤后应激障碍是一种严重的精神健康问题，但尚未得到足够的研究，特别是在孟加拉国等资源匮乏的地区。本研究旨在使用机器学习（ML）模型预测乳腺癌幸存者的创伤后应激障碍，并通过Boruta算法（一种特征选择工具）确定重要的预测因子，为早期检测和干预提供可扩展的解决方案。方法：对孟加拉国3家医院的138例乳房切除术后乳腺癌患者进行横断面研究。使用包括DSM-5 (PCL-5) PTSD检查表在内的有效工具收集社会人口学、健康史、社会经验和治疗方面的数据。Boruta算法确定了关键预测因子，并使用准确性、敏感性、特异性和AUC等指标对10 ML模型进行PTSD预测评估。结果：随机森林（Random Forest， RF）模型优于其他模型（准确率：88.9%,AUC: 0.914）。重要的预测因素包括教育程度、月收入和家庭行为的变化。婚姻状况、患有慢性疾病和激素治疗等因素没有统计学意义。PTSD患病率为34.1%，其中城市居民和年轻患者风险较高。结论：ML模型，尤其是RF，表现出很强的预测能力，并确定了关键的PTSD预测因子。这些发现强调了在资源有限的情况下进行具有成本效益的PTSD筛查的潜力。未来的研究应侧重于更广泛的验证和纵向研究，以完善预测模型。

{"title":"Prediction and Feature Selection of Mastectomy-Related Post Traumatic Stress Disorder (PTSD) Using Machine Learning Among Breast Cancer Patients in Bangladesh.","authors":"Syed Billal Hossain, Md Mizanoor Rahman, Kapashia Binte Giash, Md Hazrat Ali, Mst Asma Akter, A B M Alauddin Chowdhury","doi":"10.1177/11769351251401330","DOIUrl":"10.1177/11769351251401330","url":null,"abstract":"Background: Post-mastectomy PTSD is a serious mental health issue, but it has not been studied enough, particularly in low-resource settings like Bangladesh. This study aimed to predict PTSD among breast cancer survivors using machine learning (ML) models and identify significant predictors through the Boruta algorithm, a feature selection tool, offering scalable solutions for early detection and intervention.Methods: A cross-sectional study of 138 post-mastectomy breast cancer patients was conducted across 3 hospitals in Bangladesh. Data on sociodemographic, health history, social experience, and treatment were collected using validated tools, including the PTSD Checklist for DSM-5 (PCL-5). The Boruta algorithm identified key predictors, and 10 ML models were evaluated for PTSD prediction using metrics such as accuracy, sensitivity, specificity, and AUC.Results: Random Forest (RF) outperformed other models (accuracy: 88.9%, AUC: 0.914). Significant predictors included education, monthly income, and changes in family behaviour. Factors like marital status, having chronic diseases, and hormone therapy were not statistically significant. PTSD prevalence was 34.1%, with urban residents and younger patients facing higher risks.Conclusion: ML models, particularly RF, demonstrated strong predictive performance and identified critical PTSD predictors. These findings highlight the potential for cost-effective PTSD screening in resource-constrained settings. Future research should focus on broader validation and longitudinal studies to refine predictive models.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"24 ","pages":"11769351251401330"},"PeriodicalIF":2.5,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12701936/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145764156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Pan-Cancer Analysis of the Prognostic and Immunological Role of ECT2: A Promising Target for Survival and Immunotherapy. ECT2的预后和免疫学作用的泛癌分析：一个有希望的生存和免疫治疗靶点。

IF 2.5 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2025-11-29 eCollection Date: 2025-01-01 DOI: 10.1177/11769351251396242

Lulu Wang, Hua Jin, Xiaowei Liu, Hanzhi Zhang

Objectives: The aim of this study is to investigate the role of epithelial cell transforming sequence 2 (ECT2) as a pan-cancer biomarker and to assess its potential as an immune-related target for cancer immunotherapy.

Methods: We conducted a comprehensive analysis of ECT2 expression across 44 tumor types using large-scale transcriptomic datasets from The Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) project. Pan-cancer Cox regression analyses were performed to evaluate the correlation between ECT2 expression and patient survival outcomes. Functional assays, including ECT2 knockdown via shRNA in the HepG2 hepatocellular carcinoma (HCC) cell line, were employed to investigate its mechanistic role. Transcriptomic profiling and pathway analyses were also conducted to explore the impact of ECT2 on cell proliferation and the tumor immune microenvironment.

Results: ECT2 was found to be significantly upregulated in 31 tumor types. Elevated ECT2 expression was consistently associated with worse overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI) across multiple cancer subtypes. Functional assays revealed that ECT2 knockdown significantly reduced HepG2 cell viability and impaired cell cycle progression, with downregulation of Cyclin D1. Transcriptomic analysis of ECT2-depleted cells indicated enriched gene sets related to cell proliferation and mitotic regulation. Additionally, ECT2 expression was significantly correlated with immune features, including immune cell infiltration, immune checkpoint gene expression, tumor mutational burden (TMB), and microsatellite instability (MSI).

Conclusion: ECT2 is identified as a potential pan-cancer prognostic biomarker with dual roles in tumor initiation and progression, as well as in modulating the tumor immune microenvironment. Our findings suggest that ECT2 may serve as a promising therapeutic target in cancer immunotherapy, warranting further investigation into its immune-regulatory and oncogenic functions.

目的：本研究的目的是研究上皮细胞转化序列2 （ECT2）作为泛癌症生物标志物的作用，并评估其作为癌症免疫治疗免疫相关靶点的潜力。方法：我们利用来自癌症基因组图谱（TCGA）和基因型-组织表达（GTEx）项目的大规模转录组数据集，对44种肿瘤类型的ECT2表达进行了全面分析。采用泛癌Cox回归分析评估ECT2表达与患者生存结果的相关性。在HepG2肝细胞癌（HCC）细胞系中，通过shRNA敲低ECT2的功能分析，研究了其机制作用。我们还通过转录组学分析和通路分析来探讨ECT2对细胞增殖和肿瘤免疫微环境的影响。结果：ECT2在31种肿瘤中表达显著上调。在多种癌症亚型中，升高的ECT2表达始终与较差的总生存期（OS）、疾病特异性生存期（DSS）、无病间期（DFI）和无进展间期（PFI）相关。功能分析显示，ECT2敲低显著降低HepG2细胞活力，细胞周期进程受损，Cyclin D1下调。转录组学分析显示，ect2缺失的细胞中富集了与细胞增殖和有丝分裂调控相关的基因集。此外，ECT2表达与免疫细胞浸润、免疫检查点基因表达、肿瘤突变负担（TMB）和微卫星不稳定性（MSI）等免疫特征显著相关。结论：ECT2是一种潜在的泛癌预后生物标志物，在肿瘤发生和进展以及调节肿瘤免疫微环境中具有双重作用。我们的研究结果表明，ECT2可能作为癌症免疫治疗的一个有希望的治疗靶点，值得进一步研究其免疫调节和致癌功能。

{"title":"Pan-Cancer Analysis of the Prognostic and Immunological Role of ECT2: A Promising Target for Survival and Immunotherapy.","authors":"Lulu Wang, Hua Jin, Xiaowei Liu, Hanzhi Zhang","doi":"10.1177/11769351251396242","DOIUrl":"10.1177/11769351251396242","url":null,"abstract":"Objectives: The aim of this study is to investigate the role of epithelial cell transforming sequence 2 (ECT2) as a pan-cancer biomarker and to assess its potential as an immune-related target for cancer immunotherapy.Methods: We conducted a comprehensive analysis of ECT2 expression across 44 tumor types using large-scale transcriptomic datasets from The Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) project. Pan-cancer Cox regression analyses were performed to evaluate the correlation between ECT2 expression and patient survival outcomes. Functional assays, including ECT2 knockdown via shRNA in the HepG2 hepatocellular carcinoma (HCC) cell line, were employed to investigate its mechanistic role. Transcriptomic profiling and pathway analyses were also conducted to explore the impact of ECT2 on cell proliferation and the tumor immune microenvironment.Results: ECT2 was found to be significantly upregulated in 31 tumor types. Elevated ECT2 expression was consistently associated with worse overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI) across multiple cancer subtypes. Functional assays revealed that ECT2 knockdown significantly reduced HepG2 cell viability and impaired cell cycle progression, with downregulation of Cyclin D1. Transcriptomic analysis of ECT2-depleted cells indicated enriched gene sets related to cell proliferation and mitotic regulation. Additionally, ECT2 expression was significantly correlated with immune features, including immune cell infiltration, immune checkpoint gene expression, tumor mutational burden (TMB), and microsatellite instability (MSI).Conclusion: ECT2 is identified as a potential pan-cancer prognostic biomarker with dual roles in tumor initiation and progression, as well as in modulating the tumor immune microenvironment. Our findings suggest that ECT2 may serve as a promising therapeutic target in cancer immunotherapy, warranting further investigation into its immune-regulatory and oncogenic functions.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"24 ","pages":"11769351251396242"},"PeriodicalIF":2.5,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145655403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Integrated Analysis of HAVCR1 with a Focus on Immunological and Prognostic Roles in Breast Cancer. 基于乳腺癌免疫和预后作用的HAVCR1的综合分析

IF 2.5 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2025-11-28 eCollection Date: 2025-01-01 DOI: 10.1177/11769351251393148

Wen Sun, Weiya Zhang, Jianyi Zhao, Mingyi Sang, Qixuan Feng, Wenbin Zhou, Yue Sun

Background: Breast cancer remains a predominant malignancy and a leading cause of oncologic mortality among women globally. The discovery of novel biomarkers is crucial for improving therapeutic outcomes.

Methods: We conducted a comprehensive analysis of the immunological and prognostic significance of hepatitis A virus cellular receptor 1 (HAVCR1) in breast cancer using publicly available datasets.

Results: HAVCR1 expression was markedly downregulated in breast cancer tissues. Significantly, lower expression levels of HAVCR1 in pre-treatment tumor samples were associated with poorer prognosis among pan-cancer patients undergoing immunotherapy, and a higher incidence of metastasis was observed in the breast cancer subgroup. Subtype-specific DEG analyses further indicated that distinct patterns of immune infiltration may underlie this association. Moreover, gene set enrichment analysis (GSEA) highlighted the immunological relevance of HAVCR1, particularly its involvement in T cell activation within the TNBC subtype. Clinically, elevated levels of HAVCR1 expression in pre-treatment T cells were indicative of a more favorable response to PD-1 blockade therapy compared to those with diminished expression.

Conclusion: The expression of HAVCR1 exhibits a strong correlation with immune infiltration and holds potential as a prognostic biomarker for breast cancer, offering predictive insight into the efficacy of immunotherapeutic interventions.

背景：乳腺癌仍然是一种主要的恶性肿瘤，也是全球妇女肿瘤死亡率的主要原因。新的生物标志物的发现对于改善治疗效果至关重要。方法：我们利用公开的数据集对甲型肝炎病毒细胞受体1 （HAVCR1）在乳腺癌中的免疫学和预后意义进行了全面分析。结果：HAVCR1在乳腺癌组织中表达明显下调。值得注意的是，在接受免疫治疗的泛癌患者中，治疗前肿瘤样本中较低的HAVCR1表达水平与较差的预后相关，并且在乳腺癌亚组中观察到较高的转移发生率。亚型特异性DEG分析进一步表明，不同的免疫浸润模式可能是这种关联的基础。此外，基因集富集分析（GSEA）强调了HAVCR1的免疫学相关性，特别是它参与TNBC亚型的T细胞活化。临床上，治疗前T细胞中HAVCR1表达水平升高表明与表达降低的T细胞相比，对PD-1阻断治疗的反应更有利。结论：HAVCR1的表达与免疫浸润有很强的相关性，具有作为乳腺癌预后生物标志物的潜力，为免疫治疗干预的疗效提供了预测性见解。

{"title":"An Integrated Analysis of HAVCR1 with a Focus on Immunological and Prognostic Roles in Breast Cancer.","authors":"Wen Sun, Weiya Zhang, Jianyi Zhao, Mingyi Sang, Qixuan Feng, Wenbin Zhou, Yue Sun","doi":"10.1177/11769351251393148","DOIUrl":"10.1177/11769351251393148","url":null,"abstract":"Background: Breast cancer remains a predominant malignancy and a leading cause of oncologic mortality among women globally. The discovery of novel biomarkers is crucial for improving therapeutic outcomes.Methods: We conducted a comprehensive analysis of the immunological and prognostic significance of hepatitis A virus cellular receptor 1 (HAVCR1) in breast cancer using publicly available datasets.Results: HAVCR1 expression was markedly downregulated in breast cancer tissues. Significantly, lower expression levels of HAVCR1 in pre-treatment tumor samples were associated with poorer prognosis among pan-cancer patients undergoing immunotherapy, and a higher incidence of metastasis was observed in the breast cancer subgroup. Subtype-specific DEG analyses further indicated that distinct patterns of immune infiltration may underlie this association. Moreover, gene set enrichment analysis (GSEA) highlighted the immunological relevance of HAVCR1, particularly its involvement in T cell activation within the TNBC subtype. Clinically, elevated levels of HAVCR1 expression in pre-treatment T cells were indicative of a more favorable response to PD-1 blockade therapy compared to those with diminished expression.Conclusion: The expression of HAVCR1 exhibits a strong correlation with immune infiltration and holds potential as a prognostic biomarker for breast cancer, offering predictive insight into the efficacy of immunotherapeutic interventions.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"24 ","pages":"11769351251393148"},"PeriodicalIF":2.5,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12663051/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145649533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised Random Forest Identifies Important Genetic Prognostic Factors for Breast Cancer Survival Time. 无监督随机森林识别乳腺癌生存时间的重要遗传预后因素。

IF 2.5 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2025-11-28 eCollection Date: 2025-01-01 DOI: 10.1177/11769351251393146

Benjamin Goldberg, Eric Nels Pederson, Zhengqing Ouyang

Objective: Breast cancer is one of the most prominent and deadly diseases in the world, and its prognosis varies widely based on the expression of certain genes. Identification of these genes is important for developing and interpreting clinical prognostic tests as well as furthering our understanding of breast cancer biology. We expand on prior efforts in the field toward identifying prognostic genes, by integrating powerful statistical methods.

Methods: To this end, we use an unsupervised random forest model, which allows for robust learning of non-linear gene expression/survival relationships and the ability to identify the most important genes affecting both positive and negative breast cancer prognosis. In total, 1,518 participants were considered from the METABRIC dataset, using 20,387 mRNA expression level variables and 23 clinical variables including HER2 mutation status. The top 250 & bottom 250 expressing genes and 6 clinical features were selected for the unsupervised random forest model.

Results: Our research corroborates previous discoveries of 27 important prognostic genes while also identifying 3 genes as potentially novel prognostic factors. Based on gene ontology analysis, we additionally show that these genes have plausible connections to breast cancer biology that should be experimentally investigated.

Conclusions: Here, we demonstrate the utility of the unsupervised random forest model over K-means clustering for identifying important genes in breast cancer.

目的：乳腺癌是世界上最突出和最致命的疾病之一，其预后因某些基因的表达而有很大差异。这些基因的鉴定对于发展和解释临床预后测试以及进一步加深我们对乳腺癌生物学的理解非常重要。通过整合强大的统计方法，我们扩展了先前在识别预后基因领域的努力。方法：为此，我们使用无监督随机森林模型，该模型允许对非线性基因表达/生存关系进行鲁棒学习，并能够识别影响乳腺癌阳性和阴性预后的最重要基因。总共从METABRIC数据集中考虑了1,518名参与者，使用了20,387个mRNA表达水平变量和23个临床变量，包括HER2突变状态。选择表达基因最多的250个和表达基因最少的250个以及6个临床特征作为无监督随机森林模型。结果：我们的研究证实了先前发现的27个重要预后基因，同时也确定了3个基因可能是新的预后因素。基于基因本体论分析，我们还表明这些基因与乳腺癌生物学有合理的联系，应该进行实验研究。结论：在这里，我们展示了非监督随机森林模型在K-means聚类中识别乳腺癌重要基因的效用。

{"title":"Unsupervised Random Forest Identifies Important Genetic Prognostic Factors for Breast Cancer Survival Time.","authors":"Benjamin Goldberg, Eric Nels Pederson, Zhengqing Ouyang","doi":"10.1177/11769351251393146","DOIUrl":"10.1177/11769351251393146","url":null,"abstract":"Objective: Breast cancer is one of the most prominent and deadly diseases in the world, and its prognosis varies widely based on the expression of certain genes. Identification of these genes is important for developing and interpreting clinical prognostic tests as well as furthering our understanding of breast cancer biology. We expand on prior efforts in the field toward identifying prognostic genes, by integrating powerful statistical methods.Methods: To this end, we use an unsupervised random forest model, which allows for robust learning of non-linear gene expression/survival relationships and the ability to identify the most important genes affecting both positive and negative breast cancer prognosis. In total, 1,518 participants were considered from the METABRIC dataset, using 20,387 mRNA expression level variables and 23 clinical variables including HER2 mutation status. The top 250 & bottom 250 expressing genes and 6 clinical features were selected for the unsupervised random forest model.Results: Our research corroborates previous discoveries of 27 important prognostic genes while also identifying 3 genes as potentially novel prognostic factors. Based on gene ontology analysis, we additionally show that these genes have plausible connections to breast cancer biology that should be experimentally investigated.Conclusions: Here, we demonstrate the utility of the unsupervised random forest model over K-means clustering for identifying important genes in breast cancer.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"24 ","pages":"11769351251393146"},"PeriodicalIF":2.5,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12663042/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145649557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0