Cancer Informatics最新文献_第4页

Cathepsin L in Lung Adenocarcinoma: Prognostic Significance and Immunotherapy Response Through a Multi Omics Perspective. 组织蛋白酶L在肺腺癌中的作用：多组学视角下的预后意义和免疫治疗反应。

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2024-12-16 eCollection Date: 2024-01-01 DOI: 10.1177/11769351241307492

Jianming Lu, Jiaqi Liang, Gang Xiao, Zitao He, Guifang Yu, Le Zhang, Chao Cai, Gao Yi, Jianjiang Xie

Objectives: Lung adenocarcinoma (LUAD), a predominant form of lung cancer, is characterized by a high rate of metastasis and recurrence, leading to a poor prognosis for LUAD patients. This study aimed to identify and rigorously validate a highly precise biomarker, Cathepsin L (CTSL), for the prognostic prediction of lung adenocarcinoma.

Methods: We employed a multicenter and omics-based approach, analyzing RNA sequencing data and mutation information from public databases such as The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO). The DepMap portal with Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR/Cas9) technology was used to assess the functional impact of CTSL. Immunohistochemistry (IHC) was conducted on a local cohort to validate the prognostic significance of CTSL at the protein expression level.

Results: Our findings revealed a significant correlation between elevated CTSL expression and advanced disease stage in LUAD patients. Kaplan-Meier survival analysis and Cox regression modeling revealed that high CTSL expression is associated with poor overall survival. The in vitro studies corroborated these findings, revealing notable suppression of tumor proliferation following CTSL knockout in cell lines, particularly in LUAD. Functional enrichment revealed that CTSL activated pathways associated with tumor progression, such as angiogenesis and Transforming growth factor beta (TGF-beta) signaling, and inhibited pathways such as apoptosis and DNA repair. Mutation analysis revealed distinct variations in the CTSL expression groups.

Conclusion: This study highlights the crucial role of CTSL as a prognostic biomarker in LUAD. This combined multicenter and omics-based analysis provides comprehensive insights into the biological role of CTSL, supporting its potential as a target for therapeutic intervention and a marker for prognosis in patients with LUAD.

目的：肺腺癌（LUAD）是肺癌的主要形式，其转移和复发率高，导致LUAD患者预后差。本研究旨在鉴定并严格验证一种高度精确的生物标志物，组织蛋白酶L (CTSL)，用于肺腺癌的预后预测。方法：采用多中心和组学方法，分析来自The Cancer Genome Atlas （TCGA）和Gene Expression Omnibus （GEO）等公共数据库的RNA测序数据和突变信息。采用聚类规则间隔短回文重复序列（CRISPR/Cas9）技术的DepMap门户网站评估CTSL的功能影响。通过免疫组化（IHC）对当地队列进行研究，在蛋白表达水平上验证CTSL的预后意义。结果：我们的研究结果揭示了LUAD患者CTSL表达升高与疾病晚期之间的显著相关性。Kaplan-Meier生存分析和Cox回归模型显示，CTSL高表达与较差的总生存相关。体外研究证实了这些发现，揭示了CTSL敲除后细胞系，特别是LUAD中肿瘤增殖的显著抑制。功能富集表明，CTSL激活了与肿瘤进展相关的血管生成和转化生长因子β (tgf - β)信号通路，抑制了凋亡和DNA修复等途径。突变分析显示CTSL表达组之间存在明显差异。结论：本研究强调了CTSL作为LUAD预后生物标志物的重要作用。这项多中心和基于组学的综合分析为CTSL的生物学作用提供了全面的见解，支持其作为LUAD患者治疗干预靶点和预后标记物的潜力。

{"title":"Cathepsin L in Lung Adenocarcinoma: Prognostic Significance and Immunotherapy Response Through a Multi Omics Perspective.","authors":"Jianming Lu, Jiaqi Liang, Gang Xiao, Zitao He, Guifang Yu, Le Zhang, Chao Cai, Gao Yi, Jianjiang Xie","doi":"10.1177/11769351241307492","DOIUrl":"10.1177/11769351241307492","url":null,"abstract":"Objectives: Lung adenocarcinoma (LUAD), a predominant form of lung cancer, is characterized by a high rate of metastasis and recurrence, leading to a poor prognosis for LUAD patients. This study aimed to identify and rigorously validate a highly precise biomarker, Cathepsin L (CTSL), for the prognostic prediction of lung adenocarcinoma.Methods: We employed a multicenter and omics-based approach, analyzing RNA sequencing data and mutation information from public databases such as The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO). The DepMap portal with Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR/Cas9) technology was used to assess the functional impact of CTSL. Immunohistochemistry (IHC) was conducted on a local cohort to validate the prognostic significance of CTSL at the protein expression level.Results: Our findings revealed a significant correlation between elevated CTSL expression and advanced disease stage in LUAD patients. Kaplan-Meier survival analysis and Cox regression modeling revealed that high CTSL expression is associated with poor overall survival. The in vitro studies corroborated these findings, revealing notable suppression of tumor proliferation following CTSL knockout in cell lines, particularly in LUAD. Functional enrichment revealed that CTSL activated pathways associated with tumor progression, such as angiogenesis and Transforming growth factor beta (TGF-beta) signaling, and inhibited pathways such as apoptosis and DNA repair. Mutation analysis revealed distinct variations in the CTSL expression groups.Conclusion: This study highlights the crucial role of CTSL as a prognostic biomarker in LUAD. This combined multicenter and omics-based analysis provides comprehensive insights into the biological role of CTSL, supporting its potential as a target for therapeutic intervention and a marker for prognosis in patients with LUAD.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"23 ","pages":"11769351241307492"},"PeriodicalIF":2.4,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648051/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142839637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Utilizing an In-silico Approach to Pinpoint Potential Biomarkers for Enhanced Early Detection of Colorectal Cancer. 利用芯片方法精确定位潜在的生物标志物以增强结直肠癌的早期检测。

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2024-12-16 eCollection Date: 2024-01-01 DOI: 10.1177/11769351241307163

Alireza Gharebaghi, Saeid Afshar, Leili Tapak, Hossein Ranjbar, Massoud Saidijam, Irina Dinu

Objectives: Colorectal cancer (CRC) is a prevalent disease characterized by significant dysregulation of gene expression. Non-invasive tests that utilize microRNAs (miRNAs) have shown promise for early CRC detection. This study aims to determine the association between miRNAs and key genes in CRC.

Methods: Two datasets (GSE106817 and GSE23878) were extracted from the NCBI Gene Expression Omnibus database. Penalized logistic regression (PLR) and artificial neural networks (ANN) were used to identify relevant miRNAs and evaluate the classification accuracy of the selected miRNAs. The findings were validated through bipartite miRNA-mRNA interactions.

Results: Our analysis identified 3 miRNAs: miR-1228, miR-6765-5p, and miR-6787-5p, achieving a total accuracy of over 90%. Based on the results of the mRNA-miRNA interaction network, CDK1 and MAD2L1 were identified as target genes of miR-6787-5p.

Conclusions: Our results suggest that the identified miRNAs and target genes could serve as non-invasive biomarkers for diagnosing colorectal cancer, pending laboratory confirmation.

研究目的结肠直肠癌（CRC）是一种以基因表达严重失调为特征的流行病。利用微RNAs（miRNAs）进行的无创检测已显示出早期检测CRC的前景。本研究旨在确定 miRNA 与 CRC 关键基因之间的关联：从 NCBI 基因表达总库数据库中提取了两个数据集（GSE106817 和 GSE23878）。采用惩罚性逻辑回归（PRR）和人工神经网络（ANN）识别相关的miRNA，并评估所选miRNA的分类准确性。结果：我们的分析确定了 3 个 miRNA：miR-1228、miR-6765-5p 和 miR-6787-5p，总准确率超过 90%。根据mRNA-miRNA相互作用网络的结果，CDK1和MAD2L1被确定为miR-6787-5p的靶基因：我们的研究结果表明，所发现的 miRNA 和靶基因可作为诊断结直肠癌的非侵入性生物标志物，但尚待实验室确认。

{"title":"Utilizing an In-silico Approach to Pinpoint Potential Biomarkers for Enhanced Early Detection of Colorectal Cancer.","authors":"Alireza Gharebaghi, Saeid Afshar, Leili Tapak, Hossein Ranjbar, Massoud Saidijam, Irina Dinu","doi":"10.1177/11769351241307163","DOIUrl":"10.1177/11769351241307163","url":null,"abstract":"Objectives: Colorectal cancer (CRC) is a prevalent disease characterized by significant dysregulation of gene expression. Non-invasive tests that utilize microRNAs (miRNAs) have shown promise for early CRC detection. This study aims to determine the association between miRNAs and key genes in CRC.Methods: Two datasets (GSE106817 and GSE23878) were extracted from the NCBI Gene Expression Omnibus database. Penalized logistic regression (PLR) and artificial neural networks (ANN) were used to identify relevant miRNAs and evaluate the classification accuracy of the selected miRNAs. The findings were validated through bipartite miRNA-mRNA interactions.Results: Our analysis identified 3 miRNAs: miR-1228, miR-6765-5p, and miR-6787-5p, achieving a total accuracy of over 90%. Based on the results of the mRNA-miRNA interaction network, CDK1 and MAD2L1 were identified as target genes of miR-6787-5p.Conclusions: Our results suggest that the identified miRNAs and target genes could serve as non-invasive biomarkers for diagnosing colorectal cancer, pending laboratory confirmation.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"23 ","pages":"11769351241307163"},"PeriodicalIF":2.4,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142839639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detecting the Tumor Prognostic Factors From the YTH Domain Family Through Integrative Pan-Cancer Analysis. 通过泛癌综合分析检测 YTH 结构域家族的肿瘤预后因子

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2024-11-16 eCollection Date: 2024-01-01 DOI: 10.1177/11769351241300030

Chong-Ying Zhu, Qi-Wei Yang, Xin-Yue Mu, Yan-Yu Zhai, Wen-Yan Zhao, Zuo-Jing Yin

Objectives: Emerging evidence suggests that N6-methyladenosine (m⁶A) methylation plays a critical role in cancers through various mechanisms. This work aims to reveal the essential role of m⁶A methylation "readers" in regulation of cancer prognosis at the pan-cancer level.

Methods: Herein, we focused on one special protein family of the "readers" of m⁶A methylation, YT521-B homology (YTH) domain family genes, which were observed to be frequently dysregulated in tumor tissues and closely associated with cancer prognosis. Then, a comprehensive analysis of modulation in cancer prognosis was conducted by integrating RNA sequencing (RNAseq) datasets of YTH family genes and clinical information at the pan-cancer level.

Results: YTH family genes were significantly differentially expressed in most of the cancers, particularly increased in Gastrointestinal cancers, and decreased in Endocrine and Urologic cancers. In addition, they were observed to be associated with overall survival (OS) and disease-specific survival (DSS) with various extent, especially in lower grade glioma (LGG), thyroid cancer (THCA), liver hepatocellular carcinoma (LIHC) and kidney clear cell carcinoma (KIRC), so were some "writers" (METLL3, METLL14, WTAP) and "erasers" (FTO, ALKBH5). Further survival analysis illustrated that YTH family genes specifically YTHScore constructed by combining 5 YTH family genes, as well as RWEScore calculated by combining genes from "readers"-"writers"-"erasers" could dramatically distinguish tumor prognosis in 4 representative cancers. As expected, YTHScore presented an equally comparable prognostic classification with RWEScore. Finally, analysis of immune signatures and clinical characteristics implied that, the activity of the innate immune, diagnostic age, clinical stage, Tumor-Node-Metastasis (TNM) stage and immune types, might play specific roles in modulating tumor prognosis.

Conclusions: The study demonstrated that YTH family genes had the potential to predict tumor prognosis, in which the YTHScore illustrated equal ability to predict tumor prognosis compared to RWEScore, thus providing insights into prognostic biomarkers and therapeutic targets at the pan-cancer level.

目的：新的证据表明，N6-甲基腺苷（m6A）甲基化通过各种机制在癌症中发挥着关键作用。方法：我们重点研究了m6A甲基化 "读者 "中的一个特殊蛋白家族--YT521-B同源（YTH）结构域家族基因，观察到这些基因在肿瘤组织中频繁失调，并与癌症预后密切相关。然后，通过整合YTH家族基因的RNA测序（RNAseq）数据集和泛癌水平的临床信息，对其在癌症预后中的调控进行了综合分析：结果：YTH 家族基因在大多数癌症中都有明显的差异表达，尤其是在胃肠道癌症中增加，而在内分泌和泌尿系统癌症中减少。此外，还观察到它们与总生存期（OS）和疾病特异性生存期（DSS）有不同程度的相关性，尤其是在低级别胶质瘤（LGG）、甲状腺癌（THCA）、肝肝细胞癌（LIHC）和肾透明细胞癌（KIRC）中，一些 "写手"（METLL3、METLL14、WTAP）和 "擦除者"（FTO、ALKBH5）也是如此。进一步的生存分析表明，YTH 家族基因，特别是由 5 个 YTH 家族基因组合而成的 YTHScore，以及由 "阅读者"-"书写者"-"擦除者 "基因组合而成的 RWEScore 可以显著区分 4 种代表性癌症的肿瘤预后。不出所料，YTHScore 与 RWEScore 在预后分类方面具有同等的可比性。最后，对免疫特征和临床特征的分析表明，先天性免疫的活性、诊断年龄、临床分期、肿瘤-结节-转移（TNM）分期和免疫类型可能在调节肿瘤预后方面发挥特殊作用：该研究表明，YTH 家族基因具有预测肿瘤预后的潜力，其中 YTHScore 与 RWEScore 相比，具有相同的预测肿瘤预后的能力，从而为泛癌症层面的预后生物标志物和治疗靶点提供了新的视角。

{"title":"Detecting the Tumor Prognostic Factors From the YTH Domain Family Through Integrative Pan-Cancer Analysis.","authors":"Chong-Ying Zhu, Qi-Wei Yang, Xin-Yue Mu, Yan-Yu Zhai, Wen-Yan Zhao, Zuo-Jing Yin","doi":"10.1177/11769351241300030","DOIUrl":"10.1177/11769351241300030","url":null,"abstract":"Objectives: Emerging evidence suggests that N6-methyladenosine (m6A) methylation plays a critical role in cancers through various mechanisms. This work aims to reveal the essential role of m6A methylation \"readers\" in regulation of cancer prognosis at the pan-cancer level.Methods: Herein, we focused on one special protein family of the \"readers\" of m6A methylation, YT521-B homology (YTH) domain family genes, which were observed to be frequently dysregulated in tumor tissues and closely associated with cancer prognosis. Then, a comprehensive analysis of modulation in cancer prognosis was conducted by integrating RNA sequencing (RNAseq) datasets of YTH family genes and clinical information at the pan-cancer level.Results: YTH family genes were significantly differentially expressed in most of the cancers, particularly increased in Gastrointestinal cancers, and decreased in Endocrine and Urologic cancers. In addition, they were observed to be associated with overall survival (OS) and disease-specific survival (DSS) with various extent, especially in lower grade glioma (LGG), thyroid cancer (THCA), liver hepatocellular carcinoma (LIHC) and kidney clear cell carcinoma (KIRC), so were some \"writers\" (METLL3, METLL14, WTAP) and \"erasers\" (FTO, ALKBH5). Further survival analysis illustrated that YTH family genes specifically YTHScore constructed by combining 5 YTH family genes, as well as RWEScore calculated by combining genes from \"readers\"-\"writers\"-\"erasers\" could dramatically distinguish tumor prognosis in 4 representative cancers. As expected, YTHScore presented an equally comparable prognostic classification with RWEScore. Finally, analysis of immune signatures and clinical characteristics implied that, the activity of the innate immune, diagnostic age, clinical stage, Tumor-Node-Metastasis (TNM) stage and immune types, might play specific roles in modulating tumor prognosis.Conclusions: The study demonstrated that YTH family genes had the potential to predict tumor prognosis, in which the YTHScore illustrated equal ability to predict tumor prognosis compared to RWEScore, thus providing insights into prognostic biomarkers and therapeutic targets at the pan-cancer level.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"23 ","pages":"11769351241300030"},"PeriodicalIF":2.4,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11569503/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142648656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unveiling Recurrence Patterns: Analyzing Predictive Risk Factors for Breast Cancer Recurrence after Surgery. 揭示复发模式：分析乳腺癌术后复发的预测风险因素。

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2024-11-08 eCollection Date: 2024-01-01 DOI: 10.1177/11769351241297633

Monireh Shahmoradi, Ahmad Fazilat, Mostafa Ghaderi-Zefrehei, Arash Ardalan, Ali Bigdeli, Nahid Nafissi, Ebrahim Babaei, Mahsa Rahmani

Objectives: Breast cancer (BC) stands as the second-leading cause of female-specific cancer-related fatalities globally, necessitating comprehensive research to address its critical aspects. This study aimed to explore the time intervals between surgery and disease recurrence in BC patients and their survival utilizing various parametric and semi-parametric models.

Methods: After the examination of data collected from 2010 to 2021 at a BC Center in Tehran, Iran, 171 cases met the criteria for analysis out of 2246 datasets. Model fitting, was assessed through the Akaike Information Criterion (AIC), and indicated the logistic distribution as the most fit one among concurrent and independent variable models.

Results: The Cox proportional hazard regression model consistently demonstrated superior fitting, characterized by the lowest AIC values. The average age at diagnosis was 50.39 years, with a standard deviation of 11.13. Typical survival time was estimated 53.44 months, falling within a confidence interval of 51.41-55.48 months at a 95% confidence level. The 1-year survival rate was determined at 0.92 (95% CI: 0.89-0.94). Notably, patient age while cancer diagnosis, progesterone receptor (PR), tumor grade, and tumor stage were found to be statistically significant (P < .05) risk factors for prediction of BC recurrence after surgery in Iran by Cox model.

Conclusions: Our findings underscore the importance of further exploration and consideration of the identified risk factors in BC research and treatment strategies.

目的：乳腺癌（BC）是导致全球女性癌症死亡的第二大原因，因此有必要针对其关键问题进行全面研究。本研究旨在利用各种参数和半参数模型，探讨乳腺癌患者从手术到疾病复发的时间间隔及其生存率：在对伊朗德黑兰 BC 中心 2010 年至 2021 年收集的数据进行检查后，2246 个数据集中有 171 个病例符合分析标准。通过阿凯克信息准则（AIC）对模型拟合进行评估，结果表明，在并发和自变量模型中，逻辑分布是最拟合的模型：结果：Cox 比例危险回归模型一直表现出较好的拟合效果，其特点是 AIC 值最低。确诊时的平均年龄为 50.39 岁，标准差为 11.13 岁。典型生存时间估计为 53.44 个月，在 95% 的置信水平下，置信区间为 51.41-55.48 个月。1 年生存率为 0.92（95% 置信区间：0.89-0.94）。值得注意的是，癌症确诊时的患者年龄、孕酮受体（PR）、肿瘤分级和肿瘤分期均有统计学意义（P 结论：P<0.05）：我们的研究结果表明，在 BC 研究和治疗策略中进一步探索和考虑已确定的风险因素非常重要。

{"title":"Unveiling Recurrence Patterns: Analyzing Predictive Risk Factors for Breast Cancer Recurrence after Surgery.","authors":"Monireh Shahmoradi, Ahmad Fazilat, Mostafa Ghaderi-Zefrehei, Arash Ardalan, Ali Bigdeli, Nahid Nafissi, Ebrahim Babaei, Mahsa Rahmani","doi":"10.1177/11769351241297633","DOIUrl":"https://doi.org/10.1177/11769351241297633","url":null,"abstract":"Objectives: Breast cancer (BC) stands as the second-leading cause of female-specific cancer-related fatalities globally, necessitating comprehensive research to address its critical aspects. This study aimed to explore the time intervals between surgery and disease recurrence in BC patients and their survival utilizing various parametric and semi-parametric models.Methods: After the examination of data collected from 2010 to 2021 at a BC Center in Tehran, Iran, 171 cases met the criteria for analysis out of 2246 datasets. Model fitting, was assessed through the Akaike Information Criterion (AIC), and indicated the logistic distribution as the most fit one among concurrent and independent variable models.Results: The Cox proportional hazard regression model consistently demonstrated superior fitting, characterized by the lowest AIC values. The average age at diagnosis was 50.39 years, with a standard deviation of 11.13. Typical survival time was estimated 53.44 months, falling within a confidence interval of 51.41-55.48 months at a 95% confidence level. The 1-year survival rate was determined at 0.92 (95% CI: 0.89-0.94). Notably, patient age while cancer diagnosis, progesterone receptor (PR), tumor grade, and tumor stage were found to be statistically significant (P < .05) risk factors for prediction of BC recurrence after surgery in Iran by Cox model.Conclusions: Our findings underscore the importance of further exploration and consideration of the identified risk factors in BC research and treatment strategies.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"23 ","pages":"11769351241297633"},"PeriodicalIF":2.4,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11549699/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142628843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding the Biological Basis of Polygenic Risk Scores and Disparities in Prostate Cancer: A Comprehensive Genomic Analysis. 了解前列腺癌多基因风险评分和差异的生物学基础：综合基因组分析

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2024-10-21 eCollection Date: 2024-01-01 DOI: 10.1177/11769351241276319

Wensheng Zhang, Kun Zhang

Objectives: For prostate cancer (PCa), hundreds of risk variants have been identified. It remains unknown whether the polygenic risk score (PRS) that combines the effects of these variants is also a sufficiently informative metric with relevance to the molecular mechanisms of carcinogenesis in prostate. We aimed to understand the biological basis of PRS and racial disparities in the cancer.

Methods: We performed a comprehensive analysis of the data generated (deposited in) by several genomic and/or transcriptomic projects (databases), including the GTEx, TCGA, 1000 Genomes, GEO and dbGap. PRS was constructed from 260 PCa risk variants that were identified by a recent trans-ancestry meta-analysis and contained in the GTEx dataset. The dosages of risk variants and the multi-ancestry effects on PCa incidence estimated by the meta-analysis were used in calculating individual PRS values.

Results: The following novel results were obtained from our analyses. (1) In normal prostate samples from healthy European Americans (EAs), the expression levels of 540 genes (termed PRS genes) were associated with the PRS (P < .01). (2) Ubiquitin-proteasome system in high-PRS individuals' prostates was more active than that in low-PRS individuals' prostates. (3) Nine PRS genes play roles in the cancer progression-relevant parts, which are frequently hit by somatic mutations in PCa, of PI3K-Akt/RAS-MAPK/mTOR signaling pathways. (4) The expression profiles of the top significant PRS genes in tumor samples were capable of predicting malignant PCa relapse after prostatectomy. (5) The transcriptomic differences between African American and EA samples were incompatible with the patterns of the aforementioned associations between PRS and gene expression levels.

Conclusions: This study provided unique insights into the relationship between PRS and the molecular mechanisms of carcinogenesis in prostate. The new findings, alongside the moderate but significant heritability of PCa susceptibility contributed by the risk variants, suggest the aptness and inaptness of PRS for explaining PCa and disparities.

目的：前列腺癌（PCa）的风险变异体已发现数百种。综合这些变异影响的多基因风险评分（PRS）是否也是一种与前列腺癌分子致癌机制相关的、具有足够信息量的指标，目前仍不得而知。我们旨在了解 PRS 的生物学基础以及癌症的种族差异：我们对 GTEx、TCGA、1000 Genomes、GEO 和 dbGap 等多个基因组和/或转录组项目（数据库）生成（存入）的数据进行了综合分析。PRS 是根据 GTEx 数据集中的 260 个 PCa 风险变异构建的，这些风险变异是由最近的一项跨祖先荟萃分析确定的。荟萃分析估计的风险变异剂量和对 PCa 发病率的多基因影响被用于计算单个 PRS 值：我们的分析得出了以下新结果。(1）在健康的欧洲裔美国人（EAs）的正常前列腺样本中，540 个基因（称为 PRS 基因）的表达水平与 PRS 值（P 结论）相关：这项研究为前列腺癌 PRS 与致癌分子机制之间的关系提供了独特的见解。这些新发现以及风险变异对 PCa 易感性的适度但显著的遗传性，表明了 PRS 在解释 PCa 和差异方面的适用性和不适用性。

{"title":"Understanding the Biological Basis of Polygenic Risk Scores and Disparities in Prostate Cancer: A Comprehensive Genomic Analysis.","authors":"Wensheng Zhang, Kun Zhang","doi":"10.1177/11769351241276319","DOIUrl":"10.1177/11769351241276319","url":null,"abstract":"Objectives: For prostate cancer (PCa), hundreds of risk variants have been identified. It remains unknown whether the polygenic risk score (PRS) that combines the effects of these variants is also a sufficiently informative metric with relevance to the molecular mechanisms of carcinogenesis in prostate. We aimed to understand the biological basis of PRS and racial disparities in the cancer.Methods: We performed a comprehensive analysis of the data generated (deposited in) by several genomic and/or transcriptomic projects (databases), including the GTEx, TCGA, 1000 Genomes, GEO and dbGap. PRS was constructed from 260 PCa risk variants that were identified by a recent trans-ancestry meta-analysis and contained in the GTEx dataset. The dosages of risk variants and the multi-ancestry effects on PCa incidence estimated by the meta-analysis were used in calculating individual PRS values.Results: The following novel results were obtained from our analyses. (1) In normal prostate samples from healthy European Americans (EAs), the expression levels of 540 genes (termed PRS genes) were associated with the PRS (P < .01). (2) Ubiquitin-proteasome system in high-PRS individuals' prostates was more active than that in low-PRS individuals' prostates. (3) Nine PRS genes play roles in the cancer progression-relevant parts, which are frequently hit by somatic mutations in PCa, of PI3K-Akt/RAS-MAPK/mTOR signaling pathways. (4) The expression profiles of the top significant PRS genes in tumor samples were capable of predicting malignant PCa relapse after prostatectomy. (5) The transcriptomic differences between African American and EA samples were incompatible with the patterns of the aforementioned associations between PRS and gene expression levels.Conclusions: This study provided unique insights into the relationship between PRS and the molecular mechanisms of carcinogenesis in prostate. The new findings, alongside the moderate but significant heritability of PCa susceptibility contributed by the risk variants, suggest the aptness and inaptness of PRS for explaining PCa and disparities.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"23 ","pages":"11769351241276319"},"PeriodicalIF":2.4,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11497523/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142509508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine Learning for Dynamic Prognostication of Patients With Hepatocellular Carcinoma Using Time-Series Data: Survival Path Versus Dynamic-DeepHit HCC Model. 使用时间序列数据对肝细胞癌患者进行动态诊断的机器学习：生存路径与动态深度HCC模型的比较

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2024-10-16 eCollection Date: 2024-01-01 DOI: 10.1177/11769351241289719

Lujun Shen, Yiquan Jiang, Tao Zhang, Fei Cao, Liangru Ke, Chen Li, Gulijiayina Nuerhashi, Wang Li, Peihong Wu, Chaofeng Li, Qi Zeng, Weijun Fan

Objectives: Patients with intermediate or advanced hepatocellular carcinoma (HCC) require repeated disease monitoring, prognosis assessment and treatment planning. In 2018, a novel machine learning methodology "survival path" (SP) was developed to facilitate dynamic prognosis prediction and treatment planning. One year after, a deep learning approach called Dynamic Deephit was developed. The performance of the two state-of-art models in dynamic prognostication have not been compared.

Methods: We trained and tested the SP and Dynamic DeepHit models in a large cohort of 2511 HCC patients using time-series data. The time-series data were converted into data of time slices, with an interval of three months. The time-dependent c-index for OS at given prediction time (t = 1, 6, 12, 18 months) and evaluation time (∆t = 3, 6, 9, 12, 18, 24, 36, 48 months) were compared.

Results: The comparison between SP model and Dynamic DeepHit-HCC model showed the latter had significant better performance at the time of initial admission. The time-dependent c-index of Dynamic DeepHit-HCC model gradually decreased with the extension of time (from 0.756 to 0.639 in the training set; from 0.787 to 0.661 in internal testing set; from 0.725 to 0.668 in multicenter testing set); while the time-dependent c-index of SP model displayed an increased trend (from 0.665 to 0.748 in the training set; from 0.608 to 0.743 in internal testing set; from 0.643 to 0.720 in multicenter testing set). When the prediction time comes to 6 months or later since initial treatment, the survival path model outperformed the dynamic DeepHit model at late evaluation times (∆t > 12 months).

Conclusions: This research highlighted the unique strengths of both models. The SP model had advantage in long term prediction while the Dynamic DeepHit-HCC model had advantages in prediction at near time points. Fine selection of models is needed in dealing with different scenarios.

目标：中晚期肝细胞癌（HCC）患者需要反复进行疾病监测、预后评估和治疗规划。2018 年，一种新颖的机器学习方法 "生存路径"（SP）被开发出来，以促进动态预后预测和治疗规划。一年后，又开发出一种名为 "动态 Deephit "的深度学习方法。这两种最先进的模型在动态预后方面的表现尚未进行过比较：我们使用时间序列数据，在 2511 名 HCC 患者的大型队列中训练和测试了 SP 模型和动态 DeepHit 模型。时间序列数据被转换为时间片数据，每片间隔为三个月。比较了特定预测时间（t = 1、6、12、18 个月）和评估时间（Δt = 3、6、9、12、18、24、36、48 个月）下与时间相关的 OS c 指数：结果：SP 模型与动态 DeepHit-HCC 模型的比较结果表明，后者在初始入院时的表现明显更好。随着时间的延长，动态 DeepHit-HCC 模型与时间相关的 c 指数逐渐下降（训练集从 0.756 降至 0.639；内部测试集从 0.787 降至 0.661；多中心测试集从 0.725 降至 0.668）。在多中心测试集中从 0.725 降至 0.668）；而 SP 模型随时间变化的 c 指数呈上升趋势（在训练集中从 0.665 升至 0.748；在内部测试集中从 0.608 升至 0.743；在多中心测试集中从 0.643 升至 0.720）。当预测时间达到初始治疗后 6 个月或更晚时，生存路径模型在晚期评估时间（Δt > 12 个月）的表现优于动态 DeepHit 模型：这项研究凸显了两种模型的独特优势。SP模型在长期预测方面具有优势，而动态DeepHit-HCC模型在近时间点的预测方面具有优势。在处理不同情况时，需要对模型进行精细选择。

{"title":"Machine Learning for Dynamic Prognostication of Patients With Hepatocellular Carcinoma Using Time-Series Data: Survival Path Versus Dynamic-DeepHit HCC Model.","authors":"Lujun Shen, Yiquan Jiang, Tao Zhang, Fei Cao, Liangru Ke, Chen Li, Gulijiayina Nuerhashi, Wang Li, Peihong Wu, Chaofeng Li, Qi Zeng, Weijun Fan","doi":"10.1177/11769351241289719","DOIUrl":"https://doi.org/10.1177/11769351241289719","url":null,"abstract":"Objectives: Patients with intermediate or advanced hepatocellular carcinoma (HCC) require repeated disease monitoring, prognosis assessment and treatment planning. In 2018, a novel machine learning methodology \"survival path\" (SP) was developed to facilitate dynamic prognosis prediction and treatment planning. One year after, a deep learning approach called Dynamic Deephit was developed. The performance of the two state-of-art models in dynamic prognostication have not been compared.Methods: We trained and tested the SP and Dynamic DeepHit models in a large cohort of 2511 HCC patients using time-series data. The time-series data were converted into data of time slices, with an interval of three months. The time-dependent c-index for OS at given prediction time (t = 1, 6, 12, 18 months) and evaluation time (∆t = 3, 6, 9, 12, 18, 24, 36, 48 months) were compared.Results: The comparison between SP model and Dynamic DeepHit-HCC model showed the latter had significant better performance at the time of initial admission. The time-dependent c-index of Dynamic DeepHit-HCC model gradually decreased with the extension of time (from 0.756 to 0.639 in the training set; from 0.787 to 0.661 in internal testing set; from 0.725 to 0.668 in multicenter testing set); while the time-dependent c-index of SP model displayed an increased trend (from 0.665 to 0.748 in the training set; from 0.608 to 0.743 in internal testing set; from 0.643 to 0.720 in multicenter testing set). When the prediction time comes to 6 months or later since initial treatment, the survival path model outperformed the dynamic DeepHit model at late evaluation times (∆t > 12 months).Conclusions: This research highlighted the unique strengths of both models. The SP model had advantage in long term prediction while the Dynamic DeepHit-HCC model had advantages in prediction at near time points. Fine selection of models is needed in dealing with different scenarios.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"23 ","pages":"11769351241289719"},"PeriodicalIF":2.4,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11483769/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142476533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Advancements and Challenges in the Image-Based Diagnosis of Lung and Colon Cancer: A Comprehensive Review. 基于图像的肺癌和结肠癌诊断的进步与挑战：全面回顾。

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2024-10-16 eCollection Date: 2024-01-01 DOI: 10.1177/11769351241290608

Pragati Patharia, Prabira Kumar Sethy, Aziz Nanthaamornphong

Image-based diagnosis has become a crucial tool in the identification and management of various cancers, particularly lung and colon cancer. This review delves into the latest advancements and ongoing challenges in the field, with a focus on deep learning, machine learning, and image processing techniques applied to X-rays, CT scans, and histopathological images. Significant progress has been made in imaging technologies like computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET), which, when combined with machine learning and artificial intelligence (AI) methodologies, have greatly enhanced the accuracy of cancer detection and characterization. These advances have enabled early detection, more precise tumor localization, personalized treatment plans, and overall improved patient outcomes. However, despite these improvements, challenges persist. Variability in image interpretation, the lack of standardized diagnostic protocols, unequal access to advanced imaging technologies, and concerns over data privacy and security within AI-based systems remain major obstacles. Furthermore, integrating imaging data with broader clinical information is crucial to achieving a more comprehensive approach to cancer diagnosis and treatment. This review provides valuable insights into the recent developments and challenges in image-based diagnosis for lung and colon cancers, underscoring both the remarkable progress and the hurdles that still need to be overcome to optimize cancer care.

基于图像的诊断已成为识别和管理各种癌症，尤其是肺癌和结肠癌的重要工具。本综述深入探讨了该领域的最新进展和持续挑战，重点关注应用于 X 射线、CT 扫描和组织病理学图像的深度学习、机器学习和图像处理技术。计算机断层扫描（CT）、磁共振成像（MRI）和正电子发射断层扫描（PET）等成像技术取得了重大进展，这些技术与机器学习和人工智能（AI）方法相结合，大大提高了癌症检测和定性的准确性。这些进步使得早期检测、更精确的肿瘤定位、个性化治疗方案以及患者预后的全面改善成为可能。然而，尽管取得了这些进步，挑战依然存在。图像解读的不一致性、缺乏标准化的诊断方案、先进成像技术的使用机会不平等，以及对基于人工智能系统的数据隐私和安全性的担忧，仍然是主要障碍。此外，将成像数据与更广泛的临床信息相结合对于实现更全面的癌症诊断和治疗方法至关重要。本综述就基于图像的肺癌和结肠癌诊断的最新进展和挑战提供了宝贵的见解，强调了在优化癌症治疗方面取得的显著进展和仍需克服的障碍。

{"title":"Advancements and Challenges in the Image-Based Diagnosis of Lung and Colon Cancer: A Comprehensive Review.","authors":"Pragati Patharia, Prabira Kumar Sethy, Aziz Nanthaamornphong","doi":"10.1177/11769351241290608","DOIUrl":"10.1177/11769351241290608","url":null,"abstract":"Image-based diagnosis has become a crucial tool in the identification and management of various cancers, particularly lung and colon cancer. This review delves into the latest advancements and ongoing challenges in the field, with a focus on deep learning, machine learning, and image processing techniques applied to X-rays, CT scans, and histopathological images. Significant progress has been made in imaging technologies like computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET), which, when combined with machine learning and artificial intelligence (AI) methodologies, have greatly enhanced the accuracy of cancer detection and characterization. These advances have enabled early detection, more precise tumor localization, personalized treatment plans, and overall improved patient outcomes. However, despite these improvements, challenges persist. Variability in image interpretation, the lack of standardized diagnostic protocols, unequal access to advanced imaging technologies, and concerns over data privacy and security within AI-based systems remain major obstacles. Furthermore, integrating imaging data with broader clinical information is crucial to achieving a more comprehensive approach to cancer diagnosis and treatment. This review provides valuable insights into the recent developments and challenges in image-based diagnosis for lung and colon cancers, underscoring both the remarkable progress and the hurdles that still need to be overcome to optimize cancer care.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"23 ","pages":"11769351241290608"},"PeriodicalIF":2.4,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526153/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142559000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prediction of Treatment Recommendations Via Ensemble Machine Learning Algorithms for Non-Small Cell Lung Cancer Patients in Personalized Medicine. 通过集合机器学习算法为非小细胞肺癌患者预测个性化医疗中的治疗建议

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2024-10-14 eCollection Date: 2024-01-01 DOI: 10.1177/11769351241272397

Hojin Moon, Lauren Tran, Andrew Lee, Taeksoo Kwon, Minho Lee

Objectives: The primary goal of this research is to develop treatment-related genomic predictive markers for non-small cell lung cancer by integrating various machine learning algorithms that recommends near-optimal individualized patient treatment for chemotherapy in an effort to maximize efficacy or minimize treatment-related toxicity. This research can contribute toward developing a more refined, accurate and effective therapy accounting for specific patient needs.

Methods: To accomplish our research goal, we implement ensemble learning algorithms, bagging with regularized Cox regression models and nonparametric tree-based models via Random Survival Forests. A comprehensive meta-database was compiled from the NCBI Gene Expression Omnibus data repository for lung cancer patients to capture and utilize complex genomic patterns that can predict treatment outcomes more accurately.

Results: The developed novel prediction algorithm demonstrates the ability to support complex clinical decision-making processes in the treatment of NSCLC. It effectively addresses patient heterogeneity, offering predictions that are both refined and personalized in improving the precision of chemotherapy regimens prescribed to the eligible patients.

Conclusion: This research should contribute substantial advancement of cancer treatments by improving the accuracy and efficacy of chemotherapy treatments for a targeted group of patients who need the right treatment. The integration of complex machine learning techniques with genomic data holds substantial potential to transform current cancer treatment paradigms by providing robust support in clinical decision-making.

目标：本研究的主要目标是通过整合各种机器学习算法，开发与治疗相关的非小细胞肺癌基因组预测标记物，从而为患者推荐近乎最佳的个体化化疗方案，努力实现疗效最大化或治疗相关毒性最小化。这项研究有助于开发更精细、更准确、更有效的疗法，满足患者的特殊需求：为了实现我们的研究目标，我们采用了集合学习算法，通过随机生存森林（Random Survival Forests）对正则化考克斯回归模型和基于树的非参数模型进行装袋。我们从NCBI基因表达总库中为肺癌患者建立了一个全面的元数据库，以捕捉和利用复杂的基因组模式，从而更准确地预测治疗结果：结果：所开发的新型预测算法能够支持治疗 NSCLC 的复杂临床决策过程。它有效地解决了患者的异质性问题，提供了精细化和个性化的预测，提高了为符合条件的患者开具化疗方案的精确性：这项研究通过提高化疗的准确性和疗效，为需要正确治疗的目标患者群体提供化疗方案，从而为癌症治疗的实质性进步做出贡献。复杂的机器学习技术与基因组数据的整合为临床决策提供了强有力的支持，从而为改变当前的癌症治疗模式带来了巨大的潜力。

{"title":"Prediction of Treatment Recommendations Via Ensemble Machine Learning Algorithms for Non-Small Cell Lung Cancer Patients in Personalized Medicine.","authors":"Hojin Moon, Lauren Tran, Andrew Lee, Taeksoo Kwon, Minho Lee","doi":"10.1177/11769351241272397","DOIUrl":"https://doi.org/10.1177/11769351241272397","url":null,"abstract":"Objectives: The primary goal of this research is to develop treatment-related genomic predictive markers for non-small cell lung cancer by integrating various machine learning algorithms that recommends near-optimal individualized patient treatment for chemotherapy in an effort to maximize efficacy or minimize treatment-related toxicity. This research can contribute toward developing a more refined, accurate and effective therapy accounting for specific patient needs.Methods: To accomplish our research goal, we implement ensemble learning algorithms, bagging with regularized Cox regression models and nonparametric tree-based models via Random Survival Forests. A comprehensive meta-database was compiled from the NCBI Gene Expression Omnibus data repository for lung cancer patients to capture and utilize complex genomic patterns that can predict treatment outcomes more accurately.Results: The developed novel prediction algorithm demonstrates the ability to support complex clinical decision-making processes in the treatment of NSCLC. It effectively addresses patient heterogeneity, offering predictions that are both refined and personalized in improving the precision of chemotherapy regimens prescribed to the eligible patients.Conclusion: This research should contribute substantial advancement of cancer treatments by improving the accuracy and efficacy of chemotherapy treatments for a targeted group of patients who need the right treatment. The integration of complex machine learning techniques with genomic data holds substantial potential to transform current cancer treatment paradigms by providing robust support in clinical decision-making.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"23 ","pages":"11769351241272397"},"PeriodicalIF":2.4,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11483699/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142476534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multicategory Survival Outcomes Classification via Overlapping Group Screening Process Based on Multinomial Logistic Regression Model With Application to TCGA Transcriptomic Data. 基于多叉 Logistic 回归模型的重叠组筛选过程的多类生存结果分类，并应用于 TCGA 转录组数据。

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2024-10-08 eCollection Date: 2024-01-01 DOI: 10.1177/11769351241286710

Jie-Huei Wang, Po-Lin Hou, Yi-Hau Chen

Objectives: Under the classification of multicategory survival outcomes of cancer patients, it is crucial to identify biomarkers that affect specific outcome categories. The classification of multicategory survival outcomes from transcriptomic data has been thoroughly investigated in computational biology. Nevertheless, several challenges must be addressed, including the ultra-high-dimensional feature space, feature contamination, and data imbalance, all of which contribute to the instability of the diagnostic model. Furthermore, although most methods achieve accurate predicted performance for binary classification with high-dimensional transcriptomic data, their extension to multi-class classification is not straightforward.

Methods: We employ the One-versus-One strategy to transform multi-class classification into multiple binary classification, and utilize the overlapping group screening procedure with binary logistic regression to include pathway information for identifying important genes and gene-gene interactions for multicategory survival outcomes.

Results: A series of simulation studies are conducted to compare the classification accuracy of our proposed approach with some existing machine learning methods. In practical data applications, we utilize the random oversampling procedure to tackle class imbalance issues. We then apply the proposed method to analyze transcriptomic data from various cancers in The Cancer Genome Atlas, such as kidney renal papillary cell carcinoma, lung adenocarcinoma, and head and neck squamous cell carcinoma. Our aim is to establish an accurate microarray-based multicategory cancer diagnosis model. The numerical results illustrate that the new proposal effectively enhances cancer diagnosis compared to approaches that neglect pathway information.

Conclusions: We showcase the effectiveness of the proposed method in terms of class prediction accuracy through evaluations on simulated synthetic datasets as well as real dataset applications. We also identified the cancer-related gene-gene interaction biomarkers and reported the corresponding network structure. According to the identified major genes and gene-gene interactions, we can predict for each patient the probabilities that he/she belongs to each of the survival outcome classes.

研究目的在对癌症患者的多类生存结果进行分类时，确定影响特定结果类别的生物标志物至关重要。计算生物学对转录组数据的多类生存结果分类进行了深入研究。然而，有几个难题必须解决，包括超高维特征空间、特征污染和数据不平衡，所有这些都会导致诊断模型的不稳定性。此外，虽然大多数方法都能在高维转录组数据的二元分类中实现准确的预测性能，但将其扩展到多类分类却并不简单：方法：我们采用 "一对一"（One-versus-One）策略将多类分类转化为多重二元分类，并利用二元逻辑回归的重叠组筛选程序纳入通路信息，以确定多类生存结果的重要基因和基因-基因相互作用：我们进行了一系列模拟研究，比较了我们提出的方法与一些现有机器学习方法的分类准确性。在实际数据应用中，我们利用随机超采样程序来解决类不平衡问题。然后，我们将提出的方法用于分析癌症基因组图谱中各种癌症的转录组数据，如肾脏乳头状细胞癌、肺腺癌和头颈部鳞状细胞癌。我们的目标是建立一个基于芯片的多类癌症精确诊断模型。数值结果表明，与忽视路径信息的方法相比，新建议能有效提高癌症诊断效果：通过对模拟合成数据集和真实数据集应用的评估，我们展示了所提方法在类别预测准确性方面的有效性。我们还确定了与癌症相关的基因-基因相互作用生物标记物，并报告了相应的网络结构。根据确定的主要基因和基因-基因相互作用，我们可以预测每个患者属于每个生存结果类别的概率。

{"title":"Multicategory Survival Outcomes Classification via Overlapping Group Screening Process Based on Multinomial Logistic Regression Model With Application to TCGA Transcriptomic Data.","authors":"Jie-Huei Wang, Po-Lin Hou, Yi-Hau Chen","doi":"10.1177/11769351241286710","DOIUrl":"10.1177/11769351241286710","url":null,"abstract":"Objectives: Under the classification of multicategory survival outcomes of cancer patients, it is crucial to identify biomarkers that affect specific outcome categories. The classification of multicategory survival outcomes from transcriptomic data has been thoroughly investigated in computational biology. Nevertheless, several challenges must be addressed, including the ultra-high-dimensional feature space, feature contamination, and data imbalance, all of which contribute to the instability of the diagnostic model. Furthermore, although most methods achieve accurate predicted performance for binary classification with high-dimensional transcriptomic data, their extension to multi-class classification is not straightforward.Methods: We employ the One-versus-One strategy to transform multi-class classification into multiple binary classification, and utilize the overlapping group screening procedure with binary logistic regression to include pathway information for identifying important genes and gene-gene interactions for multicategory survival outcomes.Results: A series of simulation studies are conducted to compare the classification accuracy of our proposed approach with some existing machine learning methods. In practical data applications, we utilize the random oversampling procedure to tackle class imbalance issues. We then apply the proposed method to analyze transcriptomic data from various cancers in The Cancer Genome Atlas, such as kidney renal papillary cell carcinoma, lung adenocarcinoma, and head and neck squamous cell carcinoma. Our aim is to establish an accurate microarray-based multicategory cancer diagnosis model. The numerical results illustrate that the new proposal effectively enhances cancer diagnosis compared to approaches that neglect pathway information.Conclusions: We showcase the effectiveness of the proposed method in terms of class prediction accuracy through evaluations on simulated synthetic datasets as well as real dataset applications. We also identified the cancer-related gene-gene interaction biomarkers and reported the corresponding network structure. According to the identified major genes and gene-gene interactions, we can predict for each patient the probabilities that he/she belongs to each of the survival outcome classes.","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"23 ","pages":"11769351241286710"},"PeriodicalIF":2.4,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11462568/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142393899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Innovations in Artificial Intelligence-Driven Breast Cancer Survival Prediction: A Narrative Review. 人工智能驱动的乳腺癌生存预测创新：叙述性综述。

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Cancer Informatics

Pub Date : 2024-09-29 eCollection Date: 2024-01-01 DOI: 10.1177/11769351241272389

Mehwish Mooghal, Saad Nasir, Aiman Arif, Wajiha Khan, Yasmin Abdul Rashid, Lubna M Vohra

This narrative review explores the burgeoning field of Artificial Intelligence (AI)-driven Breast Cancer (BC) survival prediction, emphasizing the transformative impact on patient care. From machine learning to deep neural networks, diverse models demonstrate the potential to refine prognosis accuracy and tailor treatment strategies. The literature underscores the need for clinician integration and addresses challenges of model generalizability and ethical considerations. Crucially, AI's promise extends to Low- and Middle-Income Countries (LMICs), presenting an opportunity to bridge healthcare disparities. Collaborative efforts in research, technology transfer, and education are essential to empower healthcare professionals in LMICs. As we navigate this frontier, AI emerges not only as a technological advancement but as a guiding light toward personalized, accessible BC care, marking a significant stride in the global fight against this formidable disease.

这篇叙述性综述探讨了人工智能（AI）驱动的乳腺癌（BC）生存预测这一新兴领域，强调其对患者护理的变革性影响。从机器学习到深度神经网络，各种模型都展示了提高预后准确性和定制治疗策略的潜力。文献强调了临床医生整合的必要性，并解决了模型通用性和伦理考虑方面的挑战。最重要的是，人工智能的前景已扩展到中低收入国家（LMIC），为缩小医疗差距提供了机会。研究、技术转让和教育方面的合作对于增强中低收入国家医疗保健专业人员的能力至关重要。在我们探索这一前沿领域的过程中，人工智能不仅是一项技术进步，也是实现个性化、可获得的 BC 护理的指路明灯，标志着全球在抗击这一可怕疾病的斗争中迈出了重要一步。

引用次数: 0