[This corrects the article DOI: 10.1177/11769351231177269.].
[This corrects the article DOI: 10.1177/11769351231177269.].
Using a decision support system (DSS) that classifies various cancers provides support to the clinicians/researchers to make better decisions that can aid in early cancer diagnosis, thereby reducing chances of incorrect disease diagnosis. Thus, this work aimed at designing a classification model that can predict accurately for 5 different cancer types comprising of 20 cancer exomes, using the mutations identified from whole exome cancer analysis. Initially, a basic model was designed using supervised machine learning classification algorithms such as K-nearest neighbor (KNN), support vector machine (SVM), decision tree, naïve bayes and random forest (RF), among which decision tree and random forest performed better in terms of preliminary model accuracy. However, output predictions were incorrect due to less training scores. Thus, 16 essential features were then selected for model improvement using 2 approaches. All imbalanced datasets were balanced using SMOTE. In the first approach, all features from 20 cancer exome datasets were trained and models were designed using decision tree and random forest. Balanced datasets for decision tree model showed an accuracy of 77%, while with the RF model, the accuracy improved to 82% where all 5 cancer types were predicted correctly. Area under the curve for RF model was closer to 1, than decision tree model. In the second approach, all 15 datasets were trained, while 5 were tested. However, only 2 cancer types were predicted correctly. To cross validate RF model, Matthew's correlation co-efficient (MCC) test was performed. For method 1, the MCC test and MCC cross validation was found to be 0.7796 and 0.9356 respectively. Likewise, for second approach, MCC was observed to be 0.9365, corroborating the accuracy of the designed model. The model was successfully deployed using Streamlit as a web application for easy use. This study presents insights for allowing easy cancer classifications.
The present study was the first comprehensive investigation of genetic mutation and expression levels of the p53 signaling genes in cutaneous melanoma through various genetic databases providing large datasets. The mutational landscape of p53 and its signaling genes was higher than expected, with TP53 followed by CDKN2A being the most mutated gene in cutaneous melanoma. Furthermore, the expression analysis showed that TP53, MDM2, CDKN2A, and TP53BP1 were overexpressed, while MDM4 and CDKN2B were under-expressed in cutaneous melanoma. Overall, TCGA data revealed that among all the other p53 signaling proteins, CDKN2A was significantly higher in both sun and non-sun-exposed healthy tissues than in melanoma. Likewise, MDM4 and TP53BP1 expressions were markedly greater in non-sun-exposed healthy tissues compared to other groups. However, CDKN2B expression was higher in the sun-exposed healthy tissues than in other tissues. In addition, various genes were expressed significantly differently among males and females. In addition, CDKN2A was highly expressed in the SK-MEL-30 skin cancer cell line, whereas, Immune cell type expression analysis revealed that the MDM4 was highly expressed in naïve B-cells. Furthermore, all six genes were significantly overexpressed in extraordinarily overweight or obese tumor tissues compared to healthy tissues. MDM2 expression and tumor stage were closely related. There were differences in gene expression across patient age groups and positive nodal status. TP53 showed a positive correlation with B cells, MDM2 with CD8+T cells, macrophages and neutrophils, and MDM4 with neutrophils. CDKN2A/B had a non-significant correlation with all six types of immune cells. However, TP53BP1 was positively correlated with all five types of immune cells except B cells. Only TP53, MDM2, and CDKN2A had a role in cutaneous melanoma-specific tumor immunity. All TP53 and its regulating genes may be predictive for prognosis. The results of the present study need to be validated through future screening, in vivo, and in vitro studies.
Epidemiologic evidence for the association of cholesterol and breast cancer is inconsistent. Several factors may contribute to this inconsistency, including limited sample sizes, confounding effects of antihyperlipidemic treatment, age, and body mass index, and the assumption that the association follows a simple linear function. Here, we aimed to address these factors by combining visualization and quantification a large-scale contemporary electronic health record database (the All of Us Research Program). We find clear visual and quantitative evidence that breast cancer is strongly, positively, and near-linearly associated with total cholesterol and low-density lipoprotein cholesterol, but not associated with triglycerides. The association of breast cancer with high-density lipoprotein cholesterol was non-linear and age dependent. Standardized odds ratios were 2.12 (95% confidence interval 1.9-2.48), P = 5.6 × 10-31 for total cholesterol; 1.99 (1.75-2.26), P = 2.6 × 10-26 for low-density lipoprotein cholesterol; 1.69 (1.3-2.2), P = 9.0 × 10-5 for high-density lipoprotein cholesterol at age < 56; and 0.65 (0.55-0.78), P = 1.2 × 10-6 for high-density lipoprotein cholesterol at age ⩾ 56. The inclusion of the lipid levels measured after antihyperlipidemic treatment in the analysis results in erroneous associations. We demonstrate that the use of the logistic regression without inspecting risk variable linearity and accounting for confounding effects may lead to inconsistent results.
Introduction: In the era of big data, gene-set pathway analyses derived from multi-omics are exceptionally powerful. When preparing and analyzing high-dimensional multi-omics data, the installation process and programing skills required to use existing tools can be challenging. This is especially the case for those who are not familiar with coding. In addition, implementation with high performance computing solutions is required to run these tools efficiently.
Methods: We introduce an automatic multi-omics pathway workflow, a point and click graphical user interface to Multivariate Single Sample Gene Set Analysis (MOGSA), hosted on the Cancer Genomics Cloud by Seven Bridges Genomics. This workflow leverages the combination of different tools to perform data preparation for each given data types, dimensionality reduction, and MOGSA pathway analysis. The Omics data includes copy number alteration, transcriptomics data, proteomics and phosphoproteomics data. We have also provided an additional workflow to help with downloading data from The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium and preprocessing these data to be used for this multi-omics pathway workflow.
Results: The main outputs of this workflow are the distinct pathways for subgroups of interest provided by users, which are displayed in heatmaps if identified. In addition to this, graphs and tables are provided to users for reviewing.
Conclusion: Multi-omics Pathway Workflow requires no coding experience. Users can bring their own data or download and preprocess public datasets from The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium using our additional workflow based on the samples of interest. Distinct overactivated or deactivated pathways for groups of interest can be found. This useful information is important in effective therapeutic targeting.
Abnormal miRNA expression has been evidenced to be directly linked to HCC initiation and progression. This study was designed to detect possible prognostic, diagnostic, and/or therapeutic miRNAs for HCC using computational analysis of miRNAs expression. Methods: miRNA expression datasets meta-analysis was performed using the YM500v2 server to compare miRNA expression in normal and cancerous liver tissues. The most significant differentially regulated miRNAs in our study undergone target gene analysis using the mirWalk tool to obtain their validated and predicted targets. The combinatorial target prediction tool; miRror Suite was used to obtain the commonly regulated target genes. Functional enrichment analysis was performed on the resulting targets using the DAVID tool. A network was constructed based on interactions among microRNAs, their targets, and transcription factors. Hub nodes and gatekeepers were identified using network topological analysis. Further, we performed patient data survival analysis based on low and high expression of identified hubs and gatekeeper nodes, patients were stratified into low and high survival probability groups. Results: Using the meta-analysis option in the YM500v2 server, 34 miRNAs were found to be significantly differentially regulated (P-value ⩽ .05); 5 miRNAs were down-regulated while 29 were up-regulated. The validated and predicted target genes for each miRNA, as well as the combinatorially predicted targets, were obtained. DAVID enrichment analysis resulted in several important cellular functions that are directly related to the main cancer hallmarks. Among these functions are focal adhesion, cell cycle, PI3K-Akt signaling, insulin signaling, Ras and MAPK signaling pathways. Several hub genes and gatekeepers were found that could serve as potential drug targets for hepatocellular carcinoma. POU2F1 and PPARA showed a significant difference between low and high survival probabilities (P-value ⩽ .05) in HCC patients. Our study sheds light on important biomarker miRNAs for hepatocellular carcinoma along with their target genes and their regulated functions.
Background: Breast cancer (BC) has been reported as one of the most common cancers diagnosed in females throughout the world. Survival rate of BC patients is affected by metastasis. So, exploring its underlying mechanisms and identifying related biomarkers to monitor BC relapse/recurrence using new statistical methods is essential. This study investigated the high-dimensional gene-expression profiles of BC patients using penalized additive hazards regression models.
Methods: A publicly available dataset related to the time to metastasis in BC patients (GSE2034) was used. There was information of 22 283 genes expression profiles related to 286 BC patients. Penalized additive hazards regression models with different penalties, including LASSO, SCAD, SICA, MCP and Elastic net were used to identify metastasis related genes.
Results: Five regression models with penalties were applied in the additive hazards model and jointly found 9 genes including SNU13, CLINT1, MAPK9, ABCC5, NKX3-1, NCOR2, COL2A1, and ZNF219. According the median of the prognostic index calculated using the regression coefficients of the penalized additive hazards model, the patients were labeled as high/low risk groups. A significant difference was detected in the survival curves of the identified groups. The selected genes were examined using validation data and were significantly associated with the hazard of metastasis.
Conclusion: This study showed that MAPK9, NKX3-1, NCOR1, ABCC5, and CD44 are the potential recurrence and metastatic predictors in breast cancer and can be taken into account as candidates for further research in tumorigenesis, invasion, metastasis, and epithelial-mesenchymal transition of breast cancer.
Osteosarcoma (OS) is the most common primary cancer in the skeletal system, characterized by a high incidence of lung metastasis, local recurrence and death. Systemic treatment of this aggressive cancer has not improved significantly since the introduction of chemotherapy regimens, underscoring a critical need for new treatment strategies. TRAIL receptors have long been proposed to be therapeutic targets for cancer treatment, but their role in osteosarcoma remains unclear. In this study, we investigated the expression profile of four TRAIL receptors in human OS cells using total RNA-seq and single-cell RNA-seq (scRNA-seq). The results revealed that TNFRSF10B and TNFRSF10D but not TNFRSF10A and TNFRSF10C are differentially expressed in human OS cells as compared to normal cells. At the single cell level by scRNA-seq analyses, TNFRSF10B, TNFRSF10D, TNFRSF10A and TNFRSF10C are most abundantly expressed in endothelial cells of OS tissues among nine distinct cell clusters. Notably, in osteoblastic OS cells, TNFRSF10B is most abundantly expressed, followed by TNFRSF10D, TNFRSF10A and TNFRSF10C. Similarly, in an OS cell line U2-OS using RNA-seq, TNFRSF10B is most abundantly expressed, followed by TNFRSF10D, TNFRSF10A and TNFRSF10C. According to the TARGET online database, poor patient outcomes were associated with low expression of TNFRSF10C. These results could provide a new perspective to design novel therapeutic targets of TRAIL receptors for the diagnosis, prognosis and treatment of OS and other cancers.
Host immunogenetics play a critical role in the human immune response to melanoma, influencing both melanoma prevalence and immunotherapy outcomes. Beneficial outcomes that stimulate T cell response hinge on binding affinity and immunogenicity of human leukocyte antigen (HLA) with melanoma antigen epitopes. Here, we use an in silico approach to characterize binding affinity and immunogenicity of 69 HLA Class I human leukocyte antigen alleles to epitopes of 11 known melanoma antigens. The findings document a significant proportion of positively immunogenic epitope-allele combinations, with the highest proportions of positive immunogenicity found for the Q13072/BAGE1 melanoma antigen and alleles of the HLA B and C genes. The findings are discussed in terms of a personalized precision HLA-mediated adjunct to immune checkpoint blockade immunotherapy to maximize tumor elimination.