Identification of IGFBP3 and LGALS1 as potential secreted biomarkers for clear cell renal cell carcinoma based on bioinformatics analysis and machine learning.
{"title":"Identification of IGFBP3 and LGALS1 as potential secreted biomarkers for clear cell renal cell carcinoma based on bioinformatics analysis and machine learning.","authors":"Wunchana Seubwai, Sakkarn Sangkhamanon, Xuhong Zhang","doi":"10.17219/acem/194036","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Clear cell renal cell carcinoma (ccRCC) is the most common subtype of renal cell carcinoma (RCC). Due to the lack of symptoms until advanced stages, early diagnosis of ccRCC is challenging. Therefore, the identification of novel secreted biomarkers for the early detection of ccRCC is urgently needed.</p><p><strong>Objectives: </strong>This study aimed to identify novel secreted biomarkers for diagnosing ccRCC using bioinformatics and machine learning techniques based on transcriptomics data.</p><p><strong>Material and methods: </strong>Differentially expressed genes (DEGs) in ccRCC compared to normal kidney tissues were identified using 3 transcriptomics datasets (GSE53757, GSE40435 and GSE11151) from the Gene Expression Omnibus (GEO). Potential secreted biomarkers were examined within these common DEGs using a list of human secretome proteins from The Human Protein Atlas. The recursive feature elimination (RFE) technique was used to determine the optimal number of features for building classification machine learning models. The expression levels and clinical associations of candidate biomarkers identified with RFE were validated using transcriptomics data from The Cancer Genome Atlas (TCGA). Classification models were then developed based on the expression levels of these candidate biomarkers. The performance of the models was evaluated based on accuracy, evaluation metrics, confusion matrices, and ROC-AUC (receiver operating characteristic-area under the ROC curve) curves.</p><p><strong>Results: </strong>We identified 44 DEGs that encode potential secreted proteins from 274 common DEGs found across all datasets. Among these, insulin-like growth factor binding protein 3 (IGFBP3) and lectin, galactoside-binding, soluble, 1 (LGALS1) were selected for further analysis using the RFE technique. Both IGFBP3 and LGALS1 showed significant upregulation in ccRCC tissues compared to normal tissues in the GEO and TCGA datasets. The results of the survival analysis indicated that patients with higher expression levels of these genes exhibited shorter overall and disease-free survival times (OS and DFS). Decision tree and random forest models based on IGFBP3 and LGALS1 levels achieved an accuracy of 98.04% and an AUC of 0.98.</p><p><strong>Conclusions: </strong>This study identified IGFBP3 and LGALS1 as promising novel secreted biomarkers for ccRCC diagnosis.</p>","PeriodicalId":7306,"journal":{"name":"Advances in Clinical and Experimental Medicine","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Clinical and Experimental Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.17219/acem/194036","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Clear cell renal cell carcinoma (ccRCC) is the most common subtype of renal cell carcinoma (RCC). Due to the lack of symptoms until advanced stages, early diagnosis of ccRCC is challenging. Therefore, the identification of novel secreted biomarkers for the early detection of ccRCC is urgently needed.
Objectives: This study aimed to identify novel secreted biomarkers for diagnosing ccRCC using bioinformatics and machine learning techniques based on transcriptomics data.
Material and methods: Differentially expressed genes (DEGs) in ccRCC compared to normal kidney tissues were identified using 3 transcriptomics datasets (GSE53757, GSE40435 and GSE11151) from the Gene Expression Omnibus (GEO). Potential secreted biomarkers were examined within these common DEGs using a list of human secretome proteins from The Human Protein Atlas. The recursive feature elimination (RFE) technique was used to determine the optimal number of features for building classification machine learning models. The expression levels and clinical associations of candidate biomarkers identified with RFE were validated using transcriptomics data from The Cancer Genome Atlas (TCGA). Classification models were then developed based on the expression levels of these candidate biomarkers. The performance of the models was evaluated based on accuracy, evaluation metrics, confusion matrices, and ROC-AUC (receiver operating characteristic-area under the ROC curve) curves.
Results: We identified 44 DEGs that encode potential secreted proteins from 274 common DEGs found across all datasets. Among these, insulin-like growth factor binding protein 3 (IGFBP3) and lectin, galactoside-binding, soluble, 1 (LGALS1) were selected for further analysis using the RFE technique. Both IGFBP3 and LGALS1 showed significant upregulation in ccRCC tissues compared to normal tissues in the GEO and TCGA datasets. The results of the survival analysis indicated that patients with higher expression levels of these genes exhibited shorter overall and disease-free survival times (OS and DFS). Decision tree and random forest models based on IGFBP3 and LGALS1 levels achieved an accuracy of 98.04% and an AUC of 0.98.
Conclusions: This study identified IGFBP3 and LGALS1 as promising novel secreted biomarkers for ccRCC diagnosis.
期刊介绍:
Advances in Clinical and Experimental Medicine has been published by the Wroclaw Medical University since 1992. Establishing the medical journal was the idea of Prof. Bogumił Halawa, Chair of the Department of Cardiology, and was fully supported by the Rector of Wroclaw Medical University, Prof. Zbigniew Knapik. Prof. Halawa was also the first editor-in-chief, between 1992-1997. The journal, then entitled "Postępy Medycyny Klinicznej i Doświadczalnej", appeared quarterly.
Prof. Leszek Paradowski was editor-in-chief from 1997-1999. In 1998 he initiated alterations in the profile and cover design of the journal which were accepted by the Editorial Board. The title was changed to Advances in Clinical and Experimental Medicine. Articles in English were welcomed. A number of outstanding representatives of medical science from Poland and abroad were invited to participate in the newly established International Editorial Staff.
Prof. Antonina Harłozińska-Szmyrka was editor-in-chief in years 2000-2005, in years 2006-2007 once again prof. Leszek Paradowski and prof. Maria Podolak-Dawidziak was editor-in-chief in years 2008-2016. Since 2017 the editor-in chief is prof. Maciej Bagłaj.
Since July 2005, original papers have been published only in English. Case reports are no longer accepted. The manuscripts are reviewed by two independent reviewers and a statistical reviewer, and English texts are proofread by a native speaker.
The journal has been indexed in several databases: Scopus, Ulrich’sTM International Periodicals Directory, Index Copernicus and since 2007 in Thomson Reuters databases: Science Citation Index Expanded i Journal Citation Reports/Science Edition.
In 2010 the journal obtained Impact Factor which is now 1.179 pts. Articles published in the journal are worth 15 points among Polish journals according to the Polish Committee for Scientific Research and 169.43 points according to the Index Copernicus.
Since November 7, 2012, Advances in Clinical and Experimental Medicine has been indexed and included in National Library of Medicine’s MEDLINE database. English abstracts printed in the journal are included and searchable using PubMed http://www.ncbi.nlm.nih.gov/pubmed.