Identification of immune-associated biomarkers of diabetes nephropathy tubulointerstitial injury based on machine learning: a bioinformatics multi-chip integrated analysis.
Lin Wang, Jiaming Su, Zhongjie Liu, Shaowei Ding, Yaotan Li, Baoluo Hou, Yuxin Hu, Zhaoxi Dong, Jingyi Tang, Hongfang Liu, Weijing Liu
{"title":"Identification of immune-associated biomarkers of diabetes nephropathy tubulointerstitial injury based on machine learning: a bioinformatics multi-chip integrated analysis.","authors":"Lin Wang, Jiaming Su, Zhongjie Liu, Shaowei Ding, Yaotan Li, Baoluo Hou, Yuxin Hu, Zhaoxi Dong, Jingyi Tang, Hongfang Liu, Weijing Liu","doi":"10.1186/s13040-024-00369-x","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Diabetic nephropathy (DN) is a major microvascular complication of diabetes and has become the leading cause of end-stage renal disease worldwide. A considerable number of DN patients have experienced irreversible end-stage renal disease progression due to the inability to diagnose the disease early. Therefore, reliable biomarkers that are helpful for early diagnosis and treatment are identified. The migration of immune cells to the kidney is considered to be a key step in the progression of DN-related vascular injury. Therefore, finding markers in this process may be more helpful for the early diagnosis and progression prediction of DN.</p><p><strong>Methods: </strong>The gene chip data were retrieved from the GEO database using the search term ' diabetic nephropathy '. The ' limma ' software package was used to identify differentially expressed genes (DEGs) between DN and control samples. Gene set enrichment analysis (GSEA) was performed on genes obtained from the molecular characteristic database (MSigDB. The R package 'WGCNA' was used to identify gene modules associated with tubulointerstitial injury in DN, and it was crossed with immune-related DEGs to identify target genes. Gene ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed on differentially expressed genes using the 'ClusterProfiler' software package in R. Three methods, least absolute shrinkage and selection operator (LASSO), support vector machine recursive feature elimination (SVM-RFE) and random forest (RF), were used to select immune-related biomarkers for diagnosis. We retrieved the tubulointerstitial dataset from the Nephroseq database to construct an external validation dataset. Unsupervised clustering analysis of the expression levels of immune-related biomarkers was performed using the 'ConsensusClusterPlus 'R software package. The urine of patients who visited Dongzhimen Hospital of Beijing University of Chinese Medicine from September 2021 to March 2023 was collected, and Elisa was used to detect the mRNA expression level of immune-related biomarkers in urine. Pearson correlation analysis was used to detect the effect of immune-related biomarker expression on renal function in DN patients.</p><p><strong>Results: </strong>Four microarray datasets from the GEO database are included in the analysis : GSE30122, GSE47185, GSE99340 and GSE104954. These datasets included 63 DN patients and 55 healthy controls. A total of 9415 genes were detected in the data set. We found 153 differentially expressed immune-related genes, of which 112 genes were up-regulated, 41 genes were down-regulated, and 119 overlapping genes were identified. GO analysis showed that they were involved in various biological processes including leukocyte-mediated immunity. KEGG analysis showed that these target genes were mainly involved in the formation of phagosomes in Staphylococcus aureus infection. Among these 119 overlapping genes, machine learning results identified AGR2, CCR2, CEBPD, CISH, CX3CR1, DEFB1 and FSTL1 as potential tubulointerstitial immune-related biomarkers. External validation suggested that the above markers showed diagnostic efficacy in distinguishing DN patients from healthy controls. Clinical studies have shown that the expression of AGR2, CX3CR1 and FSTL1 in urine samples of DN patients is negatively correlated with GFR, the expression of CX3CR1 and FSTL1 in urine samples of DN is positively correlated with serum creatinine, while the expression of DEFB1 in urine samples of DN is negatively correlated with serum creatinine. In addition, the expression of CX3CR1 in DN urine samples was positively correlated with proteinuria, while the expression of DEFB1 in DN urine samples was negatively correlated with proteinuria. Finally, according to the level of proteinuria, DN patients were divided into nephrotic proteinuria group (n = 24) and subrenal proteinuria group. There were significant differences in urinary AGR2, CCR2 and DEFB1 between the two groups by unpaired t test (P < 0.05).</p><p><strong>Conclusions: </strong>Our study provides new insights into the role of immune-related biomarkers in DN tubulointerstitial injury and provides potential targets for early diagnosis and treatment of DN patients. Seven different genes ( AGR2, CCR2, CEBPD, CISH, CX3CR1, DEFB1, FSTL1 ), as promising sensitive biomarkers, may affect the progression of DN by regulating immune inflammatory response. However, further comprehensive studies are needed to fully understand their exact molecular mechanisms and functional pathways in DN.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"17 1","pages":"20"},"PeriodicalIF":4.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11218417/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-024-00369-x","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Diabetic nephropathy (DN) is a major microvascular complication of diabetes and has become the leading cause of end-stage renal disease worldwide. A considerable number of DN patients have experienced irreversible end-stage renal disease progression due to the inability to diagnose the disease early. Therefore, reliable biomarkers that are helpful for early diagnosis and treatment are identified. The migration of immune cells to the kidney is considered to be a key step in the progression of DN-related vascular injury. Therefore, finding markers in this process may be more helpful for the early diagnosis and progression prediction of DN.
Methods: The gene chip data were retrieved from the GEO database using the search term ' diabetic nephropathy '. The ' limma ' software package was used to identify differentially expressed genes (DEGs) between DN and control samples. Gene set enrichment analysis (GSEA) was performed on genes obtained from the molecular characteristic database (MSigDB. The R package 'WGCNA' was used to identify gene modules associated with tubulointerstitial injury in DN, and it was crossed with immune-related DEGs to identify target genes. Gene ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed on differentially expressed genes using the 'ClusterProfiler' software package in R. Three methods, least absolute shrinkage and selection operator (LASSO), support vector machine recursive feature elimination (SVM-RFE) and random forest (RF), were used to select immune-related biomarkers for diagnosis. We retrieved the tubulointerstitial dataset from the Nephroseq database to construct an external validation dataset. Unsupervised clustering analysis of the expression levels of immune-related biomarkers was performed using the 'ConsensusClusterPlus 'R software package. The urine of patients who visited Dongzhimen Hospital of Beijing University of Chinese Medicine from September 2021 to March 2023 was collected, and Elisa was used to detect the mRNA expression level of immune-related biomarkers in urine. Pearson correlation analysis was used to detect the effect of immune-related biomarker expression on renal function in DN patients.
Results: Four microarray datasets from the GEO database are included in the analysis : GSE30122, GSE47185, GSE99340 and GSE104954. These datasets included 63 DN patients and 55 healthy controls. A total of 9415 genes were detected in the data set. We found 153 differentially expressed immune-related genes, of which 112 genes were up-regulated, 41 genes were down-regulated, and 119 overlapping genes were identified. GO analysis showed that they were involved in various biological processes including leukocyte-mediated immunity. KEGG analysis showed that these target genes were mainly involved in the formation of phagosomes in Staphylococcus aureus infection. Among these 119 overlapping genes, machine learning results identified AGR2, CCR2, CEBPD, CISH, CX3CR1, DEFB1 and FSTL1 as potential tubulointerstitial immune-related biomarkers. External validation suggested that the above markers showed diagnostic efficacy in distinguishing DN patients from healthy controls. Clinical studies have shown that the expression of AGR2, CX3CR1 and FSTL1 in urine samples of DN patients is negatively correlated with GFR, the expression of CX3CR1 and FSTL1 in urine samples of DN is positively correlated with serum creatinine, while the expression of DEFB1 in urine samples of DN is negatively correlated with serum creatinine. In addition, the expression of CX3CR1 in DN urine samples was positively correlated with proteinuria, while the expression of DEFB1 in DN urine samples was negatively correlated with proteinuria. Finally, according to the level of proteinuria, DN patients were divided into nephrotic proteinuria group (n = 24) and subrenal proteinuria group. There were significant differences in urinary AGR2, CCR2 and DEFB1 between the two groups by unpaired t test (P < 0.05).
Conclusions: Our study provides new insights into the role of immune-related biomarkers in DN tubulointerstitial injury and provides potential targets for early diagnosis and treatment of DN patients. Seven different genes ( AGR2, CCR2, CEBPD, CISH, CX3CR1, DEFB1, FSTL1 ), as promising sensitive biomarkers, may affect the progression of DN by regulating immune inflammatory response. However, further comprehensive studies are needed to fully understand their exact molecular mechanisms and functional pathways in DN.
期刊介绍:
BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data.
Topical areas include, but are not limited to:
-Development, evaluation, and application of novel data mining and machine learning algorithms.
-Adaptation, evaluation, and application of traditional data mining and machine learning algorithms.
-Open-source software for the application of data mining and machine learning algorithms.
-Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies.
-Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.