{"title":"Research and analysis of differential gene expression in CD34 hematopoietic stem cells in myelodysplastic syndromes.","authors":"Min-Xiao Wang, Chang-Sheng Liao, Xue-Qin Wei, Yu-Qin Xie, Peng-Fei Han, Yan-Hui Yu","doi":"10.1371/journal.pone.0315408","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aims to investigate and analyze the differentially expressed genes (DEGs) in CD34 + hematopoietic stem cells (HSCs) from patients with myelodysplastic syndromes (MDS) through bioinformatics analysis, with the ultimate goal of uncovering the potential molecular mechanisms underlying pathogenesis of MDS. The findings of this study are expected to provide novel insights into clinical treatment strategies for MDS.</p><p><strong>Methods: </strong>Initially, we downloaded three datasets, GSE81173, GSE4619, and GSE58831, from the public Gene Expression Omnibus (GEO) database as our training sets, and selected the GSE19429 dataset as the validation set. To ensure data consistency and comparability, we standardized the training sets and removed batch effects using the ComBat algorithm, thereby integrating them into a unified gene expression dataset. Subsequently, we conducted differential expression analysis to identify genes with significant changes in expression levels across different disease states. In order to enhance prediction accuracy, we incorporated six common predictive models and trained them based on the filtered differential gene expression dataset. After comprehensive evaluation, we ultimately selected three algorithms-Lasso regression, random forest, and support vector machine (SVM)-as our core predictive models. To more precisely pinpoint genes closely related to disease characteristics, we utilized the aforementioned three machine learning methods for prediction and took the intersection of these prediction results, yielding a more robust list of genes associated with disease features. Following this, we conducted in-depth analysis of these key genes in the training set and validated the results independently using the GSE19429 dataset. Furthermore, we performed differential analysis of gene groups, co-expression analysis, and enrichment analysis to delve deeper into the mechanisms underlying the roles of these genes in disease initiation and progression. Through these analyses, we aim to provide new insights and foundations for disease diagnosis and treatment. Figure illustrates the data preprocessing and analysis workflow of this study.</p><p><strong>Results: </strong>Our analysis of differentially expressed genes (DEGs) in CD34+ hematopoietic stem cells (HSCs) from patients with myelodysplastic syndromes (MDS) revealed significant differences in gene expression patterns compared to the control group (individuals without MDS). Specifically, the expression levels of two key genes, IRF4 and ELANE, were notably downregulated in CD34+ HSCs of MDS patients, indicating their downregulatory roles in the pathological process of MDS.</p><p><strong>Conclusion: </strong>This study sheds light on the potential molecular mechanisms underlying MDS, with a particular focus on the pivotal roles of IRF4 and ELANE as key pathogenic genes. Our findings provide a novel perspective for understanding the complexity of MDS and exploring therapeutic strategies. They may also guide the development of precise and effective treatments, such as targeted interventions directed against these genes.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 3","pages":"e0315408"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0315408","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: This study aims to investigate and analyze the differentially expressed genes (DEGs) in CD34 + hematopoietic stem cells (HSCs) from patients with myelodysplastic syndromes (MDS) through bioinformatics analysis, with the ultimate goal of uncovering the potential molecular mechanisms underlying pathogenesis of MDS. The findings of this study are expected to provide novel insights into clinical treatment strategies for MDS.
Methods: Initially, we downloaded three datasets, GSE81173, GSE4619, and GSE58831, from the public Gene Expression Omnibus (GEO) database as our training sets, and selected the GSE19429 dataset as the validation set. To ensure data consistency and comparability, we standardized the training sets and removed batch effects using the ComBat algorithm, thereby integrating them into a unified gene expression dataset. Subsequently, we conducted differential expression analysis to identify genes with significant changes in expression levels across different disease states. In order to enhance prediction accuracy, we incorporated six common predictive models and trained them based on the filtered differential gene expression dataset. After comprehensive evaluation, we ultimately selected three algorithms-Lasso regression, random forest, and support vector machine (SVM)-as our core predictive models. To more precisely pinpoint genes closely related to disease characteristics, we utilized the aforementioned three machine learning methods for prediction and took the intersection of these prediction results, yielding a more robust list of genes associated with disease features. Following this, we conducted in-depth analysis of these key genes in the training set and validated the results independently using the GSE19429 dataset. Furthermore, we performed differential analysis of gene groups, co-expression analysis, and enrichment analysis to delve deeper into the mechanisms underlying the roles of these genes in disease initiation and progression. Through these analyses, we aim to provide new insights and foundations for disease diagnosis and treatment. Figure illustrates the data preprocessing and analysis workflow of this study.
Results: Our analysis of differentially expressed genes (DEGs) in CD34+ hematopoietic stem cells (HSCs) from patients with myelodysplastic syndromes (MDS) revealed significant differences in gene expression patterns compared to the control group (individuals without MDS). Specifically, the expression levels of two key genes, IRF4 and ELANE, were notably downregulated in CD34+ HSCs of MDS patients, indicating their downregulatory roles in the pathological process of MDS.
Conclusion: This study sheds light on the potential molecular mechanisms underlying MDS, with a particular focus on the pivotal roles of IRF4 and ELANE as key pathogenic genes. Our findings provide a novel perspective for understanding the complexity of MDS and exploring therapeutic strategies. They may also guide the development of precise and effective treatments, such as targeted interventions directed against these genes.
期刊介绍:
PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides:
* Open-access—freely accessible online, authors retain copyright
* Fast publication times
* Peer review by expert, practicing researchers
* Post-publication tools to indicate quality and impact
* Community-based dialogue on articles
* Worldwide media coverage