Research and analysis of differential gene expression in CD34 hematopoietic stem cells in myelodysplastic syndromes.

IF 2.6 3区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES PLoS ONE Pub Date : 2025-03-12 eCollection Date: 2025-01-01 DOI:10.1371/journal.pone.0315408
Min-Xiao Wang, Chang-Sheng Liao, Xue-Qin Wei, Yu-Qin Xie, Peng-Fei Han, Yan-Hui Yu
{"title":"Research and analysis of differential gene expression in CD34 hematopoietic stem cells in myelodysplastic syndromes.","authors":"Min-Xiao Wang, Chang-Sheng Liao, Xue-Qin Wei, Yu-Qin Xie, Peng-Fei Han, Yan-Hui Yu","doi":"10.1371/journal.pone.0315408","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aims to investigate and analyze the differentially expressed genes (DEGs) in CD34 + hematopoietic stem cells (HSCs) from patients with myelodysplastic syndromes (MDS) through bioinformatics analysis, with the ultimate goal of uncovering the potential molecular mechanisms underlying pathogenesis of MDS. The findings of this study are expected to provide novel insights into clinical treatment strategies for MDS.</p><p><strong>Methods: </strong>Initially, we downloaded three datasets, GSE81173, GSE4619, and GSE58831, from the public Gene Expression Omnibus (GEO) database as our training sets, and selected the GSE19429 dataset as the validation set. To ensure data consistency and comparability, we standardized the training sets and removed batch effects using the ComBat algorithm, thereby integrating them into a unified gene expression dataset. Subsequently, we conducted differential expression analysis to identify genes with significant changes in expression levels across different disease states. In order to enhance prediction accuracy, we incorporated six common predictive models and trained them based on the filtered differential gene expression dataset. After comprehensive evaluation, we ultimately selected three algorithms-Lasso regression, random forest, and support vector machine (SVM)-as our core predictive models. To more precisely pinpoint genes closely related to disease characteristics, we utilized the aforementioned three machine learning methods for prediction and took the intersection of these prediction results, yielding a more robust list of genes associated with disease features. Following this, we conducted in-depth analysis of these key genes in the training set and validated the results independently using the GSE19429 dataset. Furthermore, we performed differential analysis of gene groups, co-expression analysis, and enrichment analysis to delve deeper into the mechanisms underlying the roles of these genes in disease initiation and progression. Through these analyses, we aim to provide new insights and foundations for disease diagnosis and treatment. Figure illustrates the data preprocessing and analysis workflow of this study.</p><p><strong>Results: </strong>Our analysis of differentially expressed genes (DEGs) in CD34+ hematopoietic stem cells (HSCs) from patients with myelodysplastic syndromes (MDS) revealed significant differences in gene expression patterns compared to the control group (individuals without MDS). Specifically, the expression levels of two key genes, IRF4 and ELANE, were notably downregulated in CD34+ HSCs of MDS patients, indicating their downregulatory roles in the pathological process of MDS.</p><p><strong>Conclusion: </strong>This study sheds light on the potential molecular mechanisms underlying MDS, with a particular focus on the pivotal roles of IRF4 and ELANE as key pathogenic genes. Our findings provide a novel perspective for understanding the complexity of MDS and exploring therapeutic strategies. They may also guide the development of precise and effective treatments, such as targeted interventions directed against these genes.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 3","pages":"e0315408"},"PeriodicalIF":2.6000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11902259/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0315408","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: This study aims to investigate and analyze the differentially expressed genes (DEGs) in CD34 + hematopoietic stem cells (HSCs) from patients with myelodysplastic syndromes (MDS) through bioinformatics analysis, with the ultimate goal of uncovering the potential molecular mechanisms underlying pathogenesis of MDS. The findings of this study are expected to provide novel insights into clinical treatment strategies for MDS.

Methods: Initially, we downloaded three datasets, GSE81173, GSE4619, and GSE58831, from the public Gene Expression Omnibus (GEO) database as our training sets, and selected the GSE19429 dataset as the validation set. To ensure data consistency and comparability, we standardized the training sets and removed batch effects using the ComBat algorithm, thereby integrating them into a unified gene expression dataset. Subsequently, we conducted differential expression analysis to identify genes with significant changes in expression levels across different disease states. In order to enhance prediction accuracy, we incorporated six common predictive models and trained them based on the filtered differential gene expression dataset. After comprehensive evaluation, we ultimately selected three algorithms-Lasso regression, random forest, and support vector machine (SVM)-as our core predictive models. To more precisely pinpoint genes closely related to disease characteristics, we utilized the aforementioned three machine learning methods for prediction and took the intersection of these prediction results, yielding a more robust list of genes associated with disease features. Following this, we conducted in-depth analysis of these key genes in the training set and validated the results independently using the GSE19429 dataset. Furthermore, we performed differential analysis of gene groups, co-expression analysis, and enrichment analysis to delve deeper into the mechanisms underlying the roles of these genes in disease initiation and progression. Through these analyses, we aim to provide new insights and foundations for disease diagnosis and treatment. Figure illustrates the data preprocessing and analysis workflow of this study.

Results: Our analysis of differentially expressed genes (DEGs) in CD34+ hematopoietic stem cells (HSCs) from patients with myelodysplastic syndromes (MDS) revealed significant differences in gene expression patterns compared to the control group (individuals without MDS). Specifically, the expression levels of two key genes, IRF4 and ELANE, were notably downregulated in CD34+ HSCs of MDS patients, indicating their downregulatory roles in the pathological process of MDS.

Conclusion: This study sheds light on the potential molecular mechanisms underlying MDS, with a particular focus on the pivotal roles of IRF4 and ELANE as key pathogenic genes. Our findings provide a novel perspective for understanding the complexity of MDS and exploring therapeutic strategies. They may also guide the development of precise and effective treatments, such as targeted interventions directed against these genes.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
骨髓增生异常综合征中CD34造血干细胞差异基因表达的研究与分析。
目的:本研究旨在通过生物信息学分析研究和分析骨髓增生异常综合征(MDS)患者CD34 +造血干细胞(hsc)中的差异表达基因(DEGs),最终揭示MDS发病机制的潜在分子机制。本研究结果有望为MDS的临床治疗策略提供新的见解。方法:首先,我们从公共基因表达Omnibus (GEO)数据库中下载GSE81173、GSE4619和GSE58831三个数据集作为我们的训练集,并选择GSE19429数据集作为验证集。为了保证数据的一致性和可比性,我们使用ComBat算法对训练集进行标准化,并去除批次效应,从而将它们整合到一个统一的基因表达数据集中。随后,我们进行了差异表达分析,以确定在不同疾病状态下表达水平有显著变化的基因。为了提高预测精度,我们结合了6种常见的预测模型,并基于过滤后的差异基因表达数据集对其进行训练。经过综合评估,我们最终选择了lasso回归、随机森林和支持向量机(SVM)三种算法作为我们的核心预测模型。为了更精确地找出与疾病特征密切相关的基因,我们利用上述三种机器学习方法进行预测,并将这些预测结果进行交叉,得出与疾病特征相关的更健壮的基因列表。随后,我们对训练集中的这些关键基因进行了深入分析,并使用GSE19429数据集独立验证结果。此外,我们进行了基因组差异分析、共表达分析和富集分析,以深入研究这些基因在疾病发生和进展中的作用机制。通过这些分析,我们旨在为疾病的诊断和治疗提供新的见解和基础。本研究的数据预处理和分析流程如图所示。结果:我们对来自骨髓增生异常综合征(MDS)患者的CD34+造血干细胞(hsc)中的差异表达基因(DEGs)的分析显示,与对照组(无MDS的个体)相比,基因表达模式存在显著差异。其中,两个关键基因IRF4和ELANE在MDS患者的CD34+ hsc中表达水平明显下调,表明它们在MDS的病理过程中具有下调作用。结论:本研究揭示了MDS潜在的分子机制,特别关注IRF4和ELANE作为关键致病基因的关键作用。我们的发现为理解MDS的复杂性和探索治疗策略提供了一个新的视角。它们还可能指导开发精确有效的治疗方法,例如针对这些基因的针对性干预。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
PLoS ONE
PLoS ONE 生物-生物学
CiteScore
6.20
自引率
5.40%
发文量
14242
审稿时长
3.7 months
期刊介绍: PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides: * Open-access—freely accessible online, authors retain copyright * Fast publication times * Peer review by expert, practicing researchers * Post-publication tools to indicate quality and impact * Community-based dialogue on articles * Worldwide media coverage
期刊最新文献
Hospitalizations associated with endemic and non-endemic mosquito-borne arboviruses in Canada, 2002-2023. "It Feels Like My Spine is About to Break": Experience and support needs of family caregivers of children with cerebral palsy in Ethiopia. Maternal Zika virus exposure and neurodevelopmental outcomes: A longitudinal study of preschool children in the ZIKAlliance Colombian Cohort. µCT scanning effects on aDNA and a multi-step workflow for archaeological petrous portions. Modelling associated factors of maternal age at first birth in Ethiopia: Gamma regression approach.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1