Identification of EGR1 as a Key Diagnostic Biomarker in Metabolic Dysfunction-Associated Steatotic Liver Disease (MASLD) Through Machine Learning and Immune Analysis.
Xuanlin Wu, Tao Pan, Zhihao Fang, Titi Hui, Xiaoxiao Yu, Changxu Liu, Zihao Guo, Chang Liu
{"title":"Identification of EGR1 as a Key Diagnostic Biomarker in Metabolic Dysfunction-Associated Steatotic Liver Disease (MASLD) Through Machine Learning and Immune Analysis.","authors":"Xuanlin Wu, Tao Pan, Zhihao Fang, Titi Hui, Xiaoxiao Yu, Changxu Liu, Zihao Guo, Chang Liu","doi":"10.2147/JIR.S499396","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Metabolic Dysfunction-Associated Steatotic Liver Disease (MASLD), as a common chronic liver condition globally, is experiencing an increasing incidence rate which poses significant health risks. Despite this, the detailed mechanisms underlying the disease's onset and progression remain poorly understood. In this study, we aim to identify effective diagnostic biomarkers for MASLD using microarray data combined with machine learning techniques, which will aid in further understanding the pathogenesis of MASLD.</p><p><strong>Methods: </strong>We collected six datasets from the Gene Expression Omnibus (GEO) database, using five of them as training sets and one as a validation set. We employed three machine learning methods-LASSO, SVM, and Random Forest (RF)-to identify hub genes associated with MASLD. These genes were further validated using the external dataset GSE164760. Additionally, functional enrichment analysis, immune infiltration analysis, and immune function analysis were conducted. A TF-miRNA-mRNA network was constructed, and single-cell RNA sequencing was used to determine the distribution of key genes within key cell clusters. Finally, the expression of the key genes was further validated using the palmitic acid-induced AML-12 cell line and the MCD mouse model.</p><p><strong>Results: </strong>In this study, through differential gene expression (DEGs) analysis and machine learning techniques, we successfully identified 10 hub genes. Among these, the key gene EGR1 was validated and screened using an external dataset, with an area under the curve (AUC) of 0.882. Enrichment analyses and immune infiltration assessments revealed multiple pathways involving EGR1 in the pathogenesis and progression of MASLD, showing significant correlations with various immune cells. Furthermore, additional cellular experiments and animal model validations confirmed that the expression trends of EGR1 are highly consistent with our analytical findings.</p><p><strong>Conclusion: </strong>Our research has confirmed EGR1 as a key gene in MASLD, providing novel insights into the disease's pathogenesis and identifying new therapeutic targets for its treatment.</p>","PeriodicalId":16107,"journal":{"name":"Journal of Inflammation Research","volume":"18 ","pages":"1639-1656"},"PeriodicalIF":4.1000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11806694/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Inflammation Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/JIR.S499396","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"IMMUNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Metabolic Dysfunction-Associated Steatotic Liver Disease (MASLD), as a common chronic liver condition globally, is experiencing an increasing incidence rate which poses significant health risks. Despite this, the detailed mechanisms underlying the disease's onset and progression remain poorly understood. In this study, we aim to identify effective diagnostic biomarkers for MASLD using microarray data combined with machine learning techniques, which will aid in further understanding the pathogenesis of MASLD.
Methods: We collected six datasets from the Gene Expression Omnibus (GEO) database, using five of them as training sets and one as a validation set. We employed three machine learning methods-LASSO, SVM, and Random Forest (RF)-to identify hub genes associated with MASLD. These genes were further validated using the external dataset GSE164760. Additionally, functional enrichment analysis, immune infiltration analysis, and immune function analysis were conducted. A TF-miRNA-mRNA network was constructed, and single-cell RNA sequencing was used to determine the distribution of key genes within key cell clusters. Finally, the expression of the key genes was further validated using the palmitic acid-induced AML-12 cell line and the MCD mouse model.
Results: In this study, through differential gene expression (DEGs) analysis and machine learning techniques, we successfully identified 10 hub genes. Among these, the key gene EGR1 was validated and screened using an external dataset, with an area under the curve (AUC) of 0.882. Enrichment analyses and immune infiltration assessments revealed multiple pathways involving EGR1 in the pathogenesis and progression of MASLD, showing significant correlations with various immune cells. Furthermore, additional cellular experiments and animal model validations confirmed that the expression trends of EGR1 are highly consistent with our analytical findings.
Conclusion: Our research has confirmed EGR1 as a key gene in MASLD, providing novel insights into the disease's pathogenesis and identifying new therapeutic targets for its treatment.
背景:代谢功能障碍相关脂肪变性肝病(MASLD)作为一种全球常见的慢性肝病,其发病率正在上升,并构成重大的健康风险。尽管如此,这种疾病发生和发展的详细机制仍然知之甚少。在这项研究中,我们的目标是利用微阵列数据结合机器学习技术来识别MASLD的有效诊断生物标志物,这将有助于进一步了解MASLD的发病机制。方法:从Gene Expression Omnibus (GEO)数据库中收集6个数据集,其中5个作为训练集,1个作为验证集。我们使用了三种机器学习方法——lasso、SVM和随机森林(RF)来识别与MASLD相关的中心基因。这些基因使用外部数据集GSE164760进一步验证。并进行功能富集分析、免疫浸润分析和免疫功能分析。构建TF-miRNA-mRNA网络,利用单细胞RNA测序确定关键基因在关键细胞簇内的分布。最后,利用棕榈酸诱导的AML-12细胞系和MCD小鼠模型进一步验证了关键基因的表达。结果:通过差异基因表达(DEGs)分析和机器学习技术,我们成功鉴定了10个枢纽基因。其中,关键基因EGR1通过外部数据集进行验证筛选,曲线下面积(AUC)为0.882。富集分析和免疫浸润评估揭示了EGR1参与MASLD发病和进展的多种途径,显示了与多种免疫细胞的显著相关性。此外,额外的细胞实验和动物模型验证证实了EGR1的表达趋势与我们的分析结果高度一致。结论:我们的研究证实了EGR1是MASLD的关键基因,为该疾病的发病机制提供了新的见解,并为其治疗提供了新的治疗靶点。
期刊介绍:
An international, peer-reviewed, open access, online journal that welcomes laboratory and clinical findings on the molecular basis, cell biology and pharmacology of inflammation.