Bao Li, Yang Shen, Songbo Liu, Hong Yuan, Ming Liu, Haokun Li, Tonghe Zhang, Shuyuan Du, Xinwei Liu
{"title":"基于机器学习模型的骨关节炎免疫微环境亚型和临床风险生物标志物的识别。","authors":"Bao Li, Yang Shen, Songbo Liu, Hong Yuan, Ming Liu, Haokun Li, Tonghe Zhang, Shuyuan Du, Xinwei Liu","doi":"10.3389/fmolb.2024.1376793","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Osteoarthritis (OA) is a degenerative disease with a high incidence worldwide. Most affected patients do not exhibit obvious discomfort symptoms or imaging findings until OA progresses, leading to irreversible destruction of articular cartilage and bone. Therefore, developing new diagnostic biomarkers that can reflect articular cartilage injury is crucial for the early diagnosis of OA. This study aims to explore biomarkers related to the immune microenvironment of OA, providing a new research direction for the early diagnosis and identification of risk factors for OA.</p><p><strong>Methods: </strong>We screened and downloaded relevant data from the Gene Expression Omnibus (GEO) database, and the immune microenvironment-related genes (Imr-DEGs) were identified using the ImmPort data set by combining weighted coexpression analysis (WGCNA). Functional enrichment of GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) were conducted to explore the correlation of Imr-DEGs. A random forest machine learning model was constructed to analyze the characteristic genes of OA, and the diagnostic significance was determined by the Receiver Operating Characteristic Curve (ROC) curve, with external datasets used to verify the diagnostic ability. Different immune subtypes of OA were identified by unsupervised clustering, and the function of these subtypes was analyzed by gene set enrichment analysis (GSVA). The Drug-Gene Interaction Database was used to explore the relationship between characteristic genes and drugs.</p><p><strong>Results: </strong>Single sample gene set enrichment analysis (ssGSEA) revealed that 16 of 28 immune cell subsets in the dataset significantly differed between OA and normal groups. There were 26 Imr-DEGs identified by WGCNA, showing that functional enrichment was related to immune response. Using the random forest machine learning model algorithm, nine characteristic genes were obtained: <i>BLNK</i> (AUC = 0.809), <i>CCL18</i> (AUC = 0.692), <i>CD74</i> (AUC = 0.794), <i>CSF1R</i> (AUC = 0.835), <i>RAC2</i> (AUC = 0.792), <i>INSR</i> (AUC = 0.765), <i>IL11</i> (AUC = 0.662), <i>IL18</i> (AUC = 0.699), and <i>TLR7</i> (AUC = 0.807). A nomogram was constructed to predict the occurrence and development of OA, and the calibration curve confirmed the accuracy of these 9 genes in OA diagnosis.</p><p><strong>Conclusion: </strong>This study identified characteristic genes related to the immune microenvironment in OA, providing new insight into the risk factors of OA.</p>","PeriodicalId":12465,"journal":{"name":"Frontiers in Molecular Biosciences","volume":"11 ","pages":"1376793"},"PeriodicalIF":3.9000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11524973/pdf/","citationCount":"0","resultStr":"{\"title\":\"Identification of immune microenvironment subtypes and clinical risk biomarkers for osteoarthritis based on a machine learning model.\",\"authors\":\"Bao Li, Yang Shen, Songbo Liu, Hong Yuan, Ming Liu, Haokun Li, Tonghe Zhang, Shuyuan Du, Xinwei Liu\",\"doi\":\"10.3389/fmolb.2024.1376793\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Osteoarthritis (OA) is a degenerative disease with a high incidence worldwide. Most affected patients do not exhibit obvious discomfort symptoms or imaging findings until OA progresses, leading to irreversible destruction of articular cartilage and bone. Therefore, developing new diagnostic biomarkers that can reflect articular cartilage injury is crucial for the early diagnosis of OA. This study aims to explore biomarkers related to the immune microenvironment of OA, providing a new research direction for the early diagnosis and identification of risk factors for OA.</p><p><strong>Methods: </strong>We screened and downloaded relevant data from the Gene Expression Omnibus (GEO) database, and the immune microenvironment-related genes (Imr-DEGs) were identified using the ImmPort data set by combining weighted coexpression analysis (WGCNA). Functional enrichment of GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) were conducted to explore the correlation of Imr-DEGs. A random forest machine learning model was constructed to analyze the characteristic genes of OA, and the diagnostic significance was determined by the Receiver Operating Characteristic Curve (ROC) curve, with external datasets used to verify the diagnostic ability. Different immune subtypes of OA were identified by unsupervised clustering, and the function of these subtypes was analyzed by gene set enrichment analysis (GSVA). The Drug-Gene Interaction Database was used to explore the relationship between characteristic genes and drugs.</p><p><strong>Results: </strong>Single sample gene set enrichment analysis (ssGSEA) revealed that 16 of 28 immune cell subsets in the dataset significantly differed between OA and normal groups. There were 26 Imr-DEGs identified by WGCNA, showing that functional enrichment was related to immune response. Using the random forest machine learning model algorithm, nine characteristic genes were obtained: <i>BLNK</i> (AUC = 0.809), <i>CCL18</i> (AUC = 0.692), <i>CD74</i> (AUC = 0.794), <i>CSF1R</i> (AUC = 0.835), <i>RAC2</i> (AUC = 0.792), <i>INSR</i> (AUC = 0.765), <i>IL11</i> (AUC = 0.662), <i>IL18</i> (AUC = 0.699), and <i>TLR7</i> (AUC = 0.807). A nomogram was constructed to predict the occurrence and development of OA, and the calibration curve confirmed the accuracy of these 9 genes in OA diagnosis.</p><p><strong>Conclusion: </strong>This study identified characteristic genes related to the immune microenvironment in OA, providing new insight into the risk factors of OA.</p>\",\"PeriodicalId\":12465,\"journal\":{\"name\":\"Frontiers in Molecular Biosciences\",\"volume\":\"11 \",\"pages\":\"1376793\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11524973/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Molecular Biosciences\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.3389/fmolb.2024.1376793\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Molecular Biosciences","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3389/fmolb.2024.1376793","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
背景:骨关节炎(OA)是一种退行性疾病,在全球发病率很高。大多数患者在骨关节炎发展到导致关节软骨和骨骼不可逆转的破坏之前,不会表现出明显的不适症状或影像学检查结果。因此,开发能够反映关节软骨损伤的新诊断生物标志物对于早期诊断 OA 至关重要。本研究旨在探索与 OA 免疫微环境相关的生物标志物,为 OA 的早期诊断和风险因素识别提供新的研究方向:我们从基因表达总库(GEO)数据库中筛选并下载了相关数据,并通过加权共表达分析(WGCNA)结合ImmPort数据集鉴定了免疫微环境相关基因(Imr-DEGs)。为探索Imr-DEGs的相关性,对GO和京都基因组百科全书(KEGG)进行了功能富集。构建了随机森林机器学习模型来分析OA的特征基因,并通过接收操作特征曲线(ROC)来确定诊断意义,同时使用外部数据集来验证诊断能力。通过无监督聚类确定了OA的不同免疫亚型,并通过基因组富集分析(GSVA)分析了这些亚型的功能。利用药物基因相互作用数据库探索特征基因与药物之间的关系:结果:单样本基因组富集分析(ssGSEA)显示,数据集中的28个免疫细胞亚型中有16个在OA组和正常组之间存在显著差异。WGCNA鉴定出26个Imr-DEGs,表明功能富集与免疫反应有关。利用随机森林机器学习模型算法,得到了九个特征基因:BLNK(AUC = 0.809)、CCL18(AUC = 0.692)、CD74(AUC = 0.794)、CSF1R(AUC = 0.835)、RAC2(AUC = 0.792)、INSR(AUC = 0.765)、IL11(AUC = 0.662)、IL18(AUC = 0.699)和TLR7(AUC = 0.807)。结论:该研究发现了与 OA 发生和发展相关的特征基因:这项研究发现了与 OA 免疫微环境相关的特征基因,为了解 OA 的风险因素提供了新的视角。
Identification of immune microenvironment subtypes and clinical risk biomarkers for osteoarthritis based on a machine learning model.
Background: Osteoarthritis (OA) is a degenerative disease with a high incidence worldwide. Most affected patients do not exhibit obvious discomfort symptoms or imaging findings until OA progresses, leading to irreversible destruction of articular cartilage and bone. Therefore, developing new diagnostic biomarkers that can reflect articular cartilage injury is crucial for the early diagnosis of OA. This study aims to explore biomarkers related to the immune microenvironment of OA, providing a new research direction for the early diagnosis and identification of risk factors for OA.
Methods: We screened and downloaded relevant data from the Gene Expression Omnibus (GEO) database, and the immune microenvironment-related genes (Imr-DEGs) were identified using the ImmPort data set by combining weighted coexpression analysis (WGCNA). Functional enrichment of GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) were conducted to explore the correlation of Imr-DEGs. A random forest machine learning model was constructed to analyze the characteristic genes of OA, and the diagnostic significance was determined by the Receiver Operating Characteristic Curve (ROC) curve, with external datasets used to verify the diagnostic ability. Different immune subtypes of OA were identified by unsupervised clustering, and the function of these subtypes was analyzed by gene set enrichment analysis (GSVA). The Drug-Gene Interaction Database was used to explore the relationship between characteristic genes and drugs.
Results: Single sample gene set enrichment analysis (ssGSEA) revealed that 16 of 28 immune cell subsets in the dataset significantly differed between OA and normal groups. There were 26 Imr-DEGs identified by WGCNA, showing that functional enrichment was related to immune response. Using the random forest machine learning model algorithm, nine characteristic genes were obtained: BLNK (AUC = 0.809), CCL18 (AUC = 0.692), CD74 (AUC = 0.794), CSF1R (AUC = 0.835), RAC2 (AUC = 0.792), INSR (AUC = 0.765), IL11 (AUC = 0.662), IL18 (AUC = 0.699), and TLR7 (AUC = 0.807). A nomogram was constructed to predict the occurrence and development of OA, and the calibration curve confirmed the accuracy of these 9 genes in OA diagnosis.
Conclusion: This study identified characteristic genes related to the immune microenvironment in OA, providing new insight into the risk factors of OA.
期刊介绍:
Much of contemporary investigation in the life sciences is devoted to the molecular-scale understanding of the relationships between genes and the environment — in particular, dynamic alterations in the levels, modifications, and interactions of cellular effectors, including proteins. Frontiers in Molecular Biosciences offers an international publication platform for basic as well as applied research; we encourage contributions spanning both established and emerging areas of biology. To this end, the journal draws from empirical disciplines such as structural biology, enzymology, biochemistry, and biophysics, capitalizing as well on the technological advancements that have enabled metabolomics and proteomics measurements in massively parallel throughput, and the development of robust and innovative computational biology strategies. We also recognize influences from medicine and technology, welcoming studies in molecular genetics, molecular diagnostics and therapeutics, and nanotechnology.
Our ultimate objective is the comprehensive illustration of the molecular mechanisms regulating proteins, nucleic acids, carbohydrates, lipids, and small metabolites in organisms across all branches of life.
In addition to interesting new findings, techniques, and applications, Frontiers in Molecular Biosciences will consider new testable hypotheses to inspire different perspectives and stimulate scientific dialogue. The integration of in silico, in vitro, and in vivo approaches will benefit endeavors across all domains of the life sciences.