{"title":"hdWGCNA and Cellular Communication Identify Active NK Cell Subtypes in Alzheimer's Disease and Screen for Diagnostic Markers through Machine Learning.","authors":"Guobin Song, Haoyang Wu, Haiqing Chen, Shengke Zhang, Qingwen Hu, Haotian Lai, Claire Fuller, Guanhu Yang, Hao Chi","doi":"10.2174/0115672050314171240527064514","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Alzheimer's disease (AD) is a recognized complex and severe neurodegenerative disorder, presenting a significant challenge to global health. Its hallmark pathological features include the deposition of β-amyloid plaques and the formation of neurofibrillary tangles. Given this context, it becomes imperative to develop an early and accurate biomarker model for AD diagnosis, employing machine learning and bioinformatics analysis.</p><p><strong>Methods: </strong>In this study, single-cell data analysis was employed to identify cellular subtypes that exhibited significant differences between the diseased and control groups. Following the identification of NK cells, hdWGCNA analysis and cellular communication analysis were conducted to pinpoint NK cell subset with the most robust communication effects. Subsequently, three machine learning algorithms-LASSO, Random Forest, and SVM-RFE-were employed to jointly screen for NK cell subset modular genes highly associated with AD. A logistic regression diagnostic model was then designed based on these characterized genes. Additionally, a protein-protein interaction (PPI) networks of model genes was established. Furthermore, unsupervised cluster analysis was conducted to classify AD subtypes based on the model genes, followed by the analysis of immune infiltration in the different subtypes. Finally, Spearman correlation coefficient analysis was utilized to explore the correlation between model genes and immune cells, as well as inflammatory factors.</p><p><strong>Results: </strong>We have successfully identified three genes (RPLP2, RPSA, and RPL18A) that exhibit a high association with AD. The nomogram based on these genes provides practical assistance in diagnosing and predicting patients' outcomes. The interconnected genes screened through PPI are intricately linked to ribosome metabolism and the COVID-19 pathway. Utilizing the expression of modular genes, unsupervised cluster analysis unveiled three distinct AD subtypes. Particularly noteworthy is subtype C3, characterized by high expression, which correlates with immune cell infiltration and elevated levels of inflammatory factors. Hence, it can be inferred that the establishment of an immune environment in AD patients is closely intertwined with the heightened expression of model genes.</p><p><strong>Conclusion: </strong>This study has not only established a valuable diagnostic model for AD patients but has also delved deeply into the pivotal role of model genes in shaping the immune environment of individuals with AD. These findings offer crucial insights into early AD diagnosis and patient management strategies.</p>","PeriodicalId":94309,"journal":{"name":"Current Alzheimer research","volume":" ","pages":"120-140"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Alzheimer research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/0115672050314171240527064514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Alzheimer's disease (AD) is a recognized complex and severe neurodegenerative disorder, presenting a significant challenge to global health. Its hallmark pathological features include the deposition of β-amyloid plaques and the formation of neurofibrillary tangles. Given this context, it becomes imperative to develop an early and accurate biomarker model for AD diagnosis, employing machine learning and bioinformatics analysis.
Methods: In this study, single-cell data analysis was employed to identify cellular subtypes that exhibited significant differences between the diseased and control groups. Following the identification of NK cells, hdWGCNA analysis and cellular communication analysis were conducted to pinpoint NK cell subset with the most robust communication effects. Subsequently, three machine learning algorithms-LASSO, Random Forest, and SVM-RFE-were employed to jointly screen for NK cell subset modular genes highly associated with AD. A logistic regression diagnostic model was then designed based on these characterized genes. Additionally, a protein-protein interaction (PPI) networks of model genes was established. Furthermore, unsupervised cluster analysis was conducted to classify AD subtypes based on the model genes, followed by the analysis of immune infiltration in the different subtypes. Finally, Spearman correlation coefficient analysis was utilized to explore the correlation between model genes and immune cells, as well as inflammatory factors.
Results: We have successfully identified three genes (RPLP2, RPSA, and RPL18A) that exhibit a high association with AD. The nomogram based on these genes provides practical assistance in diagnosing and predicting patients' outcomes. The interconnected genes screened through PPI are intricately linked to ribosome metabolism and the COVID-19 pathway. Utilizing the expression of modular genes, unsupervised cluster analysis unveiled three distinct AD subtypes. Particularly noteworthy is subtype C3, characterized by high expression, which correlates with immune cell infiltration and elevated levels of inflammatory factors. Hence, it can be inferred that the establishment of an immune environment in AD patients is closely intertwined with the heightened expression of model genes.
Conclusion: This study has not only established a valuable diagnostic model for AD patients but has also delved deeply into the pivotal role of model genes in shaping the immune environment of individuals with AD. These findings offer crucial insights into early AD diagnosis and patient management strategies.
背景:阿尔茨海默病(AD)是一种公认的复杂而严重的神经退行性疾病,对全球健康构成重大挑战。其标志性病理特征包括β淀粉样蛋白斑块的沉积和神经纤维缠结的形成。有鉴于此,当务之急是利用机器学习和生物信息学分析方法,开发一种早期、准确的AD诊断生物标志物模型:本研究采用单细胞数据分析来识别在患病组和对照组之间表现出显著差异的细胞亚型。在识别出 NK 细胞后,进行了 hdWGCNA 分析和细胞通讯分析,以确定具有最强通讯效应的 NK 细胞亚群。随后,三种机器学习算法--LASSO、随机森林和SVM-RFE--被用来联合筛选与AD高度相关的NK细胞亚群模块基因。然后根据这些特征基因设计了逻辑回归诊断模型。此外,还建立了模型基因的蛋白-蛋白相互作用(PPI)网络。此外,还进行了无监督聚类分析,根据模型基因对 AD 亚型进行分类,然后分析不同亚型的免疫浸润情况。最后,利用斯皮尔曼相关系数分析探讨了模型基因与免疫细胞以及炎症因子之间的相关性:结果:我们成功地发现了三个与 AD 高度相关的基因(RPLP2、RPSA 和 RPL18A)。基于这些基因的提名图为诊断和预测患者预后提供了实际帮助。通过 PPI 筛选出的相互关联的基因与核糖体代谢和 COVID-19 通路有着错综复杂的联系。利用模块化基因的表达,无监督聚类分析揭示了三种不同的AD亚型。尤其值得注意的是C3亚型,其特点是高表达,与免疫细胞浸润和炎症因子水平升高相关。因此,可以推断 AD 患者免疫环境的建立与模型基因的高表达密切相关:这项研究不仅为 AD 患者建立了一个有价值的诊断模型,还深入研究了模型基因在形成 AD 患者免疫环境中的关键作用。这些发现为早期 AD 诊断和患者管理策略提供了重要启示。