整合机器学习与孟德尔随机化以揭示多形性胶质母细胞瘤的因果基因网络。

IF 2.8 4区 医学 Q3 ENDOCRINOLOGY & METABOLISM Discover. Oncology Pub Date : 2025-01-13 DOI:10.1007/s12672-025-01792-0
Lixin Du, Pan Wang, Xiaoting Qiu, Zhigang Li, Jianlan Ma, Pengfei Chen
{"title":"整合机器学习与孟德尔随机化以揭示多形性胶质母细胞瘤的因果基因网络。","authors":"Lixin Du, Pan Wang, Xiaoting Qiu, Zhigang Li, Jianlan Ma, Pengfei Chen","doi":"10.1007/s12672-025-01792-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Glioblastoma multiforme (GBM) is a highly aggressive brain cancer with poor prognosis and limited treatment options. Despite advances in understanding its molecular mechanisms, effective therapeutic strategies remain elusive due to the tumor's genetic complexity and heterogeneity.</p><p><strong>Methods: </strong>This study employed a comprehensive analysis approach integrating 113 machine learning algorithms with Mendelian Randomization (MR) analysis to investigate the molecular underpinnings of GBM. Five publicly available gene expression datasets were analyzed to identify differentially expressed genes (DEGs) associated with GBM. Weighted Gene Co-expression Network Analysis (WGCNA) was used to identify GBM-related gene modules. Further, gene set enrichment and variation analyses were conducted to explore the biological pathways involved. The machine learning models were evaluated using Receiver Operating Characteristic (ROC) curves and confusion matrices to assess their predictive accuracy, with the best-performing model validated across external datasets. MR analysis was performed to establish causal relationships between genetically predicted gene expression levels and GBM outcomes.</p><p><strong>Results: </strong>The study identified 286 DEGs between GBM and adjacent normal tissues across five datasets. WGCNA highlighted the yellow module as the most relevant to GBM, containing key genes such as KLHL3, FOXO4, and MAP1A. Of the 113 machine learning models tested, Ridge regression achieved the highest area under the curve (AUC) of 0.92, demonstrating robust predictive accuracy. Validation using external datasets confirmed the model's reliability, with a classification accuracy of 89.5% in the training set and 85.3% in the validation sets. MR analysis provided strong evidence of a causal relationship between the expression levels of the identified genes and GBM risk.</p><p><strong>Conclusions: </strong>This study demonstrates the power of combining machine learning and Mendelian Randomization to uncover novel genetic markers for GBM. The identified genes offer promising potential as biomarkers for GBM diagnosis and therapy, providing new avenues for personalized treatment strategies.</p>","PeriodicalId":11148,"journal":{"name":"Discover. Oncology","volume":"16 1","pages":"38"},"PeriodicalIF":2.8000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11730047/pdf/","citationCount":"0","resultStr":"{\"title\":\"Integrating machine learning with mendelian randomization for unveiling causal gene networks in glioblastoma multiforme.\",\"authors\":\"Lixin Du, Pan Wang, Xiaoting Qiu, Zhigang Li, Jianlan Ma, Pengfei Chen\",\"doi\":\"10.1007/s12672-025-01792-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Glioblastoma multiforme (GBM) is a highly aggressive brain cancer with poor prognosis and limited treatment options. Despite advances in understanding its molecular mechanisms, effective therapeutic strategies remain elusive due to the tumor's genetic complexity and heterogeneity.</p><p><strong>Methods: </strong>This study employed a comprehensive analysis approach integrating 113 machine learning algorithms with Mendelian Randomization (MR) analysis to investigate the molecular underpinnings of GBM. Five publicly available gene expression datasets were analyzed to identify differentially expressed genes (DEGs) associated with GBM. Weighted Gene Co-expression Network Analysis (WGCNA) was used to identify GBM-related gene modules. Further, gene set enrichment and variation analyses were conducted to explore the biological pathways involved. The machine learning models were evaluated using Receiver Operating Characteristic (ROC) curves and confusion matrices to assess their predictive accuracy, with the best-performing model validated across external datasets. MR analysis was performed to establish causal relationships between genetically predicted gene expression levels and GBM outcomes.</p><p><strong>Results: </strong>The study identified 286 DEGs between GBM and adjacent normal tissues across five datasets. WGCNA highlighted the yellow module as the most relevant to GBM, containing key genes such as KLHL3, FOXO4, and MAP1A. Of the 113 machine learning models tested, Ridge regression achieved the highest area under the curve (AUC) of 0.92, demonstrating robust predictive accuracy. Validation using external datasets confirmed the model's reliability, with a classification accuracy of 89.5% in the training set and 85.3% in the validation sets. MR analysis provided strong evidence of a causal relationship between the expression levels of the identified genes and GBM risk.</p><p><strong>Conclusions: </strong>This study demonstrates the power of combining machine learning and Mendelian Randomization to uncover novel genetic markers for GBM. The identified genes offer promising potential as biomarkers for GBM diagnosis and therapy, providing new avenues for personalized treatment strategies.</p>\",\"PeriodicalId\":11148,\"journal\":{\"name\":\"Discover. Oncology\",\"volume\":\"16 1\",\"pages\":\"38\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-01-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11730047/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Discover. Oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s12672-025-01792-0\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENDOCRINOLOGY & METABOLISM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Discover. Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s12672-025-01792-0","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0

摘要

背景:多形性胶质母细胞瘤(GBM多形性胶质母细胞瘤(GBM)是一种侵袭性极强的脑癌,预后不良,治疗方案有限。尽管在了解其分子机制方面取得了进展,但由于肿瘤的遗传复杂性和异质性,有效的治疗策略仍然难以捉摸:本研究采用了一种综合分析方法,将113种机器学习算法与孟德尔随机化(MR)分析相结合,研究GBM的分子基础。研究人员分析了五个公开的基因表达数据集,以确定与GBM相关的差异表达基因(DEGs)。利用加权基因共表达网络分析(WGCNA)确定了与 GBM 相关的基因模块。此外,还进行了基因组富集和变异分析,以探索相关的生物通路。使用接收者操作特征曲线(ROC)和混淆矩阵评估机器学习模型的预测准确性,并在外部数据集中验证表现最佳的模型。进行了磁共振分析,以确定基因预测的基因表达水平与 GBM 结果之间的因果关系:该研究在五个数据集中确定了 GBM 和邻近正常组织之间的 286 个 DEGs。WGCNA突出显示了与GBM最相关的黄色模块,其中包含KLHL3、FOXO4和MAP1A等关键基因。在测试的 113 个机器学习模型中,岭回归的曲线下面积(AUC)最高,达到了 0.92,显示了强大的预测准确性。使用外部数据集进行的验证证实了该模型的可靠性,训练集的分类准确率为 89.5%,验证集的分类准确率为 85.3%。磁共振分析有力地证明了已识别基因的表达水平与 GBM 风险之间的因果关系:这项研究展示了将机器学习与孟德尔随机化相结合以发现 GBM 的新型遗传标记的能力。鉴定出的基因有望成为 GBM 诊断和治疗的生物标记物,为个性化治疗策略提供新的途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Integrating machine learning with mendelian randomization for unveiling causal gene networks in glioblastoma multiforme.

Background: Glioblastoma multiforme (GBM) is a highly aggressive brain cancer with poor prognosis and limited treatment options. Despite advances in understanding its molecular mechanisms, effective therapeutic strategies remain elusive due to the tumor's genetic complexity and heterogeneity.

Methods: This study employed a comprehensive analysis approach integrating 113 machine learning algorithms with Mendelian Randomization (MR) analysis to investigate the molecular underpinnings of GBM. Five publicly available gene expression datasets were analyzed to identify differentially expressed genes (DEGs) associated with GBM. Weighted Gene Co-expression Network Analysis (WGCNA) was used to identify GBM-related gene modules. Further, gene set enrichment and variation analyses were conducted to explore the biological pathways involved. The machine learning models were evaluated using Receiver Operating Characteristic (ROC) curves and confusion matrices to assess their predictive accuracy, with the best-performing model validated across external datasets. MR analysis was performed to establish causal relationships between genetically predicted gene expression levels and GBM outcomes.

Results: The study identified 286 DEGs between GBM and adjacent normal tissues across five datasets. WGCNA highlighted the yellow module as the most relevant to GBM, containing key genes such as KLHL3, FOXO4, and MAP1A. Of the 113 machine learning models tested, Ridge regression achieved the highest area under the curve (AUC) of 0.92, demonstrating robust predictive accuracy. Validation using external datasets confirmed the model's reliability, with a classification accuracy of 89.5% in the training set and 85.3% in the validation sets. MR analysis provided strong evidence of a causal relationship between the expression levels of the identified genes and GBM risk.

Conclusions: This study demonstrates the power of combining machine learning and Mendelian Randomization to uncover novel genetic markers for GBM. The identified genes offer promising potential as biomarkers for GBM diagnosis and therapy, providing new avenues for personalized treatment strategies.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Discover. Oncology
Discover. Oncology Medicine-Endocrinology, Diabetes and Metabolism
CiteScore
2.40
自引率
9.10%
发文量
122
审稿时长
5 weeks
期刊最新文献
Correction: Identified VCAM1 as prognostic gene in gastric cancer by co-expression network analysis. Investigating causal relationship among inflammatory cytokines and oropharyngeal cancer: Mendelian randomization. To describe the subsets of malignant epithelial cells in gastric cancer, their developmental trajectories and drug resistance characteristics. AURKB affects the proliferation of clear cell renal cell carcinoma by regulating fatty acid metabolism. A panel of cancer testis antigens in squamous cell carcinoma of the lung, head and neck, and esophagus: implication for biomarkers and therapeutic targets.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1