预测前列腺癌生化复发的可解释和可视化机器学习模型

Wenhao Lu, Lin Zhao, Shenfan Wang, Huiyong Zhang, Kangxian Jiang, Jin Ji, Shaohua Chen, Chengbang Wang, Chunmeng Wei, Rongbin Zhou, Zuheng Wang, Xiao Li, Fubo Wang, Xuedong Wei, Wenlei Hou
{"title":"预测前列腺癌生化复发的可解释和可视化机器学习模型","authors":"Wenhao Lu, Lin Zhao, Shenfan Wang, Huiyong Zhang, Kangxian Jiang, Jin Ji, Shaohua Chen, Chengbang Wang, Chunmeng Wei, Rongbin Zhou, Zuheng Wang, Xiao Li, Fubo Wang, Xuedong Wei, Wenlei Hou","doi":"10.1007/s12094-024-03480-x","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Purpose</h3><p>Machine learning (ML) models presented an excellent performance in the prognosis prediction. However, the black box characteristic of ML models limited the clinical applications. Here, we aimed to establish explainable and visualizable ML models to predict biochemical recurrence (BCR) of prostate cancer (PCa).</p><h3 data-test=\"abstract-sub-heading\">Materials and methods</h3><p>A total of 647 PCa patients were retrospectively evaluated. Clinical parameters were identified using LASSO regression. Then, cohort was split into training and validation datasets with a ratio of 0.75:0.25 and BCR-related features were included in Cox regression and five ML algorithm to construct BCR prediction models. The clinical utility of each model was evaluated by concordance index (C-index) values and decision curve analyses (DCA). Besides, Shapley Additive Explanation (SHAP) values were used to explain the features in the models.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>We identified 11 BCR-related features using LASSO regression, then establishing five ML-based models, including random survival forest (RSF), survival support vector machine (SSVM), survival Tree (sTree), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and a Cox regression model, C-index were 0.846 (95%CI 0.796–0.894), 0.774 (95%CI 0.712–0.834), 0.757 (95%CI 0.694–0.818), 0.820 (95%CI 0.765–0.869), 0.793 (95%CI 0.735–0.852), and 0.807 (95%CI 0.753–0.858), respectively. The DCA showed that RSF model had significant advantages over all models. In interpretability of ML models, the SHAP value demonstrated the tangible contribution of each feature in RSF model.</p><h3 data-test=\"abstract-sub-heading\">Conclusions</h3><p>Our score system provide reference for the identification for BCR, and the crafting of a framework for making therapeutic decisions for PCa on a personalized basis.</p>","PeriodicalId":10166,"journal":{"name":"Clinical and Translational Oncology","volume":"43 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Explainable and visualizable machine learning models to predict biochemical recurrence of prostate cancer\",\"authors\":\"Wenhao Lu, Lin Zhao, Shenfan Wang, Huiyong Zhang, Kangxian Jiang, Jin Ji, Shaohua Chen, Chengbang Wang, Chunmeng Wei, Rongbin Zhou, Zuheng Wang, Xiao Li, Fubo Wang, Xuedong Wei, Wenlei Hou\",\"doi\":\"10.1007/s12094-024-03480-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">Purpose</h3><p>Machine learning (ML) models presented an excellent performance in the prognosis prediction. However, the black box characteristic of ML models limited the clinical applications. Here, we aimed to establish explainable and visualizable ML models to predict biochemical recurrence (BCR) of prostate cancer (PCa).</p><h3 data-test=\\\"abstract-sub-heading\\\">Materials and methods</h3><p>A total of 647 PCa patients were retrospectively evaluated. Clinical parameters were identified using LASSO regression. Then, cohort was split into training and validation datasets with a ratio of 0.75:0.25 and BCR-related features were included in Cox regression and five ML algorithm to construct BCR prediction models. The clinical utility of each model was evaluated by concordance index (C-index) values and decision curve analyses (DCA). Besides, Shapley Additive Explanation (SHAP) values were used to explain the features in the models.</p><h3 data-test=\\\"abstract-sub-heading\\\">Results</h3><p>We identified 11 BCR-related features using LASSO regression, then establishing five ML-based models, including random survival forest (RSF), survival support vector machine (SSVM), survival Tree (sTree), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and a Cox regression model, C-index were 0.846 (95%CI 0.796–0.894), 0.774 (95%CI 0.712–0.834), 0.757 (95%CI 0.694–0.818), 0.820 (95%CI 0.765–0.869), 0.793 (95%CI 0.735–0.852), and 0.807 (95%CI 0.753–0.858), respectively. The DCA showed that RSF model had significant advantages over all models. In interpretability of ML models, the SHAP value demonstrated the tangible contribution of each feature in RSF model.</p><h3 data-test=\\\"abstract-sub-heading\\\">Conclusions</h3><p>Our score system provide reference for the identification for BCR, and the crafting of a framework for making therapeutic decisions for PCa on a personalized basis.</p>\",\"PeriodicalId\":10166,\"journal\":{\"name\":\"Clinical and Translational Oncology\",\"volume\":\"43 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical and Translational Oncology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s12094-024-03480-x\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical and Translational Oncology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s12094-024-03480-x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目的机器学习(ML)模型在预后预测方面表现出色。然而,ML 模型的黑箱特性限制了其临床应用。在此,我们旨在建立可解释、可视化的 ML 模型,以预测前列腺癌(PCa)的生化复发(BCR)。采用 LASSO 回归法确定临床参数。然后,以0.75:0.25的比例将队列分成训练数据集和验证数据集,并将BCR相关特征纳入Cox回归和五种ML算法,以构建BCR预测模型。通过一致性指数(C-index)值和决策曲线分析(DCA)评估每个模型的临床实用性。结果我们利用 LASSO 回归确定了 11 个 BCR 相关特征,然后建立了 5 个基于 ML 的模型,包括随机生存森林(RSF)、生存支持向量机(SSVM)、生存树(sTree)、梯度提升决策树(GBDT)、极端梯度提升(XGBoost)和 Cox 回归模型,C-index 分别为 0.846(95%CI 0.796-0.894)、0.774(95%CI 0.712-0.834)、0.757(95%CI 0.694-0.818)、0.820(95%CI 0.765-0.869)、0.793(95%CI 0.735-0.852)和 0.807(95%CI 0.753-0.858)。DCA 结果表明,RSF 模型与所有模型相比具有显著优势。结论我们的评分系统为确定 BCR 提供了参考,并为制定个性化的 PCa 治疗决策提供了框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Explainable and visualizable machine learning models to predict biochemical recurrence of prostate cancer

Purpose

Machine learning (ML) models presented an excellent performance in the prognosis prediction. However, the black box characteristic of ML models limited the clinical applications. Here, we aimed to establish explainable and visualizable ML models to predict biochemical recurrence (BCR) of prostate cancer (PCa).

Materials and methods

A total of 647 PCa patients were retrospectively evaluated. Clinical parameters were identified using LASSO regression. Then, cohort was split into training and validation datasets with a ratio of 0.75:0.25 and BCR-related features were included in Cox regression and five ML algorithm to construct BCR prediction models. The clinical utility of each model was evaluated by concordance index (C-index) values and decision curve analyses (DCA). Besides, Shapley Additive Explanation (SHAP) values were used to explain the features in the models.

Results

We identified 11 BCR-related features using LASSO regression, then establishing five ML-based models, including random survival forest (RSF), survival support vector machine (SSVM), survival Tree (sTree), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and a Cox regression model, C-index were 0.846 (95%CI 0.796–0.894), 0.774 (95%CI 0.712–0.834), 0.757 (95%CI 0.694–0.818), 0.820 (95%CI 0.765–0.869), 0.793 (95%CI 0.735–0.852), and 0.807 (95%CI 0.753–0.858), respectively. The DCA showed that RSF model had significant advantages over all models. In interpretability of ML models, the SHAP value demonstrated the tangible contribution of each feature in RSF model.

Conclusions

Our score system provide reference for the identification for BCR, and the crafting of a framework for making therapeutic decisions for PCa on a personalized basis.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
lncSNHG16 promotes hepatocellular carcinoma development by inhibiting autophagy Efficacy and safety of camrelizumab combined with chemotherapy in the treatment of advanced biliary malignancy and associations between peripheral blood lymphocyte subsets and clinical outcomes Emerging immunologic approaches as cancer anti-angiogenic therapies Safety profile of trastuzumab originator vs biosimilars: a systematic review and meta-analysis of randomized clinical trials Construction and clinical significance of prognostic risk markers based on cancer driver genes in lung adenocarcinoma
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1