Development and external validation of a machine learning-based model to predict postoperative recurrence in patients with duodenal adenocarcinoma: a multicenter, retrospective cohort study.
{"title":"Development and external validation of a machine learning-based model to predict postoperative recurrence in patients with duodenal adenocarcinoma: a multicenter, retrospective cohort study.","authors":"Xu Liu, Qifeng Xiao, Zongting Gu, Xin Wu, Chunhui Yuan, Xiaolong Tang, Fanbin Meng, Dong Wang, Ren Lang, Kaiqing Guo, Xiaodong Tian, Yu Zhang, Enhong Zhao, Zhenzhou Wu, Jingyong Xu, Ying Xing, Feng Cao, Chengfeng Wang, Jianwei Zhang","doi":"10.1186/s12916-025-03912-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Duodenal adenocarcinoma (DA) has a high recurrence rate, making the prediction of recurrence after surgery critically important.</p><p><strong>Methods: </strong>Our objective is to develop a machine learning-based model to predict the postoperative recurrence of DA. We conducted a multicenter, retrospective cohort study in China. 1830 patients with DA who underwent radical surgery between 2012 and 2023 were included. Wrapper methods were used to select optimal predictors by ten machine learning learners. Subsequently, these ten learners were utilized for model development. The model's performance was validated using three separate cohorts, and assessed by the concordance index (C-index), time-dependent calibration curve, time-dependent receiver operating characteristic curves, and decision curve analysis.</p><p><strong>Results: </strong>After selecting predictors, ten feature subsets were identified. And ten feature subsets were combined with the ten machine learning learners in a permutation, resulting in the development of 100 predictive models, and the Penalized Regression + Accelerated Oblique Random Survival Forest model (PAM) exhibited the best predictive performance. The C-index for PAM was 0.882 (95% CI 0.860-0.886) in the training cohort, 0.747 (95% CI 0.683-0.798) in the validation cohort 1, 0.736 (95% CI 0.649-0.792) in the validation cohort 2, and 0.734 (95% CI 0.674-0.791) in the validation cohort 3. A publicly accessible web tool was developed for the PAM.</p><p><strong>Conclusions: </strong>The PAM has the potential to identify postoperative recurrence in DA patients. This can assist clinicians in assessing the severity of the disease, facilitating patient follow-up, and aiding in the formulation of adjuvant treatment strategies.</p>","PeriodicalId":9188,"journal":{"name":"BMC Medicine","volume":"23 1","pages":"98"},"PeriodicalIF":8.3000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11846245/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12916-025-03912-7","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Duodenal adenocarcinoma (DA) has a high recurrence rate, making the prediction of recurrence after surgery critically important.
Methods: Our objective is to develop a machine learning-based model to predict the postoperative recurrence of DA. We conducted a multicenter, retrospective cohort study in China. 1830 patients with DA who underwent radical surgery between 2012 and 2023 were included. Wrapper methods were used to select optimal predictors by ten machine learning learners. Subsequently, these ten learners were utilized for model development. The model's performance was validated using three separate cohorts, and assessed by the concordance index (C-index), time-dependent calibration curve, time-dependent receiver operating characteristic curves, and decision curve analysis.
Results: After selecting predictors, ten feature subsets were identified. And ten feature subsets were combined with the ten machine learning learners in a permutation, resulting in the development of 100 predictive models, and the Penalized Regression + Accelerated Oblique Random Survival Forest model (PAM) exhibited the best predictive performance. The C-index for PAM was 0.882 (95% CI 0.860-0.886) in the training cohort, 0.747 (95% CI 0.683-0.798) in the validation cohort 1, 0.736 (95% CI 0.649-0.792) in the validation cohort 2, and 0.734 (95% CI 0.674-0.791) in the validation cohort 3. A publicly accessible web tool was developed for the PAM.
Conclusions: The PAM has the potential to identify postoperative recurrence in DA patients. This can assist clinicians in assessing the severity of the disease, facilitating patient follow-up, and aiding in the formulation of adjuvant treatment strategies.
背景:十二指肠腺癌(Duodenal adencarcinoma, DA)复发率高,对其术后复发的预测至关重要。方法:我们的目标是建立一个基于机器学习的模型来预测DA术后复发。我们在中国进行了一项多中心、回顾性队列研究,纳入了2012年至2023年间接受根治性手术的1830例DA患者。包装方法被用来选择十个机器学习学习者的最优预测器。随后,这10个学习器被用于模型开发。采用三个单独的队列验证模型的性能,并通过一致性指数(C-index)、时变校准曲线、时变受试者工作特征曲线和决策曲线分析来评估模型的性能。结果:选择预测因子后,识别出10个特征子集。将10个特征子集与10个机器学习学习者组合成一个排列,开发出100个预测模型,其中惩罚回归+加速倾斜随机生存森林模型(PAM)的预测性能最好。PAM的c指数在训练队列中为0.882 (95% CI 0.860-0.886),在验证队列1中为0.747 (95% CI 0.683-0.798),在验证队列2中为0.736 (95% CI 0.649-0.792),在验证队列3中为0.734 (95% CI 0.674-0.791)。为PAM开发了一个可公开访问的web工具。结论:PAM具有鉴别DA患者术后复发的潜力。这可以帮助临床医生评估疾病的严重程度,促进患者随访,并有助于制定辅助治疗策略。
期刊介绍:
BMC Medicine is an open access, transparent peer-reviewed general medical journal. It is the flagship journal of the BMC series and publishes outstanding and influential research in various areas including clinical practice, translational medicine, medical and health advances, public health, global health, policy, and general topics of interest to the biomedical and sociomedical professional communities. In addition to research articles, the journal also publishes stimulating debates, reviews, unique forum articles, and concise tutorials. All articles published in BMC Medicine are included in various databases such as Biological Abstracts, BIOSIS, CAS, Citebase, Current contents, DOAJ, Embase, MEDLINE, PubMed, Science Citation Index Expanded, OAIster, SCImago, Scopus, SOCOLAR, and Zetoc.