Development and multicenter validation of machine learning models for predicting postoperative pulmonary complications after neurosurgery.

IF 7.5 3区医学 Q1 MEDICINE, GENERAL & INTERNAL Chinese Medical Journal Pub Date : 2025-02-13 DOI:10.1097/CM9.0000000000003433

Ming Xu, Wenhao Zhu, Siyu Hou, Hongzhi Xu, Jingwen Xia, Liyu Lin, Hao Fu, Mingyu You, Jiafeng Wang, Zhi Xie, Xiaohong Wen, Yingwei Wang

{"title":"Development and multicenter validation of machine learning models for predicting postoperative pulmonary complications after neurosurgery.","authors":"Ming Xu, Wenhao Zhu, Siyu Hou, Hongzhi Xu, Jingwen Xia, Liyu Lin, Hao Fu, Mingyu You, Jiafeng Wang, Zhi Xie, Xiaohong Wen, Yingwei Wang","doi":"10.1097/CM9.0000000000003433","DOIUrl":null,"url":null,"abstract":"Background: Postoperative pulmonary complications (PPCs) are major adverse events in neurosurgical patients. This study aimed to develop and validate machine learning models predicting PPCs after neurosurgery.Methods: PPCs were defined according to the European Perioperative Clinical Outcome standards as occurring within 7 postoperative days. Data of cases meeting inclusion/exclusion criteria were extracted from the anesthesia information management system to create three datasets: The development (data of Huashan Hospital, Fudan University from 2018 to 2020), temporal validation (data of Huashan Hospital, Fudan University in 2021) and external validation (data of other three hospitals in 2023) datasets. Machine learning models of six algorithms were trained using either 35 retrievable and plausible features or the 11 features selected by Lasso regression. Temporal validation was conducted for all models and the 11-feature models were also externally validated. Independent risk factors were identified and feature importance in top models was analyzed.Results: PPCs occurred in 712 of 7533 (9.5%), 258 of 2824 (9.1%), and 207 of 2300 (9.0%) patients in the development, temporal validation and external validation datasets, respectively. During cross-validation training, all models except Bayes demonstrated good discrimination with an area under the receiver operating characteristic curve (AUC) of 0.84. In temporal validation of full-feature models, deep neural network (DNN) performed the best with an AUC of 0.835 (95% confidence interval [CI]: 0.805-0.858) and a Brier score of 0.069, followed by logistic regression (LR), random forest and XGBoost. The 11-feature models performed comparable to full-feature models with very close but statistically lower AUCs, with the top models of DNN and LR in temporal and external validations. An 11-feature nomogram was drawn based on the LR algorithm and it outperformed the minimally modified Assess respiratory RIsk in Surgical patients in CATalonia (ARISCAT) and Laparoscopic Surgery Video Educational Guidelines (LAS VEGAS) scores with a higher AUC (LR: 0.824, ARISCAT: 0.672, LAS: 0.663). Independent risk factors based on multivariate LR mostly overlapped with Lasso-selected features, but lacked consistency with the important features using the Shapley additive explanation (SHAP) method of the LR model.Conclusions: The developed models, especially the DNN model and the nomogram, had good discrimination and calibration, and could be used for predicting PPCs in neurosurgical patients. The establishment of machine learning models and the ascertainment of risk factors might assist clinical decision support for improving surgical outcomes.Trial registration: ChiCTR 2100047474; https://www.chictr.org.cn/showproj.html?proj = 128279.","PeriodicalId":10183,"journal":{"name":"Chinese Medical Journal","volume":" ","pages":""},"PeriodicalIF":7.5000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Medical Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/CM9.0000000000003433","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Postoperative pulmonary complications (PPCs) are major adverse events in neurosurgical patients. This study aimed to develop and validate machine learning models predicting PPCs after neurosurgery.

Methods: PPCs were defined according to the European Perioperative Clinical Outcome standards as occurring within 7 postoperative days. Data of cases meeting inclusion/exclusion criteria were extracted from the anesthesia information management system to create three datasets: The development (data of Huashan Hospital, Fudan University from 2018 to 2020), temporal validation (data of Huashan Hospital, Fudan University in 2021) and external validation (data of other three hospitals in 2023) datasets. Machine learning models of six algorithms were trained using either 35 retrievable and plausible features or the 11 features selected by Lasso regression. Temporal validation was conducted for all models and the 11-feature models were also externally validated. Independent risk factors were identified and feature importance in top models was analyzed.

Results: PPCs occurred in 712 of 7533 (9.5%), 258 of 2824 (9.1%), and 207 of 2300 (9.0%) patients in the development, temporal validation and external validation datasets, respectively. During cross-validation training, all models except Bayes demonstrated good discrimination with an area under the receiver operating characteristic curve (AUC) of 0.84. In temporal validation of full-feature models, deep neural network (DNN) performed the best with an AUC of 0.835 (95% confidence interval [CI]: 0.805-0.858) and a Brier score of 0.069, followed by logistic regression (LR), random forest and XGBoost. The 11-feature models performed comparable to full-feature models with very close but statistically lower AUCs, with the top models of DNN and LR in temporal and external validations. An 11-feature nomogram was drawn based on the LR algorithm and it outperformed the minimally modified Assess respiratory RIsk in Surgical patients in CATalonia (ARISCAT) and Laparoscopic Surgery Video Educational Guidelines (LAS VEGAS) scores with a higher AUC (LR: 0.824, ARISCAT: 0.672, LAS: 0.663). Independent risk factors based on multivariate LR mostly overlapped with Lasso-selected features, but lacked consistency with the important features using the Shapley additive explanation (SHAP) method of the LR model.

Conclusions: The developed models, especially the DNN model and the nomogram, had good discrimination and calibration, and could be used for predicting PPCs in neurosurgical patients. The establishment of machine learning models and the ascertainment of risk factors might assist clinical decision support for improving surgical outcomes.

Trial registration: ChiCTR 2100047474; https://www.chictr.org.cn/showproj.html?proj = 128279.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Chinese Medical Journal 医学-医学：内科

CiteScore

9.80

自引率

4.90%

发文量

19245

审稿时长

6 months

期刊介绍： The Chinese Medical Journal (CMJ) is published semimonthly in English by the Chinese Medical Association, and is a peer reviewed general medical journal for all doctors, researchers, and health workers regardless of their medical specialty or type of employment. Established in 1887, it is the oldest medical periodical in China and is distributed worldwide. The journal functions as a window into China’s medical sciences and reflects the advances and progress in China’s medical sciences and technology. It serves the objective of international academic exchange. The journal includes Original Articles, Editorial, Review Articles, Medical Progress, Brief Reports, Case Reports, Viewpoint, Clinical Exchange, Letter,and News,etc. CMJ is abstracted or indexed in many databases including Biological Abstracts, Chemical Abstracts, Index Medicus/Medline, Science Citation Index (SCI), Current Contents, Cancerlit, Health Plan & Administration, Embase, Social Scisearch, Aidsline, Toxline, Biocommercial Abstracts, Arts and Humanities Search, Nuclear Science Abstracts, Water Resources Abstracts, Cab Abstracts, Occupation Safety & Health, etc. In 2007, the impact factor of the journal by SCI is 0.636, and the total citation is 2315.