Mingkuan Su, Haiying Wu, Hongbin Chen, Jianfeng Guo, Zongyun Chen, Jie Qiu, Jiancheng Huang
{"title":"Early prediction of sepsis-induced respiratory tract infection using a biomarker-based machine-learning algorithm.","authors":"Mingkuan Su, Haiying Wu, Hongbin Chen, Jianfeng Guo, Zongyun Chen, Jie Qiu, Jiancheng Huang","doi":"10.1080/00365513.2024.2346914","DOIUrl":null,"url":null,"abstract":"<p><p>Early and differential diagnosis of sepsis is essential to avoid unnecessary antibiotic use and further reduce patient morbidity and mortality. Here, we aimed to identify predictors of sepsis and advance a machine-learning strategy to predict sepsis-induced respiratory tract infection (RTI). Patients with sepsis and RTI were selected via retrospective analysis, and essential population characteristics and laboratory parameters were recorded. To improve the performance of the primary model and avoid over-fitting, a recursive feature elimination with cross-validation (RFECV) strategy was used to screen the optimal subset of biomarkers and construct nine machine-learning models based on this subset; the average accuracy, precision, recall, and F1-score were used for evaluation of the models. We identified 430 patients with sepsis and 686 patients with RTI. A total of 39 features were collected, with 23 features identified for initial model construction. Using the RFECV algorithm, we found that the XGBoost classifier, which only needed to include seven biomarkers, demonstrated the best performance among all prediction models, with an average accuracy of 89.24 ± 2.28, while the Ridge classifier, which included 11 biomarkers, had an average accuracy of only 83.87 ± 4.69. The remaining models had prediction accuracies greater than 88%. We developed nine models for predicting sepsis using a strategy that combined RFECV with machine learning. Among these models, the XGBoost classifier, which included seven biomarkers, showed the best performance and highest accuracy for predicting sepsis and may be a promising tool for the timely identification of sepsis.</p>","PeriodicalId":21474,"journal":{"name":"Scandinavian Journal of Clinical & Laboratory Investigation","volume":" ","pages":"202-210"},"PeriodicalIF":1.3000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scandinavian Journal of Clinical & Laboratory Investigation","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/00365513.2024.2346914","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/4/29 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Early and differential diagnosis of sepsis is essential to avoid unnecessary antibiotic use and further reduce patient morbidity and mortality. Here, we aimed to identify predictors of sepsis and advance a machine-learning strategy to predict sepsis-induced respiratory tract infection (RTI). Patients with sepsis and RTI were selected via retrospective analysis, and essential population characteristics and laboratory parameters were recorded. To improve the performance of the primary model and avoid over-fitting, a recursive feature elimination with cross-validation (RFECV) strategy was used to screen the optimal subset of biomarkers and construct nine machine-learning models based on this subset; the average accuracy, precision, recall, and F1-score were used for evaluation of the models. We identified 430 patients with sepsis and 686 patients with RTI. A total of 39 features were collected, with 23 features identified for initial model construction. Using the RFECV algorithm, we found that the XGBoost classifier, which only needed to include seven biomarkers, demonstrated the best performance among all prediction models, with an average accuracy of 89.24 ± 2.28, while the Ridge classifier, which included 11 biomarkers, had an average accuracy of only 83.87 ± 4.69. The remaining models had prediction accuracies greater than 88%. We developed nine models for predicting sepsis using a strategy that combined RFECV with machine learning. Among these models, the XGBoost classifier, which included seven biomarkers, showed the best performance and highest accuracy for predicting sepsis and may be a promising tool for the timely identification of sepsis.
期刊介绍:
The Scandinavian Journal of Clinical and Laboratory Investigation is an international scientific journal covering clinically oriented biochemical and physiological research. Since the launch of the journal in 1949, it has been a forum for international laboratory medicine, closely related to, and edited by, The Scandinavian Society for Clinical Chemistry.
The journal contains peer-reviewed articles, editorials, invited reviews, and short technical notes, as well as several supplements each year. Supplements consist of monographs, and symposium and congress reports covering subjects within clinical chemistry and clinical physiology.