{"title":"Exploring prognostic factors on vascular outcomes among maintenance dialysis patients and establishing a prognosis prediction model using machine learning methods.","authors":"Chung-Kuan Wu, Zih-Kai Kao, Vy-Khanh Nguyen, Noi Yar, Ming-Tsang Chuang, Tzu-Hao Chang","doi":"10.1186/s12911-025-03302-2","DOIUrl":"10.1186/s12911-025-03302-2","url":null,"abstract":"","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":" ","pages":"6"},"PeriodicalIF":3.8,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12797654/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145687056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simultaneous prediction of early and delayed mortality in burn patients: a comparative machine learning analysis of feature importance in a single-center retrospective study.","authors":"Mehran Motamedi, Najibeh Mohseni Moallemkolaei, Mohammadhossein Hesamirostami, Mojtaba Ghorbani, Leila Shokrizadeh Arani","doi":"10.1186/s12911-025-03311-1","DOIUrl":"10.1186/s12911-025-03311-1","url":null,"abstract":"","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":" ","pages":"7"},"PeriodicalIF":3.8,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12797782/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145687053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1186/s12911-025-03291-2
Nikitha Karkera, Samik Ghosh, Germaine Escames, Sucheendra K Palaniappan
{"title":"MelAnalyze: fact-checking melatonin claims using large language models and natural language inference.","authors":"Nikitha Karkera, Samik Ghosh, Germaine Escames, Sucheendra K Palaniappan","doi":"10.1186/s12911-025-03291-2","DOIUrl":"10.1186/s12911-025-03291-2","url":null,"abstract":"","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":" ","pages":"39"},"PeriodicalIF":3.8,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12875012/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145676570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1186/s12911-025-03270-7
Claudia Robbiati, Maria Elena Tosti, Joaquim Tomas, Giulia Natali, Luca De Simeis, Nsuka Da Silva, Florentino Ferraz Joaquim, Daniel Tulomba, Neusa Lazary, Janet Adão, Fabio Manenti, Maria Grazia Dente
{"title":"Digital health for Tuberculosis control: findings from the piloting of an electronic medical record in Luanda (Angola).","authors":"Claudia Robbiati, Maria Elena Tosti, Joaquim Tomas, Giulia Natali, Luca De Simeis, Nsuka Da Silva, Florentino Ferraz Joaquim, Daniel Tulomba, Neusa Lazary, Janet Adão, Fabio Manenti, Maria Grazia Dente","doi":"10.1186/s12911-025-03270-7","DOIUrl":"10.1186/s12911-025-03270-7","url":null,"abstract":"","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":" ","pages":"4"},"PeriodicalIF":3.8,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12781550/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145676567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-03DOI: 10.1186/s12911-025-03278-z
Ping-Huang Tsai, Shang-Yang Lee, Chia-Ling Helen Wei, Yu-Juei Hsu, Chin Lin
Background: Chronic kidney disease (CKD) is a global health burden with low awareness among both patients and healthcare providers. Deep learning models (DLMs) have shown promise in interpreting electrocardiograms (ECGs) for various disease and may offer new opportunities for early CKD detection.
Methods: We enrolled 66,587 outpatients with estimated glomerular filtration rate (eGFR) data from January 2010 to October 2020. A total of 72,618 ECGs from 49,632 patients were used to develop DLMs. Internal validation was performed on 16,955 nonoverlapping patients, and external validation involved 10,476 patients from a community hospital. The primary outcome was the detection of CKD, defined as eGFR < 60 mL/min/1.73 m². Secondary outcomes included all-cause mortality and major cardiovascular events.
Results: The DLM achieved an AUC of 0.885 and 0.861 in the internal and external validation sets, respectively. Patients flagged by the DLM as having CKD showed more clinical risk factors for CKD progression and cardiovascular disease. Among patients without baseline CKD, those with a positive DLM screen had a significantly higher risk of incident CKD (hazard ratios 2.14 and 1.38; 95% CIs: 1.76-2.60 and 1.09-1.74). DLM stratification also predicted adverse outcomes such as stroke, heart failure, and atrial fibrillation more effectively than eGFR classification alone.
Conclusion: An ECG-based deep learning model can help identify individuals at risk for CKD and its complications, even before laboratory abnormalities emerge. This approach may support early detection and risk stratification in clinical practice.
{"title":"ECG-based deep learning for chronic kidney disease detection and cardiovascular risk prediction.","authors":"Ping-Huang Tsai, Shang-Yang Lee, Chia-Ling Helen Wei, Yu-Juei Hsu, Chin Lin","doi":"10.1186/s12911-025-03278-z","DOIUrl":"10.1186/s12911-025-03278-z","url":null,"abstract":"<p><strong>Background: </strong>Chronic kidney disease (CKD) is a global health burden with low awareness among both patients and healthcare providers. Deep learning models (DLMs) have shown promise in interpreting electrocardiograms (ECGs) for various disease and may offer new opportunities for early CKD detection.</p><p><strong>Methods: </strong>We enrolled 66,587 outpatients with estimated glomerular filtration rate (eGFR) data from January 2010 to October 2020. A total of 72,618 ECGs from 49,632 patients were used to develop DLMs. Internal validation was performed on 16,955 nonoverlapping patients, and external validation involved 10,476 patients from a community hospital. The primary outcome was the detection of CKD, defined as eGFR < 60 mL/min/1.73 m². Secondary outcomes included all-cause mortality and major cardiovascular events.</p><p><strong>Results: </strong>The DLM achieved an AUC of 0.885 and 0.861 in the internal and external validation sets, respectively. Patients flagged by the DLM as having CKD showed more clinical risk factors for CKD progression and cardiovascular disease. Among patients without baseline CKD, those with a positive DLM screen had a significantly higher risk of incident CKD (hazard ratios 2.14 and 1.38; 95% CIs: 1.76-2.60 and 1.09-1.74). DLM stratification also predicted adverse outcomes such as stroke, heart failure, and atrial fibrillation more effectively than eGFR classification alone.</p><p><strong>Conclusion: </strong>An ECG-based deep learning model can help identify individuals at risk for CKD and its complications, even before laboratory abnormalities emerge. This approach may support early detection and risk stratification in clinical practice.</p><p><strong>Clinical trial number: </strong>Not applicable.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"439"},"PeriodicalIF":3.8,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12676854/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145667286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-03DOI: 10.1186/s12911-025-03275-2
Xue Bai, Jian Guo, Meng Zhang, Yi Wang, Naishi Li
Introduction: Epidemiological data on rare diseases (RDs) affect the accurate scientific assessment of these diseases and lead to many issues in policy-making, healthcare systems, and legislation. The coding system is crucial for accurately identifying and calculating the incidence rates of each RD. This study focuses on the effectiveness of collecting RD data via the ICD-11 and examines whether the ICD-11 can fully support RD statistics. The findings of this study should provide a foundation for replacing the ICD-10 with the ICD-11.
Methods: This study included 121 RDs from the first "Rare Disease Catalogue"in China. The diseases were recoded independently by two experts in the ICD-11 MMS. A comparative analysis was conducted on the distributions of chapters, code types, and index terms in the ICD-10 and ICD-11 MMS.
Results: This study analysed 121 rare diseases (RDs) from China's first Rare Disease Catalogue. These RDs mapped to 204 ICD-10 codes (1.4% of all codes), including 76 (37.3%) non-index terms, and to 171 ICD-11 MMS codes (0.96% of all codes). The proportion of RD codes was significantly lower in ICD-11 than in ICD-10 (0.96% vs. 1.4%, P < 0.001), indicating greater dilution of RDs in ICD-11. All ICD-11 MMS codes were indexed (100% vs. 62.7% in ICD-10, P < 0.001), and 51 ICD-11 MMS codes (29.8%, P < 0.001) provided more detailed classifications. When using the ICD-11 to code RDs for subsequent statistical analyses, it is recommended that a network system of RD index terms be established in advance.
Conclusion: The ICD-11 can replace the ICD-10 for coding RDs. However, many RD terms do not have accurate codes and must be uniquely identified with URIs in the ICD-11. To ensure the reliability of RD-related data, establishing a local RD database for reporting data via the ICD-11 in China is essential.
{"title":"Quantifying coding integrity and reliability of ICD-11 MMS for rare disease registration: a case study of the Chinese rare disease catalogue.","authors":"Xue Bai, Jian Guo, Meng Zhang, Yi Wang, Naishi Li","doi":"10.1186/s12911-025-03275-2","DOIUrl":"10.1186/s12911-025-03275-2","url":null,"abstract":"<p><strong>Introduction: </strong>Epidemiological data on rare diseases (RDs) affect the accurate scientific assessment of these diseases and lead to many issues in policy-making, healthcare systems, and legislation. The coding system is crucial for accurately identifying and calculating the incidence rates of each RD. This study focuses on the effectiveness of collecting RD data via the ICD-11 and examines whether the ICD-11 can fully support RD statistics. The findings of this study should provide a foundation for replacing the ICD-10 with the ICD-11.</p><p><strong>Methods: </strong>This study included 121 RDs from the first \"Rare Disease Catalogue\"in China. The diseases were recoded independently by two experts in the ICD-11 MMS. A comparative analysis was conducted on the distributions of chapters, code types, and index terms in the ICD-10 and ICD-11 MMS.</p><p><strong>Results: </strong>This study analysed 121 rare diseases (RDs) from China's first Rare Disease Catalogue. These RDs mapped to 204 ICD-10 codes (1.4% of all codes), including 76 (37.3%) non-index terms, and to 171 ICD-11 MMS codes (0.96% of all codes). The proportion of RD codes was significantly lower in ICD-11 than in ICD-10 (0.96% vs. 1.4%, P < 0.001), indicating greater dilution of RDs in ICD-11. All ICD-11 MMS codes were indexed (100% vs. 62.7% in ICD-10, P < 0.001), and 51 ICD-11 MMS codes (29.8%, P < 0.001) provided more detailed classifications. When using the ICD-11 to code RDs for subsequent statistical analyses, it is recommended that a network system of RD index terms be established in advance.</p><p><strong>Conclusion: </strong>The ICD-11 can replace the ICD-10 for coding RDs. However, many RD terms do not have accurate codes and must be uniquely identified with URIs in the ICD-11. To ensure the reliability of RD-related data, establishing a local RD database for reporting data via the ICD-11 in China is essential.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"440"},"PeriodicalIF":3.8,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12676748/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145667256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1186/s12911-025-03271-6
Thomas Hadler, Leonhard Grassow, Johanna Kuhnt, Richard Hickstein, Hadil Saad, Maximilian Fenski, Jan Gröschel, Ralf-Felix Trauzeddel, Edyta Blaszczyk, Clemens Ammann, Darian Viezzer, Anja Hennemuth, Steffen Lange, Jeanette Schulz-Menger
Background: Cardiovascular magnetic resonance (CMR) offers state-of-the-art volume, function, fibrosis and oedema imaging. Quality assurance (QA) tasks, such as quantitative parameter reproducibility assessments, the evaluation of AI methods, and the assessment of trainees have become essential to CMR. However, the explainability of how qualitative differences impact quantitative differences remains underexplored. Our aim is to demonstrate a semi-automated QA tool, Lazy Luna's (LL) applicability to typical CMR QA application cases.
Methods: A software feature error-tracing is designed that allows for quickly pinpointing qualitative reasons for quantitative differences and outliers. Three QA application cases were designed. First, LL was applied to perform outlier detection for inter- and intraobserver analyses to detect failure cases and provide qualitative explanations. Outlier detection was performed on several typical images types. Second, LL supported an Artificial intelligence (AI) evaluation, in which an AI method was compared to a CMR-expert of 144 patients. LL assessed the acceptability of AI biases for left and right ventricular (LV, RV) end-systolic, -diastolic, and stroke volumes (ESV, EDV, SV), ejection fractions (EF) and the myocardial mass (LVM). Annotations were examined to explain the qualitative differences that resulted in good and poor parameters. The AI investigation was recorded as a video. Third, LL was used to provide a Trainee Feedback to a CMR beginner. The trainee was compared to an expert on several imaging techniques to investigate outliers.
Results: For the outlier detection, LL detected segmentation differences that caused parameter differences on multiple sequences. For the AI evaluation calculated clinical parameter biases to be: LVESV:-3.1 ml, LVEDV:2.1 ml, LVSV:6.5 ml, LVEF:3.0 ml, RVESV:0.3 ml, RVEDV:-3.8 ml, RVSV:-4.2 ml, RVEF:-1.4 ml, LVM:-2 g. Inspecting the causes for outlier differences revealed that juxtaposed basal slice failures caused unacceptable LVSV deviations between AI and expert. For the trainee assessment, LL showed that trainee parameters exceeded tolerance ranges. The segmentations could be improved to better mirror expert segmentations and close the parameter gaps.
Conclusion: Lazy Luna, as a semi-automated quality assurance tool, is applicable to several quality assurance application cases in CMR.
{"title":"A semi-automated quality assurance tool for cardiovascular magnetic resonance imaging: application to outlier detection, artificial intelligence evaluation and trainee feedback.","authors":"Thomas Hadler, Leonhard Grassow, Johanna Kuhnt, Richard Hickstein, Hadil Saad, Maximilian Fenski, Jan Gröschel, Ralf-Felix Trauzeddel, Edyta Blaszczyk, Clemens Ammann, Darian Viezzer, Anja Hennemuth, Steffen Lange, Jeanette Schulz-Menger","doi":"10.1186/s12911-025-03271-6","DOIUrl":"10.1186/s12911-025-03271-6","url":null,"abstract":"<p><strong>Background: </strong>Cardiovascular magnetic resonance (CMR) offers state-of-the-art volume, function, fibrosis and oedema imaging. Quality assurance (QA) tasks, such as quantitative parameter reproducibility assessments, the evaluation of AI methods, and the assessment of trainees have become essential to CMR. However, the explainability of how qualitative differences impact quantitative differences remains underexplored. Our aim is to demonstrate a semi-automated QA tool, Lazy Luna's (LL) applicability to typical CMR QA application cases.</p><p><strong>Methods: </strong>A software feature error-tracing is designed that allows for quickly pinpointing qualitative reasons for quantitative differences and outliers. Three QA application cases were designed. First, LL was applied to perform outlier detection for inter- and intraobserver analyses to detect failure cases and provide qualitative explanations. Outlier detection was performed on several typical images types. Second, LL supported an Artificial intelligence (AI) evaluation, in which an AI method was compared to a CMR-expert of 144 patients. LL assessed the acceptability of AI biases for left and right ventricular (LV, RV) end-systolic, -diastolic, and stroke volumes (ESV, EDV, SV), ejection fractions (EF) and the myocardial mass (LVM). Annotations were examined to explain the qualitative differences that resulted in good and poor parameters. The AI investigation was recorded as a video. Third, LL was used to provide a Trainee Feedback to a CMR beginner. The trainee was compared to an expert on several imaging techniques to investigate outliers.</p><p><strong>Results: </strong>For the outlier detection, LL detected segmentation differences that caused parameter differences on multiple sequences. For the AI evaluation calculated clinical parameter biases to be: LVESV:-3.1 ml, LVEDV:2.1 ml, LVSV:6.5 ml, LVEF:3.0 ml, RVESV:0.3 ml, RVEDV:-3.8 ml, RVSV:-4.2 ml, RVEF:-1.4 ml, LVM:-2 g. Inspecting the causes for outlier differences revealed that juxtaposed basal slice failures caused unacceptable LVSV deviations between AI and expert. For the trainee assessment, LL showed that trainee parameters exceeded tolerance ranges. The segmentations could be improved to better mirror expert segmentations and close the parameter gaps.</p><p><strong>Conclusion: </strong>Lazy Luna, as a semi-automated quality assurance tool, is applicable to several quality assurance application cases in CMR.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"437"},"PeriodicalIF":3.8,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12670863/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145653642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Hemodialysis patients are at high risk for ICU admission due to elevated mortality, cardiovascular disease, and infection rates. Traditional ICU scoring systems (e.g., APACHE-II, SOFA) demonstrate limited accuracy in this population. This study aimed to identify key risk factors and develop interpretable machine learning (ML) models for predicting ICU outcomes to enable early intervention.
Methods: This multicenter study analyzed data from three cohorts: The First Affiliated Hospital of Sun Yat-sen University (n = 248), MIMIC-IV (n = 769), and eICU-CRD (n = 1,878). Primary outcome was all-cause ICU mortality; secondary outcomes were cardiovascular and infection-related mortality. Thirteen ML algorithms and ensemble models were applied to 113 clinical variables collected within 24 h of ICU admission. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and benchmarked against existing ICU scoring systems. We employed SHapley Additive exPlanation (SHAP) analysis to enhance interpretability.
Results: Key predictors numbered 6 (cardiovascular mortality), 11 (infection-related mortality), and 25 (all-cause mortality). Ensemble machine learning models, trained on the SYSU cohort, were initially screened by performance (8-fold cross-validation AUC ≥ 0.80) and evaluated in the eICU selection cohort, with the top-performing models subsequently validated in the external MIMIC-IV cohort. In the external validation, NeuralNetC achieved the highest AUC of 0.847 (95% confidence interval [CI] 0.806-0.885) for all-cause mortality among the ensemble models, outperforming ICU scoring systems. ExtraTreesA performed best for infection-related mortality (AUCs: 0.880; 95% CI 0.852-0.906), and NeuralNetD for cardiovascular mortality (AUCs: 0.790; 95% CI 0.733-0.844). An online predictive platform was developed to facilitate clinical application.
Conclusion: ML models provided high predictive accuracy for ICU mortality in hemodialysis patients, facilitating early identification of high-risk individuals and supporting targeted interventions. The online platform promotes clinical translation for intensive care decision-making.
背景:血液透析患者因死亡率、心血管疾病和感染率升高而进入ICU的风险较高。传统的ICU评分系统(如APACHE-II、SOFA)在这一人群中的准确性有限。本研究旨在确定关键风险因素,并开发可解释的机器学习(ML)模型,以预测ICU结果,从而实现早期干预。方法:本多中心研究分析了三个队列的数据:中山大学第一附属医院(n = 248)、MIMIC-IV (n = 769)和eICU-CRD (n = 1878)。主要结局为ICU全因死亡率;次要结局是心血管和感染相关的死亡率。13种ML算法和集成模型应用于ICU入院24 h内收集的113个临床变量。使用受试者工作特征曲线(AUC)下的面积来评估模型的性能,并以现有的ICU评分系统为基准。我们采用SHapley加性解释(SHAP)分析来提高可解释性。结果:关键预测因子有6个(心血管死亡率)、11个(感染相关死亡率)和25个(全因死亡率)。在SYSU队列中训练的集成机器学习模型最初通过性能进行筛选(8倍交叉验证AUC≥0.80),并在eICU选择队列中进行评估,随后在外部MIMIC-IV队列中验证表现最佳的模型。在外部验证中,在集合模型中,NeuralNetC的全因死亡率AUC最高,为0.847(95%可信区间[CI] 0.806-0.885),优于ICU评分系统。ExtraTreesA在感染相关死亡率方面表现最佳(auc: 0.880; 95% CI 0.852-0.906),而NeuralNetD在心血管死亡率方面表现最佳(auc: 0.790; 95% CI 0.733-0.844)。开发在线预测平台,方便临床应用。结论:ML模型对血透患者ICU死亡率预测准确率高,有利于早期识别高危人群,支持有针对性的干预。该在线平台促进重症监护决策的临床翻译。
{"title":"Development and validation of interpretable machine learning models to predict intensive care unit outcomes in patients on hemodialysis: a multicenter study.","authors":"Minjie Chen, Pengan Li, Yuanwen Xu, Zhenghui Li, Yan Xiong, Jianhua Wu, Chintan Pandya, Yunuo Wang, Guixin Huang","doi":"10.1186/s12911-025-03301-3","DOIUrl":"10.1186/s12911-025-03301-3","url":null,"abstract":"<p><strong>Background: </strong>Hemodialysis patients are at high risk for ICU admission due to elevated mortality, cardiovascular disease, and infection rates. Traditional ICU scoring systems (e.g., APACHE-II, SOFA) demonstrate limited accuracy in this population. This study aimed to identify key risk factors and develop interpretable machine learning (ML) models for predicting ICU outcomes to enable early intervention.</p><p><strong>Methods: </strong>This multicenter study analyzed data from three cohorts: The First Affiliated Hospital of Sun Yat-sen University (n = 248), MIMIC-IV (n = 769), and eICU-CRD (n = 1,878). Primary outcome was all-cause ICU mortality; secondary outcomes were cardiovascular and infection-related mortality. Thirteen ML algorithms and ensemble models were applied to 113 clinical variables collected within 24 h of ICU admission. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and benchmarked against existing ICU scoring systems. We employed SHapley Additive exPlanation (SHAP) analysis to enhance interpretability.</p><p><strong>Results: </strong>Key predictors numbered 6 (cardiovascular mortality), 11 (infection-related mortality), and 25 (all-cause mortality). Ensemble machine learning models, trained on the SYSU cohort, were initially screened by performance (8-fold cross-validation AUC ≥ 0.80) and evaluated in the eICU selection cohort, with the top-performing models subsequently validated in the external MIMIC-IV cohort. In the external validation, NeuralNetC achieved the highest AUC of 0.847 (95% confidence interval [CI] 0.806-0.885) for all-cause mortality among the ensemble models, outperforming ICU scoring systems. ExtraTreesA performed best for infection-related mortality (AUCs: 0.880; 95% CI 0.852-0.906), and NeuralNetD for cardiovascular mortality (AUCs: 0.790; 95% CI 0.733-0.844). An online predictive platform was developed to facilitate clinical application.</p><p><strong>Conclusion: </strong>ML models provided high predictive accuracy for ICU mortality in hemodialysis patients, facilitating early identification of high-risk individuals and supporting targeted interventions. The online platform promotes clinical translation for intensive care decision-making.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":" ","pages":"5"},"PeriodicalIF":3.8,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12784603/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145660419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}