Background: Recurrent ischemic stroke (RIS) is a significant challenge in Malaysia, affecting approximately 33% of patients. However, studies using artificial intelligence (AI) to predict this event using real-world data remain very limited. This study aimed to develop and evaluate Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and RUSBoost models for predicting recurrent ischemic stroke using real-world data from the Malaysian National Neurology Registry.
Methods: We established a retrospective study of 7,697 enrolled patients registered in the National Neurology Registry in Malaysia (2009-2016). We developed and evaluated several machine learning models, including SVM, KNN, and RUSBoost, to predict recurrent RIS. The Synthetic Minority Over-Sampling Technique (SMOTE) was applied to the training data to handle the imbalanced data. Ten-fold cross-validation was applied to assess the robustness and accuracy of the models, and performance was evaluated using criteria of accuracy, sensitivity, specificity, PPV, and area under the ROC curve (AUC).
Results: Among the evaluated machine learning models, RUSBoost demonstrated the strongest and most clinically relevant performance when assessed on validation (test) folds under stratified ten-fold cross-validation, achieving an AUROC of 0.943, sensitivity of 86.5%, and a favourable balance between sensitivity and PPV of 40.2% on the original imbalanced dataset. Although the application of SMOTE during training improved model discrimination for RUSBoost (training-fold AUROC = 0.986). The SHAP analysis showed that age, race, glucose level, hypertension, hyperlipidemia, and duration of diabetes were the most significant factors linked to an increased risk of recurrent ischemic stroke.
Conclusion: This study demonstrates that applying machine learning models on real-world clinical data is a promising tool for predicting the risk of ischemic stroke recurrence. RUSBoost emerged as the most reliable and generalisable model for clinical risk prediction, proved effective in improving prediction accuracy and identifying patients at highest risk. While SMOTE enhanced model learning during training. The findings highlight the importance of integrating AI technologies into clinical practice to support early treatment decisions and enhance preventive interventions, opening new pathways for better patient care and reducing the health burden from recurrent stroke.
扫码关注我们
求助内容:
应助结果提醒方式:
