Aim: This study aimed to evaluate the performance of machine learning (ML) models in predicting patient no-shows for telemedicine appointments within Peruvian health system and identify key predictors of nonattendance.
Methods: We performed a retrospective observational study using anonymized data (June 2019-November 2023) from "Teleatiendo." The dataset included over 1.5 million completed appointments and about 64,000 no-shows (4.1%), focusing on teleorientation and telemonitoring. Predictor variables included patient demographics, socioeconomic factors, health care facility characteristics, appointment timing, and telemedicine service types. A 70% training, 10% validation, and 20% testing split were used over 10 iterations, with hyperparameter tuning performed on the validation set to identify optimal model parameters. Multiple ML approaches-random forest, XGBoost, LightGBM, and anomaly detection-were implemented in combination with undersampling and cost-sensitive learning to address class imbalance. Performance was evaluated using precision, recall, specificity, area under the curve (AUC), F1-score, and accuracy.
Results: Of the models tested, undersampling with XGBoost achieved a precision of 0.115 (±0.001), recall of 0.654 (±0.005), specificity of 0.786 (±0.002), AUC of 0.720 (±0.002), and accuracy of 0.780 (±0.002). In contrast, cost-sensitive XGBoost exhibited a balanced performance with a precision of 0.123 (±0.001), recall of 0.639 (±0.006), specificity of 0.805 (±0.004), AUC of 0.722 (±0.001), and accuracy of 0.799 (±0.003). Additionally, cost-sensitive random forest achieved the highest specificity (0.843 ± 0.002) and accuracy (0.832 ± 0.001) but recorded a lower recall (0.585 ± 0.004), while cost-sensitive LightGBM and balanced random forest yielded performance metrics similar to cost-sensitive XGBoost. Isolation forest, used for abnormality detection, demonstrated the lowest performance.
Conclusions: ML models can moderately predict telemedicine no-shows in Peru, with cost-sensitive boosting techniques enhancing the identification of high-risk patients. Key predictors reflect both individual behavior and system-level contexts, suggesting the need for tailored, context-specific interventions. These findings can inform targeted strategies to optimize telemedicine, improve appointment adherence, and promote equitable health care access.
扫码关注我们
求助内容:
应助结果提醒方式:
