Background: Tuberculosis (TB) remains a major global health challenge, with an estimated 10 million new cases and 1.4 million deaths annually. Identifying patients who would benefit from comprehensive pharmacist intervention services is critical for optimizing pharmacist intervention benefit outcomes and resource allocation. We developed a machine learning model to predict pharmacist intervention group assignment at hospital admission using clinical parameters.
Methods: We conducted a retrospective analysis of 467 TB patients from a tertiary care hospital. The prediction model was trained exclusively on clinical variables to predict pharmacist intervention group assignment (binary classification: intervention group = 1, control group = 0). To address limited sample size, we implemented data augmentation using multi-neighbor interpolation, expanding the dataset to 1,999 samples (328.1% increase). We developed an extensive feature engineering pipeline generating 122 optimized features and employed Optuna-based hyperparameter optimization (250 trials) with a multi-level ensemble architecture comprising 43 base models. Separately, we analyzed publicly available GEO datasets to provide biological interpretation and mechanistic insights, but these transcriptomic data were not used as model features.
Results: The ultimate ensemble model achieved accuracy of 92.25% (95% CI: 89.1-95.4%) and AUC-ROC of 96.96% (95% CI: 94.8-99.1%) in predicting pharmacist intervention group assignment, demonstrating the substantial impact of the optimization strategies employed. Analysis of GEO datasets identified 150 significantly differentially expressed genes (FDR < 0.05) and revealed enrichment in immune response and inflammation pathways, providing supportive biological context for the clinical prediction model.
Conclusions: Our study demonstrates that comprehensive machine learning optimization can achieve strong predictive performance for identifying patients who would benefit from pharmacist intervention. The clinical prediction model, trained exclusively on clinical variables, provides a robust framework for personalized TB treatment resource allocation. Supportive transcriptomic analyses provide biological context but are not used in model prediction. The model's accuracy (92.25%) and discriminative ability (AUC 96.96%) suggest potential for clinical implementation.
扫码关注我们
求助内容:
应助结果提醒方式:
