Purpose: Immune checkpoint inhibitors (ICIs) have revolutionized cancer treatment, yet their use is associated with immune-related adverse events (irAEs). Estimating the prevalence and patient impact of these irAEs in the real-world data setting is critical for characterizing the benefit/risk profile of ICI therapies beyond the clinical trial population. Diagnosis codes, such as International Classification of Diseases codes, do not comprehensively illustrate a patient's care journey and offer no insight into drug-irAE causality. This study aims to capture the relationship between ICIs and irAEs more accurately by using augmented curation (AC), a natural language processing-based innovation, on unstructured data in electronic health records.
Methods: In a cohort of 9,290 patients treated with ICIs at Mayo Clinic from 2005 to 2021, we compared the prevalence of irAEs using diagnosis codes and AC models, which classify drug-irAE pairs in clinical notes with implied textual causality. Four illustrative irAEs with high patient impact-myocarditis, encephalitis, pneumonitis, and severe cutaneous adverse reactions, abbreviated as MEPS-were analyzed using corticosteroid administration and ICI discontinuation as proxies of severity.
Results: For MEPS, only 70% (n = 118) of patients found by AC were also identified by diagnosis codes. Using AC models, patients with MEPS received corticosteroids for their respective irAE 82% of the time and permanently discontinued the ICI because of the irAE 35.9% (n = 115) of the time.
Conclusion: Overall, AC models enabled more accurate identification and assessment of patient impact of ICI-induced irAEs not found using diagnosis codes, demonstrating a novel and more efficient strategy to assess real-world clinical outcomes in patients treated with ICIs.
Purpose: Adverse effects of chemotherapy often require hospital admissions or treatment management. Identifying factors contributing to unplanned hospital utilization may improve health care quality and patients' well-being. This study aimed to assess if patient-reported outcome measures (PROMs) improve performance of machine learning (ML) models predicting hospital admissions, triage events (contacting helpline or attending hospital), and changes to chemotherapy.
Materials and methods: Clinical trial data were used and contained responses to three PROMs (European Organisation for Research and Treatment of Cancer Core Quality of Life Questionnaire [QLQ-C30], EuroQol Five-Dimensional Visual Analogue Scale [EQ-5D], and Functional Assessment of Cancer Therapy-General [FACT-G]) and clinical information on 508 participants undergoing chemotherapy. Six feature sets (with following variables: [1] all available; [2] clinical; [3] PROMs; [4] clinical and QLQ-C30; [5] clinical and EQ-5D; [6] clinical and FACT-G) were applied in six ML models (logistic regression [LR], decision tree, adaptive boosting, random forest [RF], support vector machines [SVMs], and neural network) to predict admissions, triage events, and chemotherapy changes.
Results: The comprehensive analysis of predictive performances of the six ML models for each feature set in three different methods for handling class imbalance indicated that PROMs improved predictions of all outcomes. RF and SVMs had the highest performance for predicting admissions and changes to chemotherapy in balanced data sets, and LR in imbalanced data set. Balancing data led to the best performance compared with imbalanced data set or data set with balanced train set only.
Conclusion: These results endorsed the view that ML can be applied on PROM data to predict hospital utilization and chemotherapy management. If further explored, this study may contribute to health care planning and treatment personalization. Rigorous comparison of model performance affected by different imbalanced data handling methods shows best practice in ML research.