Background and purpose: Robustness against input data perturbations is essential for deploying deep-learning models in clinical practice. Adversarial attacks involve subtle, voxel-level manipulations of scans to increase deep-learning models' prediction errors. Testing deep-learning model performance on examples of adversarial images provides a measure of robustness, and including adversarial images in the training set can improve the model's robustness. In this study, we examined adversarial training and input modifications to improve the robustness of deep-learning models in predicting hematoma expansion (HE) from admission head CTs of patients with acute intracerebral hemorrhage (ICH).
Materials and methods: We used a multicenter cohort of n=890 patients for cross-validation/training, and a cohort of n=684 consecutive ICH patients from two stroke centers for independent validation. Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) adversarial attacks were applied for training and testing. We developed and tested four different models to predict ≥3mL, ≥6mL, ≥9mL, and ≥12mL HE in independent validation cohort applying Receiver Operating Characteristics (ROC) Area Under the Curve (AUC). We examined varying mixtures of adversarial and non-perturbed (clean) scans for training as well as including additional input from the hyperparameter-free Otsu multi-threshold segmentation for model.
Results: When deep-learning models trained solely on clean scans were tested with PGD and FGSM adversarial images, the average HE prediction AUC dropped from 0.8 to 0.67 and 0.71, respectively. Overall, the best performing strategy to improve model robustness was training with 5-to-3 mix of clean and PGD adversarial scans and addition of Otsu multi-threshold segmentation to model input, increasing the average AUC to 0.77 against both PGD and FGSM adversarial attacks. Adversarial training with FGSM improved robustness against similar type attack but offered limited cross-attack robustness against PGD-type images.
Conclusions: Adversarial training and inclusion of threshold-based segmentation as an additional input can improve deep-learning model robustness in prediction of HE from admission head CTs in acute ICH.
Abbreviations: ATACH-2= Antihypertensive Treatment of Acute Cerebral Hemorrhage; AUC= Area Under the Curve; Dice=Dice coefficient; CNN= Convolutional Neural Network; FGSM= Fast Gradient Sign Method; ICH= Intracerebral hemorrhage; HD= Hausdorff distance; HE= Hematoma expansion; PGD= Projected Gradient Descent; ROC= Receiver Operating Characteristics; VS= Volume similarity.