Background: Formulating a clinically acceptable plan within the time-constrained clinical setting of brachytherapy poses challenges to clinicians. Deep learning based dose prediction methods have shown favorable solutions for enhancing efficiency, but development has primarily been on external beam radiation therapy. Thus, there is a need for translation to brachytherapy.
Purpose: This study proposes a dose prediction model utilizing an attention-gating mechanism and a 3D UNET for cervical cancer high-dose-rate intracavitary brachytherapy treatment planning with tandem-and-ovoid/ring applicators.
Methods: A multi-institutional data set consisting of 77 retrospective clinical brachytherapy plans was utilized in this study. The data were preprocessed and augmented to increase the number of plans to 252. A 3D UNET architecture with attention gates was constructed and trained for mapping the contour information to dose distribution. The trained model was evaluated on a testing data set using various metrics, including dose statistics and dose-volume indices. We also trained a baseline UNET model for a fair comparison.
Results: The attention-gated 3D UNET model exhibited competitive accuracy in predicting dose distributions similar to the ground truth. The average values of the mean absolute errors were 0.46 ± 11.71 Gy (vs. 0.47 ± 9.16 Gy for a baseline UNET) in CTVHR, 0.55 ± 0.67 Gy (vs. 0.70 ± 1.54 Gy for a baseline UNET) in bladder, 0.42 ± 0.46 Gy (vs. 0.49 ± 1.34 Gy for a baseline UNET) in rectum, and 0.31 ± 0.65 Gy (vs. 0.20 ± 3.76 Gy for a baseline UNET) in sigmoid. Our results showed that the mean individual differences in ΔD2cc for bladder, rectum, and sigmoid were 0.38 ± 1.19 (p = 0.50), 0.43 ± 0.71 (p = 0.41), and -0.47 ± 0.79 (p = 0.30) Gy, respectively. Similarly, the mean individual differences in ΔD1cc for bladder, rectum, and sigmoid were 0.09 ± 1.21 (p = 0.36), 0.20 ± 0.95 (p = 0.24), and -0.21 ± 0.59 (p = 0.30) Gy. The mean individual differences for ΔD90, ΔV100%, ΔV150%, and ΔV200% of the CTVHR were -0.45 ± 2.42 (p = 0.26) Gy, 0.55 ± 9.42% (p = 0.78), 0.82 ± 4.21% (p = 0.81), and -0.80 ± 10.48% (p = 0.36), respectively. The model requires less than 5 s to predict a full 3D dose distribution for a new patient plan.
Conclusion: Attention-gated 3D UNET revealed a promising capability in predicting voxel-wise dose distributions compared to 3D UNET. This model could be deployed for clinical use to predict 3D dose distributions for near real-time decision-making before planning, quality assurance, and guiding future automated planning, making the current workflow more efficient.