Objective: This study aimed to develop a novel multi-stage self-supervised learning model tailored for the accurate classification of optical coherence tomography (OCT) images in ophthalmology reducing reliance on costly labeled datasets while maintaining high diagnostic accuracy.
Materials and methods: A private dataset of 2719 OCT images from 493 patients was employed, along with 3 public datasets comprising 84 484 images from 4686 patients, 3231 images from 45 patients, and 572 images. Extensive internal, external, and clinical validation were performed to assess model performance. Grad-CAM was employed for qualitative analysis to interpret the model's decisions by highlighting relevant areas. Subsampling analyses evaluated the model's robustness with varying labeled data availability.
Results: The proposed model outperformed conventional supervised or self-supervised learning-based models, achieving state-of-the-art results across 3 public datasets. In a clinical validation, the model exhibited up to 17.50% higher accuracy and 17.53% higher macro F-1 score than a supervised learning-based model under limited training data.
Discussion: The model's robustness in OCT image classification underscores the potential of the multi-stage self-supervised learning to address challenges associated with limited labeled data. The availability of source codes and pre-trained models promotes the use of this model in a variety of clinical settings, facilitating broader adoption.
Conclusion: This model offers a promising solution for advancing OCT image classification, achieving high accuracy while reducing the cost of extensive expert annotation and potentially streamlining clinical workflows, thereby supporting more efficient patient management.