Optical coherence tomography (OCT) is an essential tool for diagnosing retinal diseases because of its high-resolution, three-dimensional structural and functional imaging of the retina. Automatic segmentation and quantification of the retinal biomarkers provide clinicians with reliable diagnostic references and improve the accuracy and efficiency of diagnosis. However, the diverse lesions, artifacts, and missing normal retinal structures in the OCT images of patients with macular edema severely affect the accuracy of the segmentation model. Moreover, most deep learning segmentation models require a considerable amount of annotated data, which increases the development cost of medical image segmentation models. To address these issues, we propose a structural prior-guided and feature-enhanced transformer with masked imaging modeling pretraining (SPFET-MIMP) to segment the retinal layers and fluid in macular edema OCT B-scans. The segmentation network employs a transformer architecture combining shifted-windowing multi-head self-attention and axial attention to enhance the extraction of contextual information and multiscale features. To focus on the physiological order of the retinal layers and their positional relationships with fluid, a customized multi-class synergistic segmentation (MCSS) loss is incorporated into the loss function. The loss value reflects the prior knowledge of relative positions and topological structures in the retina, which helps maintain the correct order and completeness of the retinal layers. We also utilize a self-supervised pretraining framework, SimMIM, to pretrain a segmentation model on a large-scale unlabeled OCT dataset to enhance the robustness of the model for images with low contrast or shadow artifacts. Our method achieved average Dice coefficients of 94.35% and 90.19% on the AROI dataset and a private diabetic macular edema dataset, respectively, both outperforming other state-of-the-art technologies.
扫码关注我们
求助内容:
应助结果提醒方式:
