This study presents and evaluates a globally applicable cloud and shadow masking model for Sentinel-2 top-of-atmosphere (TOA) reflectance using a state-of-the-art transformer-based U-Net model (Swin-Unet) trained with nearly 20 thousand globally distributed 512 × 512 20 m pixel patches to classify each pixel as cloud, cloud shadow, or clear. The training data were compiled from publicly available annotation data, that were refined for obvious annotation errors and supplemented with additional annotations to enhance representation of underrepresented cloud and surface conditions. The trained Swin-Unet model was validated using the KappaSet and CloudSEN12+ testing datasets and compared with the Fmask, Sen2Cor scene classification layer (SCL), and a deep learning model CloudS2Mask. The Swin-Unet achieved the highest overall accuracy (91.32 %) and the highest F1-scores for cloud (0.909) and clear classes (0.935), despite a lower cloud shadow F1-score (0.691) than CloudS2Mask (0.743). The four models were also applied to 11,458 Sentinel-2 images acquired over a calendar year for 78 globally distributed 109 × 109 km tiles and the temporal smoothness index (TSI) of the time series ‘clear’ surface reflectance was derived. The Swin-Unet model yielded the smallest TSI value (most temporally consistent reflectance) for each selected Sentinel-2 band. Visual assessment confirmed the superior performance of the Swin-Unet model. The results highlight the potential of the Swin-Unet model for Sentinel-2 cloud and cloud shadow detection for images acquired anywhere and anytime. The training data and model are publicly available to enable users to apply them efficiently.
扫码关注我们
求助内容:
应助结果提醒方式:
