MMCANet A Multimodal and Cross-Attention Network for Cloud Removal and Exploration of Progressive Remote Sensing Images Restoration Algorithm

IF 8.6 1区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2025-04-01 DOI:10.1109/TGRS.2025.3556560

Yejian Zhou;Jiahui Suo;Yachen Wang;Jie Su;Wen Xiao;Zhen Hong;Rajiv Ranjan;Lizhe Wang;Zhenyu Wen

{"title":"MMCANet A Multimodal and Cross-Attention Network for Cloud Removal and Exploration of Progressive Remote Sensing Images Restoration Algorithm","authors":"Yejian Zhou;Jiahui Suo;Yachen Wang;Jie Su;Wen Xiao;Zhen Hong;Rajiv Ranjan;Lizhe Wang;Zhenyu Wen","doi":"10.1109/TGRS.2025.3556560","DOIUrl":null,"url":null,"abstract":"In Earth observation, cloud severely affects the interpretation of optical satellites generated high-resolution images. Cloud-free optical images are vital for downstream tasks such as semantic segmentation and object detection. Thus, the elimination of clouds from optical imagery has emerged as a significant topic in remote sensing. Currently, most existing methods are proposed to leverage the texture information from auxiliary synthetic aperture radar (SAR) images to restore cloud-free images via direct channel merging. However, such a unified feature extraction approach often neglects the inherent distribution disparity between SAR and optical images—the result of differing imaging principles-potentially leading to significant feature loss. To this end, we introduce a network by jointing SAR and optical images multimodal and cross-attention network (MMCANet) to effectively extract multiscale contextual features from SAR imagery and integrate them with optical features. Specifically, instead of simple concatenation of the channels of SAR and optical images, we obtain high-dimensional features from them through independent feature extractors. The integration of these features is facilitated by a cross-attention mechanism that provides a more fine-grained amalgamation of information. Meanwhile, an atrous spatial pyramid pooling (ASPP) module is introduced into the integration of high-level features, which captures multiscale contextual information around clouded areas. In addition, we propose four advanced remote sensing image restoration algorithms that approach image restoration as a series of subtasks, gradually eliminating clouds to enhance performance. Comprehensive assessments show that MMCANet performs well on the SEN 12 MS-CR dataset with peak signal-to-noise ratio (PSNR) of 39.8871, structural similarity index (SSIM) of 0.9672, mean absolute error (MAE) of 0.0081, and spectral angle mapper (SAM) of 2.9884.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-13"},"PeriodicalIF":8.6000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10946262/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

In Earth observation, cloud severely affects the interpretation of optical satellites generated high-resolution images. Cloud-free optical images are vital for downstream tasks such as semantic segmentation and object detection. Thus, the elimination of clouds from optical imagery has emerged as a significant topic in remote sensing. Currently, most existing methods are proposed to leverage the texture information from auxiliary synthetic aperture radar (SAR) images to restore cloud-free images via direct channel merging. However, such a unified feature extraction approach often neglects the inherent distribution disparity between SAR and optical images—the result of differing imaging principles-potentially leading to significant feature loss. To this end, we introduce a network by jointing SAR and optical images multimodal and cross-attention network (MMCANet) to effectively extract multiscale contextual features from SAR imagery and integrate them with optical features. Specifically, instead of simple concatenation of the channels of SAR and optical images, we obtain high-dimensional features from them through independent feature extractors. The integration of these features is facilitated by a cross-attention mechanism that provides a more fine-grained amalgamation of information. Meanwhile, an atrous spatial pyramid pooling (ASPP) module is introduced into the integration of high-level features, which captures multiscale contextual information around clouded areas. In addition, we propose four advanced remote sensing image restoration algorithms that approach image restoration as a series of subtasks, gradually eliminating clouds to enhance performance. Comprehensive assessments show that MMCANet performs well on the SEN 12 MS-CR dataset with peak signal-to-noise ratio (PSNR) of 39.8871, structural similarity index (SSIM) of 0.9672, mean absolute error (MAE) of 0.0081, and spectral angle mapper (SAM) of 2.9884.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MMCANet：一种多模态和交叉关注的云去除网络，探索渐进遥感图像恢复算法

在对地观测中，云严重影响了光学卫星生成的高分辨率图像的解译。无云光学图像对于语义分割和目标检测等下游任务至关重要。因此，从光学图像中消除云已成为遥感领域的一个重要课题。目前，现有的方法大多是利用辅助合成孔径雷达（SAR）图像的纹理信息，通过直接信道合并恢复无云图像。然而，这种统一的特征提取方法往往忽略了SAR图像与光学图像之间固有的分布差异，这是成像原理不同的结果，可能导致显著的特征丢失。为此，我们引入了一种结合SAR和光学图像的多模态和交叉关注网络（MMCANet），以有效地从SAR图像中提取多尺度背景特征，并将其与光学特征相结合。具体来说，我们不是简单地将SAR和光学图像的通道拼接，而是通过独立的特征提取器从中获得高维特征。交叉注意机制促进了这些功能的集成，该机制提供了更细粒度的信息合并。同时，将空间金字塔池（ASPP）模块引入到高阶特征的集成中，以获取云区周围的多尺度上下文信息。此外，我们提出了四种先进的遥感图像恢复算法，将图像恢复作为一系列子任务，逐步消除云以提高性能。综合评价表明，MMCANet在SEN 12 MS-CR数据集上表现良好，峰值信噪比（PSNR）为39.8871，结构相似指数（SSIM）为0.9672，平均绝对误差（MAE）为0.0081，光谱角映射（SAM）为2.9884。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理

CiteScore

11.50

自引率

28.00%

发文量

1912

审稿时长

4.0 months

期刊介绍： IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.