A Novel Remote Sensing Image Change Detection Approach Based on Multilevel State Space Model

IF 8.6 1区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2024-11-18 DOI:10.1109/TGRS.2024.3501303

Zhongyu Zhang;Xuanmei Fan;Xin Wang;Yingxiang Qin;Junshi Xia

{"title":"A Novel Remote Sensing Image Change Detection Approach Based on Multilevel State Space Model","authors":"Zhongyu Zhang;Xuanmei Fan;Xin Wang;Yingxiang Qin;Junshi Xia","doi":"10.1109/TGRS.2024.3501303","DOIUrl":null,"url":null,"abstract":"Remote sensing image change detection (CD) is crucial for disaster assessment, land use change, and urban management. Most CD methods are realized by CNN and Transformer. However, these methods are not satisfied with modeling global dependencies while keeping a low computational complexity. Recently, the emergence of Mamba architectures based on state space models (SSMs) can remedy the above problems. In this article, we propose a visual Mamba-based multiscale feature extraction network to efficiently interactively fuse global and local information, which is named as MF-VMamba (MF: multiscale feature). First, a VMamba-based encoder is used to extract multiscale semantic features from bitemporal images. Then, a feature enhancement module (FEM) is proposed to capture the difference information between images. In addition, we employ a multilevel attention decoder (MAD) based on large kernel convolution (LKC) to obtain the information in spatial and spectral dimensions to realize the information interaction between global and local features. After the sequential processing of these three modules, the discriminative ability of changing objects is significantly improved. Notably, the computational complexity of our VMamba-based model grows linearly, which can significantly reduce the computational cost. In the experiments, our method performs well on CDD, DSIFN-CD, LEVIR-CD, and SYSU-CD datasets, with \n<inline-formula> <tex-math>$F1$ </tex-math></inline-formula>\n scores and OA reaching \n<inline-formula> <tex-math>$95.69\\%/88.05\\%/90.64\\%{/86.95\\%}$ </tex-math></inline-formula>\n and \n<inline-formula> <tex-math>$98.97\\%/96.01\\%/99.07\\%{/90.75\\%}$ </tex-math></inline-formula>\n, respectively. The code can be accessed at \n<uri>https://github.com/121zzy/MF-Mamba.git</uri>\n.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-14"},"PeriodicalIF":8.6000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10756674/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Remote sensing image change detection (CD) is crucial for disaster assessment, land use change, and urban management. Most CD methods are realized by CNN and Transformer. However, these methods are not satisfied with modeling global dependencies while keeping a low computational complexity. Recently, the emergence of Mamba architectures based on state space models (SSMs) can remedy the above problems. In this article, we propose a visual Mamba-based multiscale feature extraction network to efficiently interactively fuse global and local information, which is named as MF-VMamba (MF: multiscale feature). First, a VMamba-based encoder is used to extract multiscale semantic features from bitemporal images. Then, a feature enhancement module (FEM) is proposed to capture the difference information between images. In addition, we employ a multilevel attention decoder (MAD) based on large kernel convolution (LKC) to obtain the information in spatial and spectral dimensions to realize the information interaction between global and local features. After the sequential processing of these three modules, the discriminative ability of changing objects is significantly improved. Notably, the computational complexity of our VMamba-based model grows linearly, which can significantly reduce the computational cost. In the experiments, our method performs well on CDD, DSIFN-CD, LEVIR-CD, and SYSU-CD datasets, with

$F1$

scores and OA reaching

$95.69\%/88.05\%/90.64\%{/86.95\%}$

and

$98.97\%/96.01\%/99.07\%{/90.75\%}$

, respectively. The code can be accessed at https://github.com/121zzy/MF-Mamba.git .

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于多级状态空间模型的新型遥感图像变化检测方法

遥感影像变化检测在灾害评估、土地利用变化和城市管理等方面具有重要意义。大多数CD方法是由CNN和Transformer实现的。然而，这些方法并不满足于在保持较低计算复杂度的同时建模全局依赖关系。最近，基于状态空间模型（ssm）的Mamba体系结构的出现可以解决上述问题。在本文中，我们提出了一种基于视觉曼巴的多尺度特征提取网络，以有效地交互融合全局和局部信息，并将其命名为MF- vamba （MF: multiscale feature）。首先，利用基于vamba的编码器从双时图像中提取多尺度语义特征；然后，提出了一种特征增强模块（FEM）来捕获图像之间的差异信息。此外，我们采用基于大核卷积（LKC）的多级注意解码器（MAD）来获取空间和光谱维度的信息，实现全局特征和局部特征之间的信息交互。这三个模块经过顺序处理后，识别变化对象的能力明显提高。值得注意的是，我们基于vamba的模型的计算复杂度呈线性增长，这可以显著降低计算成本。在实验中，我们的方法在CDD、DSIFN-CD、levirr - cd和SYSU-CD数据集上表现良好，$F1$分数和OA分别达到$95.69\%/88.05\%/90.64\%{/86.95\%}$和$98.97\%/96.01\%/99.07\%{/90.75\%}$。代码可以在https://github.com/121zzy/MF-Mamba.git上访问。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理

CiteScore

11.50

自引率

28.00%

发文量

1912

审稿时长

4.0 months

期刊介绍： IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.