A Novel Remote Sensing Image Change Detection Approach Based on Multilevel State Space Model

IF 8.6 1区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2024-11-18 DOI:10.1109/TGRS.2024.3501303
Zhongyu Zhang;Xuanmei Fan;Xin Wang;Yingxiang Qin;Junshi Xia
{"title":"A Novel Remote Sensing Image Change Detection Approach Based on Multilevel State Space Model","authors":"Zhongyu Zhang;Xuanmei Fan;Xin Wang;Yingxiang Qin;Junshi Xia","doi":"10.1109/TGRS.2024.3501303","DOIUrl":null,"url":null,"abstract":"Remote sensing image change detection (CD) is crucial for disaster assessment, land use change, and urban management. Most CD methods are realized by CNN and Transformer. However, these methods are not satisfied with modeling global dependencies while keeping a low computational complexity. Recently, the emergence of Mamba architectures based on state space models (SSMs) can remedy the above problems. In this article, we propose a visual Mamba-based multiscale feature extraction network to efficiently interactively fuse global and local information, which is named as MF-VMamba (MF: multiscale feature). First, a VMamba-based encoder is used to extract multiscale semantic features from bitemporal images. Then, a feature enhancement module (FEM) is proposed to capture the difference information between images. In addition, we employ a multilevel attention decoder (MAD) based on large kernel convolution (LKC) to obtain the information in spatial and spectral dimensions to realize the information interaction between global and local features. After the sequential processing of these three modules, the discriminative ability of changing objects is significantly improved. Notably, the computational complexity of our VMamba-based model grows linearly, which can significantly reduce the computational cost. In the experiments, our method performs well on CDD, DSIFN-CD, LEVIR-CD, and SYSU-CD datasets, with \n<inline-formula> <tex-math>$F1$ </tex-math></inline-formula>\n scores and OA reaching \n<inline-formula> <tex-math>$95.69\\%/88.05\\%/90.64\\%{/86.95\\%}$ </tex-math></inline-formula>\n and \n<inline-formula> <tex-math>$98.97\\%/96.01\\%/99.07\\%{/90.75\\%}$ </tex-math></inline-formula>\n, respectively. The code can be accessed at \n<uri>https://github.com/121zzy/MF-Mamba.git</uri>\n.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-14"},"PeriodicalIF":8.6000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10756674/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Remote sensing image change detection (CD) is crucial for disaster assessment, land use change, and urban management. Most CD methods are realized by CNN and Transformer. However, these methods are not satisfied with modeling global dependencies while keeping a low computational complexity. Recently, the emergence of Mamba architectures based on state space models (SSMs) can remedy the above problems. In this article, we propose a visual Mamba-based multiscale feature extraction network to efficiently interactively fuse global and local information, which is named as MF-VMamba (MF: multiscale feature). First, a VMamba-based encoder is used to extract multiscale semantic features from bitemporal images. Then, a feature enhancement module (FEM) is proposed to capture the difference information between images. In addition, we employ a multilevel attention decoder (MAD) based on large kernel convolution (LKC) to obtain the information in spatial and spectral dimensions to realize the information interaction between global and local features. After the sequential processing of these three modules, the discriminative ability of changing objects is significantly improved. Notably, the computational complexity of our VMamba-based model grows linearly, which can significantly reduce the computational cost. In the experiments, our method performs well on CDD, DSIFN-CD, LEVIR-CD, and SYSU-CD datasets, with $F1$ scores and OA reaching $95.69\%/88.05\%/90.64\%{/86.95\%}$ and $98.97\%/96.01\%/99.07\%{/90.75\%}$ , respectively. The code can be accessed at https://github.com/121zzy/MF-Mamba.git .
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于多级状态空间模型的新型遥感图像变化检测方法
遥感影像变化检测在灾害评估、土地利用变化和城市管理等方面具有重要意义。大多数CD方法是由CNN和Transformer实现的。然而,这些方法并不满足于在保持较低计算复杂度的同时建模全局依赖关系。最近,基于状态空间模型(ssm)的Mamba体系结构的出现可以解决上述问题。在本文中,我们提出了一种基于视觉曼巴的多尺度特征提取网络,以有效地交互融合全局和局部信息,并将其命名为MF- vamba (MF: multiscale feature)。首先,利用基于vamba的编码器从双时图像中提取多尺度语义特征;然后,提出了一种特征增强模块(FEM)来捕获图像之间的差异信息。此外,我们采用基于大核卷积(LKC)的多级注意解码器(MAD)来获取空间和光谱维度的信息,实现全局特征和局部特征之间的信息交互。这三个模块经过顺序处理后,识别变化对象的能力明显提高。值得注意的是,我们基于vamba的模型的计算复杂度呈线性增长,这可以显著降低计算成本。在实验中,我们的方法在CDD、DSIFN-CD、levirr - cd和SYSU-CD数据集上表现良好,$F1$分数和OA分别达到$95.69\%/88.05\%/90.64\%{/86.95\%}$和$98.97\%/96.01\%/99.07\%{/90.75\%}$。代码可以在https://github.com/121zzy/MF-Mamba.git上访问。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Geoscience and Remote Sensing
IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理
CiteScore
11.50
自引率
28.00%
发文量
1912
审稿时长
4.0 months
期刊介绍: IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.
期刊最新文献
Phase Error Suppression for Swaying Antennas in Base Station-Based Bridge Monitoring Multiscale Spiking Graph Convolution Aggregation Network for Hyperspectral Image Classification AD-GRT Deep Learning Waveform Inversion CroBIM-V: Memory-Quality Controlled Remote Sensing Referring Video Object Segmentation MapSAM2: Adapting SAM2 for Automatic Segmentation of Historical Map Images and Time Series
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1