Enze Zhu;Zhan Chen;Dingkai Wang;Hanru Shi;Xiaoxuan Liu;Lei Wang
{"title":"UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images","authors":"Enze Zhu;Zhan Chen;Dingkai Wang;Hanru Shi;Xiaoxuan Liu;Lei Wang","doi":"10.1109/LGRS.2024.3505193","DOIUrl":null,"url":null,"abstract":"Semantic segmentation of high-resolution remote sensing images is vital in downstream applications such as land-cover mapping, urban planning, and disaster assessment. Existing Transformer-based methods suffer from the constraint between accuracy and efficiency, while the recently proposed Mamba is renowned for being efficient. Therefore, to overcome the dilemma, we propose UNetMamba, a UNet-like semantic segmentation model based on Mamba. It incorporates a Mamba segmentation decoder (MSD) that can efficiently decode the complex information within high-resolution images, and a local supervision module (LSM), which is train-only but can significantly enhance the perception of local contents. Extensive experiments demonstrate that UNetMamba outperforms the state-of-the-art (SOTA) methods with mIoU increased by 0.87% on LoveDA and 0.39% on ISPRS Vaihingen while achieving high efficiency through the lightweight design, less memory footprint, and reduced computational cost. The source code is available at \n<uri>https://github.com/EnzeZhu2001/UNetMamba</uri>\n.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10766630/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Semantic segmentation of high-resolution remote sensing images is vital in downstream applications such as land-cover mapping, urban planning, and disaster assessment. Existing Transformer-based methods suffer from the constraint between accuracy and efficiency, while the recently proposed Mamba is renowned for being efficient. Therefore, to overcome the dilemma, we propose UNetMamba, a UNet-like semantic segmentation model based on Mamba. It incorporates a Mamba segmentation decoder (MSD) that can efficiently decode the complex information within high-resolution images, and a local supervision module (LSM), which is train-only but can significantly enhance the perception of local contents. Extensive experiments demonstrate that UNetMamba outperforms the state-of-the-art (SOTA) methods with mIoU increased by 0.87% on LoveDA and 0.39% on ISPRS Vaihingen while achieving high efficiency through the lightweight design, less memory footprint, and reduced computational cost. The source code is available at
https://github.com/EnzeZhu2001/UNetMamba
.