Enze Zhu;Zhan Chen;Dingkai Wang;Hanru Shi;Xiaoxuan Liu;Lei Wang
{"title":"UNetMamba:一种用于高分辨率遥感图像语义分割的高效unet类曼巴算法","authors":"Enze Zhu;Zhan Chen;Dingkai Wang;Hanru Shi;Xiaoxuan Liu;Lei Wang","doi":"10.1109/LGRS.2024.3505193","DOIUrl":null,"url":null,"abstract":"Semantic segmentation of high-resolution remote sensing images is vital in downstream applications such as land-cover mapping, urban planning, and disaster assessment. Existing Transformer-based methods suffer from the constraint between accuracy and efficiency, while the recently proposed Mamba is renowned for being efficient. Therefore, to overcome the dilemma, we propose UNetMamba, a UNet-like semantic segmentation model based on Mamba. It incorporates a Mamba segmentation decoder (MSD) that can efficiently decode the complex information within high-resolution images, and a local supervision module (LSM), which is train-only but can significantly enhance the perception of local contents. Extensive experiments demonstrate that UNetMamba outperforms the state-of-the-art (SOTA) methods with mIoU increased by 0.87% on LoveDA and 0.39% on ISPRS Vaihingen while achieving high efficiency through the lightweight design, less memory footprint, and reduced computational cost. The source code is available at \n<uri>https://github.com/EnzeZhu2001/UNetMamba</uri>\n.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images\",\"authors\":\"Enze Zhu;Zhan Chen;Dingkai Wang;Hanru Shi;Xiaoxuan Liu;Lei Wang\",\"doi\":\"10.1109/LGRS.2024.3505193\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic segmentation of high-resolution remote sensing images is vital in downstream applications such as land-cover mapping, urban planning, and disaster assessment. Existing Transformer-based methods suffer from the constraint between accuracy and efficiency, while the recently proposed Mamba is renowned for being efficient. Therefore, to overcome the dilemma, we propose UNetMamba, a UNet-like semantic segmentation model based on Mamba. It incorporates a Mamba segmentation decoder (MSD) that can efficiently decode the complex information within high-resolution images, and a local supervision module (LSM), which is train-only but can significantly enhance the perception of local contents. Extensive experiments demonstrate that UNetMamba outperforms the state-of-the-art (SOTA) methods with mIoU increased by 0.87% on LoveDA and 0.39% on ISPRS Vaihingen while achieving high efficiency through the lightweight design, less memory footprint, and reduced computational cost. The source code is available at \\n<uri>https://github.com/EnzeZhu2001/UNetMamba</uri>\\n.\",\"PeriodicalId\":91017,\"journal\":{\"name\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"volume\":\"22 \",\"pages\":\"1-5\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2024-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10766630/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10766630/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images
Semantic segmentation of high-resolution remote sensing images is vital in downstream applications such as land-cover mapping, urban planning, and disaster assessment. Existing Transformer-based methods suffer from the constraint between accuracy and efficiency, while the recently proposed Mamba is renowned for being efficient. Therefore, to overcome the dilemma, we propose UNetMamba, a UNet-like semantic segmentation model based on Mamba. It incorporates a Mamba segmentation decoder (MSD) that can efficiently decode the complex information within high-resolution images, and a local supervision module (LSM), which is train-only but can significantly enhance the perception of local contents. Extensive experiments demonstrate that UNetMamba outperforms the state-of-the-art (SOTA) methods with mIoU increased by 0.87% on LoveDA and 0.39% on ISPRS Vaihingen while achieving high efficiency through the lightweight design, less memory footprint, and reduced computational cost. The source code is available at
https://github.com/EnzeZhu2001/UNetMamba
.