Swin-ResUNet: A Swin-Topology Module for Road Extraction from Remote Sensing Images

{"title":"Swin-ResUNet: A Swin-Topology Module for Road Extraction from Remote Sensing Images","authors":"","doi":"10.1109/DICTA56598.2022.10034582","DOIUrl":null,"url":null,"abstract":"Road extraction from remote sensing images plays a crucial role in navigation, traffic management, urban construction, and other fields. With the development of deep learning in the field of computer vision, road extraction from remote sensing images using deep learning models has become a hot research topic. The convolution-based U-shaped road extraction models have some issues such as high extraction error rate and poor continuity on road topology. The Transformer-based road extraction methods also have issues such as low extraction accuracy and large GPU memory usage. In order to solve the above issues, we propose a Swin-ResUNet structure and use the new paradigm Swin Transformer to extract roads in remote sensing images. Specifically, we construct a Swin-Topology module by adding a Sobel layer based on residual connections to the Swin Transformer block. Based on the Swin-Topology module, we propose a Swin-ResUNet network structure in order to better capture the topology of roads. Experimental results show that the values of mIOU and mDC obtained on the Massachusetts dataset were 64.1% and 76.6% respectively, and the corresponding values on the DeepGlobe2018 dataset were 66.69% and 75.86% respectively. When the batch size is 8, the GPU memory usage with Swin-ResUNet is about 9 GB, which is significantly smaller than other Transformer-based networks. Compared with convolution-based U-shaped structures, the Swin-ResUNet can better capture the topology of roads in remote sensing images and improve road extraction accuracy. Compared with other Transformer-based road extraction methods, the Swin-ResUNet can improve the accuracy of road extraction and reduce GPU memory usage.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA56598.2022.10034582","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Road extraction from remote sensing images plays a crucial role in navigation, traffic management, urban construction, and other fields. With the development of deep learning in the field of computer vision, road extraction from remote sensing images using deep learning models has become a hot research topic. The convolution-based U-shaped road extraction models have some issues such as high extraction error rate and poor continuity on road topology. The Transformer-based road extraction methods also have issues such as low extraction accuracy and large GPU memory usage. In order to solve the above issues, we propose a Swin-ResUNet structure and use the new paradigm Swin Transformer to extract roads in remote sensing images. Specifically, we construct a Swin-Topology module by adding a Sobel layer based on residual connections to the Swin Transformer block. Based on the Swin-Topology module, we propose a Swin-ResUNet network structure in order to better capture the topology of roads. Experimental results show that the values of mIOU and mDC obtained on the Massachusetts dataset were 64.1% and 76.6% respectively, and the corresponding values on the DeepGlobe2018 dataset were 66.69% and 75.86% respectively. When the batch size is 8, the GPU memory usage with Swin-ResUNet is about 9 GB, which is significantly smaller than other Transformer-based networks. Compared with convolution-based U-shaped structures, the Swin-ResUNet can better capture the topology of roads in remote sensing images and improve road extraction accuracy. Compared with other Transformer-based road extraction methods, the Swin-ResUNet can improve the accuracy of road extraction and reduce GPU memory usage.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
swwin - resunet:用于遥感图像道路提取的swwin - topology模块
遥感影像道路提取在导航、交通管理、城市建设等领域发挥着至关重要的作用。随着深度学习在计算机视觉领域的发展,利用深度学习模型从遥感图像中提取道路已成为一个研究热点。基于卷积的u型道路提取模型存在提取错误率高、道路拓扑连续性差等问题。基于transformer的道路提取方法也存在提取精度低和GPU内存占用大等问题。为了解决上述问题,我们提出了一种Swin- resunet结构,并使用新的Swin Transformer范式来提取遥感图像中的道路。具体来说,我们通过在Swin Transformer块上添加基于剩余连接的Sobel层来构建Swin- topology模块。在swwin - topology模块的基础上,我们提出了一种swwin - resunet网络结构,以便更好地捕获道路拓扑。实验结果表明,在Massachusetts数据集上得到的mIOU和mDC值分别为64.1%和76.6%,在DeepGlobe2018数据集上得到的mIOU和mDC值分别为66.69%和75.86%。当批处理大小为8时,swing - resunet的GPU内存使用量约为9 GB,这比其他基于transformer的网络要小得多。与基于卷积的u形结构相比,swwin - resunet可以更好地捕捉遥感图像中道路的拓扑结构,提高道路提取精度。与其他基于transformer的道路提取方法相比,swwin - resunet可以提高道路提取的准确性,减少GPU内存的使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
FootSeg: Automatic Anatomical Segmentation of Foot Bones from Weight-Bearing Cone Beam CT Scans KENGIC: KEyword-driven and N-Gram Graph based Image Captioning Swin-ResUNet: A Swin-Topology Module for Road Extraction from Remote Sensing Images A Multi-Granularity Feature Fusion Model for Pedestrian Attribute Recognition Disentangling Convolutional Neural Network towards an explainable Vehicle Classifier
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1