时空异质性下基于多时相SAR图像和可解释深度学习的大尺度水稻制图

IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-02-01 DOI:10.1016/j.isprsjprs.2024.12.021
Ji Ge , Hong Zhang , Lijun Zuo , Lu Xu , Jingling Jiang , Mingyang Song , Yinhaibin Ding , Yazhe Xie , Fan Wu , Chao Wang , Wenjiang Huang
{"title":"时空异质性下基于多时相SAR图像和可解释深度学习的大尺度水稻制图","authors":"Ji Ge ,&nbsp;Hong Zhang ,&nbsp;Lijun Zuo ,&nbsp;Lu Xu ,&nbsp;Jingling Jiang ,&nbsp;Mingyang Song ,&nbsp;Yinhaibin Ding ,&nbsp;Yazhe Xie ,&nbsp;Fan Wu ,&nbsp;Chao Wang ,&nbsp;Wenjiang Huang","doi":"10.1016/j.isprsjprs.2024.12.021","DOIUrl":null,"url":null,"abstract":"<div><div>Timely and accurate mapping of rice cultivation distribution is crucial for ensuring global food security and achieving SDG2. From a global perspective, rice areas display high heterogeneity in spatial pattern and SAR time-series characteristics, posing substantial challenges to deep learning (DL) models’ performance, efficiency, and transferability. Moreover, due to their “black box” nature, DL often lack interpretability and credibility. To address these challenges, this paper constructs the first SAR rice dataset with spatiotemporal heterogeneity and proposes an explainable, lightweight model for rice area extraction, the eXplainable Mamba UNet (XM-UNet). The dataset is based on the 2023 multi-temporal Sentinel-1 data, covering diverse rice samples from the United States, Kenya, and Vietnam. A Temporal Feature Importance Explainer (TFI-Explainer) based on the Selective State Space Model is designed to enhance adaptability to the temporal heterogeneity of rice and the model’s interpretability. This explainer, coupled with the DL model, provides interpretations of the importance of SAR temporal features and facilitates crucial time phase screening. To overcome the spatial heterogeneity of rice, an Attention Sandglass Layer (ASL) combining CNN and self-attention mechanisms is designed to enhance the local spatial feature extraction capabilities. Additionally, the Parallel Visual State Space Layer (PVSSL) utilizes 2D-Selective-Scan (SS2D) cross-scanning to capture the global spatial features of rice multi-directionally, significantly reducing computational complexity through parallelization. Experimental results demonstrate that the XM-UNet adapts well to the spatiotemporal heterogeneity of rice globally, with OA and F1-score of 94.26 % and 90.73 %, respectively. The model is extremely lightweight, with only 0.190 M parameters and 0.279 GFLOPs. Mamba’s selective scanning facilitates feature screening, and its integration with CNN effectively balances rice’s local and global spatial characteristics. The interpretability experiments prove that the explanations of the importance of the temporal features provided by the model are crucial for guiding rice distribution mapping and filling a gap in the related field. The code is available in <span><span>https://github.com/SAR-RICE/XM-UNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 395-412"},"PeriodicalIF":10.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Large-scale rice mapping under spatiotemporal heterogeneity using multi-temporal SAR images and explainable deep learning\",\"authors\":\"Ji Ge ,&nbsp;Hong Zhang ,&nbsp;Lijun Zuo ,&nbsp;Lu Xu ,&nbsp;Jingling Jiang ,&nbsp;Mingyang Song ,&nbsp;Yinhaibin Ding ,&nbsp;Yazhe Xie ,&nbsp;Fan Wu ,&nbsp;Chao Wang ,&nbsp;Wenjiang Huang\",\"doi\":\"10.1016/j.isprsjprs.2024.12.021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Timely and accurate mapping of rice cultivation distribution is crucial for ensuring global food security and achieving SDG2. From a global perspective, rice areas display high heterogeneity in spatial pattern and SAR time-series characteristics, posing substantial challenges to deep learning (DL) models’ performance, efficiency, and transferability. Moreover, due to their “black box” nature, DL often lack interpretability and credibility. To address these challenges, this paper constructs the first SAR rice dataset with spatiotemporal heterogeneity and proposes an explainable, lightweight model for rice area extraction, the eXplainable Mamba UNet (XM-UNet). The dataset is based on the 2023 multi-temporal Sentinel-1 data, covering diverse rice samples from the United States, Kenya, and Vietnam. A Temporal Feature Importance Explainer (TFI-Explainer) based on the Selective State Space Model is designed to enhance adaptability to the temporal heterogeneity of rice and the model’s interpretability. This explainer, coupled with the DL model, provides interpretations of the importance of SAR temporal features and facilitates crucial time phase screening. To overcome the spatial heterogeneity of rice, an Attention Sandglass Layer (ASL) combining CNN and self-attention mechanisms is designed to enhance the local spatial feature extraction capabilities. Additionally, the Parallel Visual State Space Layer (PVSSL) utilizes 2D-Selective-Scan (SS2D) cross-scanning to capture the global spatial features of rice multi-directionally, significantly reducing computational complexity through parallelization. Experimental results demonstrate that the XM-UNet adapts well to the spatiotemporal heterogeneity of rice globally, with OA and F1-score of 94.26 % and 90.73 %, respectively. The model is extremely lightweight, with only 0.190 M parameters and 0.279 GFLOPs. Mamba’s selective scanning facilitates feature screening, and its integration with CNN effectively balances rice’s local and global spatial characteristics. The interpretability experiments prove that the explanations of the importance of the temporal features provided by the model are crucial for guiding rice distribution mapping and filling a gap in the related field. The code is available in <span><span>https://github.com/SAR-RICE/XM-UNet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50269,\"journal\":{\"name\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"volume\":\"220 \",\"pages\":\"Pages 395-412\"},\"PeriodicalIF\":10.6000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S092427162400491X\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092427162400491X","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

摘要

及时准确地绘制水稻种植分布图对于确保全球粮食安全和实现可持续发展目标2至关重要。从全球范围来看,水稻区域在空间格局和SAR时间序列特征上具有高度异质性,这对深度学习(DL)模型的性能、效率和可移植性提出了重大挑战。此外,由于其“黑箱”性质,深度学习往往缺乏可解释性和可信度。为了解决这些问题,本文构建了第一个具有时空异质性的SAR水稻数据集,并提出了一个可解释的轻量级水稻面积提取模型——可解释曼巴UNet (XM-UNet)。该数据集基于2023年Sentinel-1的多时段数据,涵盖了来自美国、肯尼亚和越南的不同水稻样本。基于选择性状态空间模型设计了一个时间特征重要性解释器(TFI-Explainer),以增强对水稻时间异质性的适应性和模型的可解释性。该解释器与DL模型相结合,提供了对SAR时间特征重要性的解释,并促进了关键的时相筛选。为了克服水稻的空间异质性,设计了一种结合CNN和自注意机制的注意力沙漏层(Attention Sandglass Layer, ASL),增强了局部空间特征提取能力。此外,并行视觉状态空间层(PVSSL)利用2D-Selective-Scan (SS2D)交叉扫描多方向捕获水稻的全局空间特征,通过并行化显著降低了计算复杂度。实验结果表明,XM-UNet能较好地适应全球水稻的时空异质性,OA和f1得分分别为94.26%和90.73%。该模型非常轻巧,只有0.190 M参数和0.279 GFLOPs。Mamba的选择性扫描有助于特征筛选,它与CNN的结合有效地平衡了水稻的局部和全局空间特征。可解释性实验证明,该模型所提供的对时间特征重要性的解释对于指导水稻分布图和填补相关领域的空白至关重要。该代码可在https://github.com/SAR-RICE/XM-UNet中获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Large-scale rice mapping under spatiotemporal heterogeneity using multi-temporal SAR images and explainable deep learning
Timely and accurate mapping of rice cultivation distribution is crucial for ensuring global food security and achieving SDG2. From a global perspective, rice areas display high heterogeneity in spatial pattern and SAR time-series characteristics, posing substantial challenges to deep learning (DL) models’ performance, efficiency, and transferability. Moreover, due to their “black box” nature, DL often lack interpretability and credibility. To address these challenges, this paper constructs the first SAR rice dataset with spatiotemporal heterogeneity and proposes an explainable, lightweight model for rice area extraction, the eXplainable Mamba UNet (XM-UNet). The dataset is based on the 2023 multi-temporal Sentinel-1 data, covering diverse rice samples from the United States, Kenya, and Vietnam. A Temporal Feature Importance Explainer (TFI-Explainer) based on the Selective State Space Model is designed to enhance adaptability to the temporal heterogeneity of rice and the model’s interpretability. This explainer, coupled with the DL model, provides interpretations of the importance of SAR temporal features and facilitates crucial time phase screening. To overcome the spatial heterogeneity of rice, an Attention Sandglass Layer (ASL) combining CNN and self-attention mechanisms is designed to enhance the local spatial feature extraction capabilities. Additionally, the Parallel Visual State Space Layer (PVSSL) utilizes 2D-Selective-Scan (SS2D) cross-scanning to capture the global spatial features of rice multi-directionally, significantly reducing computational complexity through parallelization. Experimental results demonstrate that the XM-UNet adapts well to the spatiotemporal heterogeneity of rice globally, with OA and F1-score of 94.26 % and 90.73 %, respectively. The model is extremely lightweight, with only 0.190 M parameters and 0.279 GFLOPs. Mamba’s selective scanning facilitates feature screening, and its integration with CNN effectively balances rice’s local and global spatial characteristics. The interpretability experiments prove that the explanations of the importance of the temporal features provided by the model are crucial for guiding rice distribution mapping and filling a gap in the related field. The code is available in https://github.com/SAR-RICE/XM-UNet.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ISPRS Journal of Photogrammetry and Remote Sensing
ISPRS Journal of Photogrammetry and Remote Sensing 工程技术-成像科学与照相技术
CiteScore
21.00
自引率
6.30%
发文量
273
审稿时长
40 days
期刊介绍: The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive. P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields. In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.
期刊最新文献
SDCluster: A clustering based self-supervised pre-training method for semantic segmentation of remote sensing images FengYun-3 meteorological satellites’ microwave radiation Imagers enhance land surface temperature measurements across the diurnal cycle Mitigation of tropospheric turbulent delays in InSAR time series by incorporating a stochastic process TACMT: Text-aware cross-modal transformer for visual grounding on high-resolution SAR images Time-Series models for ground subsidence and heave over permafrost in InSAR Processing: A comprehensive assessment and new improvement
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1