首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
Mamba-CNN hybrid Multi-scale ship detection Network driven by a Dual-perception feature of Doppler and Scattering 基于多普勒和散射双感知特征驱动的Mamba-CNN混合多尺度船舶检测网络
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-09 DOI: 10.1016/j.isprsjprs.2026.01.004
Gui Gao , Caiyi Li , Xi Zhang , Bingxiu Yao , Zhen Chen
Ship detection is crucial for both military and civilian applications and is a key use of polarimetric SAR (PolSAR). While convolutional neural networks (CNNs) enhance PolSAR ship detection with powerful feature extraction, existing approaches still face challenges in discriminating targets from clutter, detecting multi-scale objects in complex scenes, and achieving real-time detection. To address these issues, we propose a Mamba-CNN hybrid Multi-scale ship detection Network driven by a Dual-perception feature of Doppler and Scattering. First, at the input feature level, a Dual-perception feature of Doppler and Scattering (DDS) is introduced, effectively differentiating ship and clutter pixels to enhance the network’s ship discrimination. Specifically, Doppler characteristics distinguish between moving and stationary targets, while scattering characteristics reveal fundamental differences between targets and clutter. Second, at the network architecture level, a Mamba-CNN hybrid Multi-scale ship detection Network (MCMN) is designed to improve multi-scale ship detection in complex scenarios. It uses a Multi-scale Information Perception Module (MIPM) to adaptively aggregate multi-scale features and a Local-Global Feature Enhancement Module (LGFEM) based on Mamba for long-range context modeling. MCMN remains efficient through feature grouping, pointwise and depthwise convolutions, meeting real-time requirements. Finally, extensive experiments on the GF-3 and SSDD datasets demonstrate the superiority of DDS and MCMN. DDS effectively distinguishes ships from clutter across scenarios. As an input feature, it boosts average F1-score and AP by 4.3% and 4.3%, respectively, over HV intensity, and outperforms other polarization features. MCMN achieves state-of-the-art results, improving AP by 1.2% and 0.8% on the two datasets while reducing parameters by 1.29M, FLOPs by 1.5G, and inference time by 59.2%.
船舶探测对于军事和民用应用都是至关重要的,并且是极化SAR (PolSAR)的关键用途。虽然卷积神经网络(cnn)通过强大的特征提取增强了PolSAR舰船检测,但现有方法在区分杂波目标、检测复杂场景中的多尺度目标以及实现实时检测等方面仍面临挑战。为了解决这些问题,我们提出了一种基于多普勒和散射双感知特征驱动的Mamba-CNN混合多尺度船舶检测网络。首先,在输入特征层引入多普勒和散射双感知特征(DDS),有效区分舰船像素和杂波像素,增强网络的舰船识别能力;具体来说,多普勒特性区分了运动目标和静止目标,而散射特性揭示了目标和杂波之间的根本区别。其次,在网络架构层面,设计了一种Mamba-CNN混合多尺度船舶检测网络(MCMN),以改进复杂场景下的多尺度船舶检测。采用多尺度信息感知模块(MIPM)自适应聚合多尺度特征,采用基于Mamba的局部-全局特征增强模块(LGFEM)进行远程上下文建模。MCMN通过特征分组、点卷积和深度卷积保持了高效,满足了实时性要求。最后,在GF-3和SSDD数据集上的大量实验证明了DDS和MCMN的优越性。DDS可以有效地将船舶从各种场景中区分出来。作为输入特征,它比HV强度分别提高平均f1分数和平均AP 4.3%和4.3%,优于其他极化特征。MCMN达到了最先进的结果,在两个数据集上提高了1.2%和0.8%的AP,同时减少了1.29M的参数,1.5G的FLOPs, 59.2%的推理时间。
{"title":"Mamba-CNN hybrid Multi-scale ship detection Network driven by a Dual-perception feature of Doppler and Scattering","authors":"Gui Gao ,&nbsp;Caiyi Li ,&nbsp;Xi Zhang ,&nbsp;Bingxiu Yao ,&nbsp;Zhen Chen","doi":"10.1016/j.isprsjprs.2026.01.004","DOIUrl":"10.1016/j.isprsjprs.2026.01.004","url":null,"abstract":"<div><div>Ship detection is crucial for both military and civilian applications and is a key use of polarimetric SAR (PolSAR). While convolutional neural networks (CNNs) enhance PolSAR ship detection with powerful feature extraction, existing approaches still face challenges in discriminating targets from clutter, detecting multi-scale objects in complex scenes, and achieving real-time detection. To address these issues, we propose a Mamba-CNN hybrid Multi-scale ship detection Network driven by a Dual-perception feature of Doppler and Scattering. First, at the input feature level, a Dual-perception feature of Doppler and Scattering (DDS) is introduced, effectively differentiating ship and clutter pixels to enhance the network’s ship discrimination. Specifically, Doppler characteristics distinguish between moving and stationary targets, while scattering characteristics reveal fundamental differences between targets and clutter. Second, at the network architecture level, a Mamba-CNN hybrid Multi-scale ship detection Network (MCMN) is designed to improve multi-scale ship detection in complex scenarios. It uses a Multi-scale Information Perception Module (MIPM) to adaptively aggregate multi-scale features and a Local-Global Feature Enhancement Module (LGFEM) based on Mamba for long-range context modeling. MCMN remains efficient through feature grouping, pointwise and depthwise convolutions, meeting real-time requirements. Finally, extensive experiments on the GF-3 and SSDD datasets demonstrate the superiority of DDS and MCMN. DDS effectively distinguishes ships from clutter across scenarios. As an input feature, it boosts average F1-score and AP by 4.3% and 4.3%, respectively, over HV intensity, and outperforms other polarization features. MCMN achieves state-of-the-art results, improving AP by 1.2% and 0.8% on the two datasets while reducing parameters by 1.29M, FLOPs by 1.5G, and inference time by 59.2%.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 656-674"},"PeriodicalIF":12.2,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mapping melliferous tree species in Kenya via one-class classification with hyperspectral unsupervised domain adaptation 通过高光谱无监督域适应的一类分类绘制肯尼亚蜜科树种
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-08 DOI: 10.1016/j.isprsjprs.2025.12.028
Zhaozhi Luo , Janne Heiskanen , Ilja Vuorinne , Ian Ocholla , Shiqi Zhang , Saana Järvinen , Xinyu Wang , Yanfei Zhong , Petri Pellikka
The beekeeping sector holds significant potential for livelihood diversification among the agropastoral communities in Kenya. Melliferous tree species play a critical role by providing essential nectar sources for bees. However, limited knowledge of their precise spatial distributions constrains the full development of beekeeping. One-class classification (OCC) offers a practical solution for detecting single target species without requiring extensive labeled data from other classes. Although existing OCC methods perform well in trained domains, the generalization capability to unseen domains remains limited due to domain shift. To address these challenges, this study proposes a hyperspectral unsupervised domain adaptation OCC framework (HyUDA-One) for tree species mapping using airborne hyperspectral imagery and laser scanning data. The spatial–spectral regularized pseudo-positive learning was designed to mitigate domain shift and improve model generalizability. The effectiveness of HyUDA-One was demonstrated by mapping three key melliferous tree species in two savanna landscapes in southern Kenya. The results show that HyUDA-One significantly improves performance in unlabeled domains. The F1-scores of 0.788, 0.845, and 0.768 were achieved for Senegalia mellifera, Vachellia tortilis, and Commiphora africana in the trained domain, respectively. In the untrained domain, the F1-scores of Senegalia mellifera and Vachellia tortilis were 0.756 and 0.884, respectively. The distribution maps revealed the spatial patterns of these melliferous tree species and the nectar source availability, offering an important reference for sustainable beekeeping development in savanna landscapes. Furthermore, the proposed framework can potentially be extended to other mapping applications, such as invasive species detection.
养蜂业对肯尼亚农牧社区的生计多样化具有巨大潜力。蜜科树种通过为蜜蜂提供必需的花蜜来源而发挥关键作用。然而,对其精确空间分布的有限了解限制了养蜂业的充分发展。单类分类(OCC)为检测单个目标物种提供了实用的解决方案,而不需要从其他类中获得大量标记数据。虽然现有的OCC方法在训练域中表现良好,但由于域移位,对未知域的泛化能力受到限制。为了解决这些挑战,本研究提出了一个高光谱无监督域自适应OCC框架(HyUDA-One),用于利用航空高光谱图像和激光扫描数据进行树种制图。设计了空间-频谱正则化伪正学习以减轻域漂移,提高模型的可泛化性。HyUDA-One的有效性通过在肯尼亚南部的两个稀树草原上绘制三种关键的蜜树物种来证明。结果表明,HyUDA-One显著提高了未标记域的性能。Senegalia mellifera、Vachellia tortilis和Commiphora africana在训练域的f1得分分别为0.788、0.845和0.768。在非训练域,塞内加尔和玉米饼的f1得分分别为0.756和0.884。该分布图揭示了这些蜜科树种的空间分布格局和花蜜来源的有效性,为热带稀树草原景观的可持续养蜂发展提供了重要参考。此外,所提出的框架可以扩展到其他测绘应用,如入侵物种检测。
{"title":"Mapping melliferous tree species in Kenya via one-class classification with hyperspectral unsupervised domain adaptation","authors":"Zhaozhi Luo ,&nbsp;Janne Heiskanen ,&nbsp;Ilja Vuorinne ,&nbsp;Ian Ocholla ,&nbsp;Shiqi Zhang ,&nbsp;Saana Järvinen ,&nbsp;Xinyu Wang ,&nbsp;Yanfei Zhong ,&nbsp;Petri Pellikka","doi":"10.1016/j.isprsjprs.2025.12.028","DOIUrl":"10.1016/j.isprsjprs.2025.12.028","url":null,"abstract":"<div><div>The beekeeping sector holds significant potential for livelihood diversification among the agropastoral communities in Kenya. Melliferous tree species play a critical role by providing essential nectar sources for bees. However, limited knowledge of their precise spatial distributions constrains the full development of beekeeping. One-class classification (OCC) offers a practical solution for detecting single target species without requiring extensive labeled data from other classes. Although existing OCC methods perform well in trained domains, the generalization capability to unseen domains remains limited due to domain shift. To address these challenges, this study proposes a hyperspectral unsupervised domain adaptation OCC framework (HyUDA-One) for tree species mapping using airborne hyperspectral imagery and laser scanning data. The spatial–spectral regularized pseudo-positive learning was designed to mitigate domain shift and improve model generalizability. The effectiveness of HyUDA-One was demonstrated by mapping three key melliferous tree species in two savanna landscapes in southern Kenya. The results show that HyUDA-One significantly improves performance in unlabeled domains. The F1-scores of 0.788, 0.845, and 0.768 were achieved for <em>Senegalia mellifera</em>, <em>Vachellia tortilis</em>, and <em>Commiphora africana</em> in the trained domain, respectively. In the untrained domain, the F1-scores of <em>Senegalia mellifera</em> and <em>Vachellia tortilis</em> were 0.756 and 0.884, respectively. The distribution maps revealed the spatial patterns of these melliferous tree species and the nectar source availability, offering an important reference for sustainable beekeeping development in savanna landscapes. Furthermore, the proposed framework can potentially be extended to other mapping applications, such as invasive species detection.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 638-655"},"PeriodicalIF":12.2,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A spectral index using generic global endmembers from Landsat multispectral data for mapping urban areas 利用来自Landsat多光谱数据的通用全球端元的光谱指数,用于绘制城市区域
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-08 DOI: 10.1016/j.isprsjprs.2025.12.025
Ruiyi Zhao , Cai Cai , Xinfan Cai , Peijun Li
Quantifying urban land cover from space is crucial for studying and understanding its spatial distribution and changes, as well as for assessing the impact of these changes on environmental and socio-economic dynamics worldwide. Owing to the diversity and spectral heterogeneity of urban reflectance, spectrally mixed pixels predominate in moderate resolution multispectral images. The standardized linear spectral mixture model from Landsat multispectral images, which represents the radiance measurements of mixed pixels as linear mixtures of generic global Substrate (S), Vegetation (V), and Dark Surfaces (D) endmember radiances, offers an effective method for characterizing urban reflectance. Based on the analysis of SVD endmember fractions of urban land and other land cover types, this study proposes a spectral index using Landsat global SVD endmembers, termed the Urban Index using Global Endmembers (GEUI), to highlight and map urban land. GEUI is evaluated through comparisons with five established spectral indices: including Normalized Difference Built-up Index (NDBI), Index-based Built-up Index (IBI), Biophysical Composition Index (BCI), Built-up Land Features Extraction Index (BLFEI), and Urban Composition Index (UCI), all of which rely on pure spectral signatures of urban pixels. Additionally, GEUI is compared to two deep learning methods in urban area mapping, i.e., two-dimensional convolutional neural network (2D CNN) and one-dimensional CNN (1D CNN). The results demonstrate that the proposed GEUI outperforms these comparative indices in qualitative evaluation, separability analysis, and urban land mapping, and also showed superior performance in urban land mapping compared to CNN methods. GEUI achieved overall accuracies ranging from 84.36% to 93.02% and F-scores between 84.80% and 92.64%, obtaining the highest accuracy in half of the study urban areas. Since S, V, and D endmembers used in GEUI are globally available, the proposed GEUI has the advantage of being applicable across diverse locations and times. Furthermore, GEUI can be readily extended to other broadband multispectral data, such as Sentinel-2 and MODIS. Therefore, the proposed GEUI provides an effective variable for mapping urban land and holds the potential for diverse urban applications.
从空间上量化城市土地覆盖对于研究和了解其空间分布和变化,以及评估这些变化对全球环境和社会经济动态的影响至关重要。由于城市反射率的多样性和光谱异质性,光谱混合像元在中分辨率多光谱图像中占主导地位。Landsat多光谱图像的标准化线性光谱混合模型,将混合像元的亮度测量值表示为一般全球基材(S)、植被(V)和暗表面(D)端元辐射的线性混合,为表征城市反射率提供了一种有效方法。本文在分析城市土地和其他土地覆被类型SVD端元分量的基础上,提出了一种利用Landsat全球SVD端元的光谱指数,即使用全球端元的城市指数(urban index using global endmembers, GEUI),以突出和绘制城市土地。GEUI通过与标准化差异建筑指数(NDBI)、基于指数的建筑指数(IBI)、生物物理组成指数(BCI)、建筑地物提取指数(BLFEI)和城市组成指数(UCI)等5个已建立的光谱指数进行比较来评估,这些指数都依赖于城市像元的纯光谱特征。此外,将GEUI与城市区域映射中的两种深度学习方法,即二维卷积神经网络(2D CNN)和一维CNN (1D CNN)进行了比较。结果表明,该方法在定性评价、可分性分析和城市土地填图等方面均优于上述对比指标,在城市土地填图方面也优于CNN方法。GEUI的总体准确率在84.36% ~ 93.02%之间,f值在84.80% ~ 92.64%之间,在一半的研究城市地区获得了最高的准确率。由于GEUI中使用的S、V和D端元是全球可用的,因此所提出的GEUI具有适用于不同地点和时间的优势。此外,GEUI可以很容易地扩展到其他宽带多光谱数据,如Sentinel-2和MODIS。因此,拟议的GEUI为绘制城市土地提供了一个有效的变量,并具有多种城市应用的潜力。
{"title":"A spectral index using generic global endmembers from Landsat multispectral data for mapping urban areas","authors":"Ruiyi Zhao ,&nbsp;Cai Cai ,&nbsp;Xinfan Cai ,&nbsp;Peijun Li","doi":"10.1016/j.isprsjprs.2025.12.025","DOIUrl":"10.1016/j.isprsjprs.2025.12.025","url":null,"abstract":"<div><div>Quantifying urban land cover from space is crucial for studying and understanding its spatial distribution and changes, as well as for assessing the impact of these changes on environmental and socio-economic dynamics worldwide. Owing to the diversity and spectral heterogeneity of urban reflectance, spectrally mixed pixels predominate in moderate resolution multispectral images. The standardized linear spectral mixture model from Landsat multispectral images, which represents the radiance measurements of mixed pixels as linear mixtures of generic global Substrate (S), Vegetation (V), and Dark Surfaces (D) endmember radiances, offers an effective method for characterizing urban reflectance. Based on the analysis of SVD endmember fractions of urban land and other land cover types, this study proposes a spectral index using Landsat global SVD endmembers, termed the Urban Index using Global Endmembers (GEUI), to highlight and map urban land. GEUI is evaluated through comparisons with five established spectral indices: including Normalized Difference Built-up Index (NDBI), Index-based Built-up Index (IBI), Biophysical Composition Index (BCI), Built-up Land Features Extraction Index (BLFEI), and Urban Composition Index (UCI), all of which rely on pure spectral signatures of urban pixels. Additionally, GEUI is compared to two deep learning methods in urban area mapping, i.e., two-dimensional convolutional neural network (2D CNN) and one-dimensional CNN (1D CNN). The results demonstrate that the proposed GEUI outperforms these comparative indices in qualitative evaluation, separability analysis, and urban land mapping, and also showed superior performance in urban land mapping compared to CNN methods. GEUI achieved overall accuracies ranging from 84.36% to 93.02% and F-scores between 84.80% and 92.64%, obtaining the highest accuracy in half of the study urban areas. Since S, V, and D endmembers used in GEUI are globally available, the proposed GEUI has the advantage of being applicable across diverse locations and times. Furthermore, GEUI can be readily extended to other broadband multispectral data, such as Sentinel-2 and MODIS. Therefore, the proposed GEUI provides an effective variable for mapping urban land and holds the potential<!--> <!-->for diverse<!--> <!-->urban applications.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 613-637"},"PeriodicalIF":12.2,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Progressive uncertainty-guided network for binary segmentation in high-resolution remote sensing imagery 高分辨率遥感图像二值分割的渐进式不确定性引导网络
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-07 DOI: 10.1016/j.isprsjprs.2026.01.010
Jiepan Li , Wei He , Ting Hu , Minghao Tang , Liangpei Zhang
Binary semantic segmentation in remote sensing (RS) imagery faces persistent challenges due to complex object appearances, ambiguous boundaries, and high similarity between foreground and background, all of which introduce significant uncertainty into the prediction process. Existing approaches often treat uncertainty as either a global attribute or a pixel-level estimate, overlooking the critical role of spatial and contextual interactions. To address these limitations, we propose the Progressive Uncertainty-Guided Segmentation Network (PUGNet), a unified framework that explicitly models uncertainty in a context-aware manner. PUGNet decomposes uncertainty into three distinct components: foreground uncertainty, background uncertainty, and contextual uncertainty. This tripartite modeling enables more precise handling of local ambiguities and global inconsistencies. We adopt a coarse-to-fine decoding strategy that progressively refines features through two specialized modules. The Dynamic Uncertainty-Aware Module enhances regions of high foreground and background uncertainty using Gaussian-based modeling and contrastive learning. The Entropy-Driven Refinement Module quantifies contextual uncertainty via entropy and facilitates adaptive refinement through multi-scale context aggregation. Extensive experiments on ten public benchmark datasets, covering both single-temporal (e.g., building and cropland extraction) and bi-temporal (e.g., building change detection) binary segmentation tasks, demonstrate that PUGNet consistently achieves superior segmentation accuracy and uncertainty reduction, establishing a new state of the art in RS binary segmentation. The full implementation of the proposed framework and all experimental results can be accessed at https://github.com/Henryjiepanli/PU_RS.
由于物体外观复杂、边界模糊、前景与背景高度相似等特点,给遥感图像的二值语义分割带来了很大的不确定性。现有的方法通常将不确定性视为全局属性或像素级估计,忽略了空间和上下文相互作用的关键作用。为了解决这些限制,我们提出了渐进式不确定性引导分割网络(PUGNet),这是一个统一的框架,以上下文感知的方式明确地建模不确定性。PUGNet将不确定性分解为三个不同的组成部分:前景不确定性、背景不确定性和上下文不确定性。这种三方建模可以更精确地处理局部模糊和全局不一致。我们采用一种从粗到精的解码策略,通过两个专门的模块逐步细化特征。动态不确定性感知模块使用基于高斯的建模和对比学习来增强前景和背景不确定性高的区域。熵驱动的细化模块通过熵量化上下文不确定性,并通过多尺度上下文聚合促进自适应细化。在10个公共基准数据集上进行了广泛的实验,涵盖了单时间(如建筑物和农田提取)和双时间(如建筑物变化检测)二值分割任务,表明PUGNet始终如一地实现了卓越的分割精度和不确定性降低,建立了RS二值分割的新状态。提出的框架的完整实施和所有实验结果可以在https://github.com/Henryjiepanli/PU_RS上访问。
{"title":"Progressive uncertainty-guided network for binary segmentation in high-resolution remote sensing imagery","authors":"Jiepan Li ,&nbsp;Wei He ,&nbsp;Ting Hu ,&nbsp;Minghao Tang ,&nbsp;Liangpei Zhang","doi":"10.1016/j.isprsjprs.2026.01.010","DOIUrl":"10.1016/j.isprsjprs.2026.01.010","url":null,"abstract":"<div><div>Binary semantic segmentation in remote sensing (RS) imagery faces persistent challenges due to complex object appearances, ambiguous boundaries, and high similarity between foreground and background, all of which introduce significant uncertainty into the prediction process. Existing approaches often treat uncertainty as either a global attribute or a pixel-level estimate, overlooking the critical role of spatial and contextual interactions. To address these limitations, we propose the <strong>Progressive Uncertainty-Guided Segmentation Network (PUGNet)</strong>, a unified framework that explicitly models uncertainty in a context-aware manner. PUGNet decomposes uncertainty into three distinct components: <strong>foreground uncertainty</strong>, <strong>background uncertainty</strong>, and <strong>contextual uncertainty</strong>. This tripartite modeling enables more precise handling of local ambiguities and global inconsistencies. We adopt a coarse-to-fine decoding strategy that progressively refines features through two specialized modules. The <strong>Dynamic Uncertainty-Aware Module</strong> enhances regions of high foreground and background uncertainty using Gaussian-based modeling and contrastive learning. The <strong>Entropy-Driven Refinement Module</strong> quantifies contextual uncertainty via entropy and facilitates adaptive refinement through multi-scale context aggregation. Extensive experiments on ten public benchmark datasets, covering both single-temporal (<em>e.g.</em>, building and cropland extraction) and bi-temporal (<em>e.g.</em>, building change detection) binary segmentation tasks, demonstrate that PUGNet consistently achieves superior segmentation accuracy and uncertainty reduction, establishing a new state of the art in RS binary segmentation. The full implementation of the proposed framework and all experimental results can be accessed at <span><span>https://github.com/Henryjiepanli/PU_RS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 561-577"},"PeriodicalIF":12.2,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An enhanced spatiotemporal prediction method on landslide displacement with LDP-ConvFormer and MT-InSAR observations 基于LDP-ConvFormer和MT-InSAR观测的滑坡位移增强时空预测方法
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-07 DOI: 10.1016/j.isprsjprs.2025.12.027
Jianao Cai , Dongping Ming , Feng Liu , Wenyi Zhao , Mingzhi Zhang , Xiao Ling , Mengyuan Zhu , Lu Xu , Tingting Lu , Ningjie Liu , Yanfei Wei , Ming Huang
Landslide Displacement Prediction (LDP) implementation for Landslide Early Warning Systems (LEWS) using the Multi-Temporal Interferometric Synthetic Aperture Radar (MT-InSAR) technique poses significant challenges in the Three Gorges Reservoir Area (TGRA). On the one hand, the limited revisit frequency of satellites fails to satisfy the high-frequency monitoring requirements of LEWS. On the other hand, traditional LDP methods concentrate on single-point modeling. It neglects the spatial correlation between displacement points and the landslide surface. To enhance the low-frequency MT-InSAR observations, this paper proposes a new hybrid algorithm that integrates the Kalman Filter (KF) and LDP-ConvFormer to achieve enhanced spatiotemporal LDP. First, multi-orbit MT-InSAR measurements are transformed to downslope displacement. Subsequently, the multi-orbit downslope displacements are integrated by KF to generate time series data with enhanced temporal resolution (5/7-day intervals). The KF estimations indicate that the integrated higher-resolution time series achieves high accuracy, with an RMSE of 0.431 cm and an R2 of 0.974 compared to GNSS. Finally, to overcome the limitation of single-point modeling, a novel LDP-ConvFormer is constructed for enhanced spatiotemporal LDP. The Spatiotemporal Displacement Prediction Transformer (STDP-Former) employs Local Spatial Multi-Head Self-Attention (LSMHSA) and Temporal Multi-Head Self-Attention (TMHSA) to capture the displacement dependencies between different spatial locations at the same time steps and temporal relationships across different time steps, respectively. Additionally, the spatiotemporal feature map is decomposed into trend and periodic components, which are modeled separately and then summed for final predictions. Experimental results demonstrate that the constructed model can accurately establish the nonlinear relationship between the landslide displacement and its triggering factors. The LDP-ConvFormer outperforms benchmark methods, achieving RMSE: 46.29 mm, MAE: 26.7 mm, SSIM: 0.8187, PSNR: 35.62, R2: 0.9574, and EVar: 0.9603. Moreover, LDP-ConvFormer shows notable superiority in LDP over medium to long periods (60-90d) in the TGRA. The enhanced spatiotemporal LDP method provides extremely valuable reference for LEWS of translational landslides in the TGRA.
利用多时相干涉合成孔径雷达(MT-InSAR)技术在三峡库区滑坡预警系统(LEWS)中实现滑坡位移预测(LDP)是一项重大挑战。一方面,卫星的重访频率有限,无法满足LEWS的高频监测需求。另一方面,传统的LDP方法侧重于单点建模。它忽略了位移点与滑坡面之间的空间相关性。为了增强低频MT-InSAR观测,本文提出了一种将Kalman Filter (KF)和LDP- conformer相结合的混合算法,实现增强的时空LDP。首先,将多轨道MT-InSAR测量结果转化为下坡位移。随后,利用KF对多轨道下坡位移进行积分,生成时间分辨率更高(5/7天间隔)的时间序列数据。KF估计表明,与GNSS相比,整合后的高分辨率时间序列具有较高的精度,RMSE为0.431 cm, R2为0.974。最后,为了克服单点建模的局限性,构造了一种新的LDP- convformer,用于增强时空LDP。时空位移预测转换器(STDP-Former)利用局部空间多头自注意(LSMHSA)和时间多头自注意(TMHSA)分别捕捉同一时间步长不同空间位置之间的位移依赖关系和不同时间步长之间的时间关系。此外,将时空特征图分解为趋势分量和周期分量,分别建模,然后进行汇总预测。实验结果表明,所建立的模型能较准确地建立滑坡位移与其触发因素之间的非线性关系。LDP-ConvFormer优于基准方法,实现RMSE: 46.29 mm, MAE: 26.7 mm, SSIM: 0.8187, PSNR: 35.62, R2: 0.9574, EVar: 0.9603。此外,LDP- convformer在TGRA中长期(60-90d)表现出显著的LDP优势。改进的时空LDP方法为该区平动滑坡的LEWS提供了极有价值的参考。
{"title":"An enhanced spatiotemporal prediction method on landslide displacement with LDP-ConvFormer and MT-InSAR observations","authors":"Jianao Cai ,&nbsp;Dongping Ming ,&nbsp;Feng Liu ,&nbsp;Wenyi Zhao ,&nbsp;Mingzhi Zhang ,&nbsp;Xiao Ling ,&nbsp;Mengyuan Zhu ,&nbsp;Lu Xu ,&nbsp;Tingting Lu ,&nbsp;Ningjie Liu ,&nbsp;Yanfei Wei ,&nbsp;Ming Huang","doi":"10.1016/j.isprsjprs.2025.12.027","DOIUrl":"10.1016/j.isprsjprs.2025.12.027","url":null,"abstract":"<div><div>Landslide Displacement Prediction (LDP) implementation for Landslide Early Warning Systems (LEWS) using the Multi-Temporal Interferometric Synthetic Aperture Radar (MT-InSAR) technique poses significant challenges in the Three Gorges Reservoir Area (TGRA). On the one hand, the limited revisit frequency of satellites fails to satisfy the high-frequency monitoring requirements of LEWS. On the other hand, traditional LDP methods concentrate on single-point modeling. It neglects the spatial correlation between displacement points and the landslide surface. To enhance the low-frequency MT-InSAR observations, this paper proposes a new hybrid algorithm that integrates the Kalman Filter (KF) and LDP-ConvFormer to achieve enhanced spatiotemporal LDP. First, multi-orbit MT-InSAR measurements are transformed to downslope displacement. Subsequently, the multi-orbit downslope displacements are integrated by KF to generate time series data with enhanced temporal resolution (5/7-day intervals). The KF estimations indicate that the integrated higher-resolution time series achieves high accuracy, with an RMSE of 0.431 cm and an R<sup>2</sup> of 0.974 compared to GNSS. Finally, to overcome the limitation of single-point modeling, a novel LDP-ConvFormer is constructed for enhanced spatiotemporal LDP. The Spatiotemporal Displacement Prediction Transformer (STDP-Former) employs Local Spatial Multi-Head Self-Attention (LSMHSA) and Temporal Multi-Head Self-Attention (TMHSA) to capture the displacement dependencies between different spatial locations at the same time steps and temporal relationships across different time steps, respectively. Additionally, the spatiotemporal feature map is decomposed into trend and periodic components, which are modeled separately and then summed for final predictions. Experimental results demonstrate that the constructed model can accurately establish the nonlinear relationship between the landslide displacement and its triggering factors. The LDP-ConvFormer outperforms benchmark methods, achieving RMSE: 46.29 mm, MAE: 26.7 mm, SSIM: 0.8187, PSNR: 35.62, R<sup>2</sup>: 0.9574, and EVar: 0.9603. Moreover, LDP-ConvFormer shows notable superiority in LDP over medium to long periods (60-90d) in the TGRA. The enhanced spatiotemporal LDP method provides extremely valuable reference for LEWS of translational landslides in the TGRA.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 594-612"},"PeriodicalIF":12.2,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mapping land uses following tropical deforestation with location-aware deep learning 利用位置感知深度学习绘制热带森林砍伐后的土地使用地图
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-07 DOI: 10.1016/j.isprsjprs.2025.12.007
Jan Pišl , Gencer Sumbul , Gaston Lenczner , Camilo Zamora , Martin Herold , Jan Dirk Wegner , Devis Tuia
The rates of tropical deforestation remain alarmingly high. To enable effective, targeted policy responses, detailed data on its driving forces is needed—each deforestation event needs to be attributed to an agricultural commodity or another land use. Remote sensing allows us to monitor land use conversion following deforestation, providing a proxy of drivers. However, recognizing individual commodities is challenging due to spectral similarities, the limited spatial resolution of free satellite imagery, and limited labeled data. To tackle these challenges, we propose a deep learning, multi-modal approach for the recognition of post-deforestation land uses from a time series of Sentinel-2 images, geographic coordinates, and country-level statistics of deforestation drivers. To integrate the modalities, we design a Transformer-based model with modality-specific encoders. The approach reaches 87% accuracy, an improvement of 10% over the image-only baseline, with little increase in data volume, computations, and model size. It works well in low-data regimes, and can be easily extended to include other modalities. Overall, this work contributes towards detailed, repeatable, and scalable mapping of deforestation landscapes, providing necessary data for the design and implementation of targeted interventions to protect tropical forests.
热带森林砍伐的速度仍然高得惊人。为了采取有效的、有针对性的政策应对措施,需要提供有关其驱动力的详细数据——每一次毁林事件都需要归因于一种农产品或另一种土地用途。遥感使我们能够监测森林砍伐后的土地利用转换,提供驱动因素的代理。然而,由于光谱相似性、免费卫星图像有限的空间分辨率和有限的标记数据,识别单个商品是具有挑战性的。为了应对这些挑战,我们提出了一种深度学习、多模式的方法,用于从Sentinel-2图像、地理坐标和毁林驱动因素的国家级统计数据的时间序列中识别毁林后的土地利用。为了集成模态,我们设计了一个基于transformer的模型和模态特定的编码器。该方法达到了87%的准确率,比仅使用图像的基线提高了10%,而数据量、计算量和模型大小几乎没有增加。它在低数据状态下工作得很好,并且可以很容易地扩展到包括其他模式。总的来说,这项工作有助于详细、可重复和可扩展的森林砍伐景观制图,为设计和实施有针对性的干预措施提供必要的数据,以保护热带森林。
{"title":"Mapping land uses following tropical deforestation with location-aware deep learning","authors":"Jan Pišl ,&nbsp;Gencer Sumbul ,&nbsp;Gaston Lenczner ,&nbsp;Camilo Zamora ,&nbsp;Martin Herold ,&nbsp;Jan Dirk Wegner ,&nbsp;Devis Tuia","doi":"10.1016/j.isprsjprs.2025.12.007","DOIUrl":"10.1016/j.isprsjprs.2025.12.007","url":null,"abstract":"<div><div>The rates of tropical deforestation remain alarmingly high. To enable effective, targeted policy responses, detailed data on its driving forces is needed—each deforestation event needs to be attributed to an agricultural commodity or another land use. Remote sensing allows us to monitor land use conversion following deforestation, providing a proxy of drivers. However, recognizing individual commodities is challenging due to spectral similarities, the limited spatial resolution of free satellite imagery, and limited labeled data. To tackle these challenges, we propose a deep learning, multi-modal approach for the recognition of post-deforestation land uses from a time series of Sentinel-2 images, geographic coordinates, and country-level statistics of deforestation drivers. To integrate the modalities, we design a Transformer-based model with modality-specific encoders. The approach reaches 87% accuracy, an improvement of 10% over the image-only baseline, with little increase in data volume, computations, and model size. It works well in low-data regimes, and can be easily extended to include other modalities. Overall, this work contributes towards detailed, repeatable, and scalable mapping of deforestation landscapes, providing necessary data for the design and implementation of targeted interventions to protect tropical forests.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 578-593"},"PeriodicalIF":12.2,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond synthetic scenarios: Weakly-supervised super-resolution for spatiotemporally misaligned remote sensing images 超越合成场景:时空失调遥感图像的弱监督超分辨率
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-06 DOI: 10.1016/j.isprsjprs.2025.12.019
Quanyi Guo , Rui Liu , Yangtian Fang , Yi Gao , Jun Chen , Xin Tian
Deep learning-based remote sensing image super-resolution is crucial for enhancing the spatial resolution of Earth observation data. Due to the absence of perfectly aligned pairs of high- and low-resolution remote sensing images, most existing supervised and self-supervised approaches rely on synthetic degradation models or internal structural consistency to generate training data. Consequently, these methods suffer from the domain gap between synthetic and real datasets, which limits their ability to model realistic degradation and degrades their performance in real scenes. To overcome this challenge, we propose STANet, a weakly-supervised super-resolution method for spatiotemporally misaligned remote sensing images. In particular, STANet directly utilizes images of the same region captured by multiple satellites at different resolutions as datasets, to boost the real remote sensing image super-resolution performance. However, this approach also introduces new challenges related to spatiotemporal misalignment. To address this, we design a spatiotemporal align module that includes a Scale Align Module (SAM) and a Temporal Align Module (TAM). SAM uses affine transformations to align spatial features at both the pixel and global levels, while TAM applies window-based attention to adjust the weight of image content, mitigating the misleading effects of temporal misalignment on results. Besides, we also design a style encoder based on contrastive learning and a structure encoder based on variational inference, which guide SAM and TAM for feature alignment and enhance adaptability. Finally, the feature-aligned output, after upsampling, are fused with the high-frequency-enhancing output of the texture transfer module through the weighted fusion module to generate the super-resolution image. Extensive experiments on synthetic datasets based on AID and RSSR25, real datasets captured by GaoFen satellites, and cross-satellite experiments on Landsat-8 datasets demonstrate STANet’s superiority over other state-of-the-art methods.
基于深度学习的遥感图像超分辨率是提高对地观测数据空间分辨率的关键。由于缺乏高分辨率和低分辨率遥感图像的完美对齐对,现有的大多数监督和自监督方法依赖于合成退化模型或内部结构一致性来生成训练数据。因此,这些方法受到合成数据集和真实数据集之间的域差距的影响,这限制了它们模拟真实退化的能力,并降低了它们在真实场景中的性能。为了克服这一挑战,我们提出了STANet,一种用于时空失调遥感图像的弱监督超分辨率方法。特别是,STANet直接利用多颗卫星在不同分辨率下捕获的同一区域的图像作为数据集,以提高真实遥感图像的超分辨率性能。然而,这种方法也带来了与时空失调相关的新挑战。为了解决这个问题,我们设计了一个时空对齐模块,其中包括一个尺度对齐模块(SAM)和一个时间对齐模块(TAM)。SAM使用仿射变换在像素和全局级别对齐空间特征,而TAM使用基于窗口的注意力来调整图像内容的权重,从而减轻时间偏差对结果的误导影响。此外,我们还设计了基于对比学习的风格编码器和基于变分推理的结构编码器,引导SAM和TAM进行特征对齐,增强了自适应能力。最后,特征对齐输出经过上采样后,通过加权融合模块与纹理传递模块的高频增强输出进行融合,生成超分辨率图像。在基于AID和RSSR25的合成数据集、高分卫星捕获的真实数据集以及Landsat-8数据集的跨卫星实验上进行的大量实验表明,STANet优于其他最先进的方法。
{"title":"Beyond synthetic scenarios: Weakly-supervised super-resolution for spatiotemporally misaligned remote sensing images","authors":"Quanyi Guo ,&nbsp;Rui Liu ,&nbsp;Yangtian Fang ,&nbsp;Yi Gao ,&nbsp;Jun Chen ,&nbsp;Xin Tian","doi":"10.1016/j.isprsjprs.2025.12.019","DOIUrl":"10.1016/j.isprsjprs.2025.12.019","url":null,"abstract":"<div><div>Deep learning-based remote sensing image super-resolution is crucial for enhancing the spatial resolution of Earth observation data. Due to the absence of perfectly aligned pairs of high- and low-resolution remote sensing images, most existing supervised and self-supervised approaches rely on synthetic degradation models or internal structural consistency to generate training data. Consequently, these methods suffer from the domain gap between synthetic and real datasets, which limits their ability to model realistic degradation and degrades their performance in real scenes. To overcome this challenge, we propose STANet, a weakly-supervised super-resolution method for spatiotemporally misaligned remote sensing images. In particular, STANet directly utilizes images of the same region captured by multiple satellites at different resolutions as datasets, to boost the real remote sensing image super-resolution performance. However, this approach also introduces new challenges related to spatiotemporal misalignment. To address this, we design a spatiotemporal align module that includes a Scale Align Module (SAM) and a Temporal Align Module (TAM). SAM uses affine transformations to align spatial features at both the pixel and global levels, while TAM applies window-based attention to adjust the weight of image content, mitigating the misleading effects of temporal misalignment on results. Besides, we also design a style encoder based on contrastive learning and a structure encoder based on variational inference, which guide SAM and TAM for feature alignment and enhance adaptability. Finally, the feature-aligned output, after upsampling, are fused with the high-frequency-enhancing output of the texture transfer module through the weighted fusion module to generate the super-resolution image. Extensive experiments on synthetic datasets based on AID and RSSR25, real datasets captured by GaoFen satellites, and cross-satellite experiments on Landsat-8 datasets demonstrate STANet’s superiority over other state-of-the-art methods.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 524-541"},"PeriodicalIF":12.2,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145902921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing table beet root yield estimation via unmanned aerial systems (UAS) multi-modal sensing 基于无人机系统(UAS)多模态遥感的甜菜根产量估算研究进展
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-06 DOI: 10.1016/j.isprsjprs.2025.12.026
Mohammad S. Saif , Robert Chancia , Sean P. Murphy , Sarah Pethybridge , Jan van Aardt
Unmanned aerial systems (UAS) offer significant potential to improve agricultural practices due to their multi-modal payload capacity, ease of deployment, and lower cost. However, there is a need to expand UAS capabilities by including root crops, offering robust, growth-stage-independent models, and providing a comprehensive assessment of various imaging systems, i.e., identifying application-specific sensing modalities. This study aims to tackle those challenges by presenting a unified Gaussian Process Regression (GPR) model for predicting end-of-season table beet (a subterranean root crop) yield using UAS-derived spectral and structural features, combined with meteorological data, while remaining robust to flight and harvest timing. Field trials were conducted at Cornell AgriTech in Geneva, NY during the 2021 and 2022 growing seasons. UAS flights captured five-band (475, 560, 668, 717, and 840 nm) multispectral imagery, hyperspectral imagery (400–1000 nm), and light detection and ranging (LiDAR) data at multiple times throughout the season. Our model achieved R2test = 0.81 and MAPEtest = 15.7 % using only multispectral imagery, while the hyperspectral + LiDAR model attained R2test = 0.79 and MAPEtest = 17.4 %, which is comparable to recent root yield modeling studies using UAS data. Shapely analysis was performed to gain further insight into model behavior. This analysis revealed canopy volume information to contain high relative importance, as compared to other features, for table beet root yield estimation. Our study demonstrated that UAS-based imaging, combined with a unified machine learning model, can effectively predict root crop yield, providing a scalable and transferable approach for precision agriculture.
无人机系统(UAS)由于其多模式有效载荷能力、易于部署和较低的成本,为改善农业实践提供了巨大的潜力。然而,有必要通过包括块根作物,提供强大的,生长阶段独立的模型,并提供各种成像系统的综合评估,即确定特定应用的传感模式,来扩展UAS的能力。本研究旨在通过提出统一的高斯过程回归(GPR)模型来解决这些挑战,该模型利用无人机系统衍生的光谱和结构特征,结合气象数据,预测季末食用甜菜(一种地下根茎作物)的产量,同时保持对飞行和收获时间的鲁棒性。在2021年和2022年的生长季节,在纽约日内瓦的康奈尔农业技术公司进行了现场试验。UAS飞行在整个季节多次捕获五波段(475、560、668、717和840 nm)多光谱图像、高光谱图像(400-1000 nm)以及光探测和测距(LiDAR)数据。我们的模型仅使用多光谱图像时的R2test = 0.81, MAPEtest = 15.7%,而高光谱+ LiDAR模型的R2test = 0.79, MAPEtest = 17.4%,这与最近使用UAS数据进行的根系产量建模研究相当。进行形状分析以进一步了解模型行为。该分析表明,与其他特征相比,冠层体积信息对甜菜根产量估算具有较高的相对重要性。我们的研究表明,基于无人机的成像与统一的机器学习模型相结合,可以有效地预测根茎作物产量,为精准农业提供了一种可扩展和可转移的方法。
{"title":"Advancing table beet root yield estimation via unmanned aerial systems (UAS) multi-modal sensing","authors":"Mohammad S. Saif ,&nbsp;Robert Chancia ,&nbsp;Sean P. Murphy ,&nbsp;Sarah Pethybridge ,&nbsp;Jan van Aardt","doi":"10.1016/j.isprsjprs.2025.12.026","DOIUrl":"10.1016/j.isprsjprs.2025.12.026","url":null,"abstract":"<div><div>Unmanned aerial systems (UAS) offer significant potential to improve agricultural practices due to their multi-modal payload capacity, ease of deployment, and lower cost. However, there is a need to expand UAS capabilities by including root crops, offering robust, growth-stage-independent models, and providing a comprehensive assessment of various imaging systems, i.e., identifying application-specific sensing modalities. This study aims to tackle those challenges by presenting a unified Gaussian Process Regression (GPR) model for predicting end-of-season table beet (a subterranean root crop) yield using UAS-derived spectral and structural features, combined with meteorological data, while remaining robust to flight and harvest timing. Field trials were conducted at Cornell AgriTech in Geneva, NY during the 2021 and 2022 growing seasons. UAS flights captured five-band (475, 560, 668, 717, and 840 nm) multispectral imagery, hyperspectral imagery (400–1000 nm), and light detection and ranging (LiDAR) data at multiple times throughout the season. Our model achieved <em>R<sup>2</sup><sub>test</sub></em> = 0.81 and MAPE<sub>test</sub> = 15.7 % using only multispectral imagery, while the hyperspectral + LiDAR model attained <em>R<sup>2</sup><sub>test</sub></em> = 0.79 and MAPE<sub>test</sub> = 17.4 %, which is comparable to recent root yield modeling studies using UAS data. Shapely analysis was performed to gain further insight into model behavior. This analysis revealed canopy volume information to contain high relative importance, as compared to other features, for table beet root yield estimation. Our study demonstrated that UAS-based imaging, combined with a unified machine learning model, can effectively predict root crop yield, providing a scalable and transferable approach for precision agriculture.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 542-560"},"PeriodicalIF":12.2,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MARSNet: A Mamba-driven adaptive framework for robust multisource remote sensing image matching in noisy environments MARSNet:一个mamba驱动的自适应框架,用于噪声环境下的鲁棒多源遥感图像匹配
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-05 DOI: 10.1016/j.isprsjprs.2025.12.021
Weipeng Jing , Peilun Kang , Donglin Di , Jian Wang , Yang Song , Chao Li , Lei Fan
Semi-dense matching of multi-source remote sensing images under noise interference remains a challenging task. Existing detector-free methods often exhibit low efficiency and reduced performance when faced with large viewpoint variations and significant noise disturbances. Due to the inherent noise and modality differences in multi-source remote sensing images, the accuracy and robustness of feature matching are substantially compromised. To address this issue, we propose a hybrid network for multi-source remote sensing image matching based on an efficient and robust Mamba framework, named MARSNet. The network achieves efficient and robust matching through the following innovative designs: First, it leverages the efficient Mamba network to capture long-range dependencies within image sequences, enhancing the modeling capability for complex scenes. Second, a frozen pre-trained DINOv2 foundation model is introduced as a robust feature extractor, effectively improving the model’s noise resistance. Finally, an adaptive fusion strategy is employed to integrate features, and the Mamba-like linear attention mechanism is adopted to refine the Transformer-based linear attention, further enhancing the efficiency and expressive power for long-sequence processing. To validate the effectiveness of the proposed method, extensive experiments were conducted on multi-source remote sensing image datasets, covering various scenarios such as noise-free, additive random noise, and periodic stripe noise. The experimental results demonstrate that the proposed method achieves significant improvements in matching accuracy and robustness compared to state-of-the-art methods. Additionally, by performing pose error evaluation on a large-scale general dataset, the superior performance of the proposed method in 3D reconstruction is validated, complementing the test results from the multi-source remote sensing dataset, thereby providing a more comprehensive assessment of the method’s generalization ability and robustness.
噪声干扰下多源遥感图像的半密集匹配一直是一项具有挑战性的任务。现有的无探测器方法在视点变化大、噪声干扰大的情况下,效率低、性能下降。由于多源遥感图像中固有的噪声和模态差异,大大降低了特征匹配的准确性和鲁棒性。为了解决这一问题,我们提出了一种基于高效鲁棒曼巴框架的多源遥感图像匹配混合网络,称为MARSNet。该网络通过以下创新设计实现了高效鲁棒的匹配:首先,利用高效的曼巴网络捕获图像序列内的远程依赖关系,增强了对复杂场景的建模能力。其次,引入冻结预训练的DINOv2基础模型作为鲁棒特征提取器,有效提高了模型的抗噪性;最后,采用自适应融合策略对特征进行融合,并采用类似mamba的线性注意机制对基于transformer的线性注意进行细化,进一步提高了长序列处理的效率和表达能力。为了验证该方法的有效性,在多源遥感图像数据集上进行了大量实验,包括无噪声、加性随机噪声和周期性条纹噪声等多种场景。实验结果表明,与现有方法相比,该方法在匹配精度和鲁棒性方面都有显著提高。此外,通过在大型通用数据集上进行位姿误差评估,验证了该方法在三维重建中的优越性能,与多源遥感数据集的测试结果相补充,从而更全面地评估了该方法的泛化能力和鲁棒性。
{"title":"MARSNet: A Mamba-driven adaptive framework for robust multisource remote sensing image matching in noisy environments","authors":"Weipeng Jing ,&nbsp;Peilun Kang ,&nbsp;Donglin Di ,&nbsp;Jian Wang ,&nbsp;Yang Song ,&nbsp;Chao Li ,&nbsp;Lei Fan","doi":"10.1016/j.isprsjprs.2025.12.021","DOIUrl":"10.1016/j.isprsjprs.2025.12.021","url":null,"abstract":"<div><div>Semi-dense matching of multi-source remote sensing images under noise interference remains a challenging task. Existing detector-free methods often exhibit low efficiency and reduced performance when faced with large viewpoint variations and significant noise disturbances. Due to the inherent noise and modality differences in multi-source remote sensing images, the accuracy and robustness of feature matching are substantially compromised. To address this issue, we propose a hybrid network for multi-source remote sensing image matching based on an efficient and robust Mamba framework, named MARSNet. The network achieves efficient and robust matching through the following innovative designs: First, it leverages the efficient Mamba network to capture long-range dependencies within image sequences, enhancing the modeling capability for complex scenes. Second, a frozen pre-trained DINOv2 foundation model is introduced as a robust feature extractor, effectively improving the model’s noise resistance. Finally, an adaptive fusion strategy is employed to integrate features, and the Mamba-like linear attention mechanism is adopted to refine the Transformer-based linear attention, further enhancing the efficiency and expressive power for long-sequence processing. To validate the effectiveness of the proposed method, extensive experiments were conducted on multi-source remote sensing image datasets, covering various scenarios such as noise-free, additive random noise, and periodic stripe noise. The experimental results demonstrate that the proposed method achieves significant improvements in matching accuracy and robustness compared to state-of-the-art methods. Additionally, by performing pose error evaluation on a large-scale general dataset, the superior performance of the proposed method in 3D reconstruction is validated, complementing the test results from the multi-source remote sensing dataset, thereby providing a more comprehensive assessment of the method’s generalization ability and robustness.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 457-475"},"PeriodicalIF":12.2,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145903503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Spatially Masked Adaptive Gated Network for multimodal post-flood water extent mapping using SAR and incomplete multispectral data 基于SAR和不完全多光谱数据的多模态洪水后水位映射空间掩膜自适应门控网络
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-05 DOI: 10.1016/j.isprsjprs.2025.12.023
Hyunho Lee, Wenwen Li
Mapping water extent during a flood event is essential for effective disaster management throughout all phases: mitigation, preparedness, response, and recovery. In particular, during the response stage, when timely and accurate information is important, Synthetic Aperture Radar (SAR) data are primarily employed to produce water extent maps. This is because SAR sensors can observe through cloud cover and operate both day and night, whereas Multispectral Imaging (MSI) data, despite providing higher mapping accuracy, are only available under cloud-free and daytime conditions. Recently, leveraging the complementary characteristics of SAR and MSI data through a multimodal approach has emerged as a promising strategy for advancing water extent mapping using deep learning models. This approach is particularly beneficial when timely post-flood observations, acquired during or shortly after the flood peak, are limited, as it enables the use of all available imagery for more accurate post-flood water extent mapping. However, the adaptive integration of partially available MSI data into the SAR-based post-flood water extent mapping process remains underexplored. To bridge this research gap, we propose the Spatially Masked Adaptive Gated Network (SMAGNet), a multimodal deep learning model that utilizes SAR data as the primary input for post-flood water extent mapping and integrates complementary MSI data through feature fusion. In experiments on the C2S-MS Floods dataset, SMAGNet consistently outperformed other multimodal deep learning models in prediction performance across varying levels of MSI data availability. Specifically, SMAGNet achieved the highest IoU score of 86.47% using SAR and MSI data and maintained the highest performance with an IoU score of 79.53% even when MSI data were entirely missing. Furthermore, we found that even when MSI data were completely missing, the performance of SMAGNet remained statistically comparable to that of a U-Net model trained solely on SAR data. These findings indicate that SMAGNet enhances the model robustness to missing data as well as the applicability of multimodal deep learning in real-world flood management scenarios. The source code is available at https://github.com/ASUcicilab/SMAGNet.
绘制洪水期间的水位图对于在减灾、备灾、救灾和恢复等各个阶段进行有效的灾害管理至关重要。特别是在响应阶段,需要及时准确的信息,合成孔径雷达(Synthetic Aperture Radar, SAR)数据主要用于生成水域图。这是因为SAR传感器可以透过云层进行观测,并且可以在白天和夜间操作,而多光谱成像(MSI)数据尽管提供了更高的制图精度,但只能在无云和白天的条件下使用。最近,通过多模态方法利用SAR和MSI数据的互补特征已成为利用深度学习模型推进水域制图的一种有前途的策略。当在洪峰期间或洪峰后不久获得的及时的洪水后观测数据有限时,这种方法特别有用,因为它可以使用所有可用的图像来更准确地绘制洪水后的水范围图。然而,将部分可用的MSI数据自适应集成到基于sar的洪水后水位制图过程中仍未得到充分探索。为了弥补这一研究空白,我们提出了空间掩膜自适应门控网络(SMAGNet),这是一种多模态深度学习模型,利用SAR数据作为洪水后水范围映射的主要输入,并通过特征融合整合互补的MSI数据。在C2S-MS洪水数据集的实验中,SMAGNet在不同级别的MSI数据可用性的预测性能方面始终优于其他多模态深度学习模型。具体而言,SMAGNet在使用SAR和MSI数据时获得了最高的IoU得分86.47%,即使在MSI数据完全缺失的情况下也保持了最高的IoU得分79.53%。此外,我们发现,即使在MSI数据完全缺失的情况下,SMAGNet的性能在统计上仍与仅使用SAR数据训练的U-Net模型相当。这些发现表明SMAGNet增强了模型对缺失数据的鲁棒性,以及多模态深度学习在现实世界洪水管理场景中的适用性。源代码可从https://github.com/ASUcicilab/SMAGNet获得。
{"title":"A Spatially Masked Adaptive Gated Network for multimodal post-flood water extent mapping using SAR and incomplete multispectral data","authors":"Hyunho Lee,&nbsp;Wenwen Li","doi":"10.1016/j.isprsjprs.2025.12.023","DOIUrl":"10.1016/j.isprsjprs.2025.12.023","url":null,"abstract":"<div><div>Mapping water extent during a flood event is essential for effective disaster management throughout all phases: mitigation, preparedness, response, and recovery. In particular, during the response stage, when timely and accurate information is important, Synthetic Aperture Radar (SAR) data are primarily employed to produce water extent maps. This is because SAR sensors can observe through cloud cover and operate both day and night, whereas Multispectral Imaging (MSI) data, despite providing higher mapping accuracy, are only available under cloud-free and daytime conditions. Recently, leveraging the complementary characteristics of SAR and MSI data through a multimodal approach has emerged as a promising strategy for advancing water extent mapping using deep learning models. This approach is particularly beneficial when timely post-flood observations, acquired during or shortly after the flood peak, are limited, as it enables the use of all available imagery for more accurate post-flood water extent mapping. However, the adaptive integration of partially available MSI data into the SAR-based post-flood water extent mapping process remains underexplored. To bridge this research gap, we propose the Spatially Masked Adaptive Gated Network (SMAGNet), a multimodal deep learning model that utilizes SAR data as the primary input for post-flood water extent mapping and integrates complementary MSI data through feature fusion. In experiments on the C2S-MS Floods dataset, SMAGNet consistently outperformed other multimodal deep learning models in prediction performance across varying levels of MSI data availability. Specifically, SMAGNet achieved the highest IoU score of 86.47% using SAR and MSI data and maintained the highest performance with an IoU score of 79.53% even when MSI data were entirely missing. Furthermore, we found that even when MSI data were completely missing, the performance of SMAGNet remained statistically comparable to that of a U-Net model trained solely on SAR data. These findings indicate that SMAGNet enhances the model robustness to missing data as well as the applicability of multimodal deep learning in real-world flood management scenarios. The source code is available at <span><span>https://github.com/ASUcicilab/SMAGNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 492-508"},"PeriodicalIF":12.2,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145902910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1