Pub Date : 2026-03-01Epub Date: 2026-02-11DOI: 10.1016/j.isprsjprs.2026.02.013
Xiaoqin Yan , Zhou Huang , Shuliang Ren , Qia Zhu , Ganmin Yin , Junnan Qi , Yi Bao
Urban renewal prediction is critical for sustainable development, yet existing methods often overlook complex spatial dependencies and lack explainability. This study proposes an explainable hierarchical graph network for urban renewal (URHGN) prediction. It employs a hierarchical graph to model building and community level spatial interactions with GNNExplainer to predict renewal potential and quantify driving factors. Applied to Beijing, URHGN achieves an F1-score of 0.855 ± 0.012, outperforming traditional machine learning (0.700–0.775) and single-layer graph methods (0.789–0.810). Explainability analysis reveals that building-level features like floor (importance: 0.485 ± 0.030) are primary drivers while community-level context like house price (0.343 ± 0.003) provides essential supplements. Spatial relationships prove more influential than node features, with contribution scores of 0.271 ± 0.019 and 0.124 ± 0.017 respectively. The model identifies 15,990 buildings with very high renewal potential (scores > 0.8), advancing explainable GeoAI (XGeoAI) methodologies for evidence-based urban planning and establishing a foundation for future dynamic models incorporating temporal changes. The code is available at: https://github.com/kkxiaoqin/URHGN.
{"title":"Explainable urban renewal prediction at building-scale using hierarchical graph neural networks","authors":"Xiaoqin Yan , Zhou Huang , Shuliang Ren , Qia Zhu , Ganmin Yin , Junnan Qi , Yi Bao","doi":"10.1016/j.isprsjprs.2026.02.013","DOIUrl":"10.1016/j.isprsjprs.2026.02.013","url":null,"abstract":"<div><div>Urban renewal prediction is critical for sustainable development, yet existing methods often overlook complex spatial dependencies and lack explainability. This study proposes an explainable hierarchical graph network for urban renewal (URHGN) prediction. It employs a hierarchical graph to model building and community level spatial interactions with GNNExplainer to predict renewal potential and quantify driving factors. Applied to Beijing, URHGN achieves an F1-score of 0.855 ± 0.012, outperforming traditional machine learning (0.700–0.775) and single-layer graph methods (0.789–0.810). Explainability analysis reveals that building-level features like floor (importance: 0.485 ± 0.030) are primary drivers while community-level context like house price (0.343 ± 0.003) provides essential supplements. Spatial relationships prove more influential than node features, with contribution scores of 0.271 ± 0.019 and 0.124 ± 0.017 respectively. The model identifies 15,990 buildings with very high renewal potential (scores > 0.8), advancing explainable GeoAI (XGeoAI) methodologies for evidence-based urban planning and establishing a foundation for future dynamic models incorporating temporal changes. The code is available at: https://github.com/kkxiaoqin/URHGN.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 609-622"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146153116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-30DOI: 10.1016/j.isprsjprs.2026.01.035
Lizhi Liu, Lijie Huang, Yiding Wang, Pingping Lu, Bo Li, Liang Li, Robert Wang, Yirong Wu
During solar maximum, low-frequency spaceborne Polarimetric Synthetic Aperture Radar (PolSAR) systems suffer ionosphere-induced distortions that couple with system-induced polarimetric distortions. High-precision decoupled polarimetric calibration is therefore essential for obtaining high-fidelity PolSAR data. Existing point-target calibration methods lack a general approach for unbiased estimation of polarimetric distortion across multiple polarimetric modes and calibrator combinations, particularly under spatiotemporally varying ionospheric conditions. To address this, we derive the necessary conditions for unbiased estimation and propose a General Polarimetric Calibration Method (GPCM) applicable to various configurations. In addition, Enhanced Multi-Look Autofocus (EMLA), a modified STEC inversion method, is introduced for precise inversion of Slant Total Electron Content (STEC), enabling estimation of the spatiotemporally varying Faraday rotation angle for system distortion decoupling and PolSAR data compensation. GPCM applied to LuTan-1 HP and QP data results in HH/VV amplitude and phase imbalances of 0.0433 dB (STD: 0.017) and − 0.60° (STD: 1.02°), respectively, measured on trihedral corner reflectors. Calibration results also indicate that QP mode isolation exceeds 39 dB, while estimated axial ratios for HP mode are lower than 0.115 dB. Under comparable conditions, the results of GPCM are consistent with the Freeman analytical method. Furthermore, EMLA outperforms existing STEC inversion methods (COA, MLA, and GIM-based mapping), achieving a mean absolute difference of 1.95 TECU compared with in-situ measurements while demonstrating applicability to general scenes. Overall, the effectiveness of GPCM and EMLA in the LuTan-1 calibration mission is confirmed, indicating their potential for future PolSAR calibration tasks. The primary calibrated experimental dataset is publicly available at https://radars.ac.cn/web/data/getData?dataType=HPSAREADEN&pageType=en.
{"title":"An advanced decoupled polarimetric calibration method for the LuTan-1 hybrid- and quadrature-polarimetric modes","authors":"Lizhi Liu, Lijie Huang, Yiding Wang, Pingping Lu, Bo Li, Liang Li, Robert Wang, Yirong Wu","doi":"10.1016/j.isprsjprs.2026.01.035","DOIUrl":"10.1016/j.isprsjprs.2026.01.035","url":null,"abstract":"<div><div>During solar maximum, low-frequency spaceborne Polarimetric Synthetic Aperture Radar (PolSAR) systems suffer ionosphere-induced distortions that couple with system-induced polarimetric distortions. High-precision decoupled polarimetric calibration is therefore essential for obtaining high-fidelity PolSAR data. Existing point-target calibration methods lack a general approach for unbiased estimation of polarimetric distortion across multiple polarimetric modes and calibrator combinations, particularly under spatiotemporally varying ionospheric conditions. To address this, we derive the necessary conditions for unbiased estimation and propose a General Polarimetric Calibration Method (GPCM) applicable to various configurations. In addition, Enhanced Multi-Look Autofocus (EMLA), a modified STEC inversion method, is introduced for precise inversion of Slant Total Electron Content (STEC), enabling estimation of the spatiotemporally varying Faraday rotation angle for system distortion decoupling and PolSAR data compensation. GPCM applied to LuTan-1 HP and QP data results in HH/VV amplitude and phase imbalances of 0.0433 dB (STD: 0.017) and − 0.60° (STD: 1.02°), respectively, measured on trihedral corner reflectors. Calibration results also indicate that QP mode isolation exceeds 39 dB, while estimated axial ratios for HP mode are lower than 0.115 dB. Under comparable conditions, the results of GPCM are consistent with the Freeman analytical method. Furthermore, EMLA outperforms existing STEC inversion methods (COA, MLA, and GIM-based mapping), achieving a mean absolute difference of 1.95 TECU compared with in-situ measurements while demonstrating applicability to general scenes. Overall, the effectiveness of GPCM and EMLA in the LuTan-1 calibration mission is confirmed, indicating their potential for future PolSAR calibration tasks. The primary calibrated experimental dataset is publicly available at <span><span>https://radars.ac.cn/web/data/getData?dataType=HPSAREADEN&pageType=en</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 310-327"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cropland non-agriculturalization (CNA) refers to the conversion of cropland into non-agricultural land such as construction land or ponds, posing threats to food security and ecological balance. Remote sensing technology enables precise monitoring of this process, but bi-temporal methods are susceptible to errors caused by seasonal spectral fluctuations, weather interference, and imaging discrepancies, often leading to false detections. Existing methods, which lack support from temporal datasets, struggle to disentangle the spectral confusion of gradual non-agriculturalization and short-term disturbances, thereby limiting the accuracy of dynamic cropland resource monitoring. To address this issue, a novel phenology-aware temporal change detection network (PANet) is proposed to solve the misclassification challenges in CNA detection caused by “same object with different spectra” and “different objects with similar spectra” issues. A phenology-aware module (PATM) is designed, leveraging a dual-driven decoupling model to dynamically weight phenology-sensitive periods and adaptively represent non-uniform temporal intervals. Through a time-aligned feature enhancement strategy and dual-driven (intra-annual/inter-annual) temporal decay functions, PANet simultaneously focuses on short-term anomalies and robustly models long-term trends. Additionally, a sample balance adjustment module (DFBL) is developed to mitigate the impact of sample imbalance by incorporating prior knowledge of changes and dynamic adjustment factors, enhancing the model’s sensitivity to non-agriculturalization changes. Furthermore, the first high-resolution CNA dataset based on actual production data is constructed, containing 1295 pairs of 512 × 512 masked images. Compared to existing datasets, this dataset offers extensive temporal coverage, capturing comprehensive seasonal periodic characteristics of cropland. Comparative experiments with several classical time-series methods and bi-temporal methods validate the effectiveness of PANet. Experimental results on the LHCD dataset demonstrate that PANet achieves the highest F1 score, specifically, 61.01% and 61.70%. PANet accurately captures CNA information, making it vital for the scientific management and sustainable utilization of limited cropland resources. The LHCD can be downloaded from https://github.com/mss-s/LHCD.
{"title":"PANet: A multi-scale temporal decoupling network and its high-resolution benchmark dataset for detecting pseudo changes in cropland non-agriculturalization","authors":"Songman Sui , Jixian Zhang , Haiyan Gu , Yue Chang","doi":"10.1016/j.isprsjprs.2026.01.029","DOIUrl":"10.1016/j.isprsjprs.2026.01.029","url":null,"abstract":"<div><div>Cropland non-agriculturalization (CNA) refers to the conversion of cropland into non-agricultural land such as construction land or ponds, posing threats to food security and ecological balance. Remote sensing technology enables precise monitoring of this process, but bi-temporal methods are susceptible to errors caused by seasonal spectral fluctuations, weather interference, and imaging discrepancies, often leading to false detections. Existing methods, which lack support from temporal datasets, struggle to disentangle the spectral confusion of gradual non-agriculturalization and short-term disturbances, thereby limiting the accuracy of dynamic cropland resource monitoring. To address this issue, a novel phenology-aware temporal change detection network (PANet) is proposed to solve the misclassification challenges in CNA detection caused by “same object with different spectra” and “different objects with similar spectra” issues. A phenology-aware module (PATM) is designed, leveraging a dual-driven decoupling model to dynamically weight phenology-sensitive periods and adaptively represent non-uniform temporal intervals. Through a time-aligned feature enhancement strategy and dual-driven (intra-annual/inter-annual) temporal decay functions, PANet simultaneously focuses on short-term anomalies and robustly models long-term trends. Additionally, a sample balance adjustment module (DFBL) is developed to mitigate the impact of sample imbalance by incorporating prior knowledge of changes and dynamic adjustment factors, enhancing the model’s sensitivity to non-agriculturalization changes. Furthermore, the first high-resolution CNA dataset based on actual production data is constructed, containing 1295 pairs of 512 × 512 masked images. Compared to existing datasets, this dataset offers extensive temporal coverage, capturing comprehensive seasonal periodic characteristics of cropland. Comparative experiments with several classical time-series methods and bi-temporal methods validate the effectiveness of PANet. Experimental results on the LHCD dataset demonstrate that PANet achieves the highest F1 score, specifically, 61.01% and 61.70%. PANet accurately captures CNA information, making it vital for the scientific management and sustainable utilization of limited cropland resources. The LHCD can be downloaded from <span><span>https://github.com/mss-s/LHCD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 126-143"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-23DOI: 10.1016/j.isprsjprs.2026.01.024
Liuqian Wang, Jing Zhang, Guangming Mi, Li Zhuo
Fine-grained object recognition (FGOR) is gaining increasing attention in automated remote sensing analysis and interpretation (RSAI). However, the full potential of FGOR in remote sensing images (RSIs) is still constrained by several key issues: the reliance on high-quality labeled data, the difficulty of reconstructing fine details in low-resolution images, and the limited robustness of FGOR model for distinguishing similar object categories. In response, we propose an automatic fine-grained object recognition network (AutoFGOR) that follows a hierarchical dual-pipeline architecture for object analysis at global and regional levels. Specifically, Pipeline I: region detection network, which leverages geometric invariance module for weakly-supervised learning to improve the detection accuracy of sparsely labeled RSIs and extract category-free regions; and on top of that, Pipeline II: regional diffusion with vision language model (RD-VLM), which pioneers the combination of stable diffusion XL (SDXL) and large language and vision assistant (LLaVA) through a specially designed adaptive resolution adaptor (ARA) for object region super-resolution reconstruction, fundamentally solving the difficulties of feature extraction from low-quality regions and fine-grained feature mining. In addition, we introduce a winner-takes-all (WTA) strategy that utilizes a voting mechanism to enhance the reliability of fine-grained classification in complex scenes. Experimental results on FAIR1M-v2.0, VEDAI, and HRSC2016 datasets demonstrate our AutoFGOR achieving 31.72%, 80.25%, and 88.05% mAP, respectively, with highly competitive performance. In addition, the × 4 reconstruction results achieve scores of 0.5275 and 0.8173 on the MANIQA and CLIP-IQA indicators, respectively. The code will be available on GitHub:https://github.com/BJUT-AIVBD/AutoFGOR.
{"title":"Weak supervision makes strong details: fine-grained object recognition in remote sensing images via regional diffusion with VLM","authors":"Liuqian Wang, Jing Zhang, Guangming Mi, Li Zhuo","doi":"10.1016/j.isprsjprs.2026.01.024","DOIUrl":"10.1016/j.isprsjprs.2026.01.024","url":null,"abstract":"<div><div>Fine-grained object recognition (FGOR) is gaining increasing attention in automated remote sensing analysis and interpretation (RSAI). However, the full potential of FGOR in remote sensing images (RSIs) is still constrained by several key issues: the reliance on high-quality labeled data, the difficulty of reconstructing fine details in low-resolution images, and the limited robustness of FGOR model for distinguishing similar object categories. In response, we propose an automatic fine-grained object recognition network (AutoFGOR) that follows a hierarchical dual-pipeline architecture for object analysis at global and regional levels. Specifically, Pipeline I: region detection network, which leverages geometric invariance module for weakly-supervised learning to improve the detection accuracy of sparsely labeled RSIs and extract category-free regions; and on top of that, Pipeline II: regional diffusion with vision language model (RD-VLM), which pioneers the combination of stable diffusion XL (SDXL) and large language and vision assistant (LLaVA) through a specially designed adaptive resolution adaptor (ARA) for object region super-resolution reconstruction, fundamentally solving the difficulties of feature extraction from low-quality regions and fine-grained feature mining. In addition, we introduce a winner-takes-all (WTA) strategy that utilizes a voting mechanism to enhance the reliability of fine-grained classification in complex scenes. Experimental results on FAIR1M-v2.0, VEDAI, and HRSC2016 datasets demonstrate our AutoFGOR achieving 31.72%, 80.25%, and 88.05% mAP, respectively, with highly competitive performance. In addition, the × 4 reconstruction results achieve scores of 0.5275 and 0.8173 on the MANIQA and CLIP-IQA indicators, respectively. <u>The code will be available on GitHub:</u> <span><span><u>https://github.com/BJUT-AIVBD/AutoFGOR</u></span><svg><path></path></svg></span><u>.</u></div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 231-246"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-22DOI: 10.1016/j.isprsjprs.2026.01.026
Hua Su , Weiqi Xie , Luping You , Sihui Li , Dian Lin , An Wang
High-resolution ocean subsurface density is crucial for studying dynamic processes and stratification within the ocean under recent global ocean warming. This study proposes a novel deep learning-based model, named DDFNet (Dual-task Densely-Former Network), for reconstructing ocean subsurface density, to address the challenges in reconstructing high-resolution and high-reliability global ocean subsurface density. DDFNet employs multi-scale feature extraction, attention mechanisms, and a dual-label design, combining an encoder-decoder backbone network with a global spatial attention module to capture the complex spatiotemporal relationships in ocean data effectively. The model utilizes multisource surface remote sensing data as input and incorporates Argo profile data and ORAS5 reanalysis data as labels. An adaptive weighted loss function dynamically balances the contributions of the two label types, improving reconstruction accuracy and achieving a spatial resolution of 0.25°×0.25°. By constructing dual tasks with in situ observations and reanalysis data for joint learning, the true state of the ocean and the consistency of physical processes are enhanced, improving the model’s reconstruction accuracy and physical consistency. Experimental results demonstrate that DDFNet outperforms well-used LightGBM and CNN models, with the reconstructed DDFNet-SD dataset achieving an R2 of 0.9863 and an RMSE of 0.2804 kg/m3. The dataset further reveals a declining trend in global ocean subsurface density at a rate of −4.47 × 10-4 kg/m3/decade, particularly pronounced in the upper 0–700 m, which is likely associated with global ocean warming and salinity changes. The high-resolution dataset facilitates studies on mesoscale ocean dynamics, stratification variability, and climate change impacts.
{"title":"Detecting global ocean subsurface density change with high-resolution via dual-task densely-former","authors":"Hua Su , Weiqi Xie , Luping You , Sihui Li , Dian Lin , An Wang","doi":"10.1016/j.isprsjprs.2026.01.026","DOIUrl":"10.1016/j.isprsjprs.2026.01.026","url":null,"abstract":"<div><div>High-resolution ocean subsurface density is crucial for studying dynamic processes and stratification within the ocean under recent global ocean warming. This study proposes a novel deep learning-based model, named DDFNet (Dual-task Densely-Former Network), for reconstructing ocean subsurface density, to address the challenges in reconstructing high-resolution and high-reliability global ocean subsurface density. DDFNet employs multi-scale feature extraction, attention mechanisms, and a dual-label design, combining an encoder-decoder backbone network with a global spatial attention module to capture the complex spatiotemporal relationships in ocean data effectively. The model utilizes multisource surface remote sensing data as input and incorporates Argo profile data and ORAS5 reanalysis data as labels. An adaptive weighted loss function dynamically balances the contributions of the two label types, improving reconstruction accuracy and achieving a spatial resolution of 0.25°×0.25°. By constructing dual tasks with <em>in situ</em> observations and reanalysis data for joint learning, the true state of the ocean and the consistency of physical processes are enhanced, improving the model’s reconstruction accuracy and physical consistency. Experimental results demonstrate that DDFNet outperforms well-used LightGBM and CNN models, with the reconstructed DDFNet-SD dataset achieving an <em>R<sup>2</sup></em> of 0.9863 and an RMSE of 0.2804 kg/m<sup>3</sup>. The dataset further reveals a declining trend in global ocean subsurface density at a rate of −4.47 × 10<sup>-4</sup> kg/m<sup>3</sup>/decade, particularly pronounced in the upper 0–700 m, which is likely associated with global ocean warming and salinity changes. The high-resolution dataset facilitates studies on mesoscale ocean dynamics, stratification variability, and climate change impacts.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 158-179"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-19DOI: 10.1016/j.isprsjprs.2026.01.020
Wenxia Gan , Yu Feng , Jianhao Miao , Xinghua Li , Huanfeng Shen
The diversity of satellite remote sensing images has significantly enhanced the capability to observe surface information on Earth. However, multi-temporal optical remote sensing images acquired from different sensor platforms often exhibit substantial radiometric discrepancies, and it is difficult to obtain overlapping reference images, which poses critical challenges for seamless large-scale mosaicking, including global radiometric inconsistency, unsmooth local transitions, and visible seamlines. Existing traditional and deep learning methods can achieve reasonable performance on paired datasets, but often face challenges in balancing spatial structural integrity with enhanced radiometric consistency and generalizing to unseen images. To address these issues, a wavelet-enhanced radiometric normalization network called WEGLA-NormGAN is proposed to generate radiometrically normalized imagery with sound radiometric consistency and spatial fidelity. This framework integrates frequency-domain and spatial-domain information to achieve consistent multi-scale radiometric feature modeling while ensuring spatial structural fidelity. Firstly, wavelet transform is introduced to effectively decouple radiometric information and structural features from images, explicitly enhancing radiometric feature representation and edge-texture preservation. Secondly, a U-Net architecture with multi-scale modeling advantages is fused with an adaptive attention mechanism incorporating residual structures. This hybrid design employs a statistical alignment strategy to efficiently extract global shallow features and local statistical information, adaptively adjust the dynamic attention of unseen data, and alleviate local distortions, improving radiometric consistency and achieving high-fidelity spatial structure preservation. The proposed framework generates radiometrically normalized imagery that harmonizes radiometric consistency with spatial fidelity, while achieving outstanding radiometric normalization even in unseen scenarios. Extensive experiments were conducted on two public datasets and a self-constructed dataset. The results demonstrate that WEGLA-NormGAN outperforms seven state-of-the-art methods in cross-temporal scenarios and five in cross-spatiotemporal scenarios in terms of radiometric consistency, structural fidelity, and robustness. The code is available at https://github.com/WITRS/WeGLA-Norm.git.
{"title":"WEGLA-NormGAN: wavelet-enhanced Cycle-GAN with global-local attention for radiometric normalization of remote sensing images","authors":"Wenxia Gan , Yu Feng , Jianhao Miao , Xinghua Li , Huanfeng Shen","doi":"10.1016/j.isprsjprs.2026.01.020","DOIUrl":"10.1016/j.isprsjprs.2026.01.020","url":null,"abstract":"<div><div>The diversity of satellite remote sensing images has significantly enhanced the capability to observe surface information on Earth. However, multi-temporal optical remote sensing images acquired from different sensor platforms often exhibit substantial radiometric discrepancies, and it is difficult to obtain overlapping reference images, which poses critical challenges for seamless large-scale mosaicking, including global radiometric inconsistency, unsmooth local transitions, and visible seamlines. Existing traditional and deep learning methods can achieve reasonable performance on paired datasets, but often face challenges in balancing spatial structural integrity with enhanced radiometric consistency and generalizing to unseen images. To address these issues, a wavelet-enhanced radiometric normalization network called WEGLA-NormGAN is proposed to generate radiometrically normalized imagery with sound radiometric consistency and spatial fidelity. This framework integrates frequency-domain and spatial-domain information to achieve consistent multi-scale radiometric feature modeling while ensuring spatial structural fidelity. Firstly, wavelet transform is introduced to effectively decouple radiometric information and structural features from images, explicitly enhancing radiometric feature representation and edge-texture preservation. Secondly, a U-Net architecture with multi-scale modeling advantages is fused with an adaptive attention mechanism incorporating residual structures. This hybrid design employs a statistical alignment strategy to efficiently extract global shallow features and local statistical information, adaptively adjust the dynamic attention of unseen data, and alleviate local distortions, improving radiometric consistency and achieving high-fidelity spatial structure preservation. The proposed framework generates radiometrically normalized imagery that harmonizes radiometric consistency with spatial fidelity, while achieving outstanding radiometric normalization even in unseen scenarios. Extensive experiments were conducted on two public datasets and a self-constructed dataset. The results demonstrate that WEGLA-NormGAN outperforms seven state-of-the-art methods in cross-temporal scenarios and five in cross-spatiotemporal scenarios in terms of radiometric consistency, structural fidelity, and robustness. The code is available at <span><span>https://github.com/WITRS/WeGLA-Norm.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 39-54"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146001037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-16DOI: 10.1016/j.isprsjprs.2026.01.008
Yifan Sun , Chenguang Dai , Wenke Li , Xinpu Liu , Yongqi Sun , Ye Zhang , Weijun Guan , Yongsheng Zhang , Yulan Guo , Hanyun Wang
LiDAR point cloud semantic segmentation is crucial for scene understanding in autonomous driving, yet the sparse and textureless characteristics of point clouds cause huge challenges for this task. To address this, numerous studies have explored to leverage the dense color and fine-grained texture from RGB images for multi-modality 3D semantic segmentation. Nevertheless, these methods still encounter certain limitations when facing complex scenarios, as RGB images degrade under poor lighting conditions. In contrast, thermal infrared (TIR) images can provide thermal radiation information of road objects and are robust to illumination change, offering complementary advantages to RGB images. Therefore, in this work we introduce RTPSeg, the first and only multi-modality dataset to simultaneously provide RGB and TIR images for point cloud semantic segmentation. RTPSeg includes over 3000 synchronized frames collected by RGB camera, infrared camera, and LiDAR, providing over 248M pointwise annotations for 18 semantic categories in autonomous driving, involving urban and village scenes during both daytime and nighttime. Based on RTPSeg, we also propose RTPSegNet, a baseline model for point cloud semantic segmentation jointly assisted with RGB and TIR images. Extensive experiments demonstrate that the RTPSeg dataset presents considerable challenges and opportunities to existing point cloud semantic segmentation approaches, and our RTPSegNet exhibits promising effectiveness in jointly leveraging the complementary information between point clouds, RGB images, and TIR images. More importantly, the experimental results also confirm that 3D semantic segmentation can be effectively enhanced by introducing additional TIR image modality, revealing the promising potential of this innovative research and application. We anticipate that the RTPSeg will catalyze in-depth research in this field. Both RTPSeg and RTPSegNet will be released at https://github.com/sssssyf/RTPSeg
{"title":"RTPSeg: A multi-modality dataset for LiDAR point cloud semantic segmentation assisted with RGB-thermal images in autonomous driving","authors":"Yifan Sun , Chenguang Dai , Wenke Li , Xinpu Liu , Yongqi Sun , Ye Zhang , Weijun Guan , Yongsheng Zhang , Yulan Guo , Hanyun Wang","doi":"10.1016/j.isprsjprs.2026.01.008","DOIUrl":"10.1016/j.isprsjprs.2026.01.008","url":null,"abstract":"<div><div>LiDAR point cloud semantic segmentation is crucial for scene understanding in autonomous driving, yet the sparse and textureless characteristics of point clouds cause huge challenges for this task. To address this, numerous studies have explored to leverage the dense color and fine-grained texture from RGB images for multi-modality 3D semantic segmentation. Nevertheless, these methods still encounter certain limitations when facing complex scenarios, as RGB images degrade under poor lighting conditions. In contrast, thermal infrared (TIR) images can provide thermal radiation information of road objects and are robust to illumination change, offering complementary advantages to RGB images. Therefore, in this work we introduce RTPSeg, the first and only multi-modality dataset to simultaneously provide RGB and TIR images for point cloud semantic segmentation. RTPSeg includes over 3000 synchronized frames collected by RGB camera, infrared camera, and LiDAR, providing over 248M pointwise annotations for 18 semantic categories in autonomous driving, involving urban and village scenes during both daytime and nighttime. Based on RTPSeg, we also propose RTPSegNet, a baseline model for point cloud semantic segmentation jointly assisted with RGB and TIR images. Extensive experiments demonstrate that the RTPSeg dataset presents considerable challenges and opportunities to existing point cloud semantic segmentation approaches, and our RTPSegNet exhibits promising effectiveness in jointly leveraging the complementary information between point clouds, RGB images, and TIR images. More importantly, the experimental results also confirm that 3D semantic segmentation can be effectively enhanced by introducing additional TIR image modality, revealing the promising potential of this innovative research and application. We anticipate that the RTPSeg will catalyze in-depth research in this field. Both RTPSeg and RTPSegNet will be released at <span><span>https://github.com/sssssyf/RTPSeg</span><svg><path></path></svg></span></div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 25-38"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyperspectral image (HSI) contains information at various spectra, making it valuable in several real-world applications such as environmental monitoring, agriculture, and remote sensing. However, the acquisition process often introduces noise, necessitating effective HSI denoising methods to maintain its applicability. Deep Learning (DL) is considered as the de-facto for HSI denoising, but it requires a significant number of training samples to optimize network parameters for effective denoising outcomes. However, obtaining extensive datasets is challenging in HSI, leading to epistemic uncertainties and thereby deteriorating the denoising performance. This paper introduces a novel supervised contrastive learning (SCL) method, RECREATE, to enhance feature learning and mitigate the issue of epistemic uncertainty for HSI denoising. Furthermore, we introduce the exploration of image inpainting as an auxiliary task to enhance the HSI denoising performance. By adding HSI inpainting to CL, our method essentially enhances HSI denoising by increasing training datasets and enforcing improved feature learning. Experimental outcomes on various HSI datasets validate the efficacy of RECREATE, showcasing its potential for integration with existing HSI denoising techniques to enhance their performance, both qualitatively and quantitatively. This innovative method holds promise for addressing the limitations posed by limited training data and thereby advancing the field toward proposing better HSI denoising methods.
{"title":"RECREATE: Supervised contrastive learning and inpainting based hyperspectral image denoising","authors":"Aditya Dixit , Anup Kumar Gupta , Puneet Gupta , Ankur Garg","doi":"10.1016/j.isprsjprs.2026.01.022","DOIUrl":"10.1016/j.isprsjprs.2026.01.022","url":null,"abstract":"<div><div>Hyperspectral image (HSI) contains information at various spectra, making it valuable in several real-world applications such as environmental monitoring, agriculture, and remote sensing. However, the acquisition process often introduces noise, necessitating effective HSI denoising methods to maintain its applicability. Deep Learning (DL) is considered as the de-facto for HSI denoising, but it requires a significant number of training samples to optimize network parameters for effective denoising outcomes. However, obtaining extensive datasets is challenging in HSI, leading to epistemic uncertainties and thereby deteriorating the denoising performance. This paper introduces a novel supervised contrastive learning (SCL) method, <em>RECREATE</em>, to enhance feature learning and mitigate the issue of epistemic uncertainty for HSI denoising. Furthermore, we introduce the exploration of image inpainting as an auxiliary task to enhance the HSI denoising performance. By adding HSI inpainting to CL, our method essentially enhances HSI denoising by increasing training datasets and enforcing improved feature learning. Experimental outcomes on various HSI datasets validate the efficacy of <em>RECREATE</em>, showcasing its potential for integration with existing HSI denoising techniques to enhance their performance, both qualitatively and quantitatively. This innovative method holds promise for addressing the limitations posed by limited training data and thereby advancing the field toward proposing better HSI denoising methods.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 14-24"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2026-01-07DOI: 10.1016/j.isprsjprs.2026.01.010
Jiepan Li , Wei He , Ting Hu , Minghao Tang , Liangpei Zhang
Binary semantic segmentation in remote sensing (RS) imagery faces persistent challenges due to complex object appearances, ambiguous boundaries, and high similarity between foreground and background, all of which introduce significant uncertainty into the prediction process. Existing approaches often treat uncertainty as either a global attribute or a pixel-level estimate, overlooking the critical role of spatial and contextual interactions. To address these limitations, we propose the Progressive Uncertainty-Guided Segmentation Network (PUGNet), a unified framework that explicitly models uncertainty in a context-aware manner. PUGNet decomposes uncertainty into three distinct components: foreground uncertainty, background uncertainty, and contextual uncertainty. This tripartite modeling enables more precise handling of local ambiguities and global inconsistencies. We adopt a coarse-to-fine decoding strategy that progressively refines features through two specialized modules. The Dynamic Uncertainty-Aware Module enhances regions of high foreground and background uncertainty using Gaussian-based modeling and contrastive learning. The Entropy-Driven Refinement Module quantifies contextual uncertainty via entropy and facilitates adaptive refinement through multi-scale context aggregation. Extensive experiments on ten public benchmark datasets, covering both single-temporal (e.g., building and cropland extraction) and bi-temporal (e.g., building change detection) binary segmentation tasks, demonstrate that PUGNet consistently achieves superior segmentation accuracy and uncertainty reduction, establishing a new state of the art in RS binary segmentation. The full implementation of the proposed framework and all experimental results can be accessed at https://github.com/Henryjiepanli/PU_RS.
{"title":"Progressive uncertainty-guided network for binary segmentation in high-resolution remote sensing imagery","authors":"Jiepan Li , Wei He , Ting Hu , Minghao Tang , Liangpei Zhang","doi":"10.1016/j.isprsjprs.2026.01.010","DOIUrl":"10.1016/j.isprsjprs.2026.01.010","url":null,"abstract":"<div><div>Binary semantic segmentation in remote sensing (RS) imagery faces persistent challenges due to complex object appearances, ambiguous boundaries, and high similarity between foreground and background, all of which introduce significant uncertainty into the prediction process. Existing approaches often treat uncertainty as either a global attribute or a pixel-level estimate, overlooking the critical role of spatial and contextual interactions. To address these limitations, we propose the <strong>Progressive Uncertainty-Guided Segmentation Network (PUGNet)</strong>, a unified framework that explicitly models uncertainty in a context-aware manner. PUGNet decomposes uncertainty into three distinct components: <strong>foreground uncertainty</strong>, <strong>background uncertainty</strong>, and <strong>contextual uncertainty</strong>. This tripartite modeling enables more precise handling of local ambiguities and global inconsistencies. We adopt a coarse-to-fine decoding strategy that progressively refines features through two specialized modules. The <strong>Dynamic Uncertainty-Aware Module</strong> enhances regions of high foreground and background uncertainty using Gaussian-based modeling and contrastive learning. The <strong>Entropy-Driven Refinement Module</strong> quantifies contextual uncertainty via entropy and facilitates adaptive refinement through multi-scale context aggregation. Extensive experiments on ten public benchmark datasets, covering both single-temporal (<em>e.g.</em>, building and cropland extraction) and bi-temporal (<em>e.g.</em>, building change detection) binary segmentation tasks, demonstrate that PUGNet consistently achieves superior segmentation accuracy and uncertainty reduction, establishing a new state of the art in RS binary segmentation. The full implementation of the proposed framework and all experimental results can be accessed at <span><span>https://github.com/Henryjiepanli/PU_RS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 561-577"},"PeriodicalIF":12.2,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2026-01-12DOI: 10.1016/j.isprsjprs.2026.01.009
Shoujun Jia , Lotte de Vugt , Andreas Mayr , Katharina Anders , Chun Liu , Martin Rutzinger
Estimating complex 3D topographic surface changes including rigid spatial movement and non-rigid morphological deformation is an essential task to investigate Earth surface dynamics. However, for current 3D point comparison approaches, it is challenging to separate rigid and non-rigid topographic surface changes from multi-temporal 3D point clouds. Additionally, these methods are affected by challenges including topographic surface roughness and point cloud heterogeneities (i.e., discrete and irregular point distributions). To address these challenges, in this paper, we consider the dynamic evolution of topographic surfaces as the geometric changes of Riemann manifold surfaces. By building Euclidean (straight) and non-Euclidean (curved) coordinate systems on Riemann manifold surfaces that are represented from point clouds, the rigid transformation and non-rigid deformation of the Riemann manifold surfaces are solved to conceptualize rigid and non-rigid change tensors, respectively. On this basis, we design rigid (i.e., translation and rotation) and non-rigid (i.e., stretch and distortion) change features to describe various topographic surface changes and quantify the associated uncertainties to capture significant changes. The proposed method is tested on pairwise point clouds with simulated and real topographic surface changes in mountain regions. Simulation experiments demonstrate that the proposed method performed better than the baseline (i.e., M3C2) and state-of-the-art methods (i.e., LOG), with a higher translation accuracy (more than 50% improvement), a lower translation uncertainty (more than 61% reduction), and strong robustness to varying point densities. These results also show that the proposed method accurately quantifies three additional types of change features (i.e., the mean accuracies of rotation, stretch, and distortion are 1.5°, 0.5%, and 3.5°, respectively). Moreover, the real-scene experimental results demonstrate the effectiveness and superiority of the proposed method in estimating various topographic changes in real environments, the applicability in analyzing geomorphological processes, and the potential contribution for understanding spatiotemporal patterns of Earth surface dynamics.
{"title":"Change tensor: Estimating complex topographic changes from point clouds using Riemann manifold surfaces","authors":"Shoujun Jia , Lotte de Vugt , Andreas Mayr , Katharina Anders , Chun Liu , Martin Rutzinger","doi":"10.1016/j.isprsjprs.2026.01.009","DOIUrl":"10.1016/j.isprsjprs.2026.01.009","url":null,"abstract":"<div><div>Estimating complex 3D topographic surface changes including rigid spatial movement and non-rigid morphological deformation is an essential task to investigate Earth surface dynamics. However, for current 3D point comparison approaches, it is challenging to separate rigid and non-rigid topographic surface changes from multi-temporal 3D point clouds. Additionally, these methods are affected by challenges including topographic surface roughness and point cloud heterogeneities (i.e., discrete and irregular point distributions). To address these challenges, in this paper, we consider the dynamic evolution of topographic surfaces as the geometric changes of Riemann manifold surfaces. By building Euclidean (straight) and non-Euclidean (curved) coordinate systems on Riemann manifold surfaces that are represented from point clouds, the rigid transformation and non-rigid deformation of the Riemann manifold surfaces are solved to conceptualize rigid and non-rigid change tensors, respectively. On this basis, we design rigid (i.e., translation and rotation) and non-rigid (i.e., stretch and distortion) change features to describe various topographic surface changes and quantify the associated uncertainties to capture significant changes. The proposed method is tested on pairwise point clouds with simulated and real topographic surface changes in mountain regions. Simulation experiments demonstrate that the proposed method performed better than the baseline (i.e., M3C2) and state-of-the-art methods (i.e., LOG), with a higher translation accuracy (more than 50% improvement), a lower translation uncertainty (more than 61% reduction), and strong robustness to varying point densities. These results also show that the proposed method accurately quantifies three additional types of change features (i.e., the mean accuracies of rotation, stretch, and distortion are 1.5°, 0.5%, and 3.5°, respectively). Moreover, the real-scene experimental results demonstrate the effectiveness and superiority of the proposed method in estimating various topographic changes in real environments, the applicability in analyzing geomorphological processes, and the potential contribution for understanding spatiotemporal patterns of Earth surface dynamics.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 766-786"},"PeriodicalIF":12.2,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145957294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}