Pub Date : 2025-12-26DOI: 10.1016/j.isprsjprs.2025.12.011
Deren Li , Mi Wang , Jing Xiao , Bo Yang
The physical world, governed by the perpetual dynamics of matter, is inherently structured by spatio-temporal information. In the current Intelligence Era, rapid advances in data collection and Artificial Intelligence (AI) technologies have enabled large-scale acquisition and analysis of dynamic spatio-temporal information. These developments have led to the emergence of Spatio-Temporal Intelligence (STI), an interdisciplinary field that integrates spatio-temporal data with AI-driven computational methodologies to model, interpret, and manage complex physical, environmental, and social processes.
In this paper, we offer a perspective on the mission and evolving scope of STI and identify its critical challenges. A general STI framework comprising five interconnected components is proposed to support adaptive observation, multi-modal modeling, causal reasoning, and knowledge-driven service delivery. Through a case study in national park ecological monitoring, we demonstrate how STI enables large-scale, precise, and real-time environmental understanding. Distinct from approaches that simulate symbolic or linguistic cognition, STI is grounded in the physical world and leverages high-dimensional sensor data to enable machine perception, foster new cognitive paradigms, and enhance decision-making across domains.
{"title":"Perspectives on spatio-temporal intelligence","authors":"Deren Li , Mi Wang , Jing Xiao , Bo Yang","doi":"10.1016/j.isprsjprs.2025.12.011","DOIUrl":"10.1016/j.isprsjprs.2025.12.011","url":null,"abstract":"<div><div>The physical world, governed by the perpetual dynamics of matter, is inherently structured by spatio-temporal information. In the current Intelligence Era, rapid advances in data collection and Artificial Intelligence (AI) technologies have enabled large-scale acquisition and analysis of dynamic spatio-temporal information. These developments have led to the emergence of Spatio-Temporal Intelligence (STI), an interdisciplinary field that integrates spatio-temporal data with AI-driven computational methodologies to model, interpret, and manage complex physical, environmental, and social processes.</div><div>In this paper, we offer a perspective on the mission and evolving scope of STI and identify its critical challenges. A general STI framework comprising five interconnected components is proposed to support adaptive observation, multi-modal modeling, causal reasoning, and knowledge-driven service delivery. Through a case study in national park ecological monitoring, we demonstrate how STI enables large-scale, precise, and real-time environmental understanding. Distinct from approaches that simulate symbolic or linguistic cognition, STI is grounded in the physical world and leverages high-dimensional sensor data to enable machine perception, foster new cognitive paradigms, and enhance decision-making across domains.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 305-318"},"PeriodicalIF":12.2,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-20DOI: 10.1016/j.isprsjprs.2025.11.021
Xiaochen Wang, Xinlian Liang
Terrestrial laser scanning (TLS) has proven to be an effective tool for forest inventories due to its accurate, non-destructive capability to document 3D space structures. The multi-scan mode of TLS enables comprehensive data acquisition, but the point cloud of each scan must be aligned to a common coordinate frame. In practice, the most common solution involves manually placing artificial markers in the field, which is time-consuming and labor-intensive. Consequently, the automated multi-scan registration method is highly appreciated for subsequent applications. This study presents an automated TLS multi-scan registration algorithm for forest point clouds, HashReg, utilizing the high-efficiency operations of Hash Table. HashReg comprises four key procedures, including stem mapping, estimating coarse transformation parameters, factor graph optimization, and fine-tuned registration. Using optimized transformation parameters, the global poses of individual TLS scans are subsequently determined within a unified coordinate system through a depth-first strategy. Extensive experiments were performed on four datasets with diverse forest characteristics, such as dense and sparse stems, flat and undulating terrain, and natural and plantation forests. The experimental results demonstrate that HashReg achieves milliradian-level rotation accuracy and centimeter-level translation accuracy, i.e., 0–3 mrad and 0–3 cm for most of the plots, respectively. Another evaluation metric, the point-wise upper bound errors, is reported to show the variation of point discrepancy with increasing distance. For most plots, these errors remained within the centimeter range, i.e., 1–4 cm, 1–5 cm, and 2–7 cm for the distance at 5 m, 10 m, and 20 m respectively. Moreover, the efficiency of HashReg’s four key procedures was also assessed. The running time of coarse registration and global optimization is at the millisecond level, i.e., 4 ms and 6 ms, while the stem mapping and fine registration were at the second level, i.e., 3 s and 15 s. Comparison with four state-of-the-art (SOTA) point cloud registration approaches, including FMP + BnB, HL-MRF, GlobalMatch, and SGHR, was quantitatively conducted on three public datasets. HashReg achieves superior accuracy, i.e., ranking first or second across all plots, with 100 % successful registrations. It also has substantially higher efficiency, with runtime improvements exceeding two-fold relative to the SOTA methods. All these advantages demonstrate that HashReg can bridge the gap between raw data and practical applications. The implementation of HashReg is open-sourced at https://github.com/MSpace-WHU/Forest_TLS_Reg.
{"title":"Automated TLS multi-scan registration in forest environments: A novel solution based on hash table","authors":"Xiaochen Wang, Xinlian Liang","doi":"10.1016/j.isprsjprs.2025.11.021","DOIUrl":"10.1016/j.isprsjprs.2025.11.021","url":null,"abstract":"<div><div>Terrestrial laser scanning (TLS) has proven to be an effective tool for forest inventories due to its accurate, non-destructive capability to document 3D space structures. The multi-scan mode of TLS enables comprehensive data acquisition, but the point cloud of each scan must be aligned to a common coordinate frame. In practice, the most common solution involves manually placing artificial markers in the field, which is time-consuming and labor-intensive. Consequently, the automated multi-scan registration method is highly appreciated for subsequent applications. This study presents an automated TLS multi-scan registration algorithm for forest point clouds, HashReg, utilizing the high-efficiency operations of Hash Table. HashReg comprises four key procedures, including stem mapping, estimating coarse transformation parameters, factor graph optimization, and fine-tuned registration. Using optimized transformation parameters, the global poses of individual TLS scans are subsequently determined within a unified coordinate system through a depth-first strategy. Extensive experiments were performed on four datasets with diverse forest characteristics, such as dense and sparse stems, flat and undulating terrain, and natural and plantation forests. The experimental results demonstrate that HashReg achieves milliradian-level rotation accuracy and centimeter-level translation accuracy, i.e., 0–3 mrad and 0–3 cm for most of the plots, respectively. Another evaluation metric, the point-wise upper bound errors, is reported to show the variation of point discrepancy with increasing distance. For most plots, these errors remained within the centimeter range, i.e., 1–4 cm, 1–5 cm, and 2–7 cm for the distance at 5 m, 10 m, and 20 m respectively. Moreover, the efficiency of HashReg’s four key procedures was also assessed. The running time of coarse registration and global optimization is at the millisecond level, i.e., 4 ms and 6 ms, while the stem mapping and fine registration were at the second level, i.e., 3 s and 15 s. Comparison with four state-of-the-art (SOTA) point cloud registration approaches, including FMP + BnB, HL-MRF, GlobalMatch, and SGHR, was quantitatively conducted on three public datasets. HashReg achieves superior accuracy, i.e., ranking first or second across all plots, with 100 % successful registrations. It also has substantially higher efficiency, with runtime improvements exceeding two-fold relative to the SOTA methods. All these advantages demonstrate that HashReg can bridge the gap between raw data and practical applications. The implementation of HashReg is open-sourced at <span><span>https://github.com/MSpace-WHU/Forest_TLS_Reg</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 280-304"},"PeriodicalIF":12.2,"publicationDate":"2025-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145785301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1016/j.isprsjprs.2025.12.006
Yuhao Zhang , Jieru Chi , Guowei Yang , Chenglizhao Chen , Teng Yu
Currently, notable progress has been attained in small ship target detection for synthetic aperture radar (SAR) imagery, with such advancements being driven by three key methodological innovations within the deep learning framework: self-supervision combined with knowledge distillation, rotated bounding box detection, and multi-scale feature fusion. However, it still faces challenges such as high speckle noise in SAR images, difficulty in extracting small target features, geometric distortion of ship shapes and heading dependence. Therefore, this article proposes a new SAR-NanoShipNet model. To enhance the targeting of ship objects, the proposed method employs a specialized convolution (DABConv) that exhibits greater suitability for ship targets, replacing the conventional standard convolution. As opposed to traditional approaches for SAR target detection, which typically lack the capability to adaptively capture the irregular boundaries and low-contrast features of small ship targets in SAR images, this method pioneers the adaptive capture of these features through deformable convolutions and boundary attention mechanisms, leading to enhanced target localization accuracy. In addition, we introduce the VerticalCompSPPF module (VC-SPPF), which incorporates longitudinal multi-scale convolution alongside a channel attention mechanism. Finally, the design of D-CLEM is linked with DABConv to enhance directional feature extraction while also fusing, improving the accuracy of small object detection. We have validated the superiority of our method on five datasets, particularly for high precision detection of small targets (AP 2.66%). Our code can be found at https://github.com/Z-Yuhao/1.git.
{"title":"SAR-NanoShipNet: A scale-adaptive network for robust small ship detection in SAR imagery","authors":"Yuhao Zhang , Jieru Chi , Guowei Yang , Chenglizhao Chen , Teng Yu","doi":"10.1016/j.isprsjprs.2025.12.006","DOIUrl":"10.1016/j.isprsjprs.2025.12.006","url":null,"abstract":"<div><div>Currently, notable progress has been attained in small ship target detection for synthetic aperture radar (SAR) imagery, with such advancements being driven by three key methodological innovations within the deep learning framework: self-supervision combined with knowledge distillation, rotated bounding box detection, and multi-scale feature fusion. However, it still faces challenges such as high speckle noise in SAR images, difficulty in extracting small target features, geometric distortion of ship shapes and heading dependence. Therefore, this article proposes a new SAR-NanoShipNet model. To enhance the targeting of ship objects, the proposed method employs a specialized convolution (DABConv) that exhibits greater suitability for ship targets, replacing the conventional standard convolution. As opposed to traditional approaches for SAR target detection, which typically lack the capability to adaptively capture the irregular boundaries and low-contrast features of small ship targets in SAR images, this method pioneers the adaptive capture of these features through deformable convolutions and boundary attention mechanisms, leading to enhanced target localization accuracy. In addition, we introduce the VerticalCompSPPF module (VC-SPPF), which incorporates longitudinal multi-scale convolution alongside a channel attention mechanism. Finally, the design of D-CLEM is linked with DABConv to enhance directional feature extraction while also fusing, improving the accuracy of small object detection. We have validated the superiority of our method on five datasets, particularly for high precision detection of small targets (AP<span><math><msub><mrow></mrow><mrow><mi>s</mi></mrow></msub></math></span> <span><math><mi>↑</mi></math></span>2.66%). Our code can be found at <span><span>https://github.com/Z-Yuhao/1.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 262-279"},"PeriodicalIF":12.2,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145785304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-18DOI: 10.1016/j.isprsjprs.2025.11.034
Zhenyu Zhang , Yuan Li , Feihu Zhu , Yuechao Ma , Junying Lv , Qian Sun , Lin Li , Wuming Zhang
While single-photon LiDAR promises revolutionary depth sensing capabilities, existing deep learning frameworks fundamentally fail to overcome the challenge of high spatial resolution (SR) data processing. To address the amplification of fine geometric details and complex spatiotemporal dependencies in high-SR single-photon data, we adopt a U-Net++ backbone with dense skip connections to preserve high-frequency features. Our encoder cascades two novel modules, integrating attention-driven modulation and convolution to adaptively model intricate patterns without sacrificing detail. We propose a 3D triple local-attention fusion module (3D-TriLAF) to suppress incoherent responses across temporal, spatial, and channel axes. In parallel, an opposite continuous dilation spatial–temporal convolution module (OCDSConv) is designed to extract structured context while preserving transient cues. To alleviate the misalignment and semantic drift between low and high-level features—problems exacerbated by increased resolution—we design a multi-scale fusion mechanism that facilitates consistent geometric modeling across scales. Finally, we propose a hybrid loss combining ordinal regression (OR) loss, structural similarity index measure (SSIM) loss, and bilateral total variation (BTV) loss to jointly enhances peak localization, structural fidelity, and edge-aware smoothness. Extensive experiments on two 128 × 128 SR simulated datasets show that, compared with the best baseline, our framework reduces RMSE and Abs Rel by up to 60.00 % and 31.58 %. On two (200 + )×(200 + ) SR real-world datasets, RMSE and Abs Rel drop by 42.31 % and 39.44 %. These quantitative gains and visual improvements in geometric continuity under complex lighting confirm its suitability for fine-grained high-SR single-photon depth reconstruction.
{"title":"Towards High spatial resolution and fine-grained fidelity depth reconstruction of single-photon LiDAR with context-aware spatiotemporal modeling","authors":"Zhenyu Zhang , Yuan Li , Feihu Zhu , Yuechao Ma , Junying Lv , Qian Sun , Lin Li , Wuming Zhang","doi":"10.1016/j.isprsjprs.2025.11.034","DOIUrl":"10.1016/j.isprsjprs.2025.11.034","url":null,"abstract":"<div><div>While single-photon LiDAR promises revolutionary depth sensing capabilities, existing deep learning frameworks fundamentally fail to overcome the challenge of high spatial resolution (SR) data processing. To address the amplification of fine geometric details and complex spatiotemporal dependencies in high-SR single-photon data, we adopt a U-Net++ backbone with dense skip connections to preserve high-frequency features. Our encoder cascades two novel modules, integrating attention-driven modulation and convolution to adaptively model intricate patterns without sacrificing detail. We propose a 3D triple local-attention fusion module (3D-TriLAF) to suppress incoherent responses across temporal, spatial, and channel axes. In parallel, an opposite continuous dilation spatial–temporal convolution module (OCDSConv) is designed to extract structured context while preserving transient cues. To alleviate the misalignment and semantic drift between low and high-level features—problems exacerbated by increased resolution—we design a multi-scale fusion mechanism that facilitates consistent geometric modeling across scales. Finally, we propose a hybrid loss combining ordinal regression (OR) loss, structural similarity index measure (SSIM) loss, and bilateral total variation (BTV) loss to jointly enhances peak localization, structural fidelity, and edge-aware smoothness. Extensive experiments on two 128 × 128 SR simulated datasets show that, compared with the best baseline, our framework reduces RMSE and Abs Rel by up to 60.00 % and 31.58 %. On two (200 + )×(200 + ) SR real-world datasets, RMSE and Abs Rel drop by 42.31 % and 39.44 %. These quantitative gains and visual improvements in geometric continuity under complex lighting confirm its suitability for fine-grained high-SR single-photon depth reconstruction.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 238-261"},"PeriodicalIF":12.2,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145785309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-15DOI: 10.1016/j.isprsjprs.2025.12.005
Xiaosong Ding , Xianqiang He , Hemant Khatri , Jiajia Li , Feng Ye , Hao Li , Min Zhao , Fang Gong
Submesoscale processes (length scales of 0.1–10 km) play a critical role in oceanic energy dissipation and mass transport, yet traditional satellite altimetry data—with coarse spatial (∼25 km) and temporal (daily) resolution—are insufficient to resolve its dynamics. Geostationary satellite ocean-color imagers such as GOCI-II can observe hourly at 250 m resolution, but the filament-rich evolution of surface tracers complicates current inference from image sequences. Here, we develop the BAPDE-RAFT, a boundary-aware, Poisson-based, near-divergence-free–constrained extension of the RAFT optical-flow network, to retrieve pixel-scale sea surface velocities from consecutive images, effectively estimating a physically regularized, nondivergent (rotational) component of the surface flow. Compared with the standard maximum cross-correlation (MCC) approach, BAPDE-RAFT lowers end-point and angular errors by 44 % and 38 %, respectively. Wavenumber analysis places the critical scale at which model error exceeds signal at λc ≈ 4.0 km—over an order of magnitude finer than the 60–80 km limits of traditional MCC algorithm, confirming that only BAPDE-RAFT retains spectral power throughout the 1–10 km sub-mesoscale processes. When applied to hourly GOCI-II chlorophyll-a images in the Japan Sea/East Sea, the model reproduces diurnal current variability and the expected dual cascade: an upscale kinetic-energy flux (∼k−3) and a downscale tracer cascade (∼k−1). We note that the near-divergence-free constraint may damp strongly convergent/divergent ageostrophic motions; nevertheless, despite being affected by cloud coverage, these results demonstrate that high-cadence geostationary satellite ocean color observations can yield physically consistent maps of fine-scale surface currents, opening new avenues for satellite studies of ocean dynamics and mass transport at fine scale.
{"title":"Physically-constrained flow learning reveals diurnal submesoscale surface currents from geostationary satellite observations","authors":"Xiaosong Ding , Xianqiang He , Hemant Khatri , Jiajia Li , Feng Ye , Hao Li , Min Zhao , Fang Gong","doi":"10.1016/j.isprsjprs.2025.12.005","DOIUrl":"10.1016/j.isprsjprs.2025.12.005","url":null,"abstract":"<div><div>Submesoscale processes (length scales of 0.1–10 km) play a critical role in oceanic energy dissipation and mass transport, yet traditional satellite altimetry data—with coarse spatial (∼25 km) and temporal (daily) resolution—are insufficient to resolve its dynamics. Geostationary satellite ocean-color imagers such as GOCI-II can observe hourly at 250 m resolution, but the filament-rich evolution of surface tracers complicates current inference from image sequences. Here, we develop the BAPDE-RAFT, a boundary-aware, Poisson-based, near-divergence-free–constrained extension of the RAFT optical-flow network, to retrieve pixel-scale sea surface velocities from consecutive images, effectively estimating a physically regularized, nondivergent (rotational) component of the surface flow. Compared with the standard maximum cross-correlation (MCC) approach, BAPDE-RAFT lowers end-point and angular errors by 44 % and 38 %, respectively. Wavenumber analysis places the critical scale at which model error exceeds signal at λc ≈ 4.0 km—over an order of magnitude finer than the 60–80 km limits of traditional MCC algorithm, confirming that only BAPDE-RAFT retains spectral power throughout the 1–10 km sub-mesoscale processes. When applied to hourly GOCI-II chlorophyll-a images in the Japan Sea/East Sea, the model reproduces diurnal current variability and the expected dual cascade: an upscale kinetic-energy flux (<em>∼k</em><sup>−</sup><em><sup>3</sup></em>) and a downscale tracer cascade (<em>∼k</em><sup>−</sup><em><sup>1</sup></em>). We note that the near-divergence-free constraint may damp strongly convergent/divergent ageostrophic motions; nevertheless, despite being affected by cloud coverage, these results demonstrate that high-cadence geostationary satellite ocean color observations can yield physically consistent maps of fine-scale surface currents, opening new avenues for satellite studies of ocean dynamics and mass transport at fine scale.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 223-237"},"PeriodicalIF":12.2,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145753418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-10DOI: 10.1016/j.isprsjprs.2025.11.035
Chunli Wang , Kaiyuan Li , Wenfeng Zhan , Long Li , Chenguang Wang , Shasha Wang , Sida Jiang , Shuang Ge , Zihan Liu
Understanding how cooling efficiency (CE) trends of tree covers respond to background climate change and local urbanization is essential for optimizing urban greening strategies. However, previous studies have largely focused on city-wide CE trends, often with a limited number of cities or with limited geographic scope. Consequently, the spatial heterogeneity in CE trends across urban–rural gradients, and their underlying drivers remain poorly understood, especially across cities worldwide. Here we quantified summer daytime CE trends from 2003 to 2020 across six urban–rural gradients (i.e., urban cores, new towns, urban fringes, suburbs, rural fringes, and rural backgrounds) in more than 5,000 global cities, utilizing MODIS-derived land surface temperature and tree cover data. We observed an inverse-V shape in CE trends along urban–rural gradient, with peak values in urban fringes (0.10 ± 0.01 °C/%/century, mean ± one standard error), followed by new towns, cores, and three rural gradients. The trends in CE were generally positive across most climate zones, yet arid regions exhibited a decline (−0.06 ± 0.06 °C/%/century). CE trends strengthened with increasing city size in urban fringes, yet they decreased in cores and new towns. Using a LightGBM-SHAP algorithm, we found that the macro-scale background climate dominated the CE trends in urban areas (43%), whereas micro-scale local surface properties emerged as the primary contributors in rural areas (48%). Our findings provide critical insights into the spatial heterogeneity of CE trends of urban tree covers at a global scale.
{"title":"Urban–rural gradients in cooling efficiency trends of tree covers across global cities","authors":"Chunli Wang , Kaiyuan Li , Wenfeng Zhan , Long Li , Chenguang Wang , Shasha Wang , Sida Jiang , Shuang Ge , Zihan Liu","doi":"10.1016/j.isprsjprs.2025.11.035","DOIUrl":"10.1016/j.isprsjprs.2025.11.035","url":null,"abstract":"<div><div>Understanding how cooling efficiency (CE) trends of tree covers respond to background climate change and local urbanization is essential for optimizing urban greening strategies. However, previous studies have largely focused on city-wide CE trends, often with a limited number of cities or with limited geographic scope. Consequently, the spatial heterogeneity in CE trends across urban–rural gradients, and their underlying drivers remain poorly understood, especially across cities worldwide. Here we quantified summer daytime CE trends from 2003 to 2020 across six urban–rural gradients (i.e., urban cores, new towns, urban fringes, suburbs, rural fringes, and rural backgrounds) in more than 5,000 global cities, utilizing MODIS-derived land surface temperature and tree cover data. We observed an inverse-V shape in CE trends along urban–rural gradient, with peak values in urban fringes (0.10 ± 0.01 °C/%/century, mean ± one standard error), followed by new towns, cores, and three rural gradients. The trends in CE were generally positive across most climate zones, yet arid regions exhibited a decline (−0.06 ± 0.06 °C/%/century). CE trends strengthened with increasing city size in urban fringes, yet they decreased in cores and new towns. Using a LightGBM-SHAP algorithm, we found that the macro-scale background climate dominated the CE trends in urban areas (43%), whereas micro-scale local surface properties emerged as the primary contributors in rural areas (48%). Our findings provide critical insights into the spatial heterogeneity of CE trends of urban tree covers at a global scale.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 210-222"},"PeriodicalIF":12.2,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145732136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-10DOI: 10.1016/j.isprsjprs.2025.11.033
Emma De Clerck , Pablo Reyes-Muñoz , Egor Prikaziuk , Dávid D.Kovács , Jochem Verrelst
High-spatial-resolution gross primary production (GPP) estimation is critical for local carbon monitoring, especially in heterogeneous landscapes where global products lack spatial detail. We present a hybrid modeling framework that estimates GPP using Sentinel-2 (S2) reflectance and Bayesian Gaussian Process Regression (GPR), chosen for its robustness with limited data and its ability to quantify uncertainty. GPR models were trained using SCOPE (Soil Canopy Observation of Photosynthesis and Energy fluxes) radiative transfer model (RTM) simulations and optimized via active learning (AL) across 10 plant functional types (PFTs). These lightweight, PFT-specific S2-GPR models were implemented in Google Earth Engine (GEE) to enable scalable, reproducible, and accessible GPP estimation and mapping. S2-GPR models predictive performances were evaluated using data from 67 eddy covariance flux towers across Europe. Data from 2017–2020 were used for training and training database optimization, while 2021–2024 data served as independent validation. Strong predictive performance was achieved in wetlands (R=0.84, NRMSE=12.6%), savannas (R=0.81, NRMSE=12.2%), and deciduous broadleaf forests (R=0.81, NRMSE=14.3%). Moderate accuracy was observed for croplands, shrublands, grasslands, and mixed forests (R=0.67–0.77), with lower accuracy in evergreen broadleaf (R=0.07) and needleleaf forests (R=0.33). Compared to MODIS GPP (MOD17A2H V6.1), the S2-GPR models showed consistently lower bias and comparable or improved accuracy in most PFTs, except evergreen forests. Additional validation against AmeriFlux sites in North America demonstrated that the models retain predictive power beyond the ICOS network, though ecosystem-specific and regional differences can influence accuracy. The inclusion of coarse-resolution meteorological variables (temperature, radiation, vapor pressure deficit, air pressure) was evaluated but generally did not improve predictive performance and introduced additional uncertainty, highlighting that in this study S2 spectral information alone provides the dominant signal for high-resolution GPP estimation. These findings underscore the value of integrating SCOPE modeling and AL-optimized GPR for accurate, local-scale GPP mapping using cloud-based S2 data, complementing coarse-resolution global products.
{"title":"High-spatial-resolution gross primary production estimation from Sentinel-2 reflectance using hybrid Gaussian processes modeling","authors":"Emma De Clerck , Pablo Reyes-Muñoz , Egor Prikaziuk , Dávid D.Kovács , Jochem Verrelst","doi":"10.1016/j.isprsjprs.2025.11.033","DOIUrl":"10.1016/j.isprsjprs.2025.11.033","url":null,"abstract":"<div><div>High-spatial-resolution gross primary production (GPP) estimation is critical for local carbon monitoring, especially in heterogeneous landscapes where global products lack spatial detail. We present a hybrid modeling framework that estimates GPP using Sentinel-2 (S2) reflectance and Bayesian Gaussian Process Regression (GPR), chosen for its robustness with limited data and its ability to quantify uncertainty. GPR models were trained using SCOPE (Soil Canopy Observation of Photosynthesis and Energy fluxes) radiative transfer model (RTM) simulations and optimized via active learning (AL) across 10 plant functional types (PFTs). These lightweight, PFT-specific S2-GPR models were implemented in Google Earth Engine (GEE) to enable scalable, reproducible, and accessible GPP estimation and mapping. S2-GPR models predictive performances were evaluated using data from 67 eddy covariance flux towers across Europe. Data from 2017–2020 were used for training and training database optimization, while 2021–2024 data served as independent validation. Strong predictive performance was achieved in wetlands (R=0.84, NRMSE=12.6%), savannas (R=0.81, NRMSE=12.2%), and deciduous broadleaf forests (R=0.81, NRMSE=14.3%). Moderate accuracy was observed for croplands, shrublands, grasslands, and mixed forests (R=0.67–0.77), with lower accuracy in evergreen broadleaf (R=0.07) and needleleaf forests (R=0.33). Compared to MODIS GPP (MOD17A2H V6.1), the S2-GPR models showed consistently lower bias and comparable or improved accuracy in most PFTs, except evergreen forests. Additional validation against AmeriFlux sites in North America demonstrated that the models retain predictive power beyond the ICOS network, though ecosystem-specific and regional differences can influence accuracy. The inclusion of coarse-resolution meteorological variables (temperature, radiation, vapor pressure deficit, air pressure) was evaluated but generally did not improve predictive performance and introduced additional uncertainty, highlighting that in this study S2 spectral information alone provides the dominant signal for high-resolution GPP estimation. These findings underscore the value of integrating SCOPE modeling and AL-optimized GPR for accurate, local-scale GPP mapping using cloud-based S2 data, complementing coarse-resolution global products.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 172-195"},"PeriodicalIF":12.2,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145732671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-10DOI: 10.1016/j.isprsjprs.2025.12.003
Rejane S. Paulino , Vitor S. Martins , Cassia B. Caballero , Thainara M.A. Lima , Bingqing Liu , Akash Ashapure , Jeremy Werdell
NASA’s PACE (Plankton, Aerosol, Cloud, ocean Ecosystem) is a satellite mission launched in February 2024, featuring the hyperspectral Ocean Color Instrument (OCI). One of the key products released to the scientific community from PACE is spectral remote sensing reflectance (Rrs(λ)). Rrs(λ) is critical for estimating bio-optical and biogeochemical properties in aquatic systems, particularly concerning the presence of algal pigments, suspended particulate matter, and colored dissolved organic matter. PACE-OCI’s hyperspectral capabilities address the limitations of prior sensors, providing enhanced spectral discrimination for these properties, especially in optically complex waters. The provisional PACE-OCI Rrs product is especially crucial for generating global aquatic products and supporting multifaceted research in ocean and coastal systems, which has generated significant interest in understanding its quality and potential applications. This study provides a preliminary validation of the provisional PACE-OCI Rrs product (V3.1) using 15 globally distributed AERONET-OC (Aerosol Robotic Network-Ocean Color) stations. A total of 895 match-up observations between PACE-OCI and AERONET-OC (March 2024 to September 2025) were analyzed across eight wavelengths (400 – 667 nm) and 20 distinct optical water types. Results indicate overall consistency of PACE-OCI Rrs, with a median symmetric accuracy (ε) of approximately 22.6 % and a symmetric signed percentage bias () of + 6.5 %. For clear waters, the product performed well at wavelengths between 400 – 560 nm (average ε of 17.2 %) and achieved the best accuracy at longer wavelengths (490 – 667 nm) for waters with moderate to high optical complexity (average ε of 16.3 %). However, these spectral distortions were more pronounced in waters with high optical complexity compared to those with low or moderate optical complexity. These findings highlight the quality of PACE-OCI’s provisional product to support aquatic applications and bring insights for future improvements of this Rrs(λ) product.
{"title":"PACE (Plankton, Aerosol, Cloud, ocean Ecosystem): Preliminary analysis of the consistency of remote sensing reflectance product over aquatic systems","authors":"Rejane S. Paulino , Vitor S. Martins , Cassia B. Caballero , Thainara M.A. Lima , Bingqing Liu , Akash Ashapure , Jeremy Werdell","doi":"10.1016/j.isprsjprs.2025.12.003","DOIUrl":"10.1016/j.isprsjprs.2025.12.003","url":null,"abstract":"<div><div>NASA’s PACE (Plankton, Aerosol, Cloud, ocean Ecosystem) is a satellite mission launched in February 2024, featuring the hyperspectral Ocean Color Instrument (OCI). One of the key products released to the scientific community from PACE is spectral remote sensing reflectance (R<sub>rs</sub>(λ)). R<sub>rs</sub>(λ) is critical for estimating bio-optical and biogeochemical properties in aquatic systems, particularly concerning the presence of algal pigments, suspended particulate matter, and colored dissolved organic matter. PACE-OCI’s hyperspectral capabilities address the limitations of prior sensors, providing enhanced spectral discrimination for these properties, especially in optically complex waters. The provisional PACE-OCI R<sub>rs</sub> product is especially crucial for generating global aquatic products and supporting multifaceted research in ocean and coastal systems, which has generated significant interest in understanding its quality and potential applications. This study provides a preliminary validation of the provisional PACE-OCI R<sub>rs</sub> product (V3.1) using 15 globally distributed AERONET-OC (Aerosol Robotic Network-Ocean Color) stations. A total of 895 match-up observations between PACE-OCI and AERONET-OC (March 2024 to September 2025) were analyzed across eight wavelengths (400 – 667 nm) and 20 distinct optical water types. Results indicate overall consistency of PACE-OCI R<sub>rs</sub>, with a median symmetric accuracy (<em>ε</em>) of approximately 22.6 % and a symmetric signed percentage bias (<span><math><mrow><mi>β</mi></mrow></math></span>) of + 6.5 %. For clear waters, the product performed well at wavelengths between 400 – 560 nm (average <em>ε</em> of 17.2 %) and achieved the best accuracy at longer wavelengths (490 – 667 nm) for waters with moderate to high optical complexity (average <em>ε</em> of 16.3 %). However, these spectral distortions were more pronounced in waters with high optical complexity compared to those with low or moderate optical complexity. These findings highlight the quality of PACE-OCI’s provisional product to support aquatic applications and bring insights for future improvements of this R<sub>rs</sub>(λ) product.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 196-209"},"PeriodicalIF":12.2,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145731950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-08DOI: 10.1016/j.isprsjprs.2025.12.004
Jian Song , Hongruixuan Chen , Naoto Yokoya
Monocular height estimation (MHE) from very-high-resolution (VHR) remote sensing imagery via deep learning is notoriously challenging due to the lack of sufficient structural information. Conventional digital elevation models (DEMs), typically derived from airborne LiDAR or multi-view stereo, remain costly and geographically limited. While state-of-the-art monocular height estimation (MHE) and depth estimation (MDE) models show great promise, their robustness under varied illumination conditions remains a significant challenge. To address this, we introduce a novel and fully automated correction pipeline that integrates sparse, imperfect global LiDAR measurements (ICESat-2) with deep learning outputs to enhance local accuracy and robustness. Importantly, the entire workflow is fully automated and built solely on publicly available models and datasets, requiring only a single georeferenced optical image to generate corrected height maps, thereby ensuring unprecedented accessibility and global scalability. Furthermore, we establish the first comprehensive benchmark for this task, evaluating a suite of correction methods that includes two random forest-based approaches, four parameter-efficient fine-tuning techniques, and full fine-tuning. We conduct extensive experiments across six large-scale, diverse regions at 0.5 m resolution, totaling approximately 297 km, encompassing the urban cores of Tokyo, Paris, and São Paulo, as well as mixed suburban and forest landscapes. Experimental results demonstrate that the best-performing correction method reduces the MHE model’s mean absolute error (MAE) by an average of 30.9% and improves its score by 44.2%. For the MDE model, the MAE is improved by 24.1% and the score by 25.1%. These findings validate the effectiveness of our correction pipeline, demonstrating how sparse real-world LiDAR data can systematically bolster the robustness of both MHE and MDE models and paving the way for scalable, low-cost, and globally applicable 3D mapping solutions.
{"title":"Enhancing monocular height estimation via sparse LiDAR-guided correction","authors":"Jian Song , Hongruixuan Chen , Naoto Yokoya","doi":"10.1016/j.isprsjprs.2025.12.004","DOIUrl":"10.1016/j.isprsjprs.2025.12.004","url":null,"abstract":"<div><div>Monocular height estimation (MHE) from very-high-resolution (VHR) remote sensing imagery via deep learning is notoriously challenging due to the lack of sufficient structural information. Conventional digital elevation models (DEMs), typically derived from airborne LiDAR or multi-view stereo, remain costly and geographically limited. While state-of-the-art monocular height estimation (MHE) and depth estimation (MDE) models show great promise, their robustness under varied illumination conditions remains a significant challenge. To address this, we introduce a novel and fully automated correction pipeline that integrates sparse, imperfect global LiDAR measurements (ICESat-2) with deep learning outputs to enhance local accuracy and robustness. Importantly, the entire workflow is fully automated and built solely on publicly available models and datasets, requiring only a single georeferenced optical image to generate corrected height maps, thereby ensuring unprecedented accessibility and global scalability. Furthermore, we establish the first comprehensive benchmark for this task, evaluating a suite of correction methods that includes two random forest-based approaches, four parameter-efficient fine-tuning techniques, and full fine-tuning. We conduct extensive experiments across six large-scale, diverse regions at 0.5<!--> <!-->m resolution, totaling approximately 297<!--> <!-->km<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>, encompassing the urban cores of Tokyo, Paris, and São Paulo, as well as mixed suburban and forest landscapes. Experimental results demonstrate that the best-performing correction method reduces the MHE model’s mean absolute error (MAE) by an average of 30.9% and improves its <span><math><msubsup><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow><mrow><mtext>HE</mtext></mrow></msubsup></math></span> score by 44.2%. For the MDE model, the MAE is improved by 24.1% and the <span><math><msubsup><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow><mrow><mtext>HE</mtext></mrow></msubsup></math></span> score by 25.1%. These findings validate the effectiveness of our correction pipeline, demonstrating how sparse real-world LiDAR data can systematically bolster the robustness of both MHE and MDE models and paving the way for scalable, low-cost, and globally applicable 3D mapping solutions.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 155-171"},"PeriodicalIF":12.2,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145704894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-08DOI: 10.1016/j.isprsjprs.2025.12.001
Jinxin Guo , Weida Zhan , Yu Chen , Depeng Zhu , Yichun Jiang , Xiaoyu Xu , Deng Han
Thermal infrared dense UAV target detection and tracking present significant challenges at both data and algorithmic levels. At the data level, there exists a scarcity of accurately annotated real-world samples coupled with high acquisition costs. At the algorithmic level, the key difficulty lies in addressing frequent identity switches caused by highly dense target clustering, frequent occlusions, and reappearances. To overcome these challenges, this paper proposes an innovative infrared pseudo-sample generation paradigm by designing a physically-driven Heterogeneous Interactive Degradation Model (HIDM). This model simulates real infrared imaging through background-target cooperative degradation mechanisms that account for multiple coupled degradation factors, combined with a random trajectory generation strategy to produce large-scale physically realistic pseudo-sample data, significantly enhancing the domain adaptability of the generated data. Building upon this foundation, we propose a hierarchical fusion-association tracking framework—EnBoT-SORT. This framework employs YOLOv12 as a powerful detector and innovatively incorporates a dynamic target density regulator, a hybrid feature association engine, and a trajectory continuity enhancement module into BoT-SORT, effectively maintaining the continuity and stability of target IDs. Experimental results demonstrate that EnBoT-SORT significantly outperforms existing trackers in intensive UAV motion scenarios, achieving state-of-the-art performance on the IRT-B and IRC-B datasets with HOTA scores of 68.7% and 67.3%, and MOTA scores of 76.2% and 74.6%, respectively. Furthermore, cross-modal experiments on real infrared and visible-light datasets indicate that EnBoT-SORT possesses strong generalization capabilities. This work provides a comprehensive solution for infrared-intensive UAV tracking, spanning from data generation to algorithmic optimization. Our code and datasets are available at GitHub.
{"title":"EnBoT-SORT: Hierarchical fusion-association tracking with pseudo-sample generation for dense thermal infrared UAVs","authors":"Jinxin Guo , Weida Zhan , Yu Chen , Depeng Zhu , Yichun Jiang , Xiaoyu Xu , Deng Han","doi":"10.1016/j.isprsjprs.2025.12.001","DOIUrl":"10.1016/j.isprsjprs.2025.12.001","url":null,"abstract":"<div><div>Thermal infrared dense UAV target detection and tracking present significant challenges at both data and algorithmic levels. At the data level, there exists a scarcity of accurately annotated real-world samples coupled with high acquisition costs. At the algorithmic level, the key difficulty lies in addressing frequent identity switches caused by highly dense target clustering, frequent occlusions, and reappearances. To overcome these challenges, this paper proposes an innovative infrared pseudo-sample generation paradigm by designing a physically-driven Heterogeneous Interactive Degradation Model (HIDM). This model simulates real infrared imaging through background-target cooperative degradation mechanisms that account for multiple coupled degradation factors, combined with a random trajectory generation strategy to produce large-scale physically realistic pseudo-sample data, significantly enhancing the domain adaptability of the generated data. Building upon this foundation, we propose a hierarchical fusion-association tracking framework—EnBoT-SORT. This framework employs YOLOv12 as a powerful detector and innovatively incorporates a dynamic target density regulator, a hybrid feature association engine, and a trajectory continuity enhancement module into BoT-SORT, effectively maintaining the continuity and stability of target IDs. Experimental results demonstrate that EnBoT-SORT significantly outperforms existing trackers in intensive UAV motion scenarios, achieving state-of-the-art performance on the IRT-B and IRC-B datasets with HOTA scores of 68.7% and 67.3%, and MOTA scores of 76.2% and 74.6%, respectively. Furthermore, cross-modal experiments on real infrared and visible-light datasets indicate that EnBoT-SORT possesses strong generalization capabilities. This work provides a comprehensive solution for infrared-intensive UAV tracking, spanning from data generation to algorithmic optimization. Our code and datasets are available at <span><span>GitHub</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 138-154"},"PeriodicalIF":12.2,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145704860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}