Pub Date : 2026-01-30DOI: 10.1016/j.isprsjprs.2026.01.037
Qiong Wu , Panwang Xia , Lei Yu , Yi Liu , Mingtao Xiong , Liheng Zhong , Jingdong Chen , Ming Yang , Yongjun Zhang , Yi Wan
Cross-view geo-localization (CVGL) has been widely applied in fields such as robotic navigation and geographic information coupling. Existing approaches primarily use single images or fixed-view image sequences as queries, which limits perspective diversity. In contrast, when humans determine their location visually, they typically move around to gather multiple perspectives. This behavior suggests that integrating diverse visual cues can improve geo-localization reliability. Therefore, we propose a novel task: Cross-View Image Set Geo-Localization (Set-CVGL), which gathers multiple images with diverse perspectives as a query set for localization. To support this task, we introduce SetVL-480K, a benchmark comprising 480,000 ground images captured worldwide and their corresponding satellite images, with each satellite image corresponds to an average of 40 ground images from varied perspectives and locations. Furthermore, we propose FlexGeo, a flexible method designed for Set-CVGL that can also adapt to single-image and image-sequence inputs. FlexGeo includes two key modules: the Similarity-guided Feature Fuser (SFF), which adaptively fuses image features without prior content dependency, and the Individual-level Attributes Learner (IAL), leveraging geo-attributes of each image for comprehensive scene perception. FlexGeo consistently outperforms existing methods on SetVL-480K and four public datasets (VIGOR, University-1652, SeqGeo, and KITTI-CVL), achieving a 2.34 improvement in localization accuracy on SetVL-480K. The codes and dataset will be available at https://github.com/Mabel0403/Set-CVGL.
{"title":"Set-CVGL: A new perspective on cross-view geo-localization with unordered ground-view image sets","authors":"Qiong Wu , Panwang Xia , Lei Yu , Yi Liu , Mingtao Xiong , Liheng Zhong , Jingdong Chen , Ming Yang , Yongjun Zhang , Yi Wan","doi":"10.1016/j.isprsjprs.2026.01.037","DOIUrl":"10.1016/j.isprsjprs.2026.01.037","url":null,"abstract":"<div><div>Cross-view geo-localization (CVGL) has been widely applied in fields such as robotic navigation and geographic information coupling. Existing approaches primarily use single images or fixed-view image sequences as queries, which limits perspective diversity. In contrast, when humans determine their location visually, they typically move around to gather multiple perspectives. This behavior suggests that integrating diverse visual cues can improve geo-localization reliability. Therefore, we propose a novel task: Cross-View Image Set Geo-Localization (Set-CVGL), which gathers multiple images with diverse perspectives as <strong>a query set</strong> for localization. To support this task, we introduce SetVL-480K, a benchmark comprising 480,000 ground images captured worldwide and their corresponding satellite images, with each satellite image corresponds to an average of 40 ground images from varied perspectives and locations. Furthermore, we propose FlexGeo, a flexible method designed for Set-CVGL that can also adapt to single-image and image-sequence inputs. FlexGeo includes two key modules: the Similarity-guided Feature Fuser (SFF), which adaptively fuses image features without prior content dependency, and the Individual-level Attributes Learner (IAL), leveraging geo-attributes of each image for comprehensive scene perception. FlexGeo consistently outperforms existing methods on SetVL-480K and four public datasets (VIGOR, University-1652, SeqGeo, and KITTI-CVL), achieving a 2.34<span><math><mo>×</mo></math></span> improvement in localization accuracy on SetVL-480K. The codes and dataset will be available at <span><span>https://github.com/Mabel0403/Set-CVGL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 328-345"},"PeriodicalIF":12.2,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1016/j.isprsjprs.2026.01.035
Lizhi Liu, Lijie Huang, Yiding Wang, Pingping Lu, Bo Li, Liang Li, Robert Wang, Yirong Wu
During solar maximum, low-frequency spaceborne Polarimetric Synthetic Aperture Radar (PolSAR) systems suffer ionosphere-induced distortions that couple with system-induced polarimetric distortions. High-precision decoupled polarimetric calibration is therefore essential for obtaining high-fidelity PolSAR data. Existing point-target calibration methods lack a general approach for unbiased estimation of polarimetric distortion across multiple polarimetric modes and calibrator combinations, particularly under spatiotemporally varying ionospheric conditions. To address this, we derive the necessary conditions for unbiased estimation and propose a General Polarimetric Calibration Method (GPCM) applicable to various configurations. In addition, Enhanced Multi-Look Autofocus (EMLA), a modified STEC inversion method, is introduced for precise inversion of Slant Total Electron Content (STEC), enabling estimation of the spatiotemporally varying Faraday rotation angle for system distortion decoupling and PolSAR data compensation. GPCM applied to LuTan-1 HP and QP data results in HH/VV amplitude and phase imbalances of 0.0433 dB (STD: 0.017) and − 0.60° (STD: 1.02°), respectively, measured on trihedral corner reflectors. Calibration results also indicate that QP mode isolation exceeds 39 dB, while estimated axial ratios for HP mode are lower than 0.115 dB. Under comparable conditions, the results of GPCM are consistent with the Freeman analytical method. Furthermore, EMLA outperforms existing STEC inversion methods (COA, MLA, and GIM-based mapping), achieving a mean absolute difference of 1.95 TECU compared with in-situ measurements while demonstrating applicability to general scenes. Overall, the effectiveness of GPCM and EMLA in the LuTan-1 calibration mission is confirmed, indicating their potential for future PolSAR calibration tasks. The primary calibrated experimental dataset is publicly available at https://radars.ac.cn/web/data/getData?dataType=HPSAREADEN&pageType=en.
{"title":"An advanced decoupled polarimetric calibration method for the LuTan-1 hybrid- and quadrature-polarimetric modes","authors":"Lizhi Liu, Lijie Huang, Yiding Wang, Pingping Lu, Bo Li, Liang Li, Robert Wang, Yirong Wu","doi":"10.1016/j.isprsjprs.2026.01.035","DOIUrl":"10.1016/j.isprsjprs.2026.01.035","url":null,"abstract":"<div><div>During solar maximum, low-frequency spaceborne Polarimetric Synthetic Aperture Radar (PolSAR) systems suffer ionosphere-induced distortions that couple with system-induced polarimetric distortions. High-precision decoupled polarimetric calibration is therefore essential for obtaining high-fidelity PolSAR data. Existing point-target calibration methods lack a general approach for unbiased estimation of polarimetric distortion across multiple polarimetric modes and calibrator combinations, particularly under spatiotemporally varying ionospheric conditions. To address this, we derive the necessary conditions for unbiased estimation and propose a General Polarimetric Calibration Method (GPCM) applicable to various configurations. In addition, Enhanced Multi-Look Autofocus (EMLA), a modified STEC inversion method, is introduced for precise inversion of Slant Total Electron Content (STEC), enabling estimation of the spatiotemporally varying Faraday rotation angle for system distortion decoupling and PolSAR data compensation. GPCM applied to LuTan-1 HP and QP data results in HH/VV amplitude and phase imbalances of 0.0433 dB (STD: 0.017) and − 0.60° (STD: 1.02°), respectively, measured on trihedral corner reflectors. Calibration results also indicate that QP mode isolation exceeds 39 dB, while estimated axial ratios for HP mode are lower than 0.115 dB. Under comparable conditions, the results of GPCM are consistent with the Freeman analytical method. Furthermore, EMLA outperforms existing STEC inversion methods (COA, MLA, and GIM-based mapping), achieving a mean absolute difference of 1.95 TECU compared with in-situ measurements while demonstrating applicability to general scenes. Overall, the effectiveness of GPCM and EMLA in the LuTan-1 calibration mission is confirmed, indicating their potential for future PolSAR calibration tasks. The primary calibrated experimental dataset is publicly available at <span><span>https://radars.ac.cn/web/data/getData?dataType=HPSAREADEN&pageType=en</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 310-327"},"PeriodicalIF":12.2,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27DOI: 10.1016/j.isprsjprs.2026.01.034
Xiaochen Yang , Haiping Wang , Yuan Liu , Bisheng Yang , Zhen Dong
We propose RegScorer, a model learning to identify the optimal transformation to register unaligned point clouds. Existing registration advancements can generate a set of candidate transformations, which are then evaluated using conventional metrics such as Inlier Ratio (IR), Mean Squared Error (MSE) or Chamfer Distance (CD). The candidate achieving the best score is selected as the final result. However, we argue that these metrics often fail to select the correct transformation, especially in challenging scenarios involving symmetric objects, repetitive structures, or low-overlap regions. This leads to significant degradation in registration performance, a problem that has long been overlooked. The core issue lies in their limited focus on local geometric consistency and inability to capture two key conflict cases of misalignment: (1) point pairs that are spatially close after alignment but have conflicting features, and (2) point pairs with high feature similarity but large spatial distances after alignment. To address this, we propose RegScorer, which models both the spatial and feature relationships of all point pairs. This allows RegScorer to learn to capture the above conflict cases and provides a more reliable score for transformation quality. On the 3DLoMatch and ScanNet datasets, RegScorer demonstrate 19.3% and 14.1% improvements in registration recall, leading to 4.7% and 5.1% accuracy gains in multiview registration. Moreover, when generalized to symmetric and low-texture outdoor scenes, RegScorer achieves a 25% increase in transformation recall over IR metric, highlighting its robustness and generalizability. The pre-trained model and the complete code repository can be accessed at https://github.com/WHU-USI3DV/RegScorer.
我们提出了RegScorer,一个模型学习来识别最优的转换,以配准不对齐的点云。现有的配准进展可以生成一组候选变换,然后使用传统的指标(如Inlier Ratio (IR)、均方误差(MSE)或倒角距离(CD))对其进行评估。成绩最好的候选人被选为最终成绩。然而,我们认为这些指标经常不能选择正确的转换,特别是在涉及对称对象、重复结构或低重叠区域的具有挑战性的场景中。这将导致注册性能的显著下降,这是一个长期被忽视的问题。核心问题在于它们对局部几何一致性的关注有限,无法捕捉到两种关键的不对齐冲突情况:(1)对齐后空间接近但特征冲突的点对;(2)对齐后特征相似度高但空间距离大的点对。为了解决这个问题,我们提出了RegScorer,它对所有点对的空间和特征关系进行建模。这允许RegScorer学习捕获上述冲突案例,并为转换质量提供更可靠的评分。在3DLoMatch和ScanNet数据集上,RegScorer的注册召回率分别提高了19.3%和14.1%,导致多视图注册的准确率分别提高了4.7%和5.1%。此外,当推广到对称和低纹理户外场景时,RegScorer的变换召回率比IR指标提高了25%,突出了其鲁棒性和泛化性。预训练的模型和完整的代码存储库可以在https://github.com/WHU-USI3DV/RegScorer上访问。
{"title":"RegScorer: Learning to select the best transformation of point cloud registration","authors":"Xiaochen Yang , Haiping Wang , Yuan Liu , Bisheng Yang , Zhen Dong","doi":"10.1016/j.isprsjprs.2026.01.034","DOIUrl":"10.1016/j.isprsjprs.2026.01.034","url":null,"abstract":"<div><div>We propose RegScorer, a model learning to identify the optimal transformation to register unaligned point clouds. Existing registration advancements can generate a set of candidate transformations, which are then evaluated using conventional metrics such as Inlier Ratio (IR), Mean Squared Error (MSE) or Chamfer Distance (CD). The candidate achieving the best score is selected as the final result. However, we argue that these metrics often fail to select the correct transformation, especially in challenging scenarios involving symmetric objects, repetitive structures, or low-overlap regions. This leads to significant degradation in registration performance, a problem that has long been overlooked. The core issue lies in their limited focus on local geometric consistency and inability to capture two key conflict cases of misalignment: (1) point pairs that are spatially close after alignment but have conflicting features, and (2) point pairs with high feature similarity but large spatial distances after alignment. To address this, we propose RegScorer, which models both the spatial and feature relationships of all point pairs. This allows RegScorer to learn to capture the above conflict cases and provides a more reliable score for transformation quality. On the 3DLoMatch and ScanNet datasets, RegScorer demonstrate <strong>19.3</strong>% and <strong>14.1</strong>% improvements in registration recall, leading to <strong>4.7</strong>% and <strong>5.1</strong>% accuracy gains in multiview registration. Moreover, when generalized to symmetric and low-texture outdoor scenes, RegScorer achieves a <strong>25</strong>% increase in transformation recall over IR metric, highlighting its robustness and generalizability. The pre-trained model and the complete code repository can be accessed at <span><span>https://github.com/WHU-USI3DV/RegScorer</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 266-277"},"PeriodicalIF":12.2,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-27DOI: 10.1016/j.isprsjprs.2026.01.031
Josef Taher , Eric Hyyppä , Matti Hyyppä , Klaara Salolahti , Xiaowei Yu , Leena Matikainen , Antero Kukko , Matti Lehtomäki , Harri Kaartinen , Sopitta Thurachen , Paula Litkey , Ville Luoma , Markus Holopainen , Gefei Kong , Hongchao Fan , Petri Rönnholm , Matti Vaaja , Antti Polvivaara , Samuli Junttila , Mikko Vastaranta , Juha Hyyppä
<div><div>Climate-smart and biodiversity-preserving forestry demands precise information on forest resources, extending to the individual tree level. Multispectral airborne laser scanning (ALS) has shown promise in automated point cloud processing, but challenges remain in leveraging deep learning techniques and identifying rare tree species in class-imbalanced datasets. This study addresses these gaps by conducting a comprehensive benchmark of deep learning and traditional shallow machine learning methods for tree species classification. For the study, we collected high-density multispectral ALS data (<span><math><mrow><mo>></mo><mn>1000</mn></mrow></math></span> <span><math><mrow><mi>pts</mi><mo>/</mo><msup><mrow><mi>m</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span>) at three wavelengths using the FGI-developed HeliALS system, complemented by existing Optech Titan data (35 <span><math><mrow><mi>pts</mi><mo>/</mo><msup><mrow><mi>m</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span>), to evaluate the species classification accuracy of various algorithms in a peri-urban study area located in southern Finland. We established a field reference dataset of 6326 segments across nine species using a newly developed browser-based crowdsourcing tool, which facilitated efficient data annotation. The ALS data, including a training dataset of 1065 segments, was shared with the scientific community to foster collaborative research and diverse algorithmic contributions. Based on 5261 test segments, our findings demonstrate that point-based deep learning methods, particularly a point transformer model, outperformed traditional machine learning and image-based deep learning approaches on high-density multispectral point clouds. For the high-density ALS dataset, a point transformer model provided the best performance reaching an overall (macro-average) accuracy of 87.9% (74.5%) with a training set of 1065 segments and 92.0% (85.1%) with a larger training set of 5000 segments. With 1065 training segments, the best image-based deep learning method, DetailView, reached an overall (macro-average) accuracy of 84.3% (63.9%), whereas a shallow random forest (RF) classifier achieved an overall (macro-average) accuracy of 83.2% (61.3%). For the sparser ALS dataset, an RF model topped the list with an overall (macro-average) accuracy of 79.9% (57.6%), closely followed by the point transformer at 79.6% (56.0%). Importantly, the overall classification accuracy of the point transformer model on the HeliALS data increased from 73.0% with no spectral information to 84.7% with single-channel reflectance, and to 87.9% with spectral information of all the three channels. Furthermore, we studied the scaling of the classification accuracy as a function of point density and training set size using 5-fold cross-validation of our dataset. Based on our findings, multispectral information is especially beneficial for sparse point clouds with 1–50 <span><math>
{"title":"Multispectral airborne laser scanning for tree species classification: A benchmark of machine learning and deep learning algorithms","authors":"Josef Taher , Eric Hyyppä , Matti Hyyppä , Klaara Salolahti , Xiaowei Yu , Leena Matikainen , Antero Kukko , Matti Lehtomäki , Harri Kaartinen , Sopitta Thurachen , Paula Litkey , Ville Luoma , Markus Holopainen , Gefei Kong , Hongchao Fan , Petri Rönnholm , Matti Vaaja , Antti Polvivaara , Samuli Junttila , Mikko Vastaranta , Juha Hyyppä","doi":"10.1016/j.isprsjprs.2026.01.031","DOIUrl":"10.1016/j.isprsjprs.2026.01.031","url":null,"abstract":"<div><div>Climate-smart and biodiversity-preserving forestry demands precise information on forest resources, extending to the individual tree level. Multispectral airborne laser scanning (ALS) has shown promise in automated point cloud processing, but challenges remain in leveraging deep learning techniques and identifying rare tree species in class-imbalanced datasets. This study addresses these gaps by conducting a comprehensive benchmark of deep learning and traditional shallow machine learning methods for tree species classification. For the study, we collected high-density multispectral ALS data (<span><math><mrow><mo>></mo><mn>1000</mn></mrow></math></span> <span><math><mrow><mi>pts</mi><mo>/</mo><msup><mrow><mi>m</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span>) at three wavelengths using the FGI-developed HeliALS system, complemented by existing Optech Titan data (35 <span><math><mrow><mi>pts</mi><mo>/</mo><msup><mrow><mi>m</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span>), to evaluate the species classification accuracy of various algorithms in a peri-urban study area located in southern Finland. We established a field reference dataset of 6326 segments across nine species using a newly developed browser-based crowdsourcing tool, which facilitated efficient data annotation. The ALS data, including a training dataset of 1065 segments, was shared with the scientific community to foster collaborative research and diverse algorithmic contributions. Based on 5261 test segments, our findings demonstrate that point-based deep learning methods, particularly a point transformer model, outperformed traditional machine learning and image-based deep learning approaches on high-density multispectral point clouds. For the high-density ALS dataset, a point transformer model provided the best performance reaching an overall (macro-average) accuracy of 87.9% (74.5%) with a training set of 1065 segments and 92.0% (85.1%) with a larger training set of 5000 segments. With 1065 training segments, the best image-based deep learning method, DetailView, reached an overall (macro-average) accuracy of 84.3% (63.9%), whereas a shallow random forest (RF) classifier achieved an overall (macro-average) accuracy of 83.2% (61.3%). For the sparser ALS dataset, an RF model topped the list with an overall (macro-average) accuracy of 79.9% (57.6%), closely followed by the point transformer at 79.6% (56.0%). Importantly, the overall classification accuracy of the point transformer model on the HeliALS data increased from 73.0% with no spectral information to 84.7% with single-channel reflectance, and to 87.9% with spectral information of all the three channels. Furthermore, we studied the scaling of the classification accuracy as a function of point density and training set size using 5-fold cross-validation of our dataset. Based on our findings, multispectral information is especially beneficial for sparse point clouds with 1–50 <span><math>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 278-309"},"PeriodicalIF":12.2,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-23DOI: 10.1016/j.isprsjprs.2026.01.018
Seyed Babak Haji Seyed Asadollah, Giorgos Mountrakis, Stephen B. Shaw
The accelerating frequency, duration and intensity of extreme heat events demand accurate, spatially complete heat exposure metrics. Here, a modeling approach is presented for estimating the daily-maximum Heat Index (HI) at 1 km spatial resolution. Our study area covered the conterminous United States (CONUS) during the warm season (May to September) between 2003 and 2023. More than 4.6 million observations from approximately 2000 weather stations were paired with weather-related, geographical, land cover and historical climatic factors to develop the proposed Satellite-based Heat Index estimatioN modEl (SHINE). Selected explanatory variables at daily temporal intervals included reanalysis products from Modern-Era Retrospective analysis for Research and Applications (MERRA) and direct satellite products from the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor.
The most influential variables for HI estimation were the MERRA surface layer height and specific humidity products and the dual-pass MODIS daily land surface temperatures. These were followed by land cover products capturing water and forest presence, historical norms of wind speed and maximum temperature, elevation information and the corresponding day of year. An Extreme Gradient Boosting (XGBoost) regressor trained with spatial cross-validation explained 93 % of the variance (R2 = 0.93) and attained a Root Mean Square Error (RMSE) of 1.9°C and a Mean Absolute Error (MAE) of 1.4°C. Comparison of alternative configurations showed that while a MERRA-only model provided slightly higher accuracy (RMSE of 1.8°C), its coarse resolution failed to capture fine-scale heat variations. Conversely, a MODIS-only model offered kilometer-scale spatial resolution but with higher estimation errors (RMSE of 2.9°C). Integrating both MERRA and MODIS sources enabled SHINE to maintain spatial detail and preserved accuracy, underscoring the complementary strengths of reanalysis and satellite products. SHINE also demonstrated resistance to missing MODIS LST observations due to clouds as the additional RMSE error was approximately 0.5°C in the worst case of missing both morning and afternoon MODIS land surface temperature observations. Spatial error analysis revealed <1.7°C RMSE in arid and Mediterranean zones but larger, more heterogeneous errors in the humid Midwest and High Plains. From the policy perspective and considering the HI operational range for public-health heat effects, the proposed SHINE approach outperformed typically used proxies, such as land surface and air temperature. The resulting 1 km daily HI estimations can potentially be used as the foundation of the first wall-to-wall, multi-decadal, high resolution heat dataset for CONUS and offer actionable information for public-health heat studies, energy-demand forecasting and environmental-justice implications.
{"title":"Satellite-based heat Index estimatioN modEl (SHINE): An integrated machine learning approach for the conterminous United States","authors":"Seyed Babak Haji Seyed Asadollah, Giorgos Mountrakis, Stephen B. Shaw","doi":"10.1016/j.isprsjprs.2026.01.018","DOIUrl":"10.1016/j.isprsjprs.2026.01.018","url":null,"abstract":"<div><div>The accelerating frequency, duration and intensity of extreme heat events demand accurate, spatially complete heat exposure metrics. Here, a modeling approach is presented for estimating the daily-maximum Heat Index (HI) at 1 km spatial resolution. Our study area covered the conterminous United States (CONUS) during the warm season (May to September) between 2003 and 2023. More than 4.6 million observations from approximately 2000 weather stations were paired with weather-related, geographical, land cover and historical climatic factors to develop the proposed Satellite-based Heat Index estimatioN modEl (SHINE). Selected explanatory variables at daily temporal intervals included reanalysis products from Modern-Era Retrospective analysis for Research and Applications (MERRA) and direct satellite products from the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor.</div><div>The most influential variables for HI estimation were the MERRA surface layer height and specific humidity products and the dual-pass MODIS daily land surface temperatures. These were followed by land cover products capturing water and forest presence, historical norms of wind speed and maximum temperature, elevation information and the corresponding day of year. An Extreme Gradient Boosting (XGBoost) regressor trained with spatial cross-validation explained 93 % of the variance (R<sup>2</sup> = 0.93) and attained a Root Mean Square Error (RMSE) of 1.9°C and a Mean Absolute Error (MAE) of 1.4°C. Comparison of alternative configurations showed that while a MERRA-only model provided slightly higher accuracy (RMSE of 1.8°C), its coarse resolution failed to capture fine-scale heat variations. Conversely, a MODIS-only model offered kilometer-scale spatial resolution but with higher estimation errors (RMSE of 2.9°C). Integrating both MERRA and MODIS sources enabled SHINE to maintain spatial detail and preserved accuracy, underscoring the complementary strengths of reanalysis and satellite products. SHINE also demonstrated resistance to missing MODIS LST observations due to clouds as the additional RMSE error was approximately 0.5°C in the worst case of missing both morning and afternoon MODIS land surface temperature observations. Spatial error analysis revealed <1.7°C RMSE in arid and Mediterranean zones but larger, more heterogeneous errors in the humid Midwest and High Plains. From the policy perspective and considering the HI operational range for public-health heat effects, the proposed SHINE approach outperformed typically used proxies, such as land surface and air temperature. The resulting 1 km daily HI estimations can potentially be used as the foundation of the first wall-to-wall, multi-decadal, high resolution heat dataset for CONUS and offer actionable information for public-health heat studies, energy-demand forecasting and environmental-justice implications.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 209-230"},"PeriodicalIF":12.2,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-23DOI: 10.1016/j.isprsjprs.2026.01.030
Wenpeng Zhao , Shanchuan Guo , Xueliang Zhang , Pengfei Tang , Xiaoquan Pan , Haowei Mu , Chenghan Yang , Zilong Xia , Zheng Wang , Jun Du , Peijun Du
Large-scale and fine-grained extraction of agricultural parcels from very-high-resolution (VHR) imagery is essential for precision agriculture. However, traditional parcel segmentation methods and fully supervised deep learning approaches typically face scalability constraints due to costly manual annotations, while extraction accuracy is generally limited by the inadequate capacity of segmentation architectures to represent complex agricultural scenes. To address these challenges, this study proposes a Weakly Supervised approach for agricultural Parcel Extraction (WSPE), which leverages publicly available 10 m resolution images and labels to guide the delineation of 0.5 m agricultural parcels. The WSPE framework integrates the tabular (Tabular Prior-data Fitted Network, TabPFN) and the vision foundation model (Segment Anything Model 2, SAM2) to initially generate pseudo-labels with high geometric precision. These pseudo-labels are further refined for semantic accuracy through an adaptive noisy label correction module based on curriculum learning. The refined knowledge is distilled into the proposed Triple-branch Kolmogorov-Arnold enhanced Boundary-aware Network (TKBNet), a prompt-free end-to-end architecture enabling rapid inference and scalable deployment, with outputs vectorized through post-processing. The effectiveness of WSPE was evaluated on a self-constructed dataset from nine agricultural zones in China, the public AI4Boundaries and FGFD datasets, and three large-scale regions: Zhoukou, Hengshui, and Fengcheng. Results demonstrate that WSPE and its integrated TKBNet achieve robust performance across datasets with diverse agricultural scenes, validated by extensive comparative and ablation experiments. The weakly supervised approach achieves 97.7 % of fully supervised performance, and large-scale deployment verifies its scalability and generalization, offering a practical solution for fine-grained, large-scale agricultural parcel mapping. Code is available at https://github.com/zhaowenpeng/WSPE.
从高分辨率(VHR)图像中大规模和细粒度地提取农业地块对于精准农业至关重要。然而,传统的包裹分割方法和完全监督的深度学习方法通常面临可扩展性的限制,因为人工标注成本高,而提取精度通常受到分割架构表示复杂农业场景的能力不足的限制。为了解决这些挑战,本研究提出了一种弱监督的农业包裹提取方法(WSPE),该方法利用公开可用的10米分辨率图像和标签来指导0.5米农业包裹的描绘。WSPE框架集成了表格(tabular Prior-data拟合网络,TabPFN)和视觉基础模型(Segment Anything model 2, SAM2),初步生成几何精度较高的伪标签。这些伪标签通过基于课程学习的自适应噪声标签校正模块进一步细化语义准确性。精细化的知识被提炼到提议的三分支Kolmogorov-Arnold增强边界感知网络(TKBNet)中,这是一种即时的端到端架构,可以实现快速推理和可扩展部署,并通过后处理将输出矢量化。利用中国9个农业区自建数据集、AI4Boundaries和FGFD公共数据集以及周口、衡水和丰城3个大尺度区域对WSPE的有效性进行了评价。结果表明,WSPE及其集成的TKBNet在不同农业场景的数据集上实现了稳健的性能,并得到了广泛的对比和消融实验的验证。弱监督方法达到了97.7%的完全监督性能,大规模部署验证了其可扩展性和泛化性,为细粒度、大规模的农业地块测绘提供了实用的解决方案。代码可从https://github.com/zhaowenpeng/WSPE获得。
{"title":"A weakly supervised approach for large-scale agricultural parcel extraction from VHR imagery via foundation models and adaptive noise correction","authors":"Wenpeng Zhao , Shanchuan Guo , Xueliang Zhang , Pengfei Tang , Xiaoquan Pan , Haowei Mu , Chenghan Yang , Zilong Xia , Zheng Wang , Jun Du , Peijun Du","doi":"10.1016/j.isprsjprs.2026.01.030","DOIUrl":"10.1016/j.isprsjprs.2026.01.030","url":null,"abstract":"<div><div>Large-scale and fine-grained extraction of agricultural parcels from very-high-resolution (VHR) imagery is essential for precision agriculture. However, traditional parcel segmentation methods and fully supervised deep learning approaches typically face scalability constraints due to costly manual annotations, while extraction accuracy is generally limited by the inadequate capacity of segmentation architectures to represent complex agricultural scenes. To address these challenges, this study proposes a Weakly Supervised approach for agricultural Parcel Extraction (WSPE), which leverages publicly available 10 m resolution images and labels to guide the delineation of 0.5 m agricultural parcels. The WSPE framework integrates the tabular (Tabular Prior-data Fitted Network, TabPFN) and the vision foundation model (Segment Anything Model 2, SAM2) to initially generate pseudo-labels with high geometric precision. These pseudo-labels are further refined for semantic accuracy through an adaptive noisy label correction module based on curriculum learning. The refined knowledge is distilled into the proposed Triple-branch Kolmogorov-Arnold enhanced Boundary-aware Network (TKBNet), a prompt-free end-to-end architecture enabling rapid inference and scalable deployment, with outputs vectorized through post-processing. The effectiveness of WSPE was evaluated on a self-constructed dataset from nine agricultural zones in China, the public AI4Boundaries and FGFD datasets, and three large-scale regions: Zhoukou, Hengshui, and Fengcheng. Results demonstrate that WSPE and its integrated TKBNet achieve robust performance across datasets with diverse agricultural scenes, validated by extensive comparative and ablation experiments. The weakly supervised approach achieves 97.7 % of fully supervised performance, and large-scale deployment verifies its scalability and generalization, offering a practical solution for fine-grained, large-scale agricultural parcel mapping. Code is available at <span><span>https://github.com/zhaowenpeng/WSPE</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 180-208"},"PeriodicalIF":12.2,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Widespread vegetation changes have been evidenced by satellite-observed long-term trends over decades in vegetation indices (VIs). However, many issues can affect the derived VIs trends, among which the inherent difference between VIs calculated from the same input reflectance has not been investigated. Here, we compared global long-term trends in six widely used RED-NIR (near-infrared)-based VIs calculated from the MODIS nadir bidirectional reflectance distribution function (BRDF) adjusted product (MCD43A4) during 2000–2023, including normalized difference vegetation index (NDVI), kernel NDVI (kNDVI), 2-band enhanced vegetation index (EVI2), near-infrared reflectance of vegetation (NIRv), difference vegetation index (DVI), and plant phenology index (PPI). We identified two distinct groups of VIs, i.e., (1) NDVI and kNDVI, and (2) EVI2, NIRv, DVI, and PPI, which shared similar trends within the group but showed significant directional differences between groups in 17.4% of the studied area. Only 20.5% of the global land surface showed consistent trends. Based on the radiation transfer model and remote sensing observations, we demonstrated that the two groups of VIs differed in their sensitivities to RED and NIR reflectance. These differences lead to inconsistent long-term trends arising from variations in vegetation type, mixed pixel effects, saturation, and asynchronous changes in vegetation chlorophyll content and structural attributes. Comparisons with ground-observed leaf area index (LAI), flux tower gross primary productivity (GPP), and PhenoCam green chromatic coordinate (GCC) further revealed that the EVI2, NIRv, DVI, and PPI trends corresponded more closely with LAI and GPP trends, whereas the NDVI and kNDVI trends were more strongly associated with GCC trends. Our results highlight that long-term vegetation trends derived from different RED–NIR-based VIs must be interpreted by considering their intrinsic sensitivities to biophysical properties, which is essential for reliable assessments of vegetation dynamics.
{"title":"Varying sensitivities of RED-NIR-based vegetation indices to the input reflectance affect the detected long-term trends","authors":"Qing Tian , Hongxiao Jin , Rasmus Fensholt , Torbern Tagesson , Luwei Feng , Feng Tian","doi":"10.1016/j.isprsjprs.2026.01.028","DOIUrl":"10.1016/j.isprsjprs.2026.01.028","url":null,"abstract":"<div><div>Widespread vegetation changes have been evidenced by satellite-observed long-term trends over decades in vegetation indices (VIs). However, many issues can affect the derived VIs trends, among which the inherent difference between VIs calculated from the same input reflectance has not been investigated. Here, we compared global long-term trends in six widely used RED-NIR (near-infrared)-based VIs calculated from the MODIS nadir bidirectional reflectance distribution function (BRDF) adjusted product (MCD43A4) during 2000–2023, including normalized difference vegetation index (NDVI), kernel NDVI (kNDVI), 2-band enhanced vegetation index (EVI2), near-infrared reflectance of vegetation (NIRv), difference vegetation index (DVI), and plant phenology index (PPI). We identified two distinct groups of VIs, i.e., (1) NDVI and kNDVI, and (2) EVI2, NIRv, DVI, and PPI, which shared similar trends within the group but showed significant directional differences between groups in 17.4% of the studied area. Only 20.5% of the global land surface showed consistent trends. Based on the radiation transfer model and remote sensing observations, we demonstrated that the two groups of VIs differed in their sensitivities to RED and NIR reflectance. These differences lead to inconsistent long-term trends arising from variations in vegetation type, mixed pixel effects, saturation, and asynchronous changes in vegetation chlorophyll content and structural attributes. Comparisons with ground-observed leaf area index (LAI), flux tower gross primary productivity (GPP), and PhenoCam green chromatic coordinate (GCC) further revealed that the EVI2, NIRv, DVI, and PPI trends corresponded more closely with LAI and GPP trends, whereas the NDVI and kNDVI trends were more strongly associated with GCC trends. Our results highlight that long-term vegetation trends derived from different RED–NIR-based VIs must be interpreted by considering their intrinsic sensitivities to biophysical properties, which is essential for reliable assessments of vegetation dynamics.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 247-265"},"PeriodicalIF":12.2,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-23DOI: 10.1016/j.isprsjprs.2026.01.024
Liuqian Wang, Jing Zhang, Guangming Mi, Li Zhuo
Fine-grained object recognition (FGOR) is gaining increasing attention in automated remote sensing analysis and interpretation (RSAI). However, the full potential of FGOR in remote sensing images (RSIs) is still constrained by several key issues: the reliance on high-quality labeled data, the difficulty of reconstructing fine details in low-resolution images, and the limited robustness of FGOR model for distinguishing similar object categories. In response, we propose an automatic fine-grained object recognition network (AutoFGOR) that follows a hierarchical dual-pipeline architecture for object analysis at global and regional levels. Specifically, Pipeline I: region detection network, which leverages geometric invariance module for weakly-supervised learning to improve the detection accuracy of sparsely labeled RSIs and extract category-free regions; and on top of that, Pipeline II: regional diffusion with vision language model (RD-VLM), which pioneers the combination of stable diffusion XL (SDXL) and large language and vision assistant (LLaVA) through a specially designed adaptive resolution adaptor (ARA) for object region super-resolution reconstruction, fundamentally solving the difficulties of feature extraction from low-quality regions and fine-grained feature mining. In addition, we introduce a winner-takes-all (WTA) strategy that utilizes a voting mechanism to enhance the reliability of fine-grained classification in complex scenes. Experimental results on FAIR1M-v2.0, VEDAI, and HRSC2016 datasets demonstrate our AutoFGOR achieving 31.72%, 80.25%, and 88.05% mAP, respectively, with highly competitive performance. In addition, the × 4 reconstruction results achieve scores of 0.5275 and 0.8173 on the MANIQA and CLIP-IQA indicators, respectively. The code will be available on GitHub:https://github.com/BJUT-AIVBD/AutoFGOR.
{"title":"Weak supervision makes strong details: fine-grained object recognition in remote sensing images via regional diffusion with VLM","authors":"Liuqian Wang, Jing Zhang, Guangming Mi, Li Zhuo","doi":"10.1016/j.isprsjprs.2026.01.024","DOIUrl":"10.1016/j.isprsjprs.2026.01.024","url":null,"abstract":"<div><div>Fine-grained object recognition (FGOR) is gaining increasing attention in automated remote sensing analysis and interpretation (RSAI). However, the full potential of FGOR in remote sensing images (RSIs) is still constrained by several key issues: the reliance on high-quality labeled data, the difficulty of reconstructing fine details in low-resolution images, and the limited robustness of FGOR model for distinguishing similar object categories. In response, we propose an automatic fine-grained object recognition network (AutoFGOR) that follows a hierarchical dual-pipeline architecture for object analysis at global and regional levels. Specifically, Pipeline I: region detection network, which leverages geometric invariance module for weakly-supervised learning to improve the detection accuracy of sparsely labeled RSIs and extract category-free regions; and on top of that, Pipeline II: regional diffusion with vision language model (RD-VLM), which pioneers the combination of stable diffusion XL (SDXL) and large language and vision assistant (LLaVA) through a specially designed adaptive resolution adaptor (ARA) for object region super-resolution reconstruction, fundamentally solving the difficulties of feature extraction from low-quality regions and fine-grained feature mining. In addition, we introduce a winner-takes-all (WTA) strategy that utilizes a voting mechanism to enhance the reliability of fine-grained classification in complex scenes. Experimental results on FAIR1M-v2.0, VEDAI, and HRSC2016 datasets demonstrate our AutoFGOR achieving 31.72%, 80.25%, and 88.05% mAP, respectively, with highly competitive performance. In addition, the × 4 reconstruction results achieve scores of 0.5275 and 0.8173 on the MANIQA and CLIP-IQA indicators, respectively. <u>The code will be available on GitHub:</u> <span><span><u>https://github.com/BJUT-AIVBD/AutoFGOR</u></span><svg><path></path></svg></span><u>.</u></div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 231-246"},"PeriodicalIF":12.2,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-22DOI: 10.1016/j.isprsjprs.2026.01.017
Kai Hu , Jiaxin Li , Nan Ji , Xueshang Xiang , Kai Jiang , Xieping Gao
Knowledge distillation is extensively utilized in remote sensing object detection within resource-constrained environments. Among knowledge distillation methods, prediction imitation has garnered significant attention due to its ease of deployment. However, prevailing prediction imitation paradigms, which rely on an isolated, point-wise alignment of prediction scores, neglect the crucial spatial semantic information. This oversight is particularly detrimental in remote sensing images due to the abundance of objects with weak feature responses. To this end, we propose a novel Spatial Semantic Enhanced Knowledge Distillation framework, called EKD, for remote sensing object detection. Through two complementary modules, EKD shifts the focus of prediction imitation from matching isolated values to learning structured spatial semantic information. First, for classification distillation, we introduce a Weak-feature Response Enhancement Module, which models the structured spatial relationships between objects and their background to establish an initial perception of objects with weak feature responses. Second, to further capture more refined spatial information, we propose a Teacher Boundary Refinement Module for localization distillation. It provides robust boundary guidance by constructing a regression target enriched with more comprehensive spatial information. Furthermore, we introduce a Feature Mapping mechanism to ensure this spatial semantic knowledge is effectively utilized. Through extensive experiments on the DIOR and DOTA-v1.0 datasets, our method’s superiority is consistently demonstrated across diverse architectures, including both single-stage and two-stage detectors. The results show that our EKD achieves state-of-the-art results and, in some cases, even surpasses the performance of its teacher model. The code will be available soon.
{"title":"Knowledge distillation with spatial semantic enhancement for remote sensing object detection","authors":"Kai Hu , Jiaxin Li , Nan Ji , Xueshang Xiang , Kai Jiang , Xieping Gao","doi":"10.1016/j.isprsjprs.2026.01.017","DOIUrl":"10.1016/j.isprsjprs.2026.01.017","url":null,"abstract":"<div><div>Knowledge distillation is extensively utilized in remote sensing object detection within resource-constrained environments. Among knowledge distillation methods, prediction imitation has garnered significant attention due to its ease of deployment. However, prevailing prediction imitation paradigms, which rely on an isolated, point-wise alignment of prediction scores, neglect the crucial spatial semantic information. This oversight is particularly detrimental in remote sensing images due to the abundance of objects with weak feature responses. To this end, we propose a novel Spatial Semantic Enhanced Knowledge Distillation framework, called <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span><em>EKD</em>, for remote sensing object detection. Through two complementary modules, <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span><em>EKD</em> shifts the focus of prediction imitation from matching isolated values to learning structured spatial semantic information. First, for classification distillation, we introduce a Weak-feature Response Enhancement Module, which models the structured spatial relationships between objects and their background to establish an initial perception of objects with weak feature responses. Second, to further capture more refined spatial information, we propose a Teacher Boundary Refinement Module for localization distillation. It provides robust boundary guidance by constructing a regression target enriched with more comprehensive spatial information. Furthermore, we introduce a Feature Mapping mechanism to ensure this spatial semantic knowledge is effectively utilized. Through extensive experiments on the DIOR and DOTA-v1.0 datasets, our method’s superiority is consistently demonstrated across diverse architectures, including both single-stage and two-stage detectors. The results show that our <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span><em>EKD</em> achieves state-of-the-art results and, in some cases, even surpasses the performance of its teacher model. The code will be available soon.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 144-157"},"PeriodicalIF":12.2,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-22DOI: 10.1016/j.isprsjprs.2026.01.027
Baihong Pan , Xiangming Xiao , Li Pan , Andrew D Richardson , Yujie Liu , Yuan Yao , Cheng Meng , Yanhua Xie , Chenchen Zhang , Yuanwei Qin
Plant phenology serves as a vital indicator of plant’s response to climate variation and change. To date, our knowledge and data products of plant leaf phenology at the scales of large trees and forest stand are very limited, in part due to the lack of time series image data at very high spatial resolution (VHSR, meters). Here, we investigated surface reflectance (BLUE, GREEN, RED) and vegetation indices over a large cottonwood tree, using images from PlanetScope (daily, 3-m) and Sentinel-2A/B (5-day, 10-m) in 2023 and in-situ field photos. At the leaf scale, a green leaf has a spectral signature of BLUE < GREEN > RED, as chlorophyll pigment absorbs more blue and red light than green light, which is named as chlorophyll and green leaf indicator (CGLI); and a dead leaf has BLUE < GREEN < RED. At the tree scale, tree with only branches and trunk (no green leaves) has BLUE < GREEN < RED, while tree with green leaves has BLUE < GREEN > RED. We evaluated the start of season (SOS) and end of season (EOS) of the cottonwood tree, derived from (1) vegetation index (VI) data with three methods (VI-slope-, VI-ratio-, and VI-threshold-based methods) and (2) surface reflectance data with CGLI-based method. To evaluate broader applicability of the CGLI-based method, we applied the same workflow to five deciduous broadleaf forest sites within the National Ecological Observatory Network, equipped with PhenoCam. At these five sites, we compared phenology metrics (SOS, EOS) derived from VI- and CGLI-based methods with reference dates derived from PhenoCam Green Chromatic Coordinate (GCC) data. Results show that the CGLI-based method, which classifies each observation as either green leaf or non-green leaf/canopy (binary), is simple and effective in delineating leaf/canopy dynamics and phenology metrics. These findings provide a foundation for monitoring leaf phenology of large trees using satellite data.
植物物候是反映植物对气候变化响应的重要指标。迄今为止,由于缺乏非常高空间分辨率(VHSR,米)的时间序列图像数据,我们在大树和林分尺度上的植物叶物候知识和数据产品非常有限。在这里,我们利用PlanetScope(每日,3米)和Sentinel-2A/B(5天,10米)在2023年的图像和现场照片,研究了一棵大型棉杨树的表面反射率(蓝色,绿色,红色)和植被指数。在叶片尺度上,绿叶的光谱特征为BLUE <; green >; RED,这是因为叶绿素色素吸收的蓝光和红光比绿光多,称为叶绿素和绿叶指示剂(CGLI);而枯叶则是蓝<;绿<;红。在树的尺度上,只有树枝和树干(没有绿叶)的树是BLUE <; green <; RED,有绿叶的树是BLUE <; green >; RED。本文对杨树的季初(SOS)和季末(EOS)数据进行了评价,该数据来源于:(1)基于植被指数(VI)的三种方法(VI-slope- based、VI-ratio- based和VI-threshold-based方法)和(2)基于cgi的地表反射率数据。为了评估基于cgi方法的更广泛适用性,我们将相同的工作流程应用于国家生态观测站网络内的五个落叶阔叶林站点,并配备了PhenoCam。在这五个地点,我们将基于VI和cgi方法得出的物候指标(SOS, EOS)与来自PhenoCam Green Chromatic Coordinate (GCC)数据的参考日期进行了比较。结果表明,基于cgi的方法将每个观测值分为绿叶或非绿叶/冠层(二元),在描述叶/冠层动态和物候指标方面简单有效。这些发现为利用卫星数据监测大型树木叶片物候提供了基础。
{"title":"Identifying green leaf and leaf phenology of large trees and forests by time series PlanetScope and Sentinel-2 images and the chlorophyll and green leaf indicator (CGLI)","authors":"Baihong Pan , Xiangming Xiao , Li Pan , Andrew D Richardson , Yujie Liu , Yuan Yao , Cheng Meng , Yanhua Xie , Chenchen Zhang , Yuanwei Qin","doi":"10.1016/j.isprsjprs.2026.01.027","DOIUrl":"10.1016/j.isprsjprs.2026.01.027","url":null,"abstract":"<div><div>Plant phenology serves as a vital indicator of plant’s response to climate variation and change. To date, our knowledge and data products of plant leaf phenology at the scales of large trees and forest stand are very limited, in part due to the lack of time series image data at very high spatial resolution (VHSR, meters). Here, we investigated surface reflectance (BLUE, GREEN, RED) and vegetation indices over a large cottonwood tree, using images from PlanetScope (daily, 3-m) and Sentinel-2A/B (5-day, 10-m) in 2023 and in-situ field photos. At the leaf scale, a green leaf has a spectral signature of BLUE < GREEN > RED, as chlorophyll pigment absorbs more blue and red light than green light, which is named as chlorophyll and green leaf indicator (CGLI); and a dead leaf has BLUE < GREEN < RED. At the tree scale, tree with only branches and trunk (no green leaves) has BLUE < GREEN < RED, while tree with green leaves has BLUE < GREEN > RED. We evaluated the start of season (SOS) and end of season (EOS) of the cottonwood tree, derived from (1) vegetation index (VI) data with three methods (VI-slope-, VI-ratio-, and VI-threshold-based methods) and (2) surface reflectance data with CGLI-based method. To evaluate broader applicability of the CGLI-based method, we applied the same workflow to five deciduous broadleaf forest sites within the National Ecological Observatory Network, equipped with PhenoCam. At these five sites, we compared phenology metrics (SOS, EOS) derived from VI- and CGLI-based methods with reference dates derived from PhenoCam Green Chromatic Coordinate (GCC) data. Results show that the CGLI-based method, which classifies each observation as either green leaf or non-green leaf/canopy (binary), is simple and effective in delineating leaf/canopy dynamics and phenology metrics. These findings provide a foundation for monitoring leaf phenology of large trees using satellite data.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 104-125"},"PeriodicalIF":12.2,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}