ISPRS Journal of Photogrammetry and Remote Sensing最新文献_第2页

Improving XCO2 retrieval under high aerosol loads with fused satellite aerosol Data: Advancing understanding of anthropogenic emissions 利用融合卫星气溶胶数据改进高气溶胶负荷下的 XCO2 检索：增进对人为排放的了解

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-03-15 DOI: 10.1016/j.isprsjprs.2025.03.009

Hao Zhu , Tianhai Cheng , Xingyu Li , Xiaotong Ye , Donghao Fan , Tao Tang , Haoran Tong , Lili Zhang

Satellite measurements of the column-averaged dry air mole fraction of carbon dioxide (XCO₂) have been successfully employed to quantify anthropogenic carbon emissions under clean atmospheric conditions. However, for some large anthropogenic sources such as megacities or coal-fired power plants, which are often accompanied by high aerosol loads, especially in developing countries, atmospheric XCO₂ retrieval remains challenging. Traditional XCO₂ retrieval algorithms typically rely on model-based or single-satellite aerosol information as constraints, which offer limited accuracy under high aerosol conditions, resulting in imperfect aerosol scattering characterization. Various satellite sensors dedicated to aerosol detection provide distinct aerosol products, each with its strengths. The fusion of these products offers the potential for more accurate scattering characterization in high aerosol scenarios. Therefore, in this study, we first fused four satellite aerosol products from MODIS and VIIRS sensors using the Bayesian maximum entropy method and then incorporated it into the XCO₂ retrieval from NASA OCO-2 observations to improve retrieval quality under high aerosol conditions. Compared to the operational products, we find that XCO₂ retrievals coupled with co-located fused aerosol data exhibit improved accuracy and precision at higher aerosol loads, against the Total Carbon Column Observing Network (TCCON). Specifically, for high aerosol loadings (AOD@755 nm > 0.25), the mean bias and mean absolute error (MAE) of the XCO₂ retrieval are reduced by 0.14 ppm and 0.1 ppm, respectively, while the standard deviation of the XCO₂ error reaches 1.68 ppm. The detection capability of point source CO₂ emissions corresponding to this precision (1.68 ppm) is also evaluated in this study. Results show that the number of detectable coal-fired power plants globally under high aerosol conditions can be increased by 39 % compared to the application of operational products. These results indicate that using fused satellite aerosol products effectively improves XCO₂ retrieval under high aerosol conditions, advancing carbon emission understanding from important anthropogenic sources, particularly in developing countries.

{"title":"Improving XCO2 retrieval under high aerosol loads with fused satellite aerosol Data: Advancing understanding of anthropogenic emissions","authors":"Hao Zhu , Tianhai Cheng , Xingyu Li , Xiaotong Ye , Donghao Fan , Tao Tang , Haoran Tong , Lili Zhang","doi":"10.1016/j.isprsjprs.2025.03.009","DOIUrl":"10.1016/j.isprsjprs.2025.03.009","url":null,"abstract":"<div><div>Satellite measurements of the column-averaged dry air mole fraction of carbon dioxide (XCO<sub>2</sub>) have been successfully employed to quantify anthropogenic carbon emissions under clean atmospheric conditions. However, for some large anthropogenic sources such as megacities or coal-fired power plants, which are often accompanied by high aerosol loads, especially in developing countries, atmospheric XCO<sub>2</sub> retrieval remains challenging. Traditional XCO<sub>2</sub> retrieval algorithms typically rely on model-based or single-satellite aerosol information as constraints, which offer limited accuracy under high aerosol conditions, resulting in imperfect aerosol scattering characterization. Various satellite sensors dedicated to aerosol detection provide distinct aerosol products, each with its strengths. The fusion of these products offers the potential for more accurate scattering characterization in high aerosol scenarios. Therefore, in this study, we first fused four satellite aerosol products from MODIS and VIIRS sensors using the Bayesian maximum entropy method and then incorporated it into the XCO<sub>2</sub> retrieval from NASA OCO-2 observations to improve retrieval quality under high aerosol conditions. Compared to the operational products, we find that XCO<sub>2</sub> retrievals coupled with co-located fused aerosol data exhibit improved accuracy and precision at higher aerosol loads, against the Total Carbon Column Observing Network (TCCON). Specifically, for high aerosol loadings (AOD@755 nm > 0.25), the mean bias and mean absolute error (MAE) of the XCO<sub>2</sub> retrieval are reduced by 0.14 ppm and 0.1 ppm, respectively, while the standard deviation of the XCO<sub>2</sub> error reaches 1.68 ppm. The detection capability of point source CO<sub>2</sub> emissions corresponding to this precision (1.68 ppm) is also evaluated in this study. Results show that the number of detectable coal-fired power plants globally under high aerosol conditions can be increased by 39 % compared to the application of operational products. These results indicate that using fused satellite aerosol products effectively improves XCO<sub>2</sub> retrieval under high aerosol conditions, advancing carbon emission understanding from important anthropogenic sources, particularly in developing countries.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"223 ","pages":"Pages 146-158"},"PeriodicalIF":10.6,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Potential of Sentinel-1 time-series data for monitoring the phenology of European temperate forests

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-03-14 DOI: 10.1016/j.isprsjprs.2025.02.026

Michael Schlund

Time series from optical sensors are frequently used to retrieve phenology information of forests. While SAR (synthetic aperture radar) sensors can potentially provide even denser time series than optical data, their potential to retrieve phenological information of forests is still underexplored. In addition, the backscatter information from SAR is frequently exploited in the same way (e.g., via dynamic thresholding) as optical data to retrieve phenological information. Sentinel-1 backscatter coefficients of VH (vertical–horizontal) and VV (vertical–vertical) polarizations and their ratio were retrieved for temperate deciduous broad-leaf and evergreen needle-leaf forests in Europe. Breakpoints and dynamic thresholds were retrieved in the locally smoothed time-series data and compared to reference data from PhenoCam and fluxtower networks. It was generally found that breakpoints outperform dynamic thresholds in both forest types in terms of root mean squared differences, bias and

R^{2}

. Best results were achieved using breakpoints on the Sentinel-1 backscatter ratio with RMSEs of 18.4 days for the start of the season (SOS) and 14.0 days for the end of the season (EOS) compared to the 25% dynamic threshold of the seasonal amplitude in the reference data in deciduous broad-leaf forests. Substantially higher RMSE values of 56.7 days for SOS and 56.5 days for EOS were found in evergreen needle-leaf forests. This study suggests the potential of Sentinel-1 for the phenological retrieval of forests, in particular deciduous broad-leaf forests. This information could be used in combination with frequently used optical data to provide comprehensive phenological information on a large scale.

{"title":"Potential of Sentinel-1 time-series data for monitoring the phenology of European temperate forests","authors":"Michael Schlund","doi":"10.1016/j.isprsjprs.2025.02.026","DOIUrl":"10.1016/j.isprsjprs.2025.02.026","url":null,"abstract":"<div><div>Time series from optical sensors are frequently used to retrieve phenology information of forests. While SAR (synthetic aperture radar) sensors can potentially provide even denser time series than optical data, their potential to retrieve phenological information of forests is still underexplored. In addition, the backscatter information from SAR is frequently exploited in the same way (e.g., via dynamic thresholding) as optical data to retrieve phenological information. Sentinel-1 backscatter coefficients of VH (vertical–horizontal) and VV (vertical–vertical) polarizations and their ratio were retrieved for temperate deciduous broad-leaf and evergreen needle-leaf forests in Europe. Breakpoints and dynamic thresholds were retrieved in the locally smoothed time-series data and compared to reference data from PhenoCam and fluxtower networks. It was generally found that breakpoints outperform dynamic thresholds in both forest types in terms of root mean squared differences, bias and <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>. Best results were achieved using breakpoints on the Sentinel-1 backscatter ratio with RMSEs of 18.4 days for the start of the season (SOS) and 14.0 days for the end of the season (EOS) compared to the 25% dynamic threshold of the seasonal amplitude in the reference data in deciduous broad-leaf forests. Substantially higher RMSE values of 56.7 days for SOS and 56.5 days for EOS were found in evergreen needle-leaf forests. This study suggests the potential of Sentinel-1 for the phenological retrieval of forests, in particular deciduous broad-leaf forests. This information could be used in combination with frequently used optical data to provide comprehensive phenological information on a large scale.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"223 ","pages":"Pages 131-145"},"PeriodicalIF":10.6,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Toward Automated and Comprehensive Walkability Audits with Street View Images: Leveraging Virtual Reality for Enhanced Semantic Segmentation

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-03-13 DOI: 10.1016/j.isprsjprs.2025.02.015

Keundeok Park , Donghwan Ki , Sugie Lee

Street view images (SVIs) coupled with computer vision (CV) techniques have become powerful tools in the planning and related fields for measuring the built environment. However, this methodology is often challenging to be implemented due to challenges in capturing a comprehensive set of planning-relevant environmental attributes and ensuring adequate accuracy. The shortcomings arise primarily from the annotation policies of the existing benchmark datasets used to train CV models, which are not specifically tailored to fit urban planning needs. For example, CV models trained on these existing datasets can only capture a very limited subset of the environmental features included in walkability audit tools. To address this gap, this study develops a virtual reality (VR) based benchmark dataset specifically tailored for measuring walkability with CV models. Our aim is to demonstrate that combining VR-based data with the real-world dataset (i.e., ADE20K) improves performance in automated walkability audits. Specifically, we investigate whether VR-based data enables CV models to audit a broader range of walkability-related objects (i.e., comprehensiveness) and to assess objects with enhanced accuracy (i.e., accuracy). In result, the integrated model achieves a pixel accuracy (PA) of 0.964 and an intersection-over-union (IoU) of 0.679, compared to a pixel accuracy of 0.959 and an IoU of 0.605 for the real-only model. Additionally, a model trained solely on virtual data, incorporating classes absent from the original dataset (i.e., bollards), attains a PA of 0.979 and an IoU of 0.676. These findings allow planners to adapt CV and SVI techniques for more planning-relevant purposes, such as accurately and comprehensively measuring walkability.

{"title":"Toward Automated and Comprehensive Walkability Audits with Street View Images: Leveraging Virtual Reality for Enhanced Semantic Segmentation","authors":"Keundeok Park , Donghwan Ki , Sugie Lee","doi":"10.1016/j.isprsjprs.2025.02.015","DOIUrl":"10.1016/j.isprsjprs.2025.02.015","url":null,"abstract":"<div><div>Street view images (SVIs) coupled with computer vision (CV) techniques have become powerful tools in the planning and related fields for measuring the built environment. However, this methodology is often challenging to be implemented due to challenges in capturing a comprehensive set of planning-relevant environmental attributes and ensuring adequate accuracy. The shortcomings arise primarily from the annotation policies of the existing benchmark datasets used to train CV models, which are not specifically tailored to fit urban planning needs. For example, CV models trained on these existing datasets can only capture a very limited subset of the environmental features included in walkability audit tools. To address this gap, this study develops a virtual reality (VR) based benchmark dataset specifically tailored for measuring walkability with CV models. Our aim is to demonstrate that combining VR-based data with the real-world dataset (i.e., ADE20K) improves performance in automated walkability audits. Specifically, we investigate whether VR-based data enables CV models to audit a broader range of walkability-related objects (i.e., comprehensiveness) and to assess objects with enhanced accuracy (i.e., accuracy). In result, the integrated model achieves a pixel accuracy (PA) of 0.964 and an intersection-over-union (IoU) of 0.679, compared to a pixel accuracy of 0.959 and an IoU of 0.605 for the real-only model. Additionally, a model trained solely on virtual data, incorporating classes absent from the original dataset (i.e., bollards), attains a PA of 0.979 and an IoU of 0.676. These findings allow planners to adapt CV and SVI techniques for more planning-relevant purposes, such as accurately and comprehensively measuring walkability.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"223 ","pages":"Pages 78-90"},"PeriodicalIF":10.6,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143609580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-frequency tomographic SAR: A novel 3-D imaging configuration for limited acquisitions

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-03-13 DOI: 10.1016/j.isprsjprs.2025.02.029

Jian Zhao , Zegang Ding , Zhen Wang , Tao Sun , Kaiwen Zhu , Yuhan Wang , Zehua Dong , Linghao Li , Han Li

Tomographic synthetic aperture radar (TomoSAR) technology, as an extension of interferometric SAR (InSAR), solves the layover problem and realizes three-dimensional (3-D) imaging. Now, it is an important research direction in the field of radar imaging. However, TomoSAR usually requires the SAR sensor to make enough acquisitions at different spatial locations to achieve high-quality 3-D imaging, which is high time-costly and inefficient. To solve this problem, we extend multi-frequency (MF) InSAR and propose a novel SAR 3-D imaging configuration: MF-TomoSAR. MF-TomoSAR utilizes limited acquisitions with enhanced degrees of freedom (DOF) in frequency to accomplish 3-D imaging. It can achieve a similar imaging quality as the traditional TomoSAR while significantly improving 3-D imaging efficiency. The main contributions are summarized as follows: First, inspired by the idea of extending multi-baseline (MB) InSAR to TomoSAR, the single baseline (SB) MF-TomoSAR signal model is proposed. The SBMF-TomoSAR model utilizes interferometric processing to eliminate the effects of scattering changes due to different working frequencies (WFs). In the extreme case of only one fixed baseline, multiple interferograms with different WFs can be considered as samples at different spatial frequencies (SFs) to achieve 3-D imaging through spectral estimation. Then, in order to solve the sampling limitation caused by a fixed baseline, the MF-TomoSAR configuration is generalized to a general case of multiple baselines, and the MBMF-TomoSAR signal model is proposed. The MBMF-TomoSAR model realizes SF sampling through different WFs with multiple baselines to achieve sampling expansion and ensure the 3-D imaging quality. Finally, the MF-TomoSAR processing framework is proposed with the baseline distribution optimization method. The MF-TomoSAR configuration (either SB or MB) does not change the essence of spectral estimation in tomographic processing, and the classical tomographic processing algorithms can be directly applied to MF-TomoSAR processing. The computer simulation and unmanned aerial vehicle (UAV) SAR 3-D imaging experiment verify the effectiveness of the proposed MF-TomoSAR configuration.

{"title":"Multi-frequency tomographic SAR: A novel 3-D imaging configuration for limited acquisitions","authors":"Jian Zhao , Zegang Ding , Zhen Wang , Tao Sun , Kaiwen Zhu , Yuhan Wang , Zehua Dong , Linghao Li , Han Li","doi":"10.1016/j.isprsjprs.2025.02.029","DOIUrl":"10.1016/j.isprsjprs.2025.02.029","url":null,"abstract":"<div><div>Tomographic synthetic aperture radar (TomoSAR) technology, as an extension of interferometric SAR (InSAR), solves the layover problem and realizes three-dimensional (3-D) imaging. Now, it is an important research direction in the field of radar imaging. However, TomoSAR usually requires the SAR sensor to make enough acquisitions at different spatial locations to achieve high-quality 3-D imaging, which is high time-costly and inefficient. To solve this problem, we extend multi-frequency (MF) InSAR and propose a novel SAR 3-D imaging configuration: MF-TomoSAR. MF-TomoSAR utilizes limited acquisitions with enhanced degrees of freedom (DOF) in frequency to accomplish 3-D imaging. It can achieve a similar imaging quality as the traditional TomoSAR while significantly improving 3-D imaging efficiency. The main contributions are summarized as follows: First, inspired by the idea of extending multi-baseline (MB) InSAR to TomoSAR, the single baseline (SB) MF-TomoSAR signal model is proposed. The SBMF-TomoSAR model utilizes interferometric processing to eliminate the effects of scattering changes due to different working frequencies (WFs). In the extreme case of only one fixed baseline, multiple interferograms with different WFs can be considered as samples at different spatial frequencies (SFs) to achieve 3-D imaging through spectral estimation. Then, in order to solve the sampling limitation caused by a fixed baseline, the MF-TomoSAR configuration is generalized to a general case of multiple baselines, and the MBMF-TomoSAR signal model is proposed. The MBMF-TomoSAR model realizes SF sampling through different WFs with multiple baselines to achieve sampling expansion and ensure the 3-D imaging quality. Finally, the MF-TomoSAR processing framework is proposed with the baseline distribution optimization method. The MF-TomoSAR configuration (either SB or MB) does not change the essence of spectral estimation in tomographic processing, and the classical tomographic processing algorithms can be directly applied to MF-TomoSAR processing. The computer simulation and unmanned aerial vehicle (UAV) SAR 3-D imaging experiment verify the effectiveness of the proposed MF-TomoSAR configuration.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"223 ","pages":"Pages 91-108"},"PeriodicalIF":10.6,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143620490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Satellite-Based energy balance for estimating actual sugarcane evapotranspiration in the Ethiopian Rift Valley

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-03-13 DOI: 10.1016/j.isprsjprs.2025.03.003

Gezahegn W. Woldemariam , Berhan Gessesse Awoke , Raian Vargas Maretto

<div><div>Satellite-derived actual evapotranspiration (<em>ETa</em>) maps are essential for the development of innovative water management strategies. Over the past decades, multiple novel satellite remote sensing-based surface energy balance (SEB) <em>ETa</em> modeling tools have been widely used to account for field-scale crop water use and irrigation monitoring. However, their predictive capabilities for intensively irrigated commercial sugarcane plantations in the semiarid ecosystems of the Main Ethiopian Rift remain unclear. In this study, we applied and evaluated the comparative performance of four well-established SEB models–SEBAL (Surface Energy Balance Algorithm for Land), METRIC (Mapping Evapotranspiration with Internalized Calibration), SSEB (Simplified Surface Energy Balance), and SSEBop (Operational Simplified Surface Energy Balance)–to estimate <em>ETa</em> using Landsat imagery and weather measurements for the 2021–2022 season, along with an independent validation benchmark, actual evapotranspiration and interception (ETIa), and sugarcane evapotranspiration (ETc) data over irrigated sugarcane monoculture fields at the Metehara Sugar Estate in the Ethiopian Rift Valley. Cumulatively, the Landsat <em>ETa</em> maps derived from the SEB models tracked spatially explicit patterns in the temporal dynamics of sugarcane water use footprint with a higher coefficient of determination (<em>R<sup>2</sup></em>) of ≥ 0.90, with irrigation consumption accounting for more than 80 % of the water fluxes. At the field scale, SSEBop estimated average <em>ETa</em> with superior accuracy (<em>R<sup>2</sup> ≥</em> 0.96; root mean square error (RMSE) = 0.29–5.9 mm; Nash-Sutcliffe model efficiency coefficient (NSE) = 0.86–0.92), resulting in a strong agreement with ETIa (<em>d</em> = 0.95–0.98) and lower percentage bias (PBIAS ≈ 4 %), followed by SSEB (<em>R<sup>2</sup></em> ≥ 0.91; RMSE = 0.25–12 mm, NSE = 0.64–0.89, PBIAS ≤ 8 %), while SEBAL and METRIC estimated <em>ETa</em> with higher relative mean errors (RMSE = 0.83–24 mm) and PBIAS of 17 %. We found a reasonable concordance of the model-predicted average <em>ETa</em> with ETIa and ETc values during the early sugarcane growth phases, with a higher deviation during the mid-peak atmospheric demand season and late growth phases. The estimated annual <em>ETa</em> (mm yr<sup>−1</sup>) ranged from 1303 to 1628 (2021) and 1185–1737 (2022), resulting in a two-year (2021–2022) average-of 1318–1682 mm and seasonal <em>ETa</em> of 2238–2673 mm. Furthermore, we established a hierarchical rating method based on selected performance -metrics, which ranked the proposed models as follows: SSEBop > SSEB > METRIC > SEBAL. In this sense, our findings showed how the optimal method for estimating <em>ETa</em>, which serves as a proxy for -consumptive water use, can be prioritized for irrigated dryland crops with limited <em>in situ</em> measurements by assimilating model sets with publicly available Earth observ

{"title":"Satellite-Based energy balance for estimating actual sugarcane evapotranspiration in the Ethiopian Rift Valley","authors":"Gezahegn W. Woldemariam , Berhan Gessesse Awoke , Raian Vargas Maretto","doi":"10.1016/j.isprsjprs.2025.03.003","DOIUrl":"10.1016/j.isprsjprs.2025.03.003","url":null,"abstract":"<div><div>Satellite-derived actual evapotranspiration (<em>ETa</em>) maps are essential for the development of innovative water management strategies. Over the past decades, multiple novel satellite remote sensing-based surface energy balance (SEB) <em>ETa</em> modeling tools have been widely used to account for field-scale crop water use and irrigation monitoring. However, their predictive capabilities for intensively irrigated commercial sugarcane plantations in the semiarid ecosystems of the Main Ethiopian Rift remain unclear. In this study, we applied and evaluated the comparative performance of four well-established SEB models–SEBAL (Surface Energy Balance Algorithm for Land), METRIC (Mapping Evapotranspiration with Internalized Calibration), SSEB (Simplified Surface Energy Balance), and SSEBop (Operational Simplified Surface Energy Balance)–to estimate <em>ETa</em> using Landsat imagery and weather measurements for the 2021–2022 season, along with an independent validation benchmark, actual evapotranspiration and interception (ETIa), and sugarcane evapotranspiration (ETc) data over irrigated sugarcane monoculture fields at the Metehara Sugar Estate in the Ethiopian Rift Valley. Cumulatively, the Landsat <em>ETa</em> maps derived from the SEB models tracked spatially explicit patterns in the temporal dynamics of sugarcane water use footprint with a higher coefficient of determination (<em>R<sup>2</sup></em>) of ≥ 0.90, with irrigation consumption accounting for more than 80 % of the water fluxes. At the field scale, SSEBop estimated average <em>ETa</em> with superior accuracy (<em>R<sup>2</sup> ≥</em> 0.96; root mean square error (RMSE) = 0.29–5.9 mm; Nash-Sutcliffe model efficiency coefficient (NSE) = 0.86–0.92), resulting in a strong agreement with ETIa (<em>d</em> = 0.95–0.98) and lower percentage bias (PBIAS ≈ 4 %), followed by SSEB (<em>R<sup>2</sup></em> ≥ 0.91; RMSE = 0.25–12 mm, NSE = 0.64–0.89, PBIAS ≤ 8 %), while SEBAL and METRIC estimated <em>ETa</em> with higher relative mean errors (RMSE = 0.83–24 mm) and PBIAS of 17 %. We found a reasonable concordance of the model-predicted average <em>ETa</em> with ETIa and ETc values during the early sugarcane growth phases, with a higher deviation during the mid-peak atmospheric demand season and late growth phases. The estimated annual <em>ETa</em> (mm yr<sup>−1</sup>) ranged from 1303 to 1628 (2021) and 1185–1737 (2022), resulting in a two-year (2021–2022) average-of 1318–1682 mm and seasonal <em>ETa</em> of 2238–2673 mm. Furthermore, we established a hierarchical rating method based on selected performance -metrics, which ranked the proposed models as follows: SSEBop > SSEB > METRIC > SEBAL. In this sense, our findings showed how the optimal method for estimating <em>ETa</em>, which serves as a proxy for -consumptive water use, can be prioritized for irrigated dryland crops with limited <em>in situ</em> measurements by assimilating model sets with publicly available Earth observ","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"223 ","pages":"Pages 109-130"},"PeriodicalIF":10.6,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A novel cyanobacteria occurrence index derived from optical water types in a tropical lake

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-03-12 DOI: 10.1016/j.isprsjprs.2025.03.006

Davide Lomeo , Stefan G.H. Simis , Xiaohan Liu , Nick Selmes , Mark A. Warren , Anne D. Jungblut , Emma J. Tebbs

Cyanobacteria blooms are a threat to water quality of lakes and reservoirs worldwide, requiring scalable monitoring solutions. Existing approaches for remote sensing of cyanobacteria focus on quantifying (accessory) photosynthetic pigment to map surface accumulations. These approaches have proven challenging to validate against in situ observations, limiting uptake in water quality management. Optical Water Types (OWTs) have been used in inland and ocean waters to dynamically select suitable algorithms over optical gradients, thereby helping to limit out-of-scope application of individual algorithms. Here, we present a proof-of-concept study in Winam Gulf, Lake Victoria, extending an existing OWT framework using a hybrid approach combining in situ and satellite-derived water types. This extended OWT set of 25 water types, obtained from K-means clustering > 18 million Sentinel-3 Ocean and Land Colour Instrument (OLCI) spectra, was found to better capture the optical diversity of cyanobacteria bloom phases compared to the original OWT set. We translate this framework into a novel Cyanobacteria Occurrence Index (COI) by assigning weights to key optical features observed in the OWT set, such as phycocyanin absorption and surface accumulation. COI was strongly correlated with established algorithms for chlorophyll-a (Maximum Peak Height; r = 0.9) and phycocyanin (Simis07; r = 0.84), while potentially capturing various bloom phases in optically mixed conditions. We demonstrate how COI could be mapped onto a three-category risk classification to facilitate communication of cyanobacteria occurrence risk. Initial tests across diverse waterbodies suggest potential for wider application, though further validation across different environmental conditions is needed. This work provides a foundation for improved cyanobacteria monitoring in optically complex waters, particularly where conventional sampling approaches face limitations.

蓝藻藻华威胁着全球湖泊和水库的水质，需要可扩展的监测解决方案。现有的蓝藻遥感方法主要通过量化（附属）光合色素来绘制蓝藻表面积累图。事实证明，这些方法很难与现场观测结果进行验证，从而限制了在水质管理中的应用。光学水体类型（OWTs）已被用于内陆和海洋水域，通过光学梯度动态选择合适的算法，从而帮助限制单个算法在范围外的应用。在此，我们介绍了在维多利亚湖维纳姆湾进行的概念验证研究，该研究使用结合原位水类型和卫星水类型的混合方法扩展了现有的 OWT 框架。这套扩展的 OWT 包含 25 种水类型，由 K-means 聚类> 1800 万个哨兵-3 海洋和陆地色彩仪器（OLCI）光谱得出，与原始 OWT 相比，它能更好地捕捉蓝藻绽放阶段的光学多样性。我们将这一框架转化为一种新的蓝藻发生指数（COI），方法是为在 OWT 集上观测到的关键光学特征（如藻蓝蛋白吸收和表面积累）分配权重。COI 与叶绿素-a（最大峰高；r = 0.9）和藻蓝蛋白（Simis07；r = 0.84）的既定算法密切相关，同时有可能捕捉到光学混合条件下的各种藻华阶段。我们展示了如何将 COI 映射到三类风险分类中，以促进蓝藻发生风险的交流。在不同水体中进行的初步测试表明，该方法具有更广泛的应用潜力，但还需要在不同环境条件下进行进一步验证。这项工作为改善光学复杂水体中的蓝藻监测工作奠定了基础，尤其是在传统取样方法受到限制的情况下。

{"title":"A novel cyanobacteria occurrence index derived from optical water types in a tropical lake","authors":"Davide Lomeo , Stefan G.H. Simis , Xiaohan Liu , Nick Selmes , Mark A. Warren , Anne D. Jungblut , Emma J. Tebbs","doi":"10.1016/j.isprsjprs.2025.03.006","DOIUrl":"10.1016/j.isprsjprs.2025.03.006","url":null,"abstract":"<div><div>Cyanobacteria blooms are a threat to water quality of lakes and reservoirs worldwide, requiring scalable monitoring solutions. Existing approaches for remote sensing of cyanobacteria focus on quantifying (accessory) photosynthetic pigment to map surface accumulations. These approaches have proven challenging to validate against in situ observations, limiting uptake in water quality management. Optical Water Types (OWTs) have been used in inland and ocean waters to dynamically select suitable algorithms over optical gradients, thereby helping to limit out-of-scope application of individual algorithms. Here, we present a proof-of-concept study in Winam Gulf, Lake Victoria, extending an existing OWT framework using a hybrid approach combining in situ and satellite-derived water types. This extended OWT set of 25 water types, obtained from K-means clustering > 18 million Sentinel-3 Ocean and Land Colour Instrument (OLCI) spectra, was found to better capture the optical diversity of cyanobacteria bloom phases compared to the original OWT set. We translate this framework into a novel Cyanobacteria Occurrence Index (COI) by assigning weights to key optical features observed in the OWT set, such as phycocyanin absorption and surface accumulation. COI was strongly correlated with established algorithms for chlorophyll-<em>a</em> (Maximum Peak Height; <em>r</em> = 0.9) and phycocyanin (Simis07; <em>r</em> = 0.84), while potentially capturing various bloom phases in optically mixed conditions. We demonstrate how COI could be mapped onto a three-category risk classification to facilitate communication of cyanobacteria occurrence risk. Initial tests across diverse waterbodies suggest potential for wider application, though further validation across different environmental conditions is needed. This work provides a foundation for improved cyanobacteria monitoring in optically complex waters, particularly where conventional sampling approaches face limitations.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"223 ","pages":"Pages 58-77"},"PeriodicalIF":10.6,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143609581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automated registration of forest point clouds from terrestrial and drone platforms using structural features

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-03-12 DOI: 10.1016/j.isprsjprs.2025.02.023

Yiliu Tan , Xin Xu , Hangkai You , Yupan Zhang , Di Wang , Yuichi Onda , Takashi Gomi , Xinwei Wang , Min Chen

Light Detection and Ranging (LiDAR) technology has demonstrated significant effectiveness in forest remote sensing. Terrestrial Laser Scanning (TLS) and Drone Laser Scanning (DLS) systems reconstruct forest point clouds from distinct perspectives. However, a single-platform point cloud is insufficient for a comprehensive reconstruction of multi-layered forest structures. Therefore, registration of point clouds from multiple platforms is an important procedure for providing comprehensive three dimensional reconstruction of the trees for more accurate characterization in forest inventories. However, the irregular and intricate structures of forest scenes, which often lack easily recognizable geometric features such as lines and planes, present substantial challenges for existing registration algorithms, such as Coherent Point Drift(CPD), Fast Global Registration(FGR), and Four Points Congruent Sets(4PCS). To address these challenges, we develop a novel algorithm, namely ForAlign, for the registration of forest point clouds from TLS and DLS. Our algorithm incorporates a tree location-based matching procedure followed by dynamic programming for detailed alignment. It fully considers the issue of inconsistent point cloud density distributions from different platforms and utilizes differential entropy to identify subsets of points with consistent structural features from the two data sources. These subsets serve as the basis for point cloud alignment based on distribution information. To validate the generality and accuracy of the proposed ForAlign, we conducted experiments using both scanned and simulated data describing different forest environments. The results show that our method achieves superior performance, with an average translation error of 6.4 cm and a rotation error of 53.5 mrad, outperforming CPD, FGR, and 4PCS by 43.5%, 55.4%, and 44.0% in translation accuracy, and by 36.4%, 54.6%, and 42.4% in rotation accuracy, respectively. Our study demonstrates that ForAlign effectively mitigates the errors introduced by tree localization in the preprocessing steps caused by varying point densities in TLS and DLS datasets, successfully extracts corresponding tree features among complicated forest scenes, and enables a robust, automated end-to-end registration process. The source code of ForAlign and the dataset are available at https://github.com/yiliutan/ForAlign.

{"title":"Automated registration of forest point clouds from terrestrial and drone platforms using structural features","authors":"Yiliu Tan , Xin Xu , Hangkai You , Yupan Zhang , Di Wang , Yuichi Onda , Takashi Gomi , Xinwei Wang , Min Chen","doi":"10.1016/j.isprsjprs.2025.02.023","DOIUrl":"10.1016/j.isprsjprs.2025.02.023","url":null,"abstract":"<div><div>Light Detection and Ranging (LiDAR) technology has demonstrated significant effectiveness in forest remote sensing. Terrestrial Laser Scanning (TLS) and Drone Laser Scanning (DLS) systems reconstruct forest point clouds from distinct perspectives. However, a single-platform point cloud is insufficient for a comprehensive reconstruction of multi-layered forest structures. Therefore, registration of point clouds from multiple platforms is an important procedure for providing comprehensive three dimensional reconstruction of the trees for more accurate characterization in forest inventories. However, the irregular and intricate structures of forest scenes, which often lack easily recognizable geometric features such as lines and planes, present substantial challenges for existing registration algorithms, such as Coherent Point Drift(CPD), Fast Global Registration(FGR), and Four Points Congruent Sets(4PCS). To address these challenges, we develop a novel algorithm, namely <em>ForAlign</em>, for the registration of forest point clouds from TLS and DLS. Our algorithm incorporates a tree location-based matching procedure followed by dynamic programming for detailed alignment. It fully considers the issue of inconsistent point cloud density distributions from different platforms and utilizes differential entropy to identify subsets of points with consistent structural features from the two data sources. These subsets serve as the basis for point cloud alignment based on distribution information. To validate the generality and accuracy of the proposed <em>ForAlign</em>, we conducted experiments using both scanned and simulated data describing different forest environments. The results show that our method achieves superior performance, with an average translation error of 6.4 cm and a rotation error of 53.5 mrad, outperforming CPD, FGR, and 4PCS by 43.5%, 55.4%, and 44.0% in translation accuracy, and by 36.4%, 54.6%, and 42.4% in rotation accuracy, respectively. Our study demonstrates that <em>ForAlign</em> effectively mitigates the errors introduced by tree localization in the preprocessing steps caused by varying point densities in TLS and DLS datasets, successfully extracts corresponding tree features among complicated forest scenes, and enables a robust, automated end-to-end registration process. The source code of <em>ForAlign</em> and the dataset are available at <span><span>https://github.com/yiliutan/ForAlign</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"223 ","pages":"Pages 28-45"},"PeriodicalIF":10.6,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143600789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SAR altimeter 3-D localization with combined Delay Doppler Image and spatio-temporal echo similarity

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-03-12 DOI: 10.1016/j.isprsjprs.2025.02.027

Yu Wei, Weibo Qin, Fengming Hu

The Synthetic Aperture Radar (SAR) altimeter is an active sensor, which is widely used in satellite microwave remote sensing. It can be also used for geophysical localization by evaluating the similarity between the acquired terrain profile and the prior data. However, typical factors, such as the linear assumption of terrain, high variation of the ground elevation, and wide beam width will degrade the positioning accuracy of localization. In this paper, a 3-D localization method combining Delay-Doppler Image (DDI) and spatio-temporal echo similarity is proposed for the SAR altimeter. Firstly, the signal model of the SAR altimeter for DDI imaging is established. Then, the image-matching algorithm for the DDI is developed to achieve localization in the along-track and height directions. Additionally, spatio-temporal similarity is used to deal with cross-track positioning errors. The main advantage of the proposed method is the reliable 3-D localization, especially for extreme radar configurations, such as large undulations and wide beam width. Experimental results based on both simulated and real data validate the method, showing a significant improvement in localization accuracy.

{"title":"SAR altimeter 3-D localization with combined Delay Doppler Image and spatio-temporal echo similarity","authors":"Yu Wei, Weibo Qin, Fengming Hu","doi":"10.1016/j.isprsjprs.2025.02.027","DOIUrl":"10.1016/j.isprsjprs.2025.02.027","url":null,"abstract":"<div><div>The Synthetic Aperture Radar (SAR) altimeter is an active sensor, which is widely used in satellite microwave remote sensing. It can be also used for geophysical localization by evaluating the similarity between the acquired terrain profile and the prior data. However, typical factors, such as the linear assumption of terrain, high variation of the ground elevation, and wide beam width will degrade the positioning accuracy of localization. In this paper, a 3-D localization method combining Delay-Doppler Image (DDI) and spatio-temporal echo similarity is proposed for the SAR altimeter. Firstly, the signal model of the SAR altimeter for DDI imaging is established. Then, the image-matching algorithm for the DDI is developed to achieve localization in the along-track and height directions. Additionally, spatio-temporal similarity is used to deal with cross-track positioning errors. The main advantage of the proposed method is the reliable 3-D localization, especially for extreme radar configurations, such as large undulations and wide beam width. Experimental results based on both simulated and real data validate the method, showing a significant improvement in localization accuracy.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"223 ","pages":"Pages 46-57"},"PeriodicalIF":10.6,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143600612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mobile robotic multi-view photometric stereo

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-03-08 DOI: 10.1016/j.isprsjprs.2025.02.012

Suryansh Kumar

Multi-View Photometric Stereo (MVPS) is a popular method for fine-detailed 3D acquisition of an object from images. Despite its outstanding results on diverse material objects, a typical MVPS experimental setup requires a well-calibrated light source and a monocular camera installed on an immovable base. This restricts the use of MVPS on a movable platform, limiting us from taking MVPS benefits in 3D acquisition for mobile robotics applications. To this end, we introduce a new mobile robotic system for MVPS. While the proposed system brings advantages, it introduces additional algorithmic challenges. Addressing them, in this paper, we further propose an incremental approach for mobile robotic MVPS. Our approach leverages a supervised learning setup to predict per-view surface normal, object depth, and per-pixel uncertainty in model-predicted results. A refined depth map per view is obtained by solving an MVPS-driven optimization problem proposed in this paper. Later, we fuse the refined depth map while tracking the camera pose w.r.t the reference frame to recover globally consistent object 3D geometry. Experimental results show the advantages of our robotic system and algorithm, featuring the local high-frequency surface detail recovery with globally consistent object shape. Our work is beyond any MVPS system yet presented, providing encouraging results on objects with unknown reflectance properties using fewer frames without a tiring calibration and installation process, enabling computationally efficient robotic automation approach to photogrammetry. The proposed approach is nearly 100 times computationally faster than the state-of-the-art MVPS methods such as Kaya et al., (2023), Kaya et al., (2022) while maintaining the similar results when tested on subjects taken from the benchmark DiLiGenT MV dataset (Li et al., 2020). Furthermore, our system and accompanying algorithm is data-efficient, i.e., it uses significantly fewer frames at test time to perform 3D acquisition¹

{"title":"Mobile robotic multi-view photometric stereo","authors":"Suryansh Kumar","doi":"10.1016/j.isprsjprs.2025.02.012","DOIUrl":"10.1016/j.isprsjprs.2025.02.012","url":null,"abstract":"<div><div>Multi-View Photometric Stereo (MVPS) is a popular method for fine-detailed 3D acquisition of an object from images. Despite its outstanding results on diverse material objects, a typical MVPS experimental setup requires a well-calibrated light source and a monocular camera installed on an immovable base. This restricts the use of MVPS on a movable platform, limiting us from taking MVPS benefits in 3D acquisition for mobile robotics applications. To this end, we introduce a new mobile robotic system for MVPS. While the proposed system brings advantages, it introduces additional algorithmic challenges. Addressing them, in this paper, we further propose an incremental approach for mobile robotic MVPS. Our approach leverages a supervised learning setup to predict per-view surface normal, object depth, and per-pixel uncertainty in model-predicted results. A refined depth map per view is obtained by solving an MVPS-driven optimization problem proposed in this paper. Later, we fuse the refined depth map while tracking the camera pose w.r.t the reference frame to recover globally consistent object 3D geometry. Experimental results show the advantages of our robotic system and algorithm, featuring the local high-frequency surface detail recovery with globally consistent object shape. Our work is beyond any MVPS system yet presented, providing encouraging results on objects with unknown reflectance properties using fewer frames without a tiring calibration and installation process, enabling computationally efficient robotic automation approach to photogrammetry. The proposed approach is nearly 100 times computationally faster than the state-of-the-art MVPS methods such as Kaya et al., (2023), Kaya et al., (2022) while maintaining the similar results when tested on subjects taken from the benchmark DiLiGenT MV dataset (Li et al., 2020). Furthermore, our system and accompanying algorithm is data-efficient, i.e., it uses significantly fewer frames at test time to perform 3D acquisition<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"223 ","pages":"Pages 15-27"},"PeriodicalIF":10.6,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143580722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SDCluster: A clustering based self-supervised pre-training method for semantic segmentation of remote sensing images

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL

ISPRS Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-03-07 DOI: 10.1016/j.isprsjprs.2025.02.021

Hanwen Xu , Chenxiao Zhang , Peng Yue , Kaixuan Wang

Reducing the reliance of remote sensing semantic segmentation models on labeled training data is essential for practical model deployment. Self-supervised pre-training methods, which learn representations from unlabeled data by designing pretext tasks, provide an approach to address this requirement. One inconvenience of the currently contrastive learning-based and masked image modeling-based self-supervised methods is the difficulty in evaluating the quality of the pre-trained model without fine-tuning for semantic segmentation task. Hence, this paper proposes a pixel-level clustering-based self-supervised learning method, named SDCluster, which allows for a qualitative evaluation of the pre-trained model through visualizing the clustering results. Specifically, SDCluster extends the self-distillation framework to the pixel-level by incorporating the clustering assignment module. Then, clustering constraint modules, including prototype constraint module and semantic consistency constraint module, are designed to eliminate ineffective cluster prototypes and preserve the semantic information of ground objects. Benefiting from the correlation between pixel-level clustering and per-pixel classification of semantic segmentation, experimental results indicate that SDCluster exhibits competitive fine-tuning accuracy and robust few-shot segmentation capabilities when compared to prevalent self-supervised methods. Large-scale pre-training experiment and practical application experiment also prove the generalization ability and extensibility of the proposed method. The code and the dataset for practical application experiment are available at https://github.com/openrsgis/SDCluster.

{"title":"SDCluster: A clustering based self-supervised pre-training method for semantic segmentation of remote sensing images","authors":"Hanwen Xu , Chenxiao Zhang , Peng Yue , Kaixuan Wang","doi":"10.1016/j.isprsjprs.2025.02.021","DOIUrl":"10.1016/j.isprsjprs.2025.02.021","url":null,"abstract":"<div><div>Reducing the reliance of remote sensing semantic segmentation models on labeled training data is essential for practical model deployment. Self-supervised pre-training methods, which learn representations from unlabeled data by designing pretext tasks, provide an approach to address this requirement. One inconvenience of the currently contrastive learning-based and masked image modeling-based self-supervised methods is the difficulty in evaluating the quality of the pre-trained model without fine-tuning for semantic segmentation task. Hence, this paper proposes a pixel-level clustering-based self-supervised learning method, named SDCluster, which allows for a qualitative evaluation of the pre-trained model through visualizing the clustering results. Specifically, SDCluster extends the self-distillation framework to the pixel-level by incorporating the clustering assignment module. Then, clustering constraint modules, including prototype constraint module and semantic consistency constraint module, are designed to eliminate ineffective cluster prototypes and preserve the semantic information of ground objects. Benefiting from the correlation between pixel-level clustering and per-pixel classification of semantic segmentation, experimental results indicate that SDCluster exhibits competitive fine-tuning accuracy and robust few-shot segmentation capabilities when compared to prevalent self-supervised methods. Large-scale pre-training experiment and practical application experiment also prove the generalization ability and extensibility of the proposed method. The code and the dataset for practical application experiment are available at <span><span>https://github.com/openrsgis/SDCluster</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"223 ","pages":"Pages 1-14"},"PeriodicalIF":10.6,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143563151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0