The current digital elevation model super-resolution (DEM SR) methods are unstable in regions with significant spatial heterogeneity. To address this issue, this study proposes a regional DEM SR method based on an ensemble learning strategy (ELSR). Specifically, we first classified geographical regions into 10 zones based on their terrestrial geomorphologic conditions to reduce spatial heterogeneity; we then integrated the global terrain features with local geographical zoning for terrain modeling; finally, based on ensemble learning theory, we integrated the advantages of different networks to improve the stability of the generated results. The approach was tested for 46,242 km2 in Sichuan, China. The total accuracy of the regional DEM (stage 3) improved by 2.791 % compared with that of the super-resolution convolutional neural network (SRCNN); the accuracy of the geographical zoning strategy results (stage 2) increased by 1.966 %, and that of the baseline network results (stage 1) increased by 0.950 %. Specifically, the improvement in each stage compared with the previous stage was 110.105 % (in stage 2) and 41.963 % (in stage 3). Additionally, the accuracy of the 10 terrestrial geomorphologic classes improved by at least 2.000 %. In summary, the strategy proposed herein is effective for improving regional DEM resolution, with an improvement in relative accuracy related to terrain relief. This study creatively integrated geographical zoning and ensemble learning ideas to generate a stable, high-resolution regional DEM.
{"title":"An ensemble learning framework for generating high-resolution regional DEMs considering geographical zoning","authors":"Xiaoyi Han, Chen Zhou, Saisai Sun, Chiying Lyu, Mingzhu Gao, Xiangyuan He","doi":"10.1016/j.isprsjprs.2025.02.007","DOIUrl":"10.1016/j.isprsjprs.2025.02.007","url":null,"abstract":"<div><div>The current digital elevation model super-resolution (DEM SR) methods are unstable in regions with significant spatial heterogeneity. To address this issue, this study proposes a regional DEM SR method based on an ensemble learning strategy (ELSR). Specifically, we first classified geographical regions into 10 zones based on their terrestrial geomorphologic conditions to reduce spatial heterogeneity; we then integrated the global terrain features with local geographical zoning for terrain modeling; finally, based on ensemble learning theory, we integrated the advantages of different networks to improve the stability of the generated results. The approach was tested for 46,242 km<sup>2</sup> in Sichuan, China. The total accuracy of the regional DEM (stage 3) improved by 2.791 % compared with that of the super-resolution convolutional neural network (SRCNN); the accuracy of the geographical zoning strategy results (stage 2) increased by 1.966 %, and that of the baseline network results (stage 1) increased by 0.950 %. Specifically, the improvement in each stage compared with the previous stage was 110.105 % (in stage 2) and 41.963 % (in stage 3). Additionally, the accuracy of the 10 terrestrial geomorphologic classes improved by at least 2.000 %. In summary, the strategy proposed herein is effective for improving regional DEM resolution, with an improvement in relative accuracy related to terrain relief. This study creatively integrated geographical zoning and ensemble learning ideas to generate a stable, high-resolution regional DEM.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"221 ","pages":"Pages 363-383"},"PeriodicalIF":10.6,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143455146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyperspectral reflectance as well as thermal infrared emittance unmanned aerial vehicle (UAV)-borne imagery are widely used for determining plant status. However, they have certain limitations to distinguish crops subjected to combined environmental stresses such as nitrogen and water deficiencies. Studies on combined stresses would require a multimodal analysis integrating remotely sensed information from a multitude of sensors. This research identified field-grown sesame plants’ combined nitrogen and water status when subjected to these treatment combinations by exploiting the potential of multimodal remotely sensed dataset. Sesame (Sesamum indicum L.; indeterminate crop) was grown under three nitrogen regimes: low, medium, and high, combined with two irrigation treatments: well-watered and water limited. With the removal of high nitrogen treated sesame plots due to adverse effects on crop development, the effects of combined treatments were analyzed using remotely acquired dataset- UAV-borne sesame canopy hyperspectral at 400 – 1020 nm, red–green–blue, thermal infrared imagery, and contact full range hyperspectral reflectance (400 – 2350 nm) of youngest fully developed leaves in the growing season. Selected leaf traits- leaf nitrogen content, chlorophyll a and b, leaf mass per area, leaf water content, and leaf area index were measured on ground and estimated from UAV-borne hyperspectral dataset using genetic algorithm inspired partial least squares regression models (R2 ranging from 0.5 to 0.9). These estimated trait maps were used to classify the sesame plots for combined treatments with a 40 – 55 % accuracy, indicating its limitation. The reduced separability among the combined treatments was resolved by implementing a multimodal convolutional neural network classification approach integrating UAV-borne hyperspectral, RGB, and normalized thermal infrared imagery that enhanced the accuracy to 65 – 90 %. The ability to remotely distinguish between combined nitrogen and irrigation treatments was demonstrated for field-grown sesame based on the availability of ground truth data, combined treatments, and the developed ensembled multimodal timeline modeling approach.
{"title":"Multimodal ensemble of UAV-borne hyperspectral, thermal, and RGB imagery to identify combined nitrogen and water deficiencies in field-grown sesame","authors":"Maitreya Mohan Sahoo , Rom Tarshish , Yaniv Tubul , Idan Sabag , Yaron Gadri , Gota Morota , Zvi Peleg , Victor Alchanatis , Ittai Herrmann","doi":"10.1016/j.isprsjprs.2025.02.011","DOIUrl":"10.1016/j.isprsjprs.2025.02.011","url":null,"abstract":"<div><div>Hyperspectral reflectance as well as thermal infrared emittance unmanned aerial vehicle (UAV)-borne imagery are widely used for determining plant status. However, they have certain limitations to distinguish crops subjected to combined environmental stresses such as nitrogen and water deficiencies. Studies on combined stresses would require a multimodal analysis integrating remotely sensed information from a multitude of sensors. This research identified field-grown sesame plants’ combined nitrogen and water status when subjected to these treatment combinations by exploiting the potential of multimodal remotely sensed dataset. Sesame (<em>Sesamum indicum</em> L.; indeterminate crop) was grown under three nitrogen regimes: low, medium, and high, combined with two irrigation treatments: well-watered and water limited. With the removal of high nitrogen treated sesame plots due to adverse effects on crop development, the effects of combined treatments were analyzed using remotely acquired dataset- UAV-borne sesame canopy hyperspectral at 400 – 1020 nm, red–green–blue, thermal infrared imagery, and contact full range hyperspectral reflectance (400 – 2350 nm) of youngest fully developed leaves in the growing season. Selected leaf traits- leaf nitrogen content, chlorophyll <em>a</em> and b, leaf mass per area, leaf water content, and leaf area index were measured on ground and estimated from UAV-borne hyperspectral dataset using genetic algorithm inspired partial least squares regression models (R<sup>2</sup> ranging from 0.5 to 0.9). These estimated trait maps were used to classify the sesame plots for combined treatments with a 40 – 55 % accuracy, indicating its limitation. The reduced separability among the combined treatments was resolved by implementing a multimodal convolutional neural network classification approach integrating UAV-borne hyperspectral, RGB, and normalized thermal infrared imagery that enhanced the accuracy to 65 – 90 %. The ability to remotely distinguish between combined nitrogen and irrigation treatments was demonstrated for field-grown sesame based on the availability of ground truth data, combined treatments, and the developed ensembled multimodal timeline modeling approach.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"222 ","pages":"Pages 33-53"},"PeriodicalIF":10.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-20DOI: 10.1016/j.isprsjprs.2025.02.002
Vahid Nasiri , Paweł Hawryło , Piotr Tompalski , Bogdan Wertz , Jarosław Socha
Tree ring width (TRW) is crucial for assessing biomass increments, carbon uptake, forest productivity, and forest health. Due to the limitations involved in measuring TRW, utilizing canopy attributes based on vegetation indices (VIs) offers a promising alternative. This study investigated the species-specific relationship between the VIs derived from the Sentinel optical (Sentinel-2) and SAR (Sentinel-1) time series and TRW. For each of the seven dominant Central European tree species, we aimed to identify the most suitable VI that shows the strongest relationship with the interannual variation in TRW. We also developed species-specific models using the random forest (RF) approach and a variety of VIs to predict TRW. Additionally, the impact of detrending TRW on its correlation with VIs and on the accuracy of TRW modeling was assessed. The results showed that the VIs that had the strongest correlation with TRW differed among the analyzed tree species. The results confirmed our hypothesis that the use of novel VIs, such as the green normalized difference vegetation index (GNDVI), or red-edge-based VIs can increase our ability to detect growth-related canopy attributes. Among all the models constructed based on raw and detrended TRWs, 12–39 % of the annual variance in TRW was explained by the integrated optical and SAR-based features. Comparing the raw and detrended TRWs indicated that detrending is necessary for certain species, even in short-term studies (i.e., less than 6 years). We concluded that Sentinel-based VIs can be used to improve the understanding of species-specific variation in forest growth over large areas. These results are useful for modeling and upscaling forest growth, as well as for assessing the effect of extreme climate events, such as droughts, on forest productivity.
{"title":"Linking remotely sensed growth-related canopy attributes to interannual tree-ring width variations: A species-specific study using Sentinel optical and SAR time series","authors":"Vahid Nasiri , Paweł Hawryło , Piotr Tompalski , Bogdan Wertz , Jarosław Socha","doi":"10.1016/j.isprsjprs.2025.02.002","DOIUrl":"10.1016/j.isprsjprs.2025.02.002","url":null,"abstract":"<div><div>Tree ring width (TRW) is crucial for assessing biomass increments, carbon uptake, forest productivity, and forest health. Due to the limitations involved in measuring TRW, utilizing canopy attributes based on vegetation indices (VIs) offers a promising alternative. This study investigated the species-specific relationship between the VIs derived from the Sentinel optical (Sentinel-2) and SAR (Sentinel-1) time series and TRW. For each of the seven dominant Central European tree species, we aimed to identify the most suitable VI that shows the strongest relationship with the interannual variation in TRW. We also developed species-specific models using the random forest (RF) approach and a variety of VIs to predict TRW. Additionally, the impact of detrending TRW on its correlation with VIs and on the accuracy of TRW modeling was assessed. The results showed that the VIs that had the strongest correlation with TRW differed among the analyzed tree species. The results confirmed our hypothesis that the use of novel VIs, such as the green normalized difference vegetation index (GNDVI), or red-edge-based VIs can increase our ability to detect growth-related canopy attributes. Among all the models constructed based on raw and detrended TRWs, 12–39 % of the annual variance in TRW was explained by the integrated optical and SAR-based features. Comparing the raw and detrended TRWs indicated that detrending is necessary for certain species, even in short-term studies (i.e., less than 6 years). We concluded that Sentinel-based VIs can be used to improve the understanding of species-specific variation in forest growth over large areas. These results are useful for modeling and upscaling forest growth, as well as for assessing the effect of extreme climate events, such as droughts, on forest productivity.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"221 ","pages":"Pages 347-362"},"PeriodicalIF":10.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143455065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-20DOI: 10.1016/j.isprsjprs.2025.02.005
Jianhao Miao , Shuang Li , Xuechen Bai , Wenxia Gan , Jianwei Wu , Xinghua Li
Radiometric normalization (RN), also known as relative radiometric correction, is usually utilized for multi-temporal optical remote sensing image pairs. It is crucial to applications including change detection (CD) and other time-series analyses. Nevertheless, the variations across multi-temporal remote sensing image pairs are complex, containing true changes of landcover and fake changes caused by observation conditions, which poses significant difficulties for CD and other applications. For CD, the goal of RN is to well eliminate the unwanted fake changes. However, neither traditional methods nor current deep learning methods offer satisfactory solution for multi-temporal remote sensing images RN when dealing with such complicated circumstances. Towards this end, a novel pseudo invariant feature (PIF)-inspired weakly supervised generative adversarial network (GAN) for remote sensing images RN, named RS-NormGAN, is proposed to improve CD efficiently. Motivated by PIF, a sub-generator structure with different constraints is introduced to adequately deal with variant and invariant features, respectively. Besides, a global–local attention mechanism is proposed to further refine the performance by compensating spatial distortion and alleviating over-normalization and under-normalization. To verify the effectiveness of RS-NormGAN, massive experiments for CD and semantic CD across diverse scenarios have been conducted on Google Earth Bi-temporal Dataset and a constructed benchmark Sentinel-2 Hefei Change Detection Dataset. Compared with state-of-the-art methods, the proposed RS-NormGAN is very competitive, even if a simple CD network is utilized. The data and code will be available at https://gitbub.com/lixinghua5540/RS-NormGAN.
{"title":"RS-NormGAN: Enhancing change detection of multi-temporal optical remote sensing images through effective radiometric normalization","authors":"Jianhao Miao , Shuang Li , Xuechen Bai , Wenxia Gan , Jianwei Wu , Xinghua Li","doi":"10.1016/j.isprsjprs.2025.02.005","DOIUrl":"10.1016/j.isprsjprs.2025.02.005","url":null,"abstract":"<div><div>Radiometric normalization (RN), also known as relative radiometric correction, is usually utilized for multi-temporal optical remote sensing image pairs. It is crucial to applications including change detection (CD) and other time-series analyses. Nevertheless, the variations across multi-temporal remote sensing image pairs are complex, containing true changes of landcover and fake changes caused by observation conditions, which poses significant difficulties for CD and other applications. For CD, the goal of RN is to well eliminate the unwanted fake changes. However, neither traditional methods nor current deep learning methods offer satisfactory solution for multi-temporal remote sensing images RN when dealing with such complicated circumstances. Towards this end, a novel pseudo invariant feature (PIF)-inspired weakly supervised generative adversarial network (GAN) for remote sensing images RN, named RS-NormGAN, is proposed to improve CD efficiently. Motivated by PIF, a sub-generator structure with different constraints is introduced to adequately deal with variant and invariant features, respectively. Besides, a global–local attention mechanism is proposed to further refine the performance by compensating spatial distortion and alleviating over-normalization and under-normalization. To verify the effectiveness of RS-NormGAN, massive experiments for CD and semantic CD across diverse scenarios have been conducted on Google Earth Bi-temporal Dataset and a constructed benchmark Sentinel-2 Hefei Change Detection Dataset. Compared with state-of-the-art methods, the proposed RS-NormGAN is very competitive, even if a simple CD network is utilized. The data and code will be available at <span><span>https://gitbub.com/lixinghua5540/RS-NormGAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"221 ","pages":"Pages 324-346"},"PeriodicalIF":10.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143445732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-20DOI: 10.1016/j.isprsjprs.2025.02.003
Leilei Jiao , Peng Luo , Rong Huang , Yusheng Xu , Zhen Ye , Sicong Liu , Shijie Liu , Xiaohua Tong
Identifying minerals on Mars is crucial for finding evidence of water on the planet. Currently, spectral inversion methods based on remote sensing data are primarily used; however, they only provide sparse and scattered maps of mineral exposures. To address this limitation, we propose a multi-scale spatial association modeling framework (MSAM) that couples the geographical distribution of Martian hydrous minerals with environmental factors based on the existence of spatial dependence, to achieve dense and continuous mapping of hydrous minerals. Our approach leverages explanatory variables – such as elevation, slope, and aspect – to establish spatial associations with potential areas of hydrous minerals, selected via multiscale search ranges from existing hydrous mineral exposures. These association results are used to identify potential hydrous mineral locations and estimate probabilities for potential hydrous mineral points. High-probability points are then combined with known exposures, and Kriging interpolation is applied to produce a continuous surface map. Finally, the interpolation results are evaluated using geomorphological maps, along with correlation analysis. The proposed MSAM enhances prediction accuracy and addresses the challenges of incomplete detection and undetected areas inherent in remote sensing-based spectral inversion. Results reveal that incorporating environmental factors reduces the RMSE by 25% and improves spatial correlation by 30% compared to traditional interpolation techniques. An overlay analysis intersecting the interpolated results with geomorphologic features obtained through semantic segmentation further demonstrates a coupling relationship between hydrous minerals and geomorphologic features within a specific spatial range.
{"title":"Modeling hydrous mineral distribution on Mars with extremely sparse data: A multi-scale spatial association modeling framework","authors":"Leilei Jiao , Peng Luo , Rong Huang , Yusheng Xu , Zhen Ye , Sicong Liu , Shijie Liu , Xiaohua Tong","doi":"10.1016/j.isprsjprs.2025.02.003","DOIUrl":"10.1016/j.isprsjprs.2025.02.003","url":null,"abstract":"<div><div>Identifying minerals on Mars is crucial for finding evidence of water on the planet. Currently, spectral inversion methods based on remote sensing data are primarily used; however, they only provide sparse and scattered maps of mineral exposures. To address this limitation, we propose a multi-scale spatial association modeling framework (MSAM) that couples the geographical distribution of Martian hydrous minerals with environmental factors based on the existence of spatial dependence, to achieve dense and continuous mapping of hydrous minerals. Our approach leverages explanatory variables – such as elevation, slope, and aspect – to establish spatial associations with potential areas of hydrous minerals, selected via multiscale search ranges from existing hydrous mineral exposures. These association results are used to identify potential hydrous mineral locations and estimate probabilities for potential hydrous mineral points. High-probability points are then combined with known exposures, and Kriging interpolation is applied to produce a continuous surface map. Finally, the interpolation results are evaluated using geomorphological maps, along with correlation analysis. The proposed MSAM enhances prediction accuracy and addresses the challenges of incomplete detection and undetected areas inherent in remote sensing-based spectral inversion. Results reveal that incorporating environmental factors reduces the RMSE by 25% and improves spatial correlation by 30% compared to traditional interpolation techniques. An overlay analysis intersecting the interpolated results with geomorphologic features obtained through semantic segmentation further demonstrates a coupling relationship between hydrous minerals and geomorphologic features within a specific spatial range.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"222 ","pages":"Pages 16-32"},"PeriodicalIF":10.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143455080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-19DOI: 10.1016/j.isprsjprs.2025.01.027
Qishan He, Lingjun Zhao, Kefeng Ji, Li Liu, Gangyao Kuang
Synthetic Aperture Radar (SAR) image characteristics are highly susceptible to variations in the radar operation condition. Meanwhile, acquiring large amounts of SAR data under various imaging conditions is still a challenge in real application scenarios. Such sensitivity and scarcity bring an inadequately robust feature representation learning to recent data-hungry deep learning-based SAR Automatic Target Recognition (ATR) approaches. Considering the fact that physics-based electromagnetic simulated images could reproduce the image characteristics difference under various imaging conditions, we propose a simulation-aided domain adaptation technique to improve the generalization ability without extra measured SAR data. To be specific, We first build a surrogate feature alignment task using only simulated data based on a domain adaptation network. To mitigate the distribution shift problem between simulated and real data, we propose a category-level weighting mechanism based on SAR-SIFT similarity. This approach enhances surrogate feature alignment ability by re-weighting the simulated samples’ features in a category-level manner according to their similarities to the measured data. In addition, a meta-adaption optimization is designed to further reduce the sensitivity to the operation condition variation. We consider the recognition of the targets in simulated data across imaging conditions as an individual meta-task and adopt the multi-gradient descent algorithm to adapt the feature to different operation condition domains. We conduct experiments on two military vehicle datasets, MSTAR and SAMPLE-M with the aid of a simulated civilian vehicle dataset, SarSIM. The proposed method achieves state-of-the-art performance in extended operation conditions with 88.58% and 86.15% accuracy for variations in depression angle and resolution, outperforming our previous simulation-aided domain adaptation work TDDA. The code is available at https://github.com/ShShann/SA2FA-MAO.
{"title":"Simulation-aided similarity-aware feature alignment with meta-adaption optimization for SAR ATR under extended operation conditions","authors":"Qishan He, Lingjun Zhao, Kefeng Ji, Li Liu, Gangyao Kuang","doi":"10.1016/j.isprsjprs.2025.01.027","DOIUrl":"10.1016/j.isprsjprs.2025.01.027","url":null,"abstract":"<div><div>Synthetic Aperture Radar (SAR) image characteristics are highly susceptible to variations in the radar operation condition. Meanwhile, acquiring large amounts of SAR data under various imaging conditions is still a challenge in real application scenarios. Such sensitivity and scarcity bring an inadequately robust feature representation learning to recent data-hungry deep learning-based SAR Automatic Target Recognition (ATR) approaches. Considering the fact that physics-based electromagnetic simulated images could reproduce the image characteristics difference under various imaging conditions, we propose a simulation-aided domain adaptation technique to improve the generalization ability without extra measured SAR data. To be specific, We first build a surrogate feature alignment task using only simulated data based on a domain adaptation network. To mitigate the distribution shift problem between simulated and real data, we propose a category-level weighting mechanism based on SAR-SIFT similarity. This approach enhances surrogate feature alignment ability by re-weighting the simulated samples’ features in a category-level manner according to their similarities to the measured data. In addition, a meta-adaption optimization is designed to further reduce the sensitivity to the operation condition variation. We consider the recognition of the targets in simulated data across imaging conditions as an individual meta-task and adopt the multi-gradient descent algorithm to adapt the feature to different operation condition domains. We conduct experiments on two military vehicle datasets, MSTAR and SAMPLE-M with the aid of a simulated civilian vehicle dataset, SarSIM. The proposed method achieves state-of-the-art performance in extended operation conditions with 88.58% and 86.15% accuracy for variations in depression angle and resolution, outperforming our previous simulation-aided domain adaptation work TDDA. The code is available at <span><span>https://github.com/ShShann/SA2FA-MAO</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"222 ","pages":"Pages 1-15"},"PeriodicalIF":10.6,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143438093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate state estimation is crucial for autonomous navigation in unmanned systems. While traditional visual and lidar systems struggle in adverse conditions such as rain, fog, or smoke, millimeter-wave radar provides robust all-weather localization and mapping capabilities. However, sparse and noisy radar point clouds often compromise localization accuracy and lead to odometry immanent drift. This paper presents GV-iRIOM, a novel millimeter-wave radar localization and mapping system that utilizes a two layer estimation framework, which simultaneously integrates visual, inertial, and GNSS data to improve localization accuracy. The system employs radar inertial odometry and visual inertial odometry as the SLAM front-end. Addressing the varying observation accuracy of 3-axis motion for different azimuth/vertical angles in 4D radar data, we propose an angle-adaptive weighted robust estimation method for radar ego-velocity estimation. Furthermore, we developed a back-end for multi-source information fusion, integrating odometry pose constraints, GNSS observations, and loop closure constraints to ensure globally consistent positioning and mapping. By dynamically initializing GNSS measurements through observability analysis, our system automatically achieves positioning and mapping based on an absolute geographic coordinate framework, and facilitates multi-phase map fusion and multi-robot positioning. Experiments conducted on both in-house data and publicly available datasets validate the system’s robustness and effectiveness. In large-scale scenarios, the absolute localization accuracy is improved by more than 50%, ensuring globally consistent mapping across a variety of challenging environments.
{"title":"GV-iRIOM: GNSS-visual-aided 4D radar inertial odometry and mapping in large-scale environments","authors":"Binliang Wang , Yuan Zhuang , Jianzhu Huai , Yiwen Chen , Jiagang Chen , Nashwa El-Bendary","doi":"10.1016/j.isprsjprs.2025.01.039","DOIUrl":"10.1016/j.isprsjprs.2025.01.039","url":null,"abstract":"<div><div>Accurate state estimation is crucial for autonomous navigation in unmanned systems. While traditional visual and lidar systems struggle in adverse conditions such as rain, fog, or smoke, millimeter-wave radar provides robust all-weather localization and mapping capabilities. However, sparse and noisy radar point clouds often compromise localization accuracy and lead to odometry immanent drift. This paper presents GV-iRIOM, a novel millimeter-wave radar localization and mapping system that utilizes a two layer estimation framework, which simultaneously integrates visual, inertial, and GNSS data to improve localization accuracy. The system employs radar inertial odometry and visual inertial odometry as the SLAM front-end. Addressing the varying observation accuracy of 3-axis motion for different azimuth/vertical angles in 4D radar data, we propose an angle-adaptive weighted robust estimation method for radar ego-velocity estimation. Furthermore, we developed a back-end for multi-source information fusion, integrating odometry pose constraints, GNSS observations, and loop closure constraints to ensure globally consistent positioning and mapping. By dynamically initializing GNSS measurements through observability analysis, our system automatically achieves positioning and mapping based on an absolute geographic coordinate framework, and facilitates multi-phase map fusion and multi-robot positioning. Experiments conducted on both in-house data and publicly available datasets validate the system’s robustness and effectiveness. In large-scale scenarios, the absolute localization accuracy is improved by more than 50%, ensuring globally consistent mapping across a variety of challenging environments.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"221 ","pages":"Pages 310-323"},"PeriodicalIF":10.6,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-17DOI: 10.1016/j.isprsjprs.2025.02.001
Wei Zhang , Xinyu Zhang , Junyu Dong , Xiaojiang Song , Renbo Pang
Addressing data gaps in meteorological radar scan regions remains a significant challenge. Existing radar data recovery methods tend to perform poorly under different types of missing data scenarios, often due to over-smoothing. The actual scenarios represented by radar data are complex and diverse, making it difficult to simulate missing data. Recent developments in generative models have yielded new solutions for the problem of missing data in complex scenarios. Here, we propose a comprehensive inpainting diffusion model (CIDM) for weather radar data, which improves the sampling approach of the original diffusion model. This method utilises prior knowledge from known regions to guide the generation of missing information. The CIDM formalises domain knowledge into generative models, treating the problem of weather radar completion as a generative task, eliminating the need for complex data preprocessing. During the inference phase, prior knowledge of known regions guides the process and incorporates domain knowledge learned by the model to generate information for missing regions, thus supporting radar data recovery in scenarios with arbitrary missing data. Experiments were conducted on various missing data scenarios using Multi-Radar/MultiSensor System data sourced from the National Oceanic and Atmospheric Administration, and the results were compared with those of traditional and deep learning radar restoration methods. Compared with these methods, the CIDM demonstrated superior recovery performance for various missing data scenarios, particularly those with extreme amounts of missing data, in which the restoration accuracy was improved by 5%–35%. These results indicate the significant potential of the CIDM for quantitative applications. The proposed method showcases the capability of generative models in creating fine-grained data for remote sensing applications.
{"title":"CIDM: A comprehensive inpainting diffusion model for missing weather radar data with knowledge guidance","authors":"Wei Zhang , Xinyu Zhang , Junyu Dong , Xiaojiang Song , Renbo Pang","doi":"10.1016/j.isprsjprs.2025.02.001","DOIUrl":"10.1016/j.isprsjprs.2025.02.001","url":null,"abstract":"<div><div>Addressing data gaps in meteorological radar scan regions remains a significant challenge. Existing radar data recovery methods tend to perform poorly under different types of missing data scenarios, often due to over-smoothing. The actual scenarios represented by radar data are complex and diverse, making it difficult to simulate missing data. Recent developments in generative models have yielded new solutions for the problem of missing data in complex scenarios. Here, we propose a comprehensive inpainting diffusion model (CIDM) for weather radar data, which improves the sampling approach of the original diffusion model. This method utilises prior knowledge from known regions to guide the generation of missing information. The CIDM formalises domain knowledge into generative models, treating the problem of weather radar completion as a generative task, eliminating the need for complex data preprocessing. During the inference phase, prior knowledge of known regions guides the process and incorporates domain knowledge learned by the model to generate information for missing regions, thus supporting radar data recovery in scenarios with arbitrary missing data. Experiments were conducted on various missing data scenarios using Multi-Radar/MultiSensor System data sourced from the National Oceanic and Atmospheric Administration, and the results were compared with those of traditional and deep learning radar restoration methods. Compared with these methods, the CIDM demonstrated superior recovery performance for various missing data scenarios, particularly those with extreme amounts of missing data, in which the restoration accuracy was improved by 5%–35%. These results indicate the significant potential of the CIDM for quantitative applications. The proposed method showcases the capability of generative models in creating fine-grained data for remote sensing applications.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"221 ","pages":"Pages 299-309"},"PeriodicalIF":10.6,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-16DOI: 10.1016/j.isprsjprs.2025.02.008
Tao Zhou , Guoqing Zhang , Jida Wang , Zhe Zhu , R.Iestyn Woolway , Xiaoran Han , Fenglin Xu , Jun Peng
Accurate, consistent, and long-term monitoring of global lake dynamics is essential for understanding the impacts of climate change and human activities on water resources and ecosystems. However, existing methods often require extensive manually collected training data and expert knowledge to delineate accurate water extents of various lake types under different environmental conditions, limiting their applicability in data-poor regions and scenarios requiring rapid mapping responses (e.g., lake outburst floods) and frequent monitoring (e.g., highly dynamic reservoir operations). This study presents a novel remote sensing framework for automated global lake mapping using optical imagery, combining single-date and time-series algorithms to address these challenges. The single-date algorithm leverages a multi-objects superposition approach to automatically generate high-quality training sample, enabling robust machine learning-based lake boundary delineation with minimal manual intervention. This innovative approach overcomes the challenge of obtaining representative training sample across diverse environmental contexts and flexibly adapts to the images to be classified. Building upon this, the time-series algorithm incorporates dynamic mapping area adjustment, robust cloud and snow filtering, and time-series analysis, maximizing available clear imagery (>80 %) and optimizing the temporal frequency and spatial accuracy of the produced lake area time series. The framework’s effectiveness is validated by Landsat imagery using globally representative and locally focused test datasets. The automatically generated training sample achieves commission and omission rates of ∼1 % compared to manually collected sample. The resulting single-date lake mapping demonstrates overall accuracy exceeding 96 % and a Mean Percentage Error of <4 % relative to manually delineated lake areas. Additionally, the proposed framework shows improvement in mapping smaller and fractional ice-covered lakes over existing lake products. The mapped lake time series are consistent with the reconstructed products over the long term, while effectively avoiding spurious changes due to data source and processing uncertainties in the short term. This robust, automated framework is valuable for generating accurate, large-scale, and temporally dynamic lake maps to support global lake inventories and monitoring. The framework’s modular design also allows for future adaptation to other optical sensors such as Sentinel-2 and Moderate Resolution Imaging Spectroradiometer (MODIS) imagery, facilitating multi-source data fusion and enhanced surface water mapping capabilities.
{"title":"A novel framework for accurate, automated and dynamic global lake mapping based on optical imagery","authors":"Tao Zhou , Guoqing Zhang , Jida Wang , Zhe Zhu , R.Iestyn Woolway , Xiaoran Han , Fenglin Xu , Jun Peng","doi":"10.1016/j.isprsjprs.2025.02.008","DOIUrl":"10.1016/j.isprsjprs.2025.02.008","url":null,"abstract":"<div><div>Accurate, consistent, and long-term monitoring of global lake dynamics is essential for understanding the impacts of climate change and human activities on water resources and ecosystems. However, existing methods often require extensive manually collected training data and expert knowledge to delineate accurate water extents of various lake types under different environmental conditions, limiting their applicability in data-poor regions and scenarios requiring rapid mapping responses (e.g., lake outburst floods) and frequent monitoring (e.g., highly dynamic reservoir operations). This study presents a novel remote sensing framework for automated global lake mapping using optical imagery, combining single-date and time-series algorithms to address these challenges. The single-date algorithm leverages a multi-objects superposition approach to automatically generate high-quality training sample, enabling robust machine learning-based lake boundary delineation with minimal manual intervention. This innovative approach overcomes the challenge of obtaining representative training sample across diverse environmental contexts and flexibly adapts to the images to be classified. Building upon this, the time-series algorithm incorporates dynamic mapping area adjustment, robust cloud and snow filtering, and time-series analysis, maximizing available clear imagery (>80 %) and optimizing the temporal frequency and spatial accuracy of the produced lake area time series. The framework’s effectiveness is validated by Landsat imagery using globally representative and locally focused test datasets. The automatically generated training sample achieves commission and omission rates of ∼1 % compared to manually collected sample. The resulting single-date lake mapping demonstrates overall accuracy exceeding 96 % and a Mean Percentage Error of <4 % relative to manually delineated lake areas. Additionally, the proposed framework shows improvement in mapping smaller and fractional ice-covered lakes over existing lake products. The mapped lake time series are consistent with the reconstructed products over the long term, while effectively avoiding spurious changes due to data source and processing uncertainties in the short term. This robust, automated framework is valuable for generating accurate, large-scale, and temporally dynamic lake maps to support global lake inventories and monitoring. The framework’s modular design also allows for future adaptation to other optical sensors such as Sentinel-2 and Moderate Resolution Imaging Spectroradiometer (MODIS) imagery, facilitating multi-source data fusion and enhanced surface water mapping capabilities.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"221 ","pages":"Pages 280-298"},"PeriodicalIF":10.6,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143418597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-14DOI: 10.1016/j.isprsjprs.2025.01.024
Zhen Cao , Xiaoxin Mi , Bo Qiu , Zhipeng Cao , Chen Long , Xinrui Yan , Chao Zheng , Zhen Dong , Bisheng Yang
3D street scene semantic segmentation is essential for urban understanding. However, supervised point cloud semantic segmentation networks heavily rely on expensive manual annotations and demonstrate limited generalization capabilities across datasets, which poses limitations in a range of downstream tasks. In contrast, image segmentation networks exhibit stronger generalization. Fortunately, mobile laser scanning systems can collect images and point clouds simultaneously, offering a potential solution for 2D-3D semantic transfer. In this paper, we introduce a cross-modal label transfer framework for point cloud semantic segmentation, without the supervision of 3D semantic annotation. Specifically, the proposed method takes point clouds and the associated posed images of a scene as inputs, and accomplishes the pointwise semantic segmentation for point clouds. We first get the image semantic pseudo-labels through a pre-trained image semantic segmentation model. Building on this, we construct implicit neural radiance fields (NeRF) to achieve multi-view consistent label mapping by jointly constructing color and semantic fields. Then, we design a superpoint semantic module to capture the local geometric features on point clouds, which contributes a lot to correcting semantic errors in the implicit field. Moreover, we introduce a dynamic object filter and a pose adjustment module to address the spatio-temporal misalignment between point clouds and images, further enhancing the consistency of the transferred semantic labels. The proposed approach has shown promising outcomes on two street scene datasets, namely KITTI-360 and WHU-Urban3D, highlighting the effectiveness and reliability of our method. Compared to the SoTA point cloud semantic segmentation method, namely SPT, the proposed method improves mIoU by approximately 15% on the WHU-Urban3D dataset. Our code and data are available at https://github.com/a4152684/StreetSeg.
{"title":"Cross-modal semantic transfer for point cloud semantic segmentation","authors":"Zhen Cao , Xiaoxin Mi , Bo Qiu , Zhipeng Cao , Chen Long , Xinrui Yan , Chao Zheng , Zhen Dong , Bisheng Yang","doi":"10.1016/j.isprsjprs.2025.01.024","DOIUrl":"10.1016/j.isprsjprs.2025.01.024","url":null,"abstract":"<div><div>3D street scene semantic segmentation is essential for urban understanding. However, supervised point cloud semantic segmentation networks heavily rely on expensive manual annotations and demonstrate limited generalization capabilities across datasets, which poses limitations in a range of downstream tasks. In contrast, image segmentation networks exhibit stronger generalization. Fortunately, mobile laser scanning systems can collect images and point clouds simultaneously, offering a potential solution for 2D-3D semantic transfer. In this paper, we introduce a cross-modal label transfer framework for point cloud semantic segmentation, without the supervision of 3D semantic annotation. Specifically, the proposed method takes point clouds and the associated posed images of a scene as inputs, and accomplishes the pointwise semantic segmentation for point clouds. We first get the image semantic pseudo-labels through a pre-trained image semantic segmentation model. Building on this, we construct implicit neural radiance fields (NeRF) to achieve multi-view consistent label mapping by jointly constructing color and semantic fields. Then, we design a superpoint semantic module to capture the local geometric features on point clouds, which contributes a lot to correcting semantic errors in the implicit field. Moreover, we introduce a dynamic object filter and a pose adjustment module to address the spatio-temporal misalignment between point clouds and images, further enhancing the consistency of the transferred semantic labels. The proposed approach has shown promising outcomes on two street scene datasets, namely KITTI-360 and WHU-Urban3D, highlighting the effectiveness and reliability of our method. Compared to the SoTA point cloud semantic segmentation method, namely SPT, the proposed method improves mIoU by approximately 15% on the WHU-Urban3D dataset. Our code and data are available at <span><span>https://github.com/a4152684/StreetSeg</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"221 ","pages":"Pages 265-279"},"PeriodicalIF":10.6,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143418598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}