Pub Date : 2025-01-10DOI: 10.1016/j.isprsjprs.2024.12.019
Muhammad Salah, Salem Ibrahim Salem, Nobuyuki Utsumi, Hiroto Higa, Joji Ishizaka, Kazuo Oki
Chlorophyll-a (Chla) retrieval from satellite observations is crucial for assessing water quality and the health of aquatic ecosystems. Utilizing satellite data, while invaluable, poses challenges including inherent satellite biases, the necessity for precise atmospheric correction (AC), and the complexity of water bodies, all of which complicate establishing a reliable relationship between remote sensing reflectance (R<ce:inf loc="post">rs</ce:inf>) and Chla concentrations. Furthermore, the Global Change Observation Mission − Climate (GCOM-C) satellite operated by Japan Aerospace Exploration Agency (JAXA) has brought a significant leap forward in ocean color monitoring, featuring a 250 m spatial resolution and integrating the 380 nm band, enhancing the detection capabilities for aquatic environments. JAXA’s standard Chla product grounded in empirical algorithms, coupled with the limited research on the impact of atmospheric correction (AC) on R<ce:inf loc="post">rs</ce:inf> products, underscores the need for further analysis of these factors. This study introduces the three bidirectional Long short–term memory and ATtention mechanism Network (3LATNet) model that was trained on a large dataset incorporating 5610 in-situ R<ce:inf loc="post">rs</ce:inf> measurements and their corresponding Chla concentrations collected from global locations to cover broad trophic status. The R<ce:inf loc="post">rs</ce:inf> spectra have been resampled to the Second-Generation Global Imager (SGLI) aboard GCOM-C. The model was also trained using satellite matchup data, aiming to achieve a generalized deep-learning model. 3LATNet was evaluated compared to conventional Chla algorithms and ML algorithms, including JAXA’s standard Chla product. Our findings reveal a remarkable reduction in Chla estimation error, marked by a 42.5 % (from 17 to 9.77 mg/m<ce:sup loc="post">3</ce:sup>) reduction in mean absolute error (MAE) and a 57.3 % (from 43.12 to 18.43 mg/m<ce:sup loc="post">3</ce:sup>) reduction in root mean square error (RMSE) compared to JAXA’s standard Chla algorithm using in-situ data, and nearly a twofold improvement in absolute errors when evaluating using matchup SGLI R<ce:inf loc="post">rs</ce:inf>. Furthermore, we conduct an in-depth assessment of the impact of AC on the models’ performance. SeaDAS predominantly exhibited invalid reflectance values at the 412 nm band, while OC-SMART displayed more significant variability in percentage errors. In comparison, JAXA’s AC proved more precise in retrieving R<ce:inf loc="post">rs</ce:inf>. We comprehensively evaluated the spatial consistency of Chla models under clear and harmful algal bloom events. 3LATNet effectively captured Chla patterns across various ranges. Conversely, the RF algorithm frequently overestimates Chla concentrations in the low to mid-range. JAXA’s Chla algorithm, on the other hand, consistently tends to underestimate Chla concentrations, a trend that is particularly pronounced in high-range Chla a
{"title":"3LATNet: Attention based deep learning model for global Chlorophyll-a retrieval from GCOM-C satellite","authors":"Muhammad Salah, Salem Ibrahim Salem, Nobuyuki Utsumi, Hiroto Higa, Joji Ishizaka, Kazuo Oki","doi":"10.1016/j.isprsjprs.2024.12.019","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.12.019","url":null,"abstract":"Chlorophyll-a (Chla) retrieval from satellite observations is crucial for assessing water quality and the health of aquatic ecosystems. Utilizing satellite data, while invaluable, poses challenges including inherent satellite biases, the necessity for precise atmospheric correction (AC), and the complexity of water bodies, all of which complicate establishing a reliable relationship between remote sensing reflectance (R<ce:inf loc=\"post\">rs</ce:inf>) and Chla concentrations. Furthermore, the Global Change Observation Mission − Climate (GCOM-C) satellite operated by Japan Aerospace Exploration Agency (JAXA) has brought a significant leap forward in ocean color monitoring, featuring a 250 m spatial resolution and integrating the 380 nm band, enhancing the detection capabilities for aquatic environments. JAXA’s standard Chla product grounded in empirical algorithms, coupled with the limited research on the impact of atmospheric correction (AC) on R<ce:inf loc=\"post\">rs</ce:inf> products, underscores the need for further analysis of these factors. This study introduces the three bidirectional Long short–term memory and ATtention mechanism Network (3LATNet) model that was trained on a large dataset incorporating 5610 in-situ R<ce:inf loc=\"post\">rs</ce:inf> measurements and their corresponding Chla concentrations collected from global locations to cover broad trophic status. The R<ce:inf loc=\"post\">rs</ce:inf> spectra have been resampled to the Second-Generation Global Imager (SGLI) aboard GCOM-C. The model was also trained using satellite matchup data, aiming to achieve a generalized deep-learning model. 3LATNet was evaluated compared to conventional Chla algorithms and ML algorithms, including JAXA’s standard Chla product. Our findings reveal a remarkable reduction in Chla estimation error, marked by a 42.5 % (from 17 to 9.77 mg/m<ce:sup loc=\"post\">3</ce:sup>) reduction in mean absolute error (MAE) and a 57.3 % (from 43.12 to 18.43 mg/m<ce:sup loc=\"post\">3</ce:sup>) reduction in root mean square error (RMSE) compared to JAXA’s standard Chla algorithm using in-situ data, and nearly a twofold improvement in absolute errors when evaluating using matchup SGLI R<ce:inf loc=\"post\">rs</ce:inf>. Furthermore, we conduct an in-depth assessment of the impact of AC on the models’ performance. SeaDAS predominantly exhibited invalid reflectance values at the 412 nm band, while OC-SMART displayed more significant variability in percentage errors. In comparison, JAXA’s AC proved more precise in retrieving R<ce:inf loc=\"post\">rs</ce:inf>. We comprehensively evaluated the spatial consistency of Chla models under clear and harmful algal bloom events. 3LATNet effectively captured Chla patterns across various ranges. Conversely, the RF algorithm frequently overestimates Chla concentrations in the low to mid-range. JAXA’s Chla algorithm, on the other hand, consistently tends to underestimate Chla concentrations, a trend that is particularly pronounced in high-range Chla a","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"67 1","pages":""},"PeriodicalIF":12.7,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142967837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-10DOI: 10.1016/j.isprsjprs.2025.01.002
Chongyang Wang, Chenghu Zhou, Xia Zhou, Mingjie Duan, Yingwei Yan, Jiaxue Wang, Li Wang, Kai Jia, Yishan Sun, Danni Wang, Yangxiaoyue Liu, Dan Li, Jinyue Chen, Hao Jiang, Shuisen Chen
The study of estuarine turbidity maximum (ETM) has a long history. However, the algorithms and criteria for ETM identification vary significantly across estuaries and hydrological regimes. Moreover, almost all of these methods depend on derived water parameters, such as suspended sediment concentration and turbidity, which inevitably result in inherent errors in the ETM results. To overcome these disadvantages and develop a standard ETM recognition method that has good applicability in most estuaries, this study analyzed the spectral characteristics of 23 big river estuaries worldwide using Landsat and Sentinel sensor images. Based on the difference in band reflectance between the ETM and normal water bodies, we first proposed a universal method, defined as the product of the ratio of blue, green and red bands to their average value over the entire estuary, namely, Red Green Blue Turbidity (RGBT). Combined with the corresponding remote sensing images, the ETM distributions in the 23 estuaries were extracted and analyzed. It was found that the ETM recognition results for the Pearl River Estuary on different dates (2004, 2015) were consistent with those of previous studies. The validation accuracies (Q) reached 0.8335 and 0.8800, respectively, illustrating the effectiveness of the RGBT method in the Pearl River Estuary. For the other 22 estuaries, the RGBT-based ETM recognition results were evaluated using the corresponding visual interpretation. Comparisons and details of the ETM boundaries indicate that the method works well for all types of estuaries. It also included accurately identifying slightly turbid plumes from maritime wind turbines and bridge piers. The validation accuracy exceeded 0.9 (0.9025–0.9733) in seven estuaries, and surpassed 0.7898 in the remaining 15 estuaries. The RGBT method generally achieved higher accuracy for estuaries in Asia and Europe, followed by estuaries in America and Oceania, with a relatively lower accuracy for estuaries in Africa. But the variation in the accuracy in different regions was small. The average validation accuracy of all estuaries and different seasons was as high as 0.9027. This demonstrates that the unified method with same criterion can directly and effectively recognize ETM distributions from multi-source remote sensing data in different estuaries worldwide.
{"title":"A universal method to recognize global big rivers estuarine turbidity maximum from remote sensing","authors":"Chongyang Wang, Chenghu Zhou, Xia Zhou, Mingjie Duan, Yingwei Yan, Jiaxue Wang, Li Wang, Kai Jia, Yishan Sun, Danni Wang, Yangxiaoyue Liu, Dan Li, Jinyue Chen, Hao Jiang, Shuisen Chen","doi":"10.1016/j.isprsjprs.2025.01.002","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2025.01.002","url":null,"abstract":"The study of estuarine turbidity maximum (ETM) has a long history. However, the algorithms and criteria for ETM identification vary significantly across estuaries and hydrological regimes. Moreover, almost all of these methods depend on derived water parameters, such as suspended sediment concentration and turbidity, which inevitably result in inherent errors in the ETM results. To overcome these disadvantages and develop a standard ETM recognition method that has good applicability in most estuaries, this study analyzed the spectral characteristics of 23 big river estuaries worldwide using Landsat and Sentinel sensor images. Based on the difference in band reflectance between the ETM and normal water bodies, we first proposed a universal method, defined as the product of the ratio of blue, green and red bands to their average value over the entire estuary, namely, Red Green Blue Turbidity (RGBT). Combined with the corresponding remote sensing images, the ETM distributions in the 23 estuaries were extracted and analyzed. It was found that the ETM recognition results for the Pearl River Estuary on different dates (2004, 2015) were consistent with those of previous studies. The validation accuracies (Q) reached 0.8335 and 0.8800, respectively, illustrating the effectiveness of the RGBT method in the Pearl River Estuary. For the other 22 estuaries, the RGBT-based ETM recognition results were evaluated using the corresponding visual interpretation. Comparisons and details of the ETM boundaries indicate that the method works well for all types of estuaries. It also included accurately identifying slightly turbid plumes from maritime wind turbines and bridge piers. The validation accuracy exceeded 0.9 (0.9025–0.9733) in seven estuaries, and surpassed 0.7898 in the remaining 15 estuaries. The RGBT method generally achieved higher accuracy for estuaries in Asia and Europe, followed by estuaries in America and Oceania, with a relatively lower accuracy for estuaries in Africa. But the variation in the accuracy in different regions was small. The average validation accuracy of all estuaries and different seasons was as high as 0.9027. This demonstrates that the unified method with same criterion can directly and effectively recognize ETM distributions from multi-source remote sensing data in different estuaries worldwide.","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"21 1","pages":""},"PeriodicalIF":12.7,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142967835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-09DOI: 10.1016/j.isprsjprs.2025.01.001
Minki Choo, Sihun Jung, Jungho Im, Daehyeon Han
Weather and climate forecasts use the distribution of sea surface temperature (SST) as a critical factor in atmosphere–ocean interactions. High spatial resolution SST data are typically produced using infrared sensors, which use channels with wavelengths ranging from approximately 3.7 to 12 µm. However, SST data retrieved from infrared sensor-based satellites often contain noise and missing areas due to cloud contamination. Therefore, while reconstructing SST under clouds, it is necessary to consider observational noise. In this study, we present the context-aware reconstruction diffusion model for SST (CARE-SST), a denoising diffusion probabilistic model designed to reconstruct SST in cloud-covered regions and reduce observational noise. By conditioning on a reverse diffusion process, CARE-SST can integrate historical satellite data and reduce observational noise. The methodology involves using visible infrared imaging radiometer suite (VIIRS) data and the optimum interpolation SST product as a background. To evaluate the effectiveness of our method, a reconstruction using a fixed mask was performed with 10,578 VIIRS SST data from 2022. The results showed that the mean absolute error and the root mean squared error (RMSE) were 0.23 °C and 0.31 °C, respectively, preserving small-scale features. In real cloud reconstruction scenarios, the proposed model incorporated historical VIIRS SST data and buoy observations, enhancing the quality of reconstructed SST data, particularly in regions with large cloud cover. Relative to other analysis products, such as the operational SST and sea ice analysis, as well as the multi-scale ultra-high-resolution SST, our model showcased a more refined gradient field without blurring effects. In the power spectral density comparison for the Agulhas Current (35–45° S and 10–40° E), only CARE-SST demonstrated feature resolution within 10 km, highlighting superior feature resolution compared to other SST analysis products. Validation against buoy data indicated high performance, with RMSEs (and MAEs) of 0.22 °C (0.16 °C) for the Gulf Stream, 0.27 °C (0.20 °C) for the Kuroshio Current, 0.34 °C (0.25 °C) for the Agulhas Current, and 0.25 °C (0.10 °C) for the Mediterranean Sea. Furthermore, the model maintained robust spatial patterns in global mapping results for selected dates. This study highlights the potential of deep learning models in generating high-resolution, gap-filled SST data on a global scale, offering a foundation for improving deep learning-based data assimilation.
{"title":"CARE-SST: Context-Aware reconstruction diffusion model for Sea surface temperature","authors":"Minki Choo, Sihun Jung, Jungho Im, Daehyeon Han","doi":"10.1016/j.isprsjprs.2025.01.001","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2025.01.001","url":null,"abstract":"Weather and climate forecasts use the distribution of sea surface temperature (SST) as a critical factor in atmosphere–ocean interactions. High spatial resolution SST data are typically produced using infrared sensors, which use channels with wavelengths ranging from approximately 3.7 to 12 µm. However, SST data retrieved from infrared sensor-based satellites often contain noise and missing areas due to cloud contamination. Therefore, while reconstructing SST under clouds, it is necessary to consider observational noise. In this study, we present the context-aware reconstruction diffusion model for SST (CARE-SST), a denoising diffusion probabilistic model designed to reconstruct SST in cloud-covered regions and reduce observational noise. By conditioning on a reverse diffusion process, CARE-SST can integrate historical satellite data and reduce observational noise. The methodology involves using visible infrared imaging radiometer suite (VIIRS) data and the optimum interpolation SST product as a background. To evaluate the effectiveness of our method, a reconstruction using a fixed mask was performed with 10,578 VIIRS SST data from 2022. The results showed that the mean absolute error and the root mean squared error (RMSE) were 0.23 °C and 0.31 °C, respectively, preserving small-scale features. In real cloud reconstruction scenarios, the proposed model incorporated historical VIIRS SST data and buoy observations, enhancing the quality of reconstructed SST data, particularly in regions with large cloud cover. Relative to other analysis products, such as the operational SST and sea ice analysis, as well as the multi-scale ultra-high-resolution SST, our model showcased a more refined gradient field without blurring effects. In the power spectral density comparison for the Agulhas Current (35–45° S and 10–40° E), only CARE-SST demonstrated feature resolution within 10 km, highlighting superior feature resolution compared to other SST analysis products. Validation against buoy data indicated high performance, with RMSEs (and MAEs) of 0.22 °C (0.16 °C) for the Gulf Stream, 0.27 °C (0.20 °C) for the Kuroshio Current, 0.34 °C (0.25 °C) for the Agulhas Current, and 0.25 °C (0.10 °C) for the Mediterranean Sea. Furthermore, the model maintained robust spatial patterns in global mapping results for selected dates. This study highlights the potential of deep learning models in generating high-resolution, gap-filled SST data on a global scale, offering a foundation for improving deep learning-based data assimilation.","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"25 1","pages":""},"PeriodicalIF":12.7,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142967836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The acceleration of global warming and intensifying global climate anomalies have led to a rise in the frequency of wildfires. However, most existing research on wildfire fields focuses primarily on wildfire identification and prediction, with limited attention given to the intelligent interpretation of detailed information, such as fire front within fire region. To address this gap, advance the analysis of fire front in UAV-captured visible images, and facilitate future calculations of fire behavior parameters, a new method is proposed for the intelligent segmentation and fire front interpretation of wildfire regions. This proposed method comprises three key steps: deep learning-based fire segmentation, boundary tracking of wildfire regions, and fire front interpretation. Specifically, the YOLOv7-tiny model is enhanced with a Convolutional Block Attention Module (CBAM), which integrates channel and spatial attention mechanisms to improve the model’s focus on wildfire regions and boost the segmentation precision. Experimental results show that the proposed method improved detection and segmentation precision by 3.8 % and 3.6 %, respectively, compared to existing approaches, and achieved an average segmentation frame rate of 64.72 Hz, which is well above the 30 Hz threshold required for real-time fire segmentation. Furthermore, the method’s effectiveness in boundary tracking and fire front interpreting was validated using an outdoor grassland fire fusion experiment’s real fire image data. Additional tests were conducted in southern New South Wales, Australia, using data that confirmed the robustness of the method in accurately interpreting the fire front. The findings of this research have potential applications in dynamic data-driven forest fire spread modeling and fire digital twinning areas. The code and dataset are publicly available at https://github.com/makemoneyokk/fire-segmentation-interpretation.git.
{"title":"Intelligent segmentation of wildfire region and interpretation of fire front in visible light images from the viewpoint of an unmanned aerial vehicle (UAV)","authors":"Jianwei Li, Jiali Wan, Long Sun, Tongxin Hu, Xingdong Li, Huiru Zheng","doi":"10.1016/j.isprsjprs.2024.12.025","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.12.025","url":null,"abstract":"The acceleration of global warming and intensifying global climate anomalies have led to a rise in the frequency of wildfires. However, most existing research on wildfire fields focuses primarily on wildfire identification and prediction, with limited attention given to the intelligent interpretation of detailed information, such as fire front within fire region. To address this gap, advance the analysis of fire front in UAV-captured visible images, and facilitate future calculations of fire behavior parameters, a new method is proposed for the intelligent segmentation and fire front interpretation of wildfire regions. This proposed method comprises three key steps: deep learning-based fire segmentation, boundary tracking of wildfire regions, and fire front interpretation. Specifically, the YOLOv7-tiny model is enhanced with a Convolutional Block Attention Module (CBAM), which integrates channel and spatial attention mechanisms to improve the model’s focus on wildfire regions and boost the segmentation precision. Experimental results show that the proposed method improved detection and segmentation precision by 3.8 % and 3.6 %, respectively, compared to existing approaches, and achieved an average segmentation frame rate of 64.72 Hz, which is well above the 30 Hz threshold required for real-time fire segmentation. Furthermore, the method’s effectiveness in boundary tracking and fire front interpreting was validated using an outdoor grassland fire fusion experiment’s real fire image data. Additional tests were conducted in southern New South Wales, Australia, using data that confirmed the robustness of the method in accurately interpreting the fire front. The findings of this research have potential applications in dynamic data-driven forest fire spread modeling and fire digital twinning areas. The code and dataset are publicly available at <ce:inter-ref xlink:href=\"https://github.com/makemoneyokk/fire-segmentation-interpretation.git\" xlink:type=\"simple\">https://github.com/makemoneyokk/fire-segmentation-interpretation.git</ce:inter-ref>.","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"6 1","pages":""},"PeriodicalIF":12.7,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142967838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-03DOI: 10.1016/j.isprsjprs.2024.12.022
Feng Li, Xiaojing Yang, Liang Zhang, Yanhua Wang, Yuqi Han, Xin Zhang, Yang Li
In response to the challenges posed by the difficulty in obtaining polarimetric synthetic aperture radar (PolSAR) data for certain specific categories of targets, we present a zero-shot target recognition method for PolSAR images. Based on a generative model, the method leverages the unique characteristics of polarimetric SAR images and incorporates two key modules: the scattering characteristics-guided semantic embedding generation module (SE) and the polarization characteristics-guided distributional correction module (DC). The former ensures the stability of synthetic features for unseen classes by controlling scattering characteristics. At the same time, the latter enhances the quality of synthetic features by utilizing polarimetric features, thereby improving the accuracy of zero-shot recognition. The proposed method is evaluated on the GOTCHA dataset to assess its performance in recognizing unseen classes. The experiment results demonstrate that the proposed method achieves SOTA performance in zero-shot PolSAR target recognition (e.g., improving the recognition accuracy of unseen categories by nearly 20%). Our codes are available at https://github.com/chuyihuan/Zero-shot-PolSAR-target-recognition.
{"title":"Scattering mechanism-guided zero-shot PolSAR target recognition","authors":"Feng Li, Xiaojing Yang, Liang Zhang, Yanhua Wang, Yuqi Han, Xin Zhang, Yang Li","doi":"10.1016/j.isprsjprs.2024.12.022","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.12.022","url":null,"abstract":"In response to the challenges posed by the difficulty in obtaining polarimetric synthetic aperture radar (PolSAR) data for certain specific categories of targets, we present a zero-shot target recognition method for PolSAR images. Based on a generative model, the method leverages the unique characteristics of polarimetric SAR images and incorporates two key modules: the scattering characteristics-guided semantic embedding generation module (SE) and the polarization characteristics-guided distributional correction module (DC). The former ensures the stability of synthetic features for unseen classes by controlling scattering characteristics. At the same time, the latter enhances the quality of synthetic features by utilizing polarimetric features, thereby improving the accuracy of zero-shot recognition. The proposed method is evaluated on the GOTCHA dataset to assess its performance in recognizing unseen classes. The experiment results demonstrate that the proposed method achieves SOTA performance in zero-shot PolSAR target recognition (<ce:italic>e.g.,</ce:italic> improving the recognition accuracy of unseen categories by nearly 20%). Our codes are available at <ce:inter-ref xlink:href=\"https://github.com/chuyihuan/Zero-shot-PolSAR-target-recognition\" xlink:type=\"simple\">https://github.com/chuyihuan/Zero-shot-PolSAR-target-recognition</ce:inter-ref>.","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"37 1","pages":""},"PeriodicalIF":12.7,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-03DOI: 10.1016/j.isprsjprs.2024.12.002
Huanyu Li, Hao Wang, Ying Zhang, Li Li, Peng Ren
We delve into the nascent field of underwater image captioning from three perspectives: challenges, models, and datasets. One challenge arises from the disparities between natural images and underwater images, which hinder the use of the former to train models for the latter. Another challenge exists in the limited feature extraction capabilities of current image captioning models, impeding the generation of accurate underwater image captions. The final challenge, albeit not the least significant, revolves around the insufficiency of data available for underwater image captioning. This insufficiency not only complicates the training of models but also poses challenges for evaluating their performance effectively. To address these challenges, we make three novel contributions. First, we employ a physics-based degradation technique to transform natural images into degraded images that closely resemble realistic underwater images. Based on the degraded images, we develop a meta-learning strategy specifically tailored for underwater tasks. Second, we develop an underwater image captioning model based on scene-object feature fusion. It fuses underwater scene features extracted by ResNeXt and object features localized by YOLOv8, yielding comprehensive features for underwater image captioning. Last but not least, we construct an underwater image captioning dataset covering various underwater scenes, with each underwater image annotated with five accurate captions for the purpose of comprehensive training and validation. Experimental results on the new dataset validate the effectiveness of our novel models. The code and datasets are released at https://gitee.com/LHY-CODE/UICM-SOFF.
{"title":"Underwater image captioning: Challenges, models, and datasets","authors":"Huanyu Li, Hao Wang, Ying Zhang, Li Li, Peng Ren","doi":"10.1016/j.isprsjprs.2024.12.002","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.12.002","url":null,"abstract":"We delve into the nascent field of underwater image captioning from three perspectives: challenges, models, and datasets. One challenge arises from the disparities between natural images and underwater images, which hinder the use of the former to train models for the latter. Another challenge exists in the limited feature extraction capabilities of current image captioning models, impeding the generation of accurate underwater image captions. The final challenge, albeit not the least significant, revolves around the insufficiency of data available for underwater image captioning. This insufficiency not only complicates the training of models but also poses challenges for evaluating their performance effectively. To address these challenges, we make three novel contributions. First, we employ a physics-based degradation technique to transform natural images into degraded images that closely resemble realistic underwater images. Based on the degraded images, we develop a meta-learning strategy specifically tailored for underwater tasks. Second, we develop an underwater image captioning model based on scene-object feature fusion. It fuses underwater scene features extracted by ResNeXt and object features localized by YOLOv8, yielding comprehensive features for underwater image captioning. Last but not least, we construct an underwater image captioning dataset covering various underwater scenes, with each underwater image annotated with five accurate captions for the purpose of comprehensive training and validation. Experimental results on the new dataset validate the effectiveness of our novel models. The code and datasets are released at <ce:inter-ref xlink:href=\"https://gitee.com/LHY-CODE/UICM-SOFF\" xlink:type=\"simple\">https://gitee.com/LHY-CODE/UICM-SOFF</ce:inter-ref>.","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"9 1","pages":""},"PeriodicalIF":12.7,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-02DOI: 10.1016/j.isprsjprs.2024.12.024
Hamid Ebrahimy, Tong Yu, Zhou Zhang
Monitoring agricultural areas, given their rapid transformation and small-scale spatial changes, necessitates obtaining dense time series of high-resolution remote sensing data. In this manner, the unmanned aerial vehicle (UAV) that can provide high-resolution images is indispensable for monitoring and assessing agricultural areas, especially for rapidly changing crops like alfalfa. Considering the practical limitations of acquiring daily UAV images, the utilization of spatiotemporal fusion (STF) approaches to integrate publicly available satellite images with high temporal resolution and UAV images with high spatial resolution can be considered an effective alternative. This study proposed an effective STF algorithm that utilizes the Generalized Linear Model (GLM) as the mapping function and is called GLM-STF. The algorithm is designed to use coarse difference images to map fine difference images via the GLM algorithm. It then combines these fine difference images with the original fine images to synthesize daily UAV image at the prediction time. In this study, we deployed a two-step STF process: (1) MODIS MCD43A4 and Harmonized Landsat and Sentinel-2 (HLS) data were fused to produce daily HLS images; and (2) daily HLS data and UAV images were fused to produce daily UAV images. We evaluated the reliability of the deployed framework at three distinct experimental sites that were covered by alfalfa crops. The performance of the GLM-STF algorithm was compared with five benchmark STF algorithms: STARFM, ESTARFM, Fit-FC, FSDAF, and VSDF, by using three quantitative accuracy evaluation metrics, including root mean squared error (RMSE), correlation coefficient (CC), and structure similarity index (SSIM). The proposed STF algorithm yielded the most accurate synthesized UAV images, followed by VSDF, which proved to be the most accurate benchmark algorithm. Specifically, GML-STF achieved an average RMSE of 0.029 (compared to VSDF’s 0.043), an average CC of 0.725 (compared to VSDF’s 0.669), and an average SSIM of 0.840 (compared to VSDF’s 0.811). The superiority of GLM-STF was also observed with the visual comparisons as well. Additionally, GLM-STF was less sensitive to the increase in the acquisition time difference between the reference image pairs and prediction date, indicating its suitability for STF tasks with limited input reference pairs. The developed framework in this study is thus expected to provide high-quality UAV images with high spatial resolution and frequent observations for various applications.
{"title":"Developing a spatiotemporal fusion framework for generating daily UAV images in agricultural areas using publicly available satellite data","authors":"Hamid Ebrahimy, Tong Yu, Zhou Zhang","doi":"10.1016/j.isprsjprs.2024.12.024","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.12.024","url":null,"abstract":"Monitoring agricultural areas, given their rapid transformation and small-scale spatial changes, necessitates obtaining dense time series of high-resolution remote sensing data. In this manner, the unmanned aerial vehicle (UAV) that can provide high-resolution images is indispensable for monitoring and assessing agricultural areas, especially for rapidly changing crops like alfalfa. Considering the practical limitations of acquiring daily UAV images, the utilization of spatiotemporal fusion (STF) approaches to integrate publicly available satellite images with high temporal resolution and UAV images with high spatial resolution can be considered an effective alternative. This study proposed an effective STF algorithm that utilizes the Generalized Linear Model (GLM) as the mapping function and is called GLM-STF. The algorithm is designed to use coarse difference images to map fine difference images via the GLM algorithm. It then combines these fine difference images with the original fine images to synthesize daily UAV image at the prediction time. In this study, we deployed a two-step STF process: (1) MODIS MCD43A4 and Harmonized Landsat and Sentinel-2 (HLS) data were fused to produce daily HLS images; and (2) daily HLS data and UAV images were fused to produce daily UAV images. We evaluated the reliability of the deployed framework at three distinct experimental sites that were covered by alfalfa crops. The performance of the GLM-STF algorithm was compared with five benchmark STF algorithms: STARFM, ESTARFM, Fit-FC, FSDAF, and VSDF, by using three quantitative accuracy evaluation metrics, including root mean squared error (RMSE), correlation coefficient (CC), and structure similarity index (SSIM). The proposed STF algorithm yielded the most accurate synthesized UAV images, followed by VSDF, which proved to be the most accurate benchmark algorithm. Specifically, GML-STF achieved an average RMSE of 0.029 (compared to VSDF’s 0.043), an average CC of 0.725 (compared to VSDF’s 0.669), and an average SSIM of 0.840 (compared to VSDF’s 0.811). The superiority of GLM-STF was also observed with the visual comparisons as well. Additionally, GLM-STF was less sensitive to the increase in the acquisition time difference between the reference image pairs and prediction date, indicating its suitability for STF tasks with limited input reference pairs. The developed framework in this study is thus expected to provide high-quality UAV images with high spatial resolution and frequent observations for various applications.","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"15 1","pages":""},"PeriodicalIF":12.7,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1016/j.isprsjprs.2024.12.021
Ji Ge, Hong Zhang, Lijun Zuo, Lu Xu, Jingling Jiang, Mingyang Song, Yinhaibin Ding, Yazhe Xie, Fan Wu, Chao Wang, Wenjiang Huang
Timely and accurate mapping of rice cultivation distribution is crucial for ensuring global food security and achieving SDG2. From a global perspective, rice areas display high heterogeneity in spatial pattern and SAR time-series characteristics, posing substantial challenges to deep learning (DL) models’ performance, efficiency, and transferability. Moreover, due to their “black box” nature, DL often lack interpretability and credibility. To address these challenges, this paper constructs the first SAR rice dataset with spatiotemporal heterogeneity and proposes an explainable, lightweight model for rice area extraction, the eXplainable Mamba UNet (XM-UNet). The dataset is based on the 2023 multi-temporal Sentinel-1 data, covering diverse rice samples from the United States, Kenya, and Vietnam. A Temporal Feature Importance Explainer (TFI-Explainer) based on the Selective State Space Model is designed to enhance adaptability to the temporal heterogeneity of rice and the model’s interpretability. This explainer, coupled with the DL model, provides interpretations of the importance of SAR temporal features and facilitates crucial time phase screening. To overcome the spatial heterogeneity of rice, an Attention Sandglass Layer (ASL) combining CNN and self-attention mechanisms is designed to enhance the local spatial feature extraction capabilities. Additionally, the Parallel Visual State Space Layer (PVSSL) utilizes 2D-Selective-Scan (SS2D) cross-scanning to capture the global spatial features of rice multi-directionally, significantly reducing computational complexity through parallelization. Experimental results demonstrate that the XM-UNet adapts well to the spatiotemporal heterogeneity of rice globally, with OA and F1-score of 94.26 % and 90.73 %, respectively. The model is extremely lightweight, with only 0.190 M parameters and 0.279 GFLOPs. Mamba’s selective scanning facilitates feature screening, and its integration with CNN effectively balances rice’s local and global spatial characteristics. The interpretability experiments prove that the explanations of the importance of the temporal features provided by the model are crucial for guiding rice distribution mapping and filling a gap in the related field. The code is available in https://github.com/SAR-RICE/XM-UNet.
{"title":"Large-scale rice mapping under spatiotemporal heterogeneity using multi-temporal SAR images and explainable deep learning","authors":"Ji Ge, Hong Zhang, Lijun Zuo, Lu Xu, Jingling Jiang, Mingyang Song, Yinhaibin Ding, Yazhe Xie, Fan Wu, Chao Wang, Wenjiang Huang","doi":"10.1016/j.isprsjprs.2024.12.021","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.12.021","url":null,"abstract":"Timely and accurate mapping of rice cultivation distribution is crucial for ensuring global food security and achieving SDG2. From a global perspective, rice areas display high heterogeneity in spatial pattern and SAR time-series characteristics, posing substantial challenges to deep learning (DL) models’ performance, efficiency, and transferability. Moreover, due to their “black box” nature, DL often lack interpretability and credibility. To address these challenges, this paper constructs the first SAR rice dataset with spatiotemporal heterogeneity and proposes an explainable, lightweight model for rice area extraction, the eXplainable Mamba UNet (XM-UNet). The dataset is based on the 2023 multi-temporal Sentinel-1 data, covering diverse rice samples from the United States, Kenya, and Vietnam. A Temporal Feature Importance Explainer (TFI-Explainer) based on the Selective State Space Model is designed to enhance adaptability to the temporal heterogeneity of rice and the model’s interpretability. This explainer, coupled with the DL model, provides interpretations of the importance of SAR temporal features and facilitates crucial time phase screening. To overcome the spatial heterogeneity of rice, an Attention Sandglass Layer (ASL) combining CNN and self-attention mechanisms is designed to enhance the local spatial feature extraction capabilities. Additionally, the Parallel Visual State Space Layer (PVSSL) utilizes 2D-Selective-Scan (SS2D) cross-scanning to capture the global spatial features of rice multi-directionally, significantly reducing computational complexity through parallelization. Experimental results demonstrate that the XM-UNet adapts well to the spatiotemporal heterogeneity of rice globally, with OA and F1-score of 94.26 % and 90.73 %, respectively. The model is extremely lightweight, with only 0.190 M parameters and 0.279 GFLOPs. Mamba’s selective scanning facilitates feature screening, and its integration with CNN effectively balances rice’s local and global spatial characteristics. The interpretability experiments prove that the explanations of the importance of the temporal features provided by the model are crucial for guiding rice distribution mapping and filling a gap in the related field. The code is available in <ce:inter-ref xlink:href=\"https://github.com/SAR-RICE/XM-UNet\" xlink:type=\"simple\">https://github.com/SAR-RICE/XM-UNet</ce:inter-ref>.","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"40 1","pages":""},"PeriodicalIF":12.7,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate mapping of rubber plantations in Southeast Asia is critical for sustainable plantation management and ecological and environmental impact assessment. Despite extensive research on rubber plantation mapping, studies have largely been confined to provincial scales, with the few country-scale assessments showing significant disagreement in both spatial distribution and area estimates. These discrepancies primarily stem from persistent cloud cover in tropical regions and limited temporal resolution of datasets that inadequately capture the full phenological cycles of rubber trees. To address these issues, we propose the Full Time Series Satellite Imagery and Full-Cycle Monitoring (FTSI-FCM) algorithm for mapping spatial distribution and establishment year of rubber plantations in Vietnam, a country experienced significant rubber expansion over the past decades. The FTSI-FCM algorithm initially employs the LandTrendr approach—an established forest disturbance detection algorithm—to identify the land use changes during the plantation establishment phase. We enhance this process through a spatiotemporal correction scheme to accurately determine the establishment years and maturity phases of the plantations. Subsequently, the algorithm identifies rubber plantations through a random forest algorithm by integrating features from three temporal phases: canopy transitions from rubber seedlings to mature plantations, phenological changes during mature stages, and phenological-spectral characteristic during the mapping year. This approach leverages an extensive time series of Landsat images dating back to the late 1980s, complemented by Sentinel-2 images since 2015. For the mapping year, these data are further enhanced by the inclusion of PALSAR-2 L-band Synthetic-Aperture Radar (SAR) and very high-resolution Planet optical imagery. When applied in Vietnam—a leading rubber producer with complex cultivation conditions— the FTSI-FCM algorithm yielded highly reliable maps of rubber distribution (Overall Accuracy, OA = 93.75%, F1-score = 0.93) and establishment years (R<ce:sup loc="post">2</ce:sup> = 0.99, RMSE = 0.25 years) for 2022 (referred to as FTSI-FCM_2022). These results outperformed previous mappings, such as WangR_2021 (OA = 75.00%, F1-score = 0.71), in both spatial distribution and area estimates. The FTSI-FCM_2022 map revealed a total rubber plantation area of 754,482 ha, closely matching reported statistics of 727,900 ha and showing strong correlation provincial statistics (R<ce:sup loc="post">2</ce:sup> = 0.99). Spatial analysis indicated that over 90% of rubber plantations are located within 15°N latitude, below 600 m in elevation, on slopes under 15°, and were established after 2000. Notably, there has been no significant expansion of rubber plantations into higher elevations or steeper slopes since 1990s, suggesting the effectiveness of sustainable rubber cultivation management practices in Vietnam. The FTSI-FCM algorithm demonstrates sub
{"title":"A full time series imagery and full cycle monitoring (FTSI-FCM) algorithm for tracking rubber plantation dynamics in the Vietnam from 1986 to 2022","authors":"Bangqian Chen, Jinwei Dong, Tran Thi Thu Hien, Tin Yun, Weili Kou, Zhixiang Wu, Chuan Yang, Guizhen Wang, Hongyan Lai, Ruijin Liu, Feng An","doi":"10.1016/j.isprsjprs.2024.12.018","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.12.018","url":null,"abstract":"Accurate mapping of rubber plantations in Southeast Asia is critical for sustainable plantation management and ecological and environmental impact assessment. Despite extensive research on rubber plantation mapping, studies have largely been confined to provincial scales, with the few country-scale assessments showing significant disagreement in both spatial distribution and area estimates. These discrepancies primarily stem from persistent cloud cover in tropical regions and limited temporal resolution of datasets that inadequately capture the full phenological cycles of rubber trees. To address these issues, we propose the Full Time Series Satellite Imagery and Full-Cycle Monitoring (FTSI-FCM) algorithm for mapping spatial distribution and establishment year of rubber plantations in Vietnam, a country experienced significant rubber expansion over the past decades. The FTSI-FCM algorithm initially employs the LandTrendr approach—an established forest disturbance detection algorithm—to identify the land use changes during the plantation establishment phase. We enhance this process through a spatiotemporal correction scheme to accurately determine the establishment years and maturity phases of the plantations. Subsequently, the algorithm identifies rubber plantations through a random forest algorithm by integrating features from three temporal phases: canopy transitions from rubber seedlings to mature plantations, phenological changes during mature stages, and phenological-spectral characteristic during the mapping year. This approach leverages an extensive time series of Landsat images dating back to the late 1980s, complemented by Sentinel-2 images since 2015. For the mapping year, these data are further enhanced by the inclusion of PALSAR-2 L-band Synthetic-Aperture Radar (SAR) and very high-resolution Planet optical imagery. When applied in Vietnam—a leading rubber producer with complex cultivation conditions— the FTSI-FCM algorithm yielded highly reliable maps of rubber distribution (Overall Accuracy, OA = 93.75%, F1-score = 0.93) and establishment years (R<ce:sup loc=\"post\">2</ce:sup> = 0.99, RMSE = 0.25 years) for 2022 (referred to as FTSI-FCM_2022). These results outperformed previous mappings, such as WangR_2021 (OA = 75.00%, F1-score = 0.71), in both spatial distribution and area estimates. The FTSI-FCM_2022 map revealed a total rubber plantation area of 754,482 ha, closely matching reported statistics of 727,900 ha and showing strong correlation provincial statistics (R<ce:sup loc=\"post\">2</ce:sup> = 0.99). Spatial analysis indicated that over 90% of rubber plantations are located within 15°N latitude, below 600 m in elevation, on slopes under 15°, and were established after 2000. Notably, there has been no significant expansion of rubber plantations into higher elevations or steeper slopes since 1990s, suggesting the effectiveness of sustainable rubber cultivation management practices in Vietnam. The FTSI-FCM algorithm demonstrates sub","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"27 8 1","pages":""},"PeriodicalIF":12.7,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-30DOI: 10.1016/j.isprsjprs.2024.12.023
Huiqun Ren, Xin Huang, Jie Yang, Guoqing Zhou
Timely and accurate monitoring of impervious surface areas (ISA) is crucial for effective urban planning and sustainable development. Recent advances in remote sensing technologies have enabled global ISA mapping at fine spatial resolution (<30 m) over long time spans (>30 years), offering the opportunity to track global ISA dynamics. However, existing 30 m global long-term ISA datasets suffer from omission and commission issues, affecting their accuracy in practical applications. To address these challenges, we proposed a novel global long-term ISA mapping method and generated a new 30 m global ISA dataset from 1985 to 2021, namely GISA-new. Specifically, to reduce ISA omissions, a multi-temporal Continuous Change Detection and Classification (CCDC) algorithm that accounts for newly added ISA regions (NA-CCDC) was proposed to enhance the diversity and representativeness of the training samples. Meanwhile, a multi-scale iterative (MIA) method was proposed to automatically remove global commissions of various sizes and types. Finally, we collected two independent test datasets with over 100,000 test samples globally for accuracy assessment. Results showed that GISA-new outperformed other existing global ISA datasets, such as GISA, WSF-evo, GAIA, and GAUD, achieving the highest overall accuracy (93.12 %), the lowest omission errors (10.50 %), and the lowest commission errors (3.52 %). Furthermore, the spatial distribution of global ISA omissions and commissions was analyzed, revealing more mapping uncertainties in the Northern Hemisphere. In general, the proposed method in this study effectively addressed global ISA omissions and removed commissions at different scales. The generated high-quality GISA-new can serve as a fundamental parameter for a more comprehensive understanding of global urbanization.
{"title":"Improving 30-meter global impervious surface area (GISA) mapping: New method and dataset","authors":"Huiqun Ren, Xin Huang, Jie Yang, Guoqing Zhou","doi":"10.1016/j.isprsjprs.2024.12.023","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.12.023","url":null,"abstract":"Timely and accurate monitoring of impervious surface areas (ISA) is crucial for effective urban planning and sustainable development. Recent advances in remote sensing technologies have enabled global ISA mapping at fine spatial resolution (<30 m) over long time spans (>30 years), offering the opportunity to track global ISA dynamics. However, existing 30 m global long-term ISA datasets suffer from omission and commission issues, affecting their accuracy in practical applications. To address these challenges, we proposed a novel global long-term ISA mapping method and generated a new 30 m global ISA dataset from 1985 to 2021, namely GISA-new. Specifically, to reduce ISA omissions, a multi-temporal Continuous Change Detection and Classification (CCDC) algorithm that accounts for newly added ISA regions (NA-CCDC) was proposed to enhance the diversity and representativeness of the training samples. Meanwhile, a multi-scale iterative (MIA) method was proposed to automatically remove global commissions of various sizes and types. Finally, we collected two independent test datasets with over 100,000 test samples globally for accuracy assessment. Results showed that GISA-new outperformed other existing global ISA datasets, such as GISA, WSF-evo, GAIA, and GAUD, achieving the highest overall accuracy (93.12 %), the lowest omission errors (10.50 %), and the lowest commission errors (3.52 %). Furthermore, the spatial distribution of global ISA omissions and commissions was analyzed, revealing more mapping uncertainties in the Northern Hemisphere. In general, the proposed method in this study effectively addressed global ISA omissions and removed commissions at different scales. The generated high-quality GISA-new can serve as a fundamental parameter for a more comprehensive understanding of global urbanization.","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"27 1","pages":""},"PeriodicalIF":12.7,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}