Pub Date : 2025-12-05DOI: 10.1109/JSTARS.2025.3640403
Jun Hu;Hongjun Su;Yiping Chen;Yuanwei Qin;Zhaohui Xue;Qian Du
Water use efficiency (WUE) is a fundamental indicator of the balance between ecosystem carbon assimilation and water consumption. However, its spatial variability and dominant environmental drivers across China's river basins remain unclear, posing challenges for basin-scale management. In this study, a comprehensive WUE analysis framework was established through the integration of multisource remote sensing and auxiliary datasets. In this framework, multisource vegetation, climate, topography, and land-use data were integrated to estimate WUE from the GPP-to-ET ratio, and a novel basin-scale dataset covering 25 major river basins in China from 2002 to 2021 was generated (CBS-WUE, https://doi.org/10.5281/zenodo.17402779), which was validated against FLUXNET2015 observations. With this new dataset, inter-basin comparisons were conducted to characterize spatial heterogeneity and temporal dynamics, while multivariate statistical and machine learning analyses were employed to identify the relative contributions of climatic, biotic, and land-use drivers. Results indicated that elevation and vegetation structure were the primary factors influencing basin-scale WUE differences. The national average WUE was 1.13 g C kg−1 H2O, with basin-level values ranging from 0.11 to 1.80 g C kg−1 H2O. Among them, higher WUE was in basins of moderate elevation and dense vegetation, and lower WUE was in high-elevation or arid basins. This integrative analysis highlights the dominant role of topography and vegetation in shaping WUE patterns and provides a scientific basis for enhancing water resource efficiency and ecological sustainability under changing environmental conditions.
水分利用效率(WUE)是衡量生态系统碳同化与水分消耗平衡的基本指标。然而,中国河流流域的空间变异性和主导环境驱动因素尚不清楚,这给流域尺度管理带来了挑战。本研究通过多源遥感和辅助数据集的整合,建立了一个综合的WUE分析框架。在该框架下,综合多源植被、气候、地形和土地利用数据,从gpp - et比估算WUE,并生成了覆盖中国25个主要流域的2002 - 2021年流域尺度数据集(CBS-WUE, https://doi.org/10.5281/zenodo.17402779),并与FLUXNET2015观测数据进行了验证。利用这一新的数据集,进行了流域间的比较,以表征空间异质性和时间动态,同时采用多元统计和机器学习分析来确定气候、生物和土地利用驱动因素的相对贡献。结果表明,高程和植被结构是影响流域尺度水分利用效率差异的主要因素。全国平均水分利用效率为1.13 g C kg−1 H2O,流域平均水分利用效率为0.11 ~ 1.80 g C kg−1 H2O。其中,中等海拔和植被密集的流域WUE较高,而高海拔和干旱的流域WUE较低。这种综合分析强调了地形和植被在水利用效率模式形成中的主导作用,为在变化的环境条件下提高水资源效率和生态可持续性提供了科学依据。
{"title":"Elevation and Vegetation Cover Dominate Inter-Basin Water Use Efficiency Patterns in China","authors":"Jun Hu;Hongjun Su;Yiping Chen;Yuanwei Qin;Zhaohui Xue;Qian Du","doi":"10.1109/JSTARS.2025.3640403","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3640403","url":null,"abstract":"Water use efficiency (WUE) is a fundamental indicator of the balance between ecosystem carbon assimilation and water consumption. However, its spatial variability and dominant environmental drivers across China's river basins remain unclear, posing challenges for basin-scale management. In this study, a comprehensive WUE analysis framework was established through the integration of multisource remote sensing and auxiliary datasets. In this framework, multisource vegetation, climate, topography, and land-use data were integrated to estimate WUE from the GPP-to-ET ratio, and a novel basin-scale dataset covering 25 major river basins in China from 2002 to 2021 was generated (CBS-WUE, <uri>https://doi.org/10.5281/zenodo.17402779</uri>), which was validated against FLUXNET2015 observations. With this new dataset, inter-basin comparisons were conducted to characterize spatial heterogeneity and temporal dynamics, while multivariate statistical and machine learning analyses were employed to identify the relative contributions of climatic, biotic, and land-use drivers. Results indicated that elevation and vegetation structure were the primary factors influencing basin-scale WUE differences. The national average WUE was 1.13 g C kg<sup>−1</sup> H<sub>2</sub>O, with basin-level values ranging from 0.11 to 1.80 g C kg<sup>−1</sup> H<sub>2</sub>O. Among them, higher WUE was in basins of moderate elevation and dense vegetation, and lower WUE was in high-elevation or arid basins. This integrative analysis highlights the dominant role of topography and vegetation in shaping WUE patterns and provides a scientific basis for enhancing water resource efficiency and ecological sustainability under changing environmental conditions.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1533-1548"},"PeriodicalIF":5.3,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11278658","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Target detection in Synthetic Aperture Radar (SAR) images is of great importance in civilian monitoring and military reconnaissance. However, the unique speckle noise inherent in SAR images leads to semantic information loss, while traditional convolutional neural network downsampling methods exacerbate this issue, impacting detection accuracy and robustness. Moreover, some dense target scenarios and weak scattering features of targets make it challenging to achieve sufficient feature discriminability, adding complexity to the detection task. In addition, the multiscale characteristic of SAR targets presents difficulties in balancing detection performance with computational efficiency in complex scenes. To tackle these difficulties, this article introduces a wavelet-driven transformer-based SAR target detection framework called TranSTD. Specifically, it incorporates the Haar wavelet dynamic downsampling and semantic preserving dynamic downsampling modules, which effectively suppress noise and preserve semantic information using techniques such as Haar wavelet denoise and input-driven dynamic pooling downsampling. Furthermore, the SAR adaptive convolution (SAC) bottleneck is proposed for enhancing the discrimination of features. To optimize performance and efficiency across varying scene complexities, a multiscale SAR attention fusion encoder is developed. Extensive experiments are carried out on three datasets, showing that our proposed algorithm outperforms the current state-of-the-art benchmarks in SAR target detection, offering a robust solution for the detection of targets in complex SAR scenes.
{"title":"TranSTD: A Wavelet-Driven Transformer-Based SAR Target Detection Framework With Adaptive Feature Enhancement and Fusion","authors":"Bobo Xi;Jiaqi Chen;Yan Huang;Jiaojiao Li;Yunsong Li;Zan Li;Xiang-Gen Xia","doi":"10.1109/JSTARS.2025.3639785","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639785","url":null,"abstract":"Target detection in Synthetic Aperture Radar (SAR) images is of great importance in civilian monitoring and military reconnaissance. However, the unique speckle noise inherent in SAR images leads to semantic information loss, while traditional convolutional neural network downsampling methods exacerbate this issue, impacting detection accuracy and robustness. Moreover, some dense target scenarios and weak scattering features of targets make it challenging to achieve sufficient feature discriminability, adding complexity to the detection task. In addition, the multiscale characteristic of SAR targets presents difficulties in balancing detection performance with computational efficiency in complex scenes. To tackle these difficulties, this article introduces a wavelet-driven transformer-based SAR target detection framework called TranSTD. Specifically, it incorporates the Haar wavelet dynamic downsampling and semantic preserving dynamic downsampling modules, which effectively suppress noise and preserve semantic information using techniques such as Haar wavelet denoise and input-driven dynamic pooling downsampling. Furthermore, the SAR adaptive convolution (SAC) bottleneck is proposed for enhancing the discrimination of features. To optimize performance and efficiency across varying scene complexities, a multiscale SAR attention fusion encoder is developed. Extensive experiments are carried out on three datasets, showing that our proposed algorithm outperforms the current state-of-the-art benchmarks in SAR target detection, offering a robust solution for the detection of targets in complex SAR scenes.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1197-1211"},"PeriodicalIF":5.3,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11275702","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-03DOI: 10.1109/JSTARS.2025.3639503
Yong Wan;Liyan Peng;Rui Zhang;Ruyue Zhang;Haowen Wang
As one of the most severe forms of pollution, oil spills pose significant threats to the marine environment. Synthetic aperture radar (SAR), an active microwave remote sensing technology, enables sea surface monitoring under all weather and lighting conditions and provides high spatial resolution. It has been widely used in the field of marine oil spill detection. However, other natural phenomena, such as low wind regions and biogenic oil films, can also produce dark spot features in SAR imagery that resemble oil spills, leading to false alarms. Global navigation satellite system-reflectometry (GNSS-R), as an emerging remote sensing technique for ocean observation, offers distinct advantages, including high temporal resolution and multisource observation capabilities. By combining SAR backscattering coefficients with GNSS-R delay doppler map, it becomes possible to characterize the impact of oil spills on sea surface roughness from both backscattering and forward-scattering perspectives. This joint approach enables more accurate oil spill detection and has the potential to reduce the false alarms. Nevertheless, limited measured data for multisource remote sensing oil spill detection hinders robust multisensor fusion model development. To address this, this study proposes a synchronized data generation method, creating a joint SAR and GNSS-R oil spill dataset, and on this basis, a dual-branch EfficientNetV2-S architecture is adopted to build a multisource satellite oil spill data fusion model, which is applied to offshore oil spill detection. According to experimental results, the suggested model detects oil spills with an accuracy of 94.97%. Compared with SAR-only detection models, the false alarm rate is reduced by 3.6%, demonstrating that the dual-payload approach effectively lowers the rate of false detections in marine oil spill monitoring.
{"title":"A Dual-Branch EfficientNetV2-S-Based Method for Marine Oil Spill Detection Using Multisource Satellite Data Fusion","authors":"Yong Wan;Liyan Peng;Rui Zhang;Ruyue Zhang;Haowen Wang","doi":"10.1109/JSTARS.2025.3639503","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639503","url":null,"abstract":"As one of the most severe forms of pollution, oil spills pose significant threats to the marine environment. Synthetic aperture radar (SAR), an active microwave remote sensing technology, enables sea surface monitoring under all weather and lighting conditions and provides high spatial resolution. It has been widely used in the field of marine oil spill detection. However, other natural phenomena, such as low wind regions and biogenic oil films, can also produce dark spot features in SAR imagery that resemble oil spills, leading to false alarms. Global navigation satellite system-reflectometry (GNSS-R), as an emerging remote sensing technique for ocean observation, offers distinct advantages, including high temporal resolution and multisource observation capabilities. By combining SAR backscattering coefficients with GNSS-R delay doppler map, it becomes possible to characterize the impact of oil spills on sea surface roughness from both backscattering and forward-scattering perspectives. This joint approach enables more accurate oil spill detection and has the potential to reduce the false alarms. Nevertheless, limited measured data for multisource remote sensing oil spill detection hinders robust multisensor fusion model development. To address this, this study proposes a synchronized data generation method, creating a joint SAR and GNSS-R oil spill dataset, and on this basis, a dual-branch EfficientNetV2-S architecture is adopted to build a multisource satellite oil spill data fusion model, which is applied to offshore oil spill detection. According to experimental results, the suggested model detects oil spills with an accuracy of 94.97%. Compared with SAR-only detection models, the false alarm rate is reduced by 3.6%, demonstrating that the dual-payload approach effectively lowers the rate of false detections in marine oil spill monitoring.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1549-1566"},"PeriodicalIF":5.3,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11275680","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-03DOI: 10.1109/JSTARS.2025.3639670
Cuiping Shi;Yimin Wang;Liguo Wang
With the rapid development of deep learning technology, significant progress has been made in the field of remote sensing (RS) scene image classification. However, the large intraclass distance and high interclass similarity still pose significant challenges for RS scene classification. In addition, there are multiscale targets in RS images, which make significant differences in target characteristics. To overcome the above limitations, this article proposes a novel attention-fusion focused multiscale architecture network (AF2-MSA Net). First, a multilevel feature extraction module (MFEM) was designed to extract semantic and detail information at different scales from RS images. Subsequently, an intricately designed global context recalibration module (GCRM) was embedded into MFEM, and the features at each level were enhanced through a global context recalibration mechanism, enabling the model to dynamically focus on key semantic regions and important contextual information. Next, an axis-aligned feature harmonization module (AAFHM) was constructed to fuse multiscale features from adjacent stages layer by layer. This module combines attention mechanisms from both channel and spatial branches to adaptively coordinate and fuse multiscale contextual information, achieving deep collaborative optimization of different scale features. Finally, the GCRM and AAFHM are integrated into a unified framework called AF2-MSA Net to achieve collaborative optimization of global semantics and multiscale discriminative features. Extensive experiments on three commonly used datasets have shown that the proposed AF2-MSA Net outperforms some state-of-the-art methods in RS image scene classification tasks.
{"title":"AF2-MSA Net: Attention-Fusion Focused Multiscale Architecture Network for Remote Sensing Scene Classification","authors":"Cuiping Shi;Yimin Wang;Liguo Wang","doi":"10.1109/JSTARS.2025.3639670","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639670","url":null,"abstract":"With the rapid development of deep learning technology, significant progress has been made in the field of remote sensing (RS) scene image classification. However, the large intraclass distance and high interclass similarity still pose significant challenges for RS scene classification. In addition, there are multiscale targets in RS images, which make significant differences in target characteristics. To overcome the above limitations, this article proposes a novel attention-fusion focused multiscale architecture network (AF<sup>2</sup>-MSA Net). First, a multilevel feature extraction module (MFEM) was designed to extract semantic and detail information at different scales from RS images. Subsequently, an intricately designed global context recalibration module (GCRM) was embedded into MFEM, and the features at each level were enhanced through a global context recalibration mechanism, enabling the model to dynamically focus on key semantic regions and important contextual information. Next, an axis-aligned feature harmonization module (AAFHM) was constructed to fuse multiscale features from adjacent stages layer by layer. This module combines attention mechanisms from both channel and spatial branches to adaptively coordinate and fuse multiscale contextual information, achieving deep collaborative optimization of different scale features. Finally, the GCRM and AAFHM are integrated into a unified framework called AF<sup>2</sup>-MSA Net to achieve collaborative optimization of global semantics and multiscale discriminative features. Extensive experiments on three commonly used datasets have shown that the proposed AF<sup>2</sup>-MSA Net outperforms some state-of-the-art methods in RS image scene classification tasks.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1150-1164"},"PeriodicalIF":5.3,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11275695","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-03DOI: 10.1109/JSTARS.2025.3639607
Guoqing Wang;He Chen;Wenchao Liu;Tianyu Wei;Panzhe Gu;Jue Wang
At present, most remote sensing change detection methods are applicable to bitemporal image alignment scenarios, that is, assuming that the pixel pairs of bitemporal images are spatially registered. Detection accuracy is highly sensitive to the alignment accuracy of the image pairs. In practical applications, obtaining well-registered image pairs is often challenging. Currently, the approach of aligning first and then detecting is both inefficient and expensive. Explicitly integrating image alignment and change detection into a framework is an effective solution. However, offset information is difficult to be reflected in the high-level features of the image, it is hard to predict accurate image offsets, and it is also difficult to correct the spatial relationship of land covers in the image using the offset. To overcome the above problems, we propose a residual offset-driven feature alignment network (ROFANet). ROFANet combines two innovative methods: residual offset prediction (ROP) and dual-branch feature correction (DFC). ROP utilizes multilevel features to achieve offset prediction from coarse to fine granularity, effectively enhancing the model's predictive ability for image offsets. DFC has established two branches: image correction and feature correction, which respectively correct distorted images and distorted features. By optimizing the spatial relationship representation of land covers, the model's change detection ability under unaligned image conditions has been enhanced. Extensive experiments conducted on three publicly available change detection datasets demonstrate that the proposed ROFANet achieves outstanding detection performance in unaligned image scenarios.
{"title":"ROFANet: Residual Offset-Driven Feature Alignment Network for Unaligned Remote Sensing Image Change Detection","authors":"Guoqing Wang;He Chen;Wenchao Liu;Tianyu Wei;Panzhe Gu;Jue Wang","doi":"10.1109/JSTARS.2025.3639607","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639607","url":null,"abstract":"At present, most remote sensing change detection methods are applicable to bitemporal image alignment scenarios, that is, assuming that the pixel pairs of bitemporal images are spatially registered. Detection accuracy is highly sensitive to the alignment accuracy of the image pairs. In practical applications, obtaining well-registered image pairs is often challenging. Currently, the approach of aligning first and then detecting is both inefficient and expensive. Explicitly integrating image alignment and change detection into a framework is an effective solution. However, offset information is difficult to be reflected in the high-level features of the image, it is hard to predict accurate image offsets, and it is also difficult to correct the spatial relationship of land covers in the image using the offset. To overcome the above problems, we propose a residual offset-driven feature alignment network (ROFANet). ROFANet combines two innovative methods: residual offset prediction (ROP) and dual-branch feature correction (DFC). ROP utilizes multilevel features to achieve offset prediction from coarse to fine granularity, effectively enhancing the model's predictive ability for image offsets. DFC has established two branches: image correction and feature correction, which respectively correct distorted images and distorted features. By optimizing the spatial relationship representation of land covers, the model's change detection ability under unaligned image conditions has been enhanced. Extensive experiments conducted on three publicly available change detection datasets demonstrate that the proposed ROFANet achieves outstanding detection performance in unaligned image scenarios.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1305-1320"},"PeriodicalIF":5.3,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11275653","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1109/JSTARS.2025.3639298
Zhikai Wang;Yuehao Xiao;Zhumu Fu;Mengyang Li;Na Li
In transformer-based superresolution reconstruction tasks for remote sensing images, the window attention mechanism has become a key method for reducing the secondary complexity of traditional self-attention. However, the window self-attention mechanism still requires a significant amount of computational resources, especially when processing large remote sensing images. To address this issue, we propose a convolutional block structure based on the sliding window mechanism, which replaces the traditional window/sliding self-attention and drastically reduces computational complexity. It comprises a residual channel enhanced attention (RCEA) module and group convolution, which enhances the efficiency of group convolution by dynamically refining the channel weights through RCEA. In addition, the CTC-Block further refines the window feature representation by introducing a spatial attention enhancement module that focuses on key spatial details and selectively emphasizes the information regions within each window. Finally, a convolution-based feedforward network is introduced to bolster the network’s capacity to model high-frequency information in images. The experimental results demonstrate that the proposed method outperforms other classical remote sensing image superresolution reconstruction models in terms of peak signal-to-noise ratio and structural similarity evaluation metrics on the NWPU-RESISC45 and NWPU-VHR datasets. Compared with the baseline model SwinIR, the number of parameters is reduced by 39.4%, the number of floating-point operations is reduced by 42.4%, and the average inference speed reaches 22.86 ms while maintaining performance.
{"title":"SwinCTC: Efficient Network for Superresolution Reconstruction of Remote Sensing Images Based on Nonlocal Feature Enhancement by Sliding Window Mechanism","authors":"Zhikai Wang;Yuehao Xiao;Zhumu Fu;Mengyang Li;Na Li","doi":"10.1109/JSTARS.2025.3639298","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639298","url":null,"abstract":"In transformer-based superresolution reconstruction tasks for remote sensing images, the window attention mechanism has become a key method for reducing the secondary complexity of traditional self-attention. However, the window self-attention mechanism still requires a significant amount of computational resources, especially when processing large remote sensing images. To address this issue, we propose a convolutional block structure based on the sliding window mechanism, which replaces the traditional window/sliding self-attention and drastically reduces computational complexity. It comprises a residual channel enhanced attention (RCEA) module and group convolution, which enhances the efficiency of group convolution by dynamically refining the channel weights through RCEA. In addition, the CTC-Block further refines the window feature representation by introducing a spatial attention enhancement module that focuses on key spatial details and selectively emphasizes the information regions within each window. Finally, a convolution-based feedforward network is introduced to bolster the network’s capacity to model high-frequency information in images. The experimental results demonstrate that the proposed method outperforms other classical remote sensing image superresolution reconstruction models in terms of peak signal-to-noise ratio and structural similarity evaluation metrics on the NWPU-RESISC45 and NWPU-VHR datasets. Compared with the baseline model SwinIR, the number of parameters is reduced by 39.4%, the number of floating-point operations is reduced by 42.4%, and the average inference speed reaches 22.86 ms while maintaining performance.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1240-1258"},"PeriodicalIF":5.3,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271778","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1109/JSTARS.2025.3638881
Sigrid Helene Strand;Thomas Wiedemann;Bram Burczek;Dmitriy Shutin
Search and rescue missions are often critical following sudden natural disasters or in high-risk environmental situations. The most challenging search and rescue missions involve difficult-to-access terrains, such as dense forests with high occlusion. Deploying uncrewed aerial vehicles for exploration can significantly enhance search effectiveness, facilitate access to challenging environments, and reduce search time. However, in dense forests, the effectiveness of uncrewed aerial vehicles depends on their ability to capture clear views of the ground, necessitating a robust search strategy to optimize camera positioning and perspective. This work presents an optimized planning strategy and an efficient algorithm for the next best view problem in occluded environments. Two novel optimization heuristics, a geometry heuristic, and a visibility heuristic, are proposed to enhance search performance by selecting optimal camera viewpoints. Comparative evaluations in both simulated and real-world settings reveal that the visibility heuristic achieves greater performance, identifying over 90% of hidden objects in simulated forests and offering 10% better detection rates than the geometry heuristic. In addition, real-world experiments demonstrate that the visibility heuristic provides better coverage under the canopy, highlighting its potential for improving search and rescue missions in occluded environments.
{"title":"Enhancing UAV Search Under Occlusion Using Next Best View Planning","authors":"Sigrid Helene Strand;Thomas Wiedemann;Bram Burczek;Dmitriy Shutin","doi":"10.1109/JSTARS.2025.3638881","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3638881","url":null,"abstract":"Search and rescue missions are often critical following sudden natural disasters or in high-risk environmental situations. The most challenging search and rescue missions involve difficult-to-access terrains, such as dense forests with high occlusion. Deploying uncrewed aerial vehicles for exploration can significantly enhance search effectiveness, facilitate access to challenging environments, and reduce search time. However, in dense forests, the effectiveness of uncrewed aerial vehicles depends on their ability to capture clear views of the ground, necessitating a robust search strategy to optimize camera positioning and perspective. This work presents an optimized planning strategy and an efficient algorithm for the next best view problem in occluded environments. Two novel optimization heuristics, a geometry heuristic, and a visibility heuristic, are proposed to enhance search performance by selecting optimal camera viewpoints. Comparative evaluations in both simulated and real-world settings reveal that the visibility heuristic achieves greater performance, identifying over 90% of hidden objects in simulated forests and offering 10% better detection rates than the geometry heuristic. In addition, real-world experiments demonstrate that the visibility heuristic provides better coverage under the canopy, highlighting its potential for improving search and rescue missions in occluded environments.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1085-1096"},"PeriodicalIF":5.3,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271526","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1109/JSTARS.2025.3639164
Rui Jiang;Hang Shi;Jiahong Ni;Jiatao Li;Yi Feng;Xinqiang Chen;Yinlin Li
Ship detection in synthetic aperture radar (SAR) images faces challenges such as strong background interference, varying ship appearance, and distribution and high real-time requirements. Although attention-based deep learning methods dominate this field, the design of lightweight models with efficient attention mechanisms capable of addressing the aforementioned challenges remains underexplored. To address this issue, we propose a lightweight SAR ship detection model named LSDFormer, which is built upon the MetaFormer architecture and consists of an efficient multiattention-enhanced backbone and neck and a structural reparameterization (SR)-enhanced head. We employ two lightweight modules for the backbone and neck: a PoolFormer-based feature extraction module with efficient channel modulation attention is proposed to enhance ship features and suppress background interference, and a downsampling module using efficient channel aggregation attention and group convolutions is introduced to enrich ship features. The position-sensitive attention from YOLOv11 is also introduced to handle variations in ship appearance and distribution. These three attentions are integrated into an efficient multiattention mechanism. Furthermore, an SR-based detection branch is proposed for the head of LSDFormer, which enhances ship features while reducing model complexity. Extensive experiments on SSDD and HRSID datasets demonstrate the superiority and effectiveness of LSDFormer, achieving AP50 of $mathbf {98.5pm 0.4%}$ and $mathbf {92.8pm 0.2%}$, respectively, with only $mathbf {1.5}$ M parameters and $mathbf {4.1}$ GFLOPs. The average processing time per image is $mathbf {4.9}$ ms on SSDD and $mathbf {4.2}$ ms on HRSID, confirming its real-time performance.
{"title":"LSDFormer: Lightweight SAR Ship Detection Enhanced With Efficient Multiattention and Structural Reparameterization","authors":"Rui Jiang;Hang Shi;Jiahong Ni;Jiatao Li;Yi Feng;Xinqiang Chen;Yinlin Li","doi":"10.1109/JSTARS.2025.3639164","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639164","url":null,"abstract":"Ship detection in synthetic aperture radar (SAR) images faces challenges such as strong background interference, varying ship appearance, and distribution and high real-time requirements. Although attention-based deep learning methods dominate this field, the design of lightweight models with efficient attention mechanisms capable of addressing the aforementioned challenges remains underexplored. To address this issue, we propose a lightweight SAR ship detection model named LSDFormer, which is built upon the MetaFormer architecture and consists of an efficient multiattention-enhanced backbone and neck and a structural reparameterization (SR)-enhanced head. We employ two lightweight modules for the backbone and neck: a PoolFormer-based feature extraction module with efficient channel modulation attention is proposed to enhance ship features and suppress background interference, and a downsampling module using efficient channel aggregation attention and group convolutions is introduced to enrich ship features. The position-sensitive attention from YOLOv11 is also introduced to handle variations in ship appearance and distribution. These three attentions are integrated into an efficient multiattention mechanism. Furthermore, an SR-based detection branch is proposed for the head of LSDFormer, which enhances ship features while reducing model complexity. Extensive experiments on SSDD and HRSID datasets demonstrate the superiority and effectiveness of LSDFormer, achieving AP50 of <inline-formula><tex-math>$mathbf {98.5pm 0.4%}$</tex-math></inline-formula> and <inline-formula><tex-math>$mathbf {92.8pm 0.2%}$</tex-math></inline-formula>, respectively, with only <inline-formula><tex-math>$mathbf {1.5}$</tex-math></inline-formula> M parameters and <inline-formula><tex-math>$mathbf {4.1}$</tex-math></inline-formula> GFLOPs. The average processing time per image is <inline-formula><tex-math>$mathbf {4.9}$</tex-math></inline-formula> ms on SSDD and <inline-formula><tex-math>$mathbf {4.2}$</tex-math></inline-formula> ms on HRSID, confirming its real-time performance.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1359-1377"},"PeriodicalIF":5.3,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271640","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1109/JSTARS.2025.3639018
Siting Xiong;Zhichao Deng;Bochen Zhang;Jiayuan Zhang;Chisheng Wang
Interferometric synthetic aperture radar (InSAR) is a widely applied and highly efficient tool for monitoring large-scale and long-term ground displacements. In most applications, evaluating InSAR results on land displacements primarily depends on the displacement rate/velocity. However, the full exploitation of time-series information is becoming increasingly important as the increasing temporal coverage of the SAR dataset can lead to the composition of different sequences of one target. Effectively and efficiently detecting abnormal timestamps from a full-time series of InSAR displacement results is critical in InSAR postanalysis. To this end, we propose a novel approach to automatically detect anomalous timestamps in InSAR-derived time-series displacements based on improved time-series anomaly detection using generative adversarial networks (TadGAN). The improved TadGAN generates a normal time-series displacement compared with the InSAR-derived time-series displacement to obtain anomaly scores for each timestamp. Based on these anomaly scores, the ratio of anomalous timestamps and maximum anomaly scores was calculated to assess the risk levels, and the anomalous sequences were classified into abrupt and trend types using a simple convolutional neural network integrated with the attention mechanism. The proposed method was applied to the InSAR-derived ground displacements of the Hong Kong–Zhuhai–Macao Bridge. The results show that the proposed method successfully detects the start and end of anomalous sequences and produces anomaly maps that are more accurate than displacement rate maps. The trained model can also be applied directly to other regions, as validated by the InSAR results of the Kowloon Peninsula in Hong Kong.
{"title":"Anomaly Detection of InSAR Time-Series Displacements Based on Generative Adversarial Network","authors":"Siting Xiong;Zhichao Deng;Bochen Zhang;Jiayuan Zhang;Chisheng Wang","doi":"10.1109/JSTARS.2025.3639018","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639018","url":null,"abstract":"Interferometric synthetic aperture radar (InSAR) is a widely applied and highly efficient tool for monitoring large-scale and long-term ground displacements. In most applications, evaluating InSAR results on land displacements primarily depends on the displacement rate/velocity. However, the full exploitation of time-series information is becoming increasingly important as the increasing temporal coverage of the SAR dataset can lead to the composition of different sequences of one target. Effectively and efficiently detecting abnormal timestamps from a full-time series of InSAR displacement results is critical in InSAR postanalysis. To this end, we propose a novel approach to automatically detect anomalous timestamps in InSAR-derived time-series displacements based on improved time-series anomaly detection using generative adversarial networks (TadGAN). The improved TadGAN generates a normal time-series displacement compared with the InSAR-derived time-series displacement to obtain anomaly scores for each timestamp. Based on these anomaly scores, the ratio of anomalous timestamps and maximum anomaly scores was calculated to assess the risk levels, and the anomalous sequences were classified into abrupt and trend types using a simple convolutional neural network integrated with the attention mechanism. The proposed method was applied to the InSAR-derived ground displacements of the Hong Kong–Zhuhai–Macao Bridge. The results show that the proposed method successfully detects the start and end of anomalous sequences and produces anomaly maps that are more accurate than displacement rate maps. The trained model can also be applied directly to other regions, as validated by the InSAR results of the Kowloon Peninsula in Hong Kong.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1227-1239"},"PeriodicalIF":5.3,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271588","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01DOI: 10.1109/JSTARS.2025.3638765
Zihan Yuan;Jilin Gu;Yaoqi Lu;Yuwei Li
Soil temperature is a key variable in several fields of Earth science, and accurate predictions of soil temperature at different depths are of great significance for scientific research and agricultural production. Soil temperature data observations by meteorological stations are categorized as discrete and discontinuous, and moderate resolution imaging spectroradiometer (MODIS) remotely sensed data are used to perform routine soil temperature predictions on a large scale. In this study, data from three MODIS products, namely, normalized vegetation index, atmospheric precipitable water, and surface temperature, and daily average soil temperature measurements at depths of 40, 100, and 200 cm from the ground surface in Liaoning Province from 2017 to 2021 were used to establish a soil temperature prediction model based on the long short-term memory (LSTM) model. To improve the prediction accuracy and stability, an optimized LSTM model was established to perform comparative predictions of soil temperature concentrations based on the LSTM model, and it considered the hysteresis factor of soil temperature relative to the surface temperature. The LSTM soil temperature prediction models established based on the fusion of remote sensing data (NDVI, PWV, and LST) and soil temperature data at 40, 100, and 200 cm from the surface and the optimized LSTM models that considered hysteresis obtained R2 values of 0.86, 0.81, 0.69, and 0.90, 0.91, 0.88, respectively, with RMSE values of 0.30, 3.41, 3.74 °C, and 0.30, 3.41, 3.74 °C. Moreover, the SDRMSE of the optimized model considering hysteresis decreased compared to that of the LSTM. The LSTM models before and after optimization can achieve long-term daily temperature prediction of inter-annual soil temperature, although the prediction model that considers the hysteresis factor had a better fit and stability. Thus, the model considering hysteresis is more advantageous for obtaining accurate predictions of spatially continuous multidepth soil temperatures.
{"title":"An Evaluation of Soil Temperature Predictions Based on the Long Short-Term Memory Model and Remote Sensing Data","authors":"Zihan Yuan;Jilin Gu;Yaoqi Lu;Yuwei Li","doi":"10.1109/JSTARS.2025.3638765","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3638765","url":null,"abstract":"Soil temperature is a key variable in several fields of Earth science, and accurate predictions of soil temperature at different depths are of great significance for scientific research and agricultural production. Soil temperature data observations by meteorological stations are categorized as discrete and discontinuous, and moderate resolution imaging spectroradiometer (MODIS) remotely sensed data are used to perform routine soil temperature predictions on a large scale. In this study, data from three MODIS products, namely, normalized vegetation index, atmospheric precipitable water, and surface temperature, and daily average soil temperature measurements at depths of 40, 100, and 200 cm from the ground surface in Liaoning Province from 2017 to 2021 were used to establish a soil temperature prediction model based on the long short-term memory (LSTM) model. To improve the prediction accuracy and stability, an optimized LSTM model was established to perform comparative predictions of soil temperature concentrations based on the LSTM model, and it considered the hysteresis factor of soil temperature relative to the surface temperature. The LSTM soil temperature prediction models established based on the fusion of remote sensing data (NDVI, PWV, and LST) and soil temperature data at 40, 100, and 200 cm from the surface and the optimized LSTM models that considered hysteresis obtained <italic>R</i><sup>2</sup> values of 0.86, 0.81, 0.69, and 0.90, 0.91, 0.88, respectively, with RMSE values of 0.30, 3.41, 3.74 °C, and 0.30, 3.41, 3.74 °C. Moreover, the SD<sub>RMSE</sub> of the optimized model considering hysteresis decreased compared to that of the LSTM. The LSTM models before and after optimization can achieve long-term daily temperature prediction of inter-annual soil temperature, although the prediction model that considers the hysteresis factor had a better fit and stability. Thus, the model considering hysteresis is more advantageous for obtaining accurate predictions of spatially continuous multidepth soil temperatures.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1212-1226"},"PeriodicalIF":5.3,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271761","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}