Pub Date : 2025-12-23DOI: 10.1109/JSTARS.2025.3647100
Yongkang Hu;Yupei Wang;Liang Chen
In unsupervised domain adaptation for semantic segmentation of remote sensing imagery, identical land-cover classes across different domains often exhibit substantial variations in appearance, scale, and class distribution, which seriously hinder cross-domain generalization. Moreover, even within a single domain, land-cover classes present highly complex and diverse intraclass distributions that cannot be effectively captured by a single class representation, further increasing the challenge of generalization. To this end, we propose a collaborative framework integrating local pixel-level contrast and global Gaussian multiprototype bidirectional alignment. At the local level, we introduce probability-masked contrastive learning, which adaptively increases the sampling probability of minority classes to mitigate the class imbalance issue. Meanwhile, pixel contrastive learning is incorporated to enhance the robustness to cross-domain variations in appearance and texture. At the global level, we employ a Gaussian mixture model to represent each source-domain class with multiple Gaussian prototypes rather than a single one, thereby yielding richer and more fine-grained class representations. Building on this, a bidirectional alignment strategy is proposed. Concretely, the forward alignment serves multiprototypes as semantic anchors that progressively guide target-domain features to align with the source-domain class distributions, reducing the intraclass variance. Meanwhile, the reverse alignment dynamically refines the prototypes to increase anchor accuracy, further enhancing the stability and discriminability of cross-domain alignment. Experimental results on the widely used ISPRS and LoveDA datasets demonstrate the superiority of our proposed method over state-of-the-art approaches.
{"title":"Local Pixel-Contrast and Global Gaussian Multiprototype Bidirectional Alignment for Unsupervised Domain Adaptation of Semantic Segmentation in Remote Sensing Imagery","authors":"Yongkang Hu;Yupei Wang;Liang Chen","doi":"10.1109/JSTARS.2025.3647100","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3647100","url":null,"abstract":"In unsupervised domain adaptation for semantic segmentation of remote sensing imagery, identical land-cover classes across different domains often exhibit substantial variations in appearance, scale, and class distribution, which seriously hinder cross-domain generalization. Moreover, even within a single domain, land-cover classes present highly complex and diverse intraclass distributions that cannot be effectively captured by a single class representation, further increasing the challenge of generalization. To this end, we propose a collaborative framework integrating local pixel-level contrast and global Gaussian multiprototype bidirectional alignment. At the local level, we introduce probability-masked contrastive learning, which adaptively increases the sampling probability of minority classes to mitigate the class imbalance issue. Meanwhile, pixel contrastive learning is incorporated to enhance the robustness to cross-domain variations in appearance and texture. At the global level, we employ a Gaussian mixture model to represent each source-domain class with multiple Gaussian prototypes rather than a single one, thereby yielding richer and more fine-grained class representations. Building on this, a bidirectional alignment strategy is proposed. Concretely, the forward alignment serves multiprototypes as semantic anchors that progressively guide target-domain features to align with the source-domain class distributions, reducing the intraclass variance. Meanwhile, the reverse alignment dynamically refines the prototypes to increase anchor accuracy, further enhancing the stability and discriminability of cross-domain alignment. Experimental results on the widely used ISPRS and LoveDA datasets demonstrate the superiority of our proposed method over state-of-the-art approaches.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"3573-3588"},"PeriodicalIF":5.3,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11313335","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-23DOI: 10.1109/JSTARS.2025.3647047
Nan Mo;Gege Ma;Dengfeng Xie;Guangyun Zhang
Given the limitations of traditional feature coding in capturing multiscale information and precise segmentation, existing deep learning-based change detection (CD) methods often suffer from problems such as high missed detection ratio and poor boundary segmentation accuracy when dealing with changed regions with considerable scale differences and irregular geometric shapes. Recently, the emergence of vision foundation models has presented new opportunities for enhancing the feature representation and position encoding capabilities for remote sensing images. To address these challenges, this study proposes a novel segment anything model 2 (SAM2) assisted multilevel dual encoder–single decoder (SAM2-MDESD) CD method. First, an innovative dual-stream Hiera encoder is designed. This encoder leverages SAM2’s powerful feature encoding to extract multilevel features and positional information from bitemporal images effectively, thereby improving the performance for multiscale and irregular changes remarkably. Second, a grouped-channel self-attention enhancement module is designed to increase the saliency of changed regions in multilevel difference feature maps. This module contributes to the improved integrity of changed areas. Finally, a multilevel difference feature map joint prediction loss function is proposed, which leverages hierarchical constraints across features to suppress false detection caused by shallow features, thereby further improving the precision of changed regions. Experimental results on the CLCD and WHU-CD datasets demonstrate that the proposed method achieves F1-scores of 80.02% and 93.76%, respectively, outperforming existing state-of-the-art CD methods. Notably, SAM2-MDESD demonstrates remarkable advantages in reducing the missed detection ratio and improving segmentation accuracy for multiscale and irregularly shaped changes.
{"title":"SAM2-MDESD: An SAM2-Assisted Multilevel Dual Encoder–Single Decoder Method for Optical Remote Sensing Image Change Detection","authors":"Nan Mo;Gege Ma;Dengfeng Xie;Guangyun Zhang","doi":"10.1109/JSTARS.2025.3647047","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3647047","url":null,"abstract":"Given the limitations of traditional feature coding in capturing multiscale information and precise segmentation, existing deep learning-based change detection (CD) methods often suffer from problems such as high missed detection ratio and poor boundary segmentation accuracy when dealing with changed regions with considerable scale differences and irregular geometric shapes. Recently, the emergence of vision foundation models has presented new opportunities for enhancing the feature representation and position encoding capabilities for remote sensing images. To address these challenges, this study proposes a novel segment anything model 2 (SAM2) assisted multilevel dual encoder–single decoder (SAM2-MDESD) CD method. First, an innovative dual-stream Hiera encoder is designed. This encoder leverages SAM2’s powerful feature encoding to extract multilevel features and positional information from bitemporal images effectively, thereby improving the performance for multiscale and irregular changes remarkably. Second, a grouped-channel self-attention enhancement module is designed to increase the saliency of changed regions in multilevel difference feature maps. This module contributes to the improved integrity of changed areas. Finally, a multilevel difference feature map joint prediction loss function is proposed, which leverages hierarchical constraints across features to suppress false detection caused by shallow features, thereby further improving the precision of changed regions. Experimental results on the CLCD and WHU-CD datasets demonstrate that the proposed method achieves F1-scores of 80.02% and 93.76%, respectively, outperforming existing state-of-the-art CD methods. Notably, SAM2-MDESD demonstrates remarkable advantages in reducing the missed detection ratio and improving segmentation accuracy for multiscale and irregularly shaped changes.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2588-2604"},"PeriodicalIF":5.3,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11313499","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-22DOI: 10.1109/JSTARS.2025.3646776
Gan Liu;Linlin Shi;Jialong Lai;Feifei Cui;Xiaoping Zhang;Yi Xu
Detecting Martian dust devils remains a challenging task due to the scarcity of high-quality annotated data, significant variations in scale, blurred boundaries, and complex surface textures. To address these difficulties, we construct a cross-regional, manually annotated benchmark dataset named MDD-Human and propose a novel transformer-based detection network, Mars dust devil detection transformer (MDT). The model adopts FasterNet as its backbone to ensure a balance between computational efficiency and feature extraction capability. A key innovation lies in the multiscale attention fusion module, which incorporates hierarchical fusion strategies and hybrid attention mechanisms to effectively enhance the representation of dust devil features under diverse Martian terrains. In addition, we introduce a shape-aware localization loss function, shape-augmented minimum point distance IoU, which improves geometric sensitivity by integrating corner distance constraints and structural shape priors. Experimental results on the MDD-Human dataset demonstrate that MDT achieves 92.7% Precision, 90.8% Recall, 92.4% mAP@50, and 91.8% F1-score, outperforming several classical and state-of-the-art detectors. Further tests on unseen THEMIS and CRISM datasets confirm the model’s strong cross-source generalization, highlighting its robustness and applicability in diverse Martian imaging scenarios.
{"title":"A Multiscale Attention Transformer for Martian Dust Devil Detection in Remote Sensing Imagery","authors":"Gan Liu;Linlin Shi;Jialong Lai;Feifei Cui;Xiaoping Zhang;Yi Xu","doi":"10.1109/JSTARS.2025.3646776","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3646776","url":null,"abstract":"Detecting Martian dust devils remains a challenging task due to the scarcity of high-quality annotated data, significant variations in scale, blurred boundaries, and complex surface textures. To address these difficulties, we construct a cross-regional, manually annotated benchmark dataset named MDD-Human and propose a novel transformer-based detection network, Mars dust devil detection transformer (MDT). The model adopts FasterNet as its backbone to ensure a balance between computational efficiency and feature extraction capability. A key innovation lies in the multiscale attention fusion module, which incorporates hierarchical fusion strategies and hybrid attention mechanisms to effectively enhance the representation of dust devil features under diverse Martian terrains. In addition, we introduce a shape-aware localization loss function, shape-augmented minimum point distance IoU, which improves geometric sensitivity by integrating corner distance constraints and structural shape priors. Experimental results on the MDD-Human dataset demonstrate that MDT achieves 92.7% Precision, 90.8% Recall, 92.4% mAP@50, and 91.8% F1-score, outperforming several classical and state-of-the-art detectors. Further tests on unseen THEMIS and CRISM datasets confirm the model’s strong cross-source generalization, highlighting its robustness and applicability in diverse Martian imaging scenarios.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2696-2712"},"PeriodicalIF":5.3,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11309725","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate estimation of surface net longwave radiation (LW_net) and turbulent fluxes (Shle) is crucial for understanding the mechanisms of surface energy balance (SEB) and modeling SEB-based approach for estimating land surface temperature (LST) under cloudy sky. Parameterized or empirical regression methods based on remotely sensed data are effective ways to acquire LW_net and Shle. Current methods are always limited in global application due to the scarcity of observation sites and the absence of remote-sensing observations under cloudy conditions. To overcome these issues, this study developed multiple linear regression models (MLR) based on the data from 62 global sites covering 12 International Geosphere–Biosphere Program (IGBP) land cover types to estimate LW_net and Shle, in which net shortwave radiation (SW_net), normalized difference vegetation index (NDVI), normalized difference moisture index (NDMI), and digital elevation model (DEM) were used as variables. Model performance was evaluated with an independent dataset at both overall and seasonal scales. Then, the models were applied to remote-sensing products to estimate all-weather LW_net and Shle at a spatial resolution of 500 m, and the estimates were assessed against in situ data from five sites located in semiarid and arid regions in Northwest China. The results showed that the introduction of NDMI significantly improved prediction accuracy for most land cover types. The root-mean-square error (RMSE) of predicted LW_net ranged from 18.57 to 29.13 W/m2 with a mean RMSE of 25.65 W/m2, and the error of Shle ranged from 51.45 to 103.79 W/m2 with a mean RMSE of 65.16 W/m2. Application of the models to remote-sensing products, in which SW_net was provided by FY-4B surface shortwave radiation product, showed that the RMSE of estimated LW_net was 27.13 W/m2 in summer, 25.46 W/m2 in autumn, and 17.14 W/m2 in winter. For Shle, the average RMSE was 65.07 W/m2 in summer, 39.25 W/m2 in autumn, and 19.38 W/m2 in winter. Although the accuracy declined slightly over complex vegetation types and in the summer, the models exhibited robust applicability across different land cover types and seasons. This study provides an efficient, generalized method for estimating LW_net and Shle, which is promising to be used for studies on energy balance at regional and global scales and retrieval of LST under cloudy skies using SEB-based method.
{"title":"Prediction of Net Longwave Radiation and Turbulent Fluxes Using Remote-Sensing-Derived Net Shortwave Radiation for Different Land Cover Types","authors":"Jingwen Wang;Lei Lu;Xiaoming Zhou;Guanghui Huang;Zihan Chen","doi":"10.1109/JSTARS.2025.3646720","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3646720","url":null,"abstract":"Accurate estimation of surface net longwave radiation (LW_net) and turbulent fluxes (Shle) is crucial for understanding the mechanisms of surface energy balance (SEB) and modeling SEB-based approach for estimating land surface temperature (LST) under cloudy sky. Parameterized or empirical regression methods based on remotely sensed data are effective ways to acquire LW_net and Shle. Current methods are always limited in global application due to the scarcity of observation sites and the absence of remote-sensing observations under cloudy conditions. To overcome these issues, this study developed multiple linear regression models (MLR) based on the data from 62 global sites covering 12 International Geosphere–Biosphere Program (IGBP) land cover types to estimate LW_net and Shle, in which net shortwave radiation (SW_net), normalized difference vegetation index (NDVI), normalized difference moisture index (NDMI), and digital elevation model (DEM) were used as variables. Model performance was evaluated with an independent dataset at both overall and seasonal scales. Then, the models were applied to remote-sensing products to estimate all-weather LW_net and Shle at a spatial resolution of 500 m, and the estimates were assessed against in situ data from five sites located in semiarid and arid regions in Northwest China. The results showed that the introduction of NDMI significantly improved prediction accuracy for most land cover types. The root-mean-square error (RMSE) of predicted LW_net ranged from 18.57 to 29.13 W/m<sup>2</sup> with a mean RMSE of 25.65 W/m<sup>2</sup>, and the error of Shle ranged from 51.45 to 103.79 W/m<sup>2</sup> with a mean RMSE of 65.16 W/m<sup>2</sup>. Application of the models to remote-sensing products, in which SW_net was provided by FY-4B surface shortwave radiation product, showed that the RMSE of estimated LW_net was 27.13 W/m<sup>2</sup> in summer, 25.46 W/m<sup>2</sup> in autumn, and 17.14 W/m<sup>2</sup> in winter. For Shle, the average RMSE was 65.07 W/m<sup>2</sup> in summer, 39.25 W/m<sup>2</sup> in autumn, and 19.38 W/m<sup>2</sup> in winter. Although the accuracy declined slightly over complex vegetation types and in the summer, the models exhibited robust applicability across different land cover types and seasons. This study provides an efficient, generalized method for estimating LW_net and Shle, which is promising to be used for studies on energy balance at regional and global scales and retrieval of LST under cloudy skies using SEB-based method.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2864-2878"},"PeriodicalIF":5.3,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11309741","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Trajectory error modeling plays a significant role in the effective application of trajectory data in various fields. Classical trajectory error modeling methods construct the trajectory error band based on the velocity of a moving object or the trajectory's geometric features. Since both the velocity and trajectory geometric features influence the trajectory uncertainty, a modified method is proposed that integrates the velocity of the moving object and trajectory geometric features to construct the uncertain region. This method is developed based on the previous geometry-based model, the broad adaptive error ellipse model. First, two groups of Minkowski coefficients were derived based on the global geometric feature and the local velocities of moving objects, respectively. Then, the optimal set of Minkowski coefficients was obtained by integrating the above two sets of Minkowski coefficients. Last, the trajectory uncertainty was modeled on the basis of the optimal Minkowski coefficients and the measurement error of the sampled points. The proposed modified method was verified by using three trajectory datasets. The results prove that the modified method can provide an error band that can enclose the real trajectory with a relatively small area in most cases.
{"title":"A Modified Trajectory Error Modeling Method Integrating Moving Object Velocity and Trajectory Geometry","authors":"Yanmin Jin;Zhixian Luo;Xinyi Zheng;Xiaohua Tong;Yongjiu Feng;Huan Xie;Xiong Xu","doi":"10.1109/JSTARS.2025.3646151","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3646151","url":null,"abstract":"Trajectory error modeling plays a significant role in the effective application of trajectory data in various fields. Classical trajectory error modeling methods construct the trajectory error band based on the velocity of a moving object or the trajectory's geometric features. Since both the velocity and trajectory geometric features influence the trajectory uncertainty, a modified method is proposed that integrates the velocity of the moving object and trajectory geometric features to construct the uncertain region. This method is developed based on the previous geometry-based model, the broad adaptive error ellipse model. First, two groups of Minkowski coefficients were derived based on the global geometric feature and the local velocities of moving objects, respectively. Then, the optimal set of Minkowski coefficients was obtained by integrating the above two sets of Minkowski coefficients. Last, the trajectory uncertainty was modeled on the basis of the optimal Minkowski coefficients and the measurement error of the sampled points. The proposed modified method was verified by using three trajectory datasets. The results prove that the modified method can provide an error band that can enclose the real trajectory with a relatively small area in most cases.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2621-2640"},"PeriodicalIF":5.3,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11304715","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1109/JSTARS.2025.3646137
Jilun Peng;Estel Cardellach;Weiqiang Li
The availability of complex signals in global navigation satellite system-reflectometry (GNSS-R) has gained growing attention across a range of applications, due to its capacity to preserve phase information and provide high along-track resolution. HydroGNSS will employ a high-rate complex signal mode, known as the “coherent channel,” to capture both the in-phase and quadrature components of signals reflected at the specular point. This work presents a simulation framework for analyzing the coherent channel in GNSS-R at high temporal resolution. Given the computational resources and limitations of fully detailed simulations, a simplified model is proposed, which incorporates a well established reflectivity model, and new empirical and theoretical functions. The surface reflectivity model estimates it from parameters, such as soil roughness, moisture, vegetation cover, and water fraction, while the innovative modeling block, here called the complex field model, derives the coherent and diffuse field amplitudes from reflectivity, and it assigns electromagnetic phases that behave, statistically, as in actual data. The validation is conducted at different levels, first using actual amplitude and reflectivity measurements as input, then starting from auxiliary surface information, yielding correlations with the coherence coefficient of 0.93, 0.74, and 0.61, respectively. This validation approach facilitates the differentiation of the errors introduced by each of the modules. The results support the feasibility of the proposed framework as a practical and quick tool to investigate complex signals under varying reflected surface conditions. Higher accuracy will require a tighter integration of the surface reflectivity model and the complex field mode.
{"title":"A Simplified Model for Simulating Complex Signals in GNSS-Reflectometry over Land","authors":"Jilun Peng;Estel Cardellach;Weiqiang Li","doi":"10.1109/JSTARS.2025.3646137","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3646137","url":null,"abstract":"The availability of complex signals in global navigation satellite system-reflectometry (GNSS-R) has gained growing attention across a range of applications, due to its capacity to preserve phase information and provide high along-track resolution. HydroGNSS will employ a high-rate complex signal mode, known as the “coherent channel,” to capture both the in-phase and quadrature components of signals reflected at the specular point. This work presents a simulation framework for analyzing the coherent channel in GNSS-R at high temporal resolution. Given the computational resources and limitations of fully detailed simulations, a simplified model is proposed, which incorporates a well established reflectivity model, and new empirical and theoretical functions. The surface reflectivity model estimates it from parameters, such as soil roughness, moisture, vegetation cover, and water fraction, while the innovative modeling block, here called the complex field model, derives the coherent and diffuse field amplitudes from reflectivity, and it assigns electromagnetic phases that behave, statistically, as in actual data. The validation is conducted at different levels, first using actual amplitude and reflectivity measurements as input, then starting from auxiliary surface information, yielding correlations with the coherence coefficient of 0.93, 0.74, and 0.61, respectively. This validation approach facilitates the differentiation of the errors introduced by each of the modules. The results support the feasibility of the proposed framework as a practical and quick tool to investigate complex signals under varying reflected surface conditions. Higher accuracy will require a tighter integration of the surface reflectivity model and the complex field mode.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2474-2484"},"PeriodicalIF":5.3,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11304548","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
High spatiotemporal resolution soil moisture (SM) products can provide high-frequency SM information at the farmland scale for agricultural management processes. Existing SM retrieval methods for large-scale agricultural regions struggle to achieve both high temporal and spatial resolution simultaneously. In order to generate high spatiotemporal resolution SM in large-scale agricultural areas, this article has developed an SM retrieval model by taking advantage of multisource heterogeneous remote sensing data including active and passive microwaves and advanced data regression methods. First, multisource heterogeneous data with different spatiotemporal resolutions are fused to construct driving variables. Then, the multisource heterogeneous driving variables are coupled with the proposed transformer regression network to drive the SM retrieval task and generate high spatiotemporal resolution SM. We analyzed the advantages of the combination of active and passive microwaves. Compared with the SM derived from single microwave data, the SM derived from the combination of active and passive microwaves can reflect more detailed spatial and temporal information. The analysis of the retrieval performance of different retrieval methods shows that SM retrieved using multisource heterogeneous data and transformer regression has better accuracy, with a coefficient of determination of 0.9220. Moreover, the retrieved SM is not only highly consistent with the situation reflected by the U.S. Drought Monitor data, but also has a higher spatial accuracy when compared with the existing typical data products with a time resolution of one day. This article provides a feasible option for generating large-scale SM data with high spatiotemporal resolution for better agricultural water resource management.
{"title":"A High Spatiotemporal Resolution Soil Moisture Retrieval Approach Leveraging Deep Regression Networks and Multisource Remote Sensing Data","authors":"Xiaofei Kuang;Liping Wan;Shiyu Xiang;Pengliang Wei;Jiao Guo;Hanwen Yu","doi":"10.1109/JSTARS.2025.3646044","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3646044","url":null,"abstract":"High spatiotemporal resolution soil moisture (SM) products can provide high-frequency SM information at the farmland scale for agricultural management processes. Existing SM retrieval methods for large-scale agricultural regions struggle to achieve both high temporal and spatial resolution simultaneously. In order to generate high spatiotemporal resolution SM in large-scale agricultural areas, this article has developed an SM retrieval model by taking advantage of multisource heterogeneous remote sensing data including active and passive microwaves and advanced data regression methods. First, multisource heterogeneous data with different spatiotemporal resolutions are fused to construct driving variables. Then, the multisource heterogeneous driving variables are coupled with the proposed transformer regression network to drive the SM retrieval task and generate high spatiotemporal resolution SM. We analyzed the advantages of the combination of active and passive microwaves. Compared with the SM derived from single microwave data, the SM derived from the combination of active and passive microwaves can reflect more detailed spatial and temporal information. The analysis of the retrieval performance of different retrieval methods shows that SM retrieved using multisource heterogeneous data and transformer regression has better accuracy, with a coefficient of determination of 0.9220. Moreover, the retrieved SM is not only highly consistent with the situation reflected by the U.S. Drought Monitor data, but also has a higher spatial accuracy when compared with the existing typical data products with a time resolution of one day. This article provides a feasible option for generating large-scale SM data with high spatiotemporal resolution for better agricultural water resource management.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2557-2574"},"PeriodicalIF":5.3,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11303755","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-18DOI: 10.1109/JSTARS.2025.3645917
Dongdong Feng;Tao Che;Liyun Dai;Liang Gao;Fei Wu;Yanxing Hu;Yang Zhang;Guigang Wang;Yueling Shi
Significant uncertainties persist in current snow depth (SD) datasets due to variations in sensor characteristics and retrieval algorithms. This study systematically evaluated five SD datasets derived from ice, cloud, and land elevation satellite-2 (ICESat-2), passive microwave products (AMSR2 and GlobSnow), and reanalysis products (ERA5 and modern-era retrospective analysis for research and applications, version 2) across 13 representative snow-covered regions in the Northern Hemisphere. A novel dynamic upscaling approach was developed by integrating simple averaging with regression kriging for ICESat-2 SD data. Spatial matching of multisource datasets was then conducted using 3342 SD observation sites to comprehensively evaluate the uncertainties among the five datasets. The results indicate that the retrieval errors of each dataset are positively correlated with the mean SD across different regions. During snow accumulation and melt periods, ICESat-2 demonstrates significant advantages in nonforested areas with SDs ranging from 5 to 45 cm. Both passive microwave and reanalysis SD products demonstrate reliable performance during stable snow periods. However, products often miss snow during melt seasons despite ground confirmation. The causes of this phenomenon differ between the two datasets: passive microwave retrievals are primarily dominated by the physical properties of liquid water, whereas reanalysis products face limitations due to model structure and insufficient input data. In conclusion, the integration of station and ICESat-2 SD in nonforested regions may provide new possibilities for validating products.
{"title":"Uncertainty Characterization of ICESat-2, Passive Microwave, and Reanalysis Snow Depth Datasets Using Site Data in the Northern Hemisphere","authors":"Dongdong Feng;Tao Che;Liyun Dai;Liang Gao;Fei Wu;Yanxing Hu;Yang Zhang;Guigang Wang;Yueling Shi","doi":"10.1109/JSTARS.2025.3645917","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3645917","url":null,"abstract":"Significant uncertainties persist in current snow depth (SD) datasets due to variations in sensor characteristics and retrieval algorithms. This study systematically evaluated five SD datasets derived from ice, cloud, and land elevation satellite-2 (ICESat-2), passive microwave products (AMSR2 and GlobSnow), and reanalysis products (ERA5 and modern-era retrospective analysis for research and applications, version 2) across 13 representative snow-covered regions in the Northern Hemisphere. A novel dynamic upscaling approach was developed by integrating simple averaging with regression kriging for ICESat-2 SD data. Spatial matching of multisource datasets was then conducted using 3342 SD observation sites to comprehensively evaluate the uncertainties among the five datasets. The results indicate that the retrieval errors of each dataset are positively correlated with the mean SD across different regions. During snow accumulation and melt periods, ICESat-2 demonstrates significant advantages in nonforested areas with SDs ranging from 5 to 45 cm. Both passive microwave and reanalysis SD products demonstrate reliable performance during stable snow periods. However, products often miss snow during melt seasons despite ground confirmation. The causes of this phenomenon differ between the two datasets: passive microwave retrievals are primarily dominated by the physical properties of liquid water, whereas reanalysis products face limitations due to model structure and insufficient input data. In conclusion, the integration of station and ICESat-2 SD in nonforested regions may provide new possibilities for validating products.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2525-2542"},"PeriodicalIF":5.3,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11303762","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
DEtection TRansformer (DETR) has received significant recognition for its ability to streamline the design process of object detectors through the concept of set prediction. However, its exceptional performance comes at the cost of a high parameter count and significant computational requirements. Moreover, its ability to detect small objects is compromised, making it less suitable for analyzing high-altitude UncrewedAerial Vehicle (UAV) images. This article proposes UAV-DETR, a DETR architecture specifically designed for detecting UAV images captured at high altitudes, balancing parameter count and precision. UAV-DETR is built in two steps: First, inverted residual structures are used to preserve low-dimensional image features, followed by a carefully designed cascaded linear attention mechanism to mitigate parameter redundancy. Through observation and analysis of the attention diffusion issue in the encoder, a cross-channel dynamic sampling mechanism is proposed, which effectively expands the model’s receptive field while maintaining accuracy. In addition, the loss function is redesigned by incorporating the Wasserstein distance, which is insensitive to bounding boxes, to accelerate model convergence. Extensive experimental results on two major benchmarks, i.e., VisDrone and UAVDT, validate the simplicity and efficiency of our model. Specifically, on the VisDrone2021 public test set, UAV-DETR exhibits superior performance with only 14 million parameters compared to YOLOv8$_{m}$, reducing the model’s parameter count and complexity by 44$%$ and 10$%$, respectively, while achieving a 16.6$%$ improvement in accuracy, without any data augmentation or postprocessing procedures.
{"title":"UAV-DETR: Few-Parameter DETR for Small Object Detection in High-Altitude UAV Images","authors":"Ningsheng Liao;Yuning Zhang;Zhongliang Yu;Jiangshuai Huang;Mi Zhu;Bo Peng","doi":"10.1109/JSTARS.2025.3645731","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3645731","url":null,"abstract":"<bold>DE</b>tection <bold>TR</b>ansformer (DETR) has received significant recognition for its ability to streamline the design process of object detectors through the concept of set prediction. However, its exceptional performance comes at the cost of a high parameter count and significant computational requirements. Moreover, its ability to detect small objects is compromised, making it less suitable for analyzing high-altitude <bold>U</b>ncrewed<bold>A</b>erial <bold>V</b>ehicle (UAV) images. This article proposes UAV-DETR, a DETR architecture specifically designed for detecting UAV images captured at high altitudes, balancing parameter count and precision. UAV-DETR is built in two steps: First, inverted residual structures are used to preserve low-dimensional image features, followed by a carefully designed cascaded linear attention mechanism to mitigate parameter redundancy. Through observation and analysis of the attention diffusion issue in the encoder, a cross-channel dynamic sampling mechanism is proposed, which effectively expands the model’s receptive field while maintaining accuracy. In addition, the loss function is redesigned by incorporating the Wasserstein distance, which is insensitive to bounding boxes, to accelerate model convergence. Extensive experimental results on two major benchmarks, i.e., VisDrone and UAVDT, validate the simplicity and efficiency of our model. Specifically, on the VisDrone2021 public test set, UAV-DETR exhibits superior performance with only 14 million parameters compared to YOLOv8<inline-formula><tex-math>$_{m}$</tex-math></inline-formula>, reducing the model’s parameter count and complexity by 44<inline-formula><tex-math>$%$</tex-math></inline-formula> and 10<inline-formula><tex-math>$%$</tex-math></inline-formula>, respectively, while achieving a 16.6<inline-formula><tex-math>$%$</tex-math></inline-formula> improvement in accuracy, without any data augmentation or postprocessing procedures.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2575-2587"},"PeriodicalIF":5.3,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11303730","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-18DOI: 10.1109/JSTARS.2025.3646025
Qinggang Wu;Chao Ma;Mengkun He;Zedong Wu;Qinge Wu
The combination of convolutional neural networks and attention mechanisms effectively enhances feature representation capabilities in hyperspectral image (HSI) classification. However, most existing methods face great challenges in terms of parameter numbers and computational overhead, which hinders their applications when computing and storage resources are limited. To address these issues, we propose an ultralightweight multidomain feature extraction network with cross spatial–spectral attention (ULMN-CS2A) for HSI classification, which primarily consists of three modules, i.e., the collaborative frequency-spatial–spectral (CFSS) feature extraction module, Gaussian neighboring pixel ReLU (GNReLU) activation, and cross spatial–spectral attention (CSSA). First, the ultralightweight CFSS module is designed to replace traditional lightweight convolutional layers by independently extracting features from the frequency, spatial, and spectral domains. Second, the GNReLU module enhances the network's nonlinear fitting ability and improves interlayer information transmission by aggregating neighboring pixels with Gaussian weights. Third, the lightweight CSSA module captures the paired pixel-level spatial–spectral relationships and enhances the global context representation ability by simultaneously learning their interactions. Extensive experiments demonstrate that the proposed ULMN-CS2A method shows strong competitiveness compared to state-of-the-art lightweight methods in terms of model parameters, FLOPs, and classification performance under small sampling rates. Meanwhile, ULMN-CS2A-MSP achieves an excellent classification result of 82.31% in terms of open-overall accuracy on Salinas Valley dataset for open-set HSI classification task.
{"title":"An Ultralightweight Multidomain Feature Extraction Network With Cross Spatial–Spectral Attention for Hyperspectral Image Classification","authors":"Qinggang Wu;Chao Ma;Mengkun He;Zedong Wu;Qinge Wu","doi":"10.1109/JSTARS.2025.3646025","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3646025","url":null,"abstract":"The combination of convolutional neural networks and attention mechanisms effectively enhances feature representation capabilities in hyperspectral image (HSI) classification. However, most existing methods face great challenges in terms of parameter numbers and computational overhead, which hinders their applications when computing and storage resources are limited. To address these issues, we propose an ultralightweight multidomain feature extraction network with cross spatial–spectral attention (ULMN-CS2A) for HSI classification, which primarily consists of three modules, i.e., the collaborative frequency-spatial–spectral (CFSS) feature extraction module, Gaussian neighboring pixel ReLU (GNReLU) activation, and cross spatial–spectral attention (CSSA). First, the ultralightweight CFSS module is designed to replace traditional lightweight convolutional layers by independently extracting features from the frequency, spatial, and spectral domains. Second, the GNReLU module enhances the network's nonlinear fitting ability and improves interlayer information transmission by aggregating neighboring pixels with Gaussian weights. Third, the lightweight CSSA module captures the paired pixel-level spatial–spectral relationships and enhances the global context representation ability by simultaneously learning their interactions. Extensive experiments demonstrate that the proposed ULMN-CS2A method shows strong competitiveness compared to state-of-the-art lightweight methods in terms of model parameters, FLOPs, and classification performance under small sampling rates. Meanwhile, ULMN-CS2A-MSP achieves an excellent classification result of 82.31% in terms of open-overall accuracy on Salinas Valley dataset for open-set HSI classification task.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2829-2849"},"PeriodicalIF":5.3,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11303540","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}