Pub Date : 2026-01-14DOI: 10.1109/JSTARS.2026.3654017
Ali Al Bataineh;Bandi Vamsi;Scott Alan Smith
Accurate prediction of the water quality index is essential for protecting public health and managing freshwater resources. Existing models often rely on arbitrary weight initialization and make limited use of ensemble learning, which results in unstable performance and reduced interpretability. This study introduces a hybrid machine learning framework that combines feature-informed neural network initialization with gradient boosting (XGBoost) to address these limitations. Neural network weights are initialized using feature significance scores derived from SHapley Additive exPlanations (SHAP) and predictions are iteratively refined using XGBoost. The model was trained and evaluated using the public quality of freshwater dataset and compared against several baselines, including random forest, support vector regression, a conventional artificial neural network with Xavier initialization, and an XGBoost-only model. Our framework achieved an accuracy of 86.9%, an F1-score of 0.849, and a receiver operating characteristic–area under the curve of 0.894, outperforming all comparative methods. Ablation experiments showed that both the SHAP-based initialization and the boosting component each improved performance over simpler baselines.
{"title":"A Hybrid Machine Learning Framework for Water Quality Index Prediction Using Feature-Based Neural Network Initialization","authors":"Ali Al Bataineh;Bandi Vamsi;Scott Alan Smith","doi":"10.1109/JSTARS.2026.3654017","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3654017","url":null,"abstract":"Accurate prediction of the water quality index is essential for protecting public health and managing freshwater resources. Existing models often rely on arbitrary weight initialization and make limited use of ensemble learning, which results in unstable performance and reduced interpretability. This study introduces a hybrid machine learning framework that combines feature-informed neural network initialization with gradient boosting (XGBoost) to address these limitations. Neural network weights are initialized using feature significance scores derived from SHapley Additive exPlanations (SHAP) and predictions are iteratively refined using XGBoost. The model was trained and evaluated using the public quality of freshwater dataset and compared against several baselines, including random forest, support vector regression, a conventional artificial neural network with Xavier initialization, and an XGBoost-only model. Our framework achieved an accuracy of 86.9%, an <italic>F</i>1-score of 0.849, and a receiver operating characteristic–area under the curve of 0.894, outperforming all comparative methods. Ablation experiments showed that both the SHAP-based initialization and the boosting component each improved performance over simpler baselines.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4887-4905"},"PeriodicalIF":5.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11353250","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-13DOI: 10.1109/JSTARS.2026.3653626
Xiaole Lin;Guangping Li;Jiahua Xie;Zhuokun Zhi
While convolutional neural network (CNN)-based methods for small object detection in remote sensing imagery have advanced considerably, substantial challenges remain unresolved, primarily stemming from complex backgrounds and insufficient feature representation. To address these issues, we propose a novel architecture specifically designed to accommodate the unique demands of small objects, termed AMFC-DEIM. This framework introduces three key innovations: first, the adaptive one-to-one (O2O) matching mechanism, which enhances dense O2O matching by adaptively adjusting the matching grid configuration to the object distribution, thereby preserving the resolution of small objects throughout training; second, the focal convolution module, engineered to explicitly align with the spatial characteristics of small objects for extracting fine-grained features; and third, the enhanced normalized Wasserstein distance, which stabilizes the training process and bolsters performance on small targets. Comprehensive experiments conducted on three benchmark remote sensing small object detection datasets: RSOD, LEVIR-SHIP and NWPU VHR-10, demonstrate that AMFC-DEIM achieves remarkable performance, attaining AP$_{50}$ scores of 96.2%, 86.2%, and 95.1%, respectively, while maintaining only 5.27 M parameters. These results substantially outperform several established benchmark models and state-of-the-art methods.
{"title":"AMFC-DEIM: Improved DEIM With Adaptive Matching and Focal Convolution for Remote Sensing Small Object Detection","authors":"Xiaole Lin;Guangping Li;Jiahua Xie;Zhuokun Zhi","doi":"10.1109/JSTARS.2026.3653626","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3653626","url":null,"abstract":"While convolutional neural network (CNN)-based methods for small object detection in remote sensing imagery have advanced considerably, substantial challenges remain unresolved, primarily stemming from complex backgrounds and insufficient feature representation. To address these issues, we propose a novel architecture specifically designed to accommodate the unique demands of small objects, termed AMFC-DEIM. This framework introduces three key innovations: first, the adaptive one-to-one (O2O) matching mechanism, which enhances dense O2O matching by adaptively adjusting the matching grid configuration to the object distribution, thereby preserving the resolution of small objects throughout training; second, the focal convolution module, engineered to explicitly align with the spatial characteristics of small objects for extracting fine-grained features; and third, the enhanced normalized Wasserstein distance, which stabilizes the training process and bolsters performance on small targets. Comprehensive experiments conducted on three benchmark remote sensing small object detection datasets: RSOD, LEVIR-SHIP and NWPU VHR-10, demonstrate that AMFC-DEIM achieves remarkable performance, attaining AP<inline-formula><tex-math>$_{50}$</tex-math></inline-formula> scores of 96.2%, 86.2%, and 95.1%, respectively, while maintaining only 5.27 M parameters. These results substantially outperform several established benchmark models and state-of-the-art methods.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5021-5034"},"PeriodicalIF":5.3,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11347584","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-13DOI: 10.1109/JSTARS.2026.3653676
Jiapeng Huang;Yue Zhang;Xiaozhu Yang;Fan Mo
Forest canopy height is a critical structural parameter for accurately assessing forest carbon storage. This study integrates Global Ecosystem Dynamics Investigation (GEDI) LiDAR data with multisource remote sensing features to construct a multidimensional feature space comprising 13 parameters. By employing high-dimensional feature vectors of “spatial coordinates + environmental features,” the proposed deep learning-based neural network-guided interpolation (NNGI) model effectively harnesses the capacity of deep learning to model complex nonlinear relationships and adaptively extract local features. This method adopts a dual-network collaborative architecture to dynamically learn interpolation weights based on environmental similarity in the feature space, rather than relying on fixed parameters or merely considering spatial distance, thereby effectively fusing the complex nonlinear relationship modeling capability of deep learning with the concept of spatial interpolation. Experiments conducted across five representative regions in the United States demonstrate that the overall accuracy of the NNGI model significantly outperforms traditional machine learning methods, Pearson correlation coefffcient (r) = 0.79, root-mean-square error (RMSE) = 5.38 m, mean absolute error = 4.04 m, bias = –0.15 m. In areas with low (0% –20% ) and high (61% –80% ) vegetation cover fractions, the RMSE decreased by 37.52% and 5.37%, respectively, while the r-value increased by 15.87% and 35.90%, respectively. Regarding different slope aspects, the RMSE for southeastern and western slopes decreased by 30.38% and 18.70%, respectively. This study provides a more reliable solution for the accurate estimation of forest structural parameters in complex environments.
{"title":"A Deep Learning-Based Model for Forest Canopy Height Mapping Using Multisource Remote Sensing Data","authors":"Jiapeng Huang;Yue Zhang;Xiaozhu Yang;Fan Mo","doi":"10.1109/JSTARS.2026.3653676","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3653676","url":null,"abstract":"Forest canopy height is a critical structural parameter for accurately assessing forest carbon storage. This study integrates Global Ecosystem Dynamics Investigation (GEDI) LiDAR data with multisource remote sensing features to construct a multidimensional feature space comprising 13 parameters. By employing high-dimensional feature vectors of “spatial coordinates + environmental features,” the proposed deep learning-based neural network-guided interpolation (NNGI) model effectively harnesses the capacity of deep learning to model complex nonlinear relationships and adaptively extract local features. This method adopts a dual-network collaborative architecture to dynamically learn interpolation weights based on environmental similarity in the feature space, rather than relying on fixed parameters or merely considering spatial distance, thereby effectively fusing the complex nonlinear relationship modeling capability of deep learning with the concept of spatial interpolation. Experiments conducted across five representative regions in the United States demonstrate that the overall accuracy of the NNGI model significantly outperforms traditional machine learning methods, Pearson correlation coefffcient (<italic>r</i>) = 0.79, root-mean-square error (RMSE) = 5.38 m, mean absolute error = 4.04 m, bias = –0.15 m. In areas with low (0% –20% ) and high (61% –80% ) vegetation cover fractions, the RMSE decreased by 37.52% and 5.37%, respectively, while the <italic>r</i>-value increased by 15.87% and 35.90%, respectively. Regarding different slope aspects, the RMSE for southeastern and western slopes decreased by 30.38% and 18.70%, respectively. This study provides a more reliable solution for the accurate estimation of forest structural parameters in complex environments.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4842-4857"},"PeriodicalIF":5.3,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11348094","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/JSTARS.2026.3653452
Yao Xiao;Dianwei Shao;Suhui Wu;Yu Cai;Haili Li;Lichao Zhuang;Yuyue Xu;Yubin Fan;Chang-Qing Ke
Heavy rainfall in June 2024 caused a dramatic expansion of East Dongting Lake, located in northeastern Hunan Province, central China, and a breach occurred at Tuanzhouyuan within the lake region on 5th July. Optical remote sensing, synthetic aperture radar (SAR), and satellite altimetry provided essential data on inundation and water level changes. Using bitemporal Sentinel-1 SAR data, this study constructed a water body change detection dataset and applied the MambaBCD change detection models. The results showed that MambaBCD, based on state space models, showed superior performance, achieving an F1 score of 91.9% and demonstrates superior ability in identifying boundaries and small change areas. The inundation extent of East Dongting Lake from April to August 2024 was mapped using the MambaBCD model and bitemporal Sentinel-1 imagery. A sharp increase in inundation was observed in late June, with the water body expanding to 1142.4 ± 98 km2 by 4th July. In late July, the water body area began to decrease rapidly. In addition, the latest radar altimeter, surface water and ocean topography surpassed Sentinel-3 in monitoring water levels, capturing a peak of 34 m in early July during this flood event, with levels returning to normal by late August. This flooding event was caused by heavy rainfall over 600 km2 of cropland, with 95% of the buildings in Tuanzhouyuan being inundated, resulting in significant economic losses.
{"title":"Monitoring the 2024 Abrupt Flood Event in East Dongting Lake via Deep Learning and Multisource Remote Sensing Data","authors":"Yao Xiao;Dianwei Shao;Suhui Wu;Yu Cai;Haili Li;Lichao Zhuang;Yuyue Xu;Yubin Fan;Chang-Qing Ke","doi":"10.1109/JSTARS.2026.3653452","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3653452","url":null,"abstract":"Heavy rainfall in June 2024 caused a dramatic expansion of East Dongting Lake, located in northeastern Hunan Province, central China, and a breach occurred at Tuanzhouyuan within the lake region on 5th July. Optical remote sensing, synthetic aperture radar (SAR), and satellite altimetry provided essential data on inundation and water level changes. Using bitemporal Sentinel-1 SAR data, this study constructed a water body change detection dataset and applied the MambaBCD change detection models. The results showed that MambaBCD, based on state space models, showed superior performance, achieving an F1 score of 91.9% and demonstrates superior ability in identifying boundaries and small change areas. The inundation extent of East Dongting Lake from April to August 2024 was mapped using the MambaBCD model and bitemporal Sentinel-1 imagery. A sharp increase in inundation was observed in late June, with the water body expanding to 1142.4 ± 98 km<sup>2</sup> by 4th July. In late July, the water body area began to decrease rapidly. In addition, the latest radar altimeter, surface water and ocean topography surpassed Sentinel-3 in monitoring water levels, capturing a peak of 34 m in early July during this flood event, with levels returning to normal by late August. This flooding event was caused by heavy rainfall over 600 km<sup>2</sup> of cropland, with 95% of the buildings in Tuanzhouyuan being inundated, resulting in significant economic losses.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5602-5617"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11347475","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/JSTARS.2026.3651900
Enyu Zhao;Yu Shi;Nianxin Qu;Yulei Wang;Hang Zhao
Infrared small target detection is focused on accurately identifying tiny targets with low signal-to-noise ratio against complex backgrounds, representing a critical challenge in the field of infrared image processing. Existing approaches frequently fail to retain small target information during global semantic extraction and struggle with preserving detailed features and achieving effective feature fusion. To address these limitations, this article proposes a morphology-edge enhanced triple-cascaded network (MEETNet) for infrared small target detection. The network employs a triple-cascaded architecture that maintains high resolution and enhances information interaction between different stages, facilitating effective multilevel feature fusion while safeguarding deep small-target characteristics. MEETNet integrates an edge-detail enhanced module (EDEM) and a detail-aware multi-scale fusion module (DMSFM). These modules introduce edge-detail enhanced features that amalgamate contrast and edge information, thereby amplifying target saliency and improving edge representation. Specifically, EDEM augments target contrast and edge structures by integrating edge-detail-enhanced features with shallow details. This integration improves the discriminability capacity of shallow features for detecting small targets. Moreover, DMSFM implements a multireceptive field mechanism to merge target details with deep semantic insights, enabling the capture of more distinctive global contextual features. Experimental evaluations conducted using two public datasets—NUAA-SIRST and NUDT-SIRST—demonstrate that the proposed MEETNet surpasses existing state-of-the-art methods for infrared small target detection in terms of detection accuracy.
{"title":"MEETNet: Morphology-Edge Enhanced Triple-Cascaded Network for Infrared Small Target Detection","authors":"Enyu Zhao;Yu Shi;Nianxin Qu;Yulei Wang;Hang Zhao","doi":"10.1109/JSTARS.2026.3651900","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651900","url":null,"abstract":"Infrared small target detection is focused on accurately identifying tiny targets with low signal-to-noise ratio against complex backgrounds, representing a critical challenge in the field of infrared image processing. Existing approaches frequently fail to retain small target information during global semantic extraction and struggle with preserving detailed features and achieving effective feature fusion. To address these limitations, this article proposes a morphology-edge enhanced triple-cascaded network (MEETNet) for infrared small target detection. The network employs a triple-cascaded architecture that maintains high resolution and enhances information interaction between different stages, facilitating effective multilevel feature fusion while safeguarding deep small-target characteristics. MEETNet integrates an edge-detail enhanced module (EDEM) and a detail-aware multi-scale fusion module (DMSFM). These modules introduce edge-detail enhanced features that amalgamate contrast and edge information, thereby amplifying target saliency and improving edge representation. Specifically, EDEM augments target contrast and edge structures by integrating edge-detail-enhanced features with shallow details. This integration improves the discriminability capacity of shallow features for detecting small targets. Moreover, DMSFM implements a multireceptive field mechanism to merge target details with deep semantic insights, enabling the capture of more distinctive global contextual features. Experimental evaluations conducted using two public datasets—NUAA-SIRST and NUDT-SIRST—demonstrate that the proposed MEETNet surpasses existing state-of-the-art methods for infrared small target detection in terms of detection accuracy.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4748-4765"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11340625","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep forest-based models for synthetic aperture radar (SAR) image change detection are generally challenged by noise sensitivity and high feature redundancy, which significantly degrade the prediction performance. To address these issues, this article proposes a structure-constrained and feature-screened deep forest, abbreviated as SC-FS-DF, for SAR image change detection. In preclassification, a fuzzy multineighborhood information C-means clustering is proposed to generate high-quality pseudo-labels. It introduces the edge information, the nonlocal and intrasuperpixel neighborhoods into the objective function of fuzzy local information C-means, thus suppressing the speckle noise and constraining structures of targets. In the sample learning and label prediction module, a feature-screened deep forest (FS-DF) framework is constructed by combining feature importance and redundancy analysis with a dropout strategy, thus screening out the noninformative features and meanwhile retaining the informative ones for learning at each cascade layer. Finally, a novel energy function fusing the nonlocal and superpixel information is derived for refining the detection map generated by FS-DF, further preserving fine details and edge locations. Extensive comparison and ablation experiments on five real SAR datasets verify the effectiveness and robustness of the proposed SC-FS-DF, and demonstrate that the SC-FS-DF can well screen the high-dimensional features in change detection and constrain the structures of targets.
{"title":"Feature-Screened and Structure-Constrained Deep Forest for Unsupervised SAR Image Change Detection","authors":"Wanying Song;Ruijing Zhu;Jie Wang;Yinyin Jiang;Yan Wu","doi":"10.1109/JSTARS.2026.3651534","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651534","url":null,"abstract":"Deep forest-based models for synthetic aperture radar (SAR) image change detection are generally challenged by noise sensitivity and high feature redundancy, which significantly degrade the prediction performance. To address these issues, this article proposes a structure-constrained and feature-screened deep forest, abbreviated as SC-FS-DF, for SAR image change detection. In preclassification, a fuzzy multineighborhood information C-means clustering is proposed to generate high-quality pseudo-labels. It introduces the edge information, the nonlocal and intrasuperpixel neighborhoods into the objective function of fuzzy local information C-means, thus suppressing the speckle noise and constraining structures of targets. In the sample learning and label prediction module, a feature-screened deep forest (FS-DF) framework is constructed by combining feature importance and redundancy analysis with a dropout strategy, thus screening out the noninformative features and meanwhile retaining the informative ones for learning at each cascade layer. Finally, a novel energy function fusing the nonlocal and superpixel information is derived for refining the detection map generated by FS-DF, further preserving fine details and edge locations. Extensive comparison and ablation experiments on five real SAR datasets verify the effectiveness and robustness of the proposed SC-FS-DF, and demonstrate that the SC-FS-DF can well screen the high-dimensional features in change detection and constrain the structures of targets.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4056-4068"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11339914","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/JSTARS.2026.3651075
Yu Yao;Hengbin Wang;Xiang Gao;Ziyao Xing;Xiaodong Zhang;Yuanyuan Zhao;Shaoming Li;Zhe Liu
High-resolution remote sensing images provide crucial data support for applications such as precision agriculture and water resource management. However, super-resolution reconstructions often suffer from over-smoothed textures and structural distortions, failing to accurately recover the intricate details of ground objects. To address this issue, this article proposes a remote sensing image super-resolution network (DTWSTSR) that combines the Dual-Tree Complex Wavelet Transform and Swin Transformer, which enhances the ability of texture detail reconstruction by fusing frequency-domain and spatial-domain features. This model includes a Dual-Tree Complex Wavelet Texture Feature Sensing Module (DWTFSM) for integrating frequency and spatial features, and a Multiscale Efficient Channel Attention mechanism to enhance attention to multiscale and global details. In addition, we design a Kolmogorov–Arnold Network based on a branch attention mechanism, which improves the model’s ability to represent complex nonlinear features. During the training process, we investigate the impact of hyperparameters and propose the two-stage SSIM&SL1 loss function to reduce structural differences between images. Experimental results show that DTWSTSR outperforms existing mainstream methods under different magnification factors (×2, ×3, ×4), ranking among the top two in multiple metrics. For example, at ×2 magnification, its PSNR value is 0.64–2.68 dB higher than that of other models. Visual comparisons demonstrate that the proposed model achieves clearer and more accurate detail reconstruction of target ground objects. Furthermore, the model exhibits excellent generalization ability in cross-sensor image (OLI2MSI dataset) reconstruction.
{"title":"DTWSTSR: Dual-Tree Complex Wavelet and Swin Transformer Based Remote Sensing Images Super-Resolution Network","authors":"Yu Yao;Hengbin Wang;Xiang Gao;Ziyao Xing;Xiaodong Zhang;Yuanyuan Zhao;Shaoming Li;Zhe Liu","doi":"10.1109/JSTARS.2026.3651075","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651075","url":null,"abstract":"High-resolution remote sensing images provide crucial data support for applications such as precision agriculture and water resource management. However, super-resolution reconstructions often suffer from over-smoothed textures and structural distortions, failing to accurately recover the intricate details of ground objects. To address this issue, this article proposes a remote sensing image super-resolution network (DTWSTSR) that combines the Dual-Tree Complex Wavelet Transform and Swin Transformer, which enhances the ability of texture detail reconstruction by fusing frequency-domain and spatial-domain features. This model includes a Dual-Tree Complex Wavelet Texture Feature Sensing Module (DWTFSM) for integrating frequency and spatial features, and a Multiscale Efficient Channel Attention mechanism to enhance attention to multiscale and global details. In addition, we design a Kolmogorov–Arnold Network based on a branch attention mechanism, which improves the model’s ability to represent complex nonlinear features. During the training process, we investigate the impact of hyperparameters and propose the two-stage SSIM&SL1 loss function to reduce structural differences between images. Experimental results show that DTWSTSR outperforms existing mainstream methods under different magnification factors (×2, ×3, ×4), ranking among the top two in multiple metrics. For example, at ×2 magnification, its PSNR value is 0.64–2.68 dB higher than that of other models. Visual comparisons demonstrate that the proposed model achieves clearer and more accurate detail reconstruction of target ground objects. Furthermore, the model exhibits excellent generalization ability in cross-sensor image (OLI2MSI dataset) reconstruction.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4730-4747"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11329193","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/JSTARS.2026.3651639
Yang Liu;Kun Zhang;Chun-Yi Song;Zhi-Wei Xu
In high-resolution maritime radar working in scanning mode, the classification and identification of ships require the recovery of the ship’s high-resolution range profiles (HRRPs) from radar returns. The return signal from the ship is a complex sparse signal interfered by non-Gaussian sea clutter. In this article, three sparse optimization methods matching the non-Gaussian characteristics of sea clutter, i.e., the sparse optimization matching K-distribution method, the sparse optimization matching generalized Pareto distribution method, the sparse optimization matching CGIG distribution method, are proposed to estimate complex HRRPs of ships. The compound Gaussian model is used to describe the non-Gaussianity of sea clutter, and the sparsity of ships’ complex HRRPs is constrained by the random distribution with one parameter. In the three methods, the Anderson–Darling test is used to search the parameters of the sparse constraint model. Besides, the non-Gaussian characteristics of sea clutter depend on the marine environment parameters and radar operating parameters. For different scenarios, the minimal criterion of the Kolmogorov–Smirnov distance is used to select the best model from the three compound Gaussian models, and then select the corresponding proposed methods. Simulated and measured radar data are used to evaluate the performance of the proposed methods and the results show that the proposed methods obtain better estimates of ship HRRPs compared to the recent SRIM method and the classical SLIM method.
{"title":"Estimation of Ships’ Complex High-Resolution Range Profiles Based on Sparse Optimization Method in Non-Gaussian Sea Clutter","authors":"Yang Liu;Kun Zhang;Chun-Yi Song;Zhi-Wei Xu","doi":"10.1109/JSTARS.2026.3651639","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651639","url":null,"abstract":"In high-resolution maritime radar working in scanning mode, the classification and identification of ships require the recovery of the ship’s high-resolution range profiles (HRRPs) from radar returns. The return signal from the ship is a complex sparse signal interfered by non-Gaussian sea clutter. In this article, three sparse optimization methods matching the non-Gaussian characteristics of sea clutter, i.e., the sparse optimization matching K-distribution method, the sparse optimization matching generalized Pareto distribution method, the sparse optimization matching CGIG distribution method, are proposed to estimate complex HRRPs of ships. The compound Gaussian model is used to describe the non-Gaussianity of sea clutter, and the sparsity of ships’ complex HRRPs is constrained by the random distribution with one parameter. In the three methods, the Anderson–Darling test is used to search the parameters of the sparse constraint model. Besides, the non-Gaussian characteristics of sea clutter depend on the marine environment parameters and radar operating parameters. For different scenarios, the minimal criterion of the Kolmogorov–Smirnov distance is used to select the best model from the three compound Gaussian models, and then select the corresponding proposed methods. Simulated and measured radar data are used to evaluate the performance of the proposed methods and the results show that the proposed methods obtain better estimates of ship HRRPs compared to the recent SRIM method and the classical SLIM method.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"3998-4013"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11339885","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/JSTARS.2026.3651577
Wei Huang;JiaLu Li;Qiqiang Chen;Junru Yin;Jiqiang Niu;Le Sun
In recent years, the integration of convolutional neural networks and Transformers has significantly advanced hyperspectral image (HSI) classification by jointly capturing local and global features. However, most existing methods primarily focus on the fusion of spectral–spatial features while neglecting the complementary information contained in frequency-domain features. To address this issue, we propose a spatial–frequency cross-attention fusion network (SFCFNet) that jointly models spectral, spatial, and frequency-domain features for HSI classification. The framework consists of three core modules: first, the multiscale spectral–spatial feature learning module extracts joint spectral spatial features using multiscale 3-D and 2-D convolutions. Next, the triple-branch representation module employs three branches to capture global spatial features of large-scale structures, local spatial features of fine-grained textures, and multiscale frequency features based on Haar wavelet decomposition, providing complementary multidomain representations for subsequent deep fusion. Finally, the dual-domain feature cross-attention fusion module achieves effective fusion of spatial structures and frequency-domain textures, enhancing the model’s ability to separate complex backgrounds from fine-grained targets and thereby improving classification performance. Compared with other methods, SFCFNet achieves higher overall accuracy on the Salinas, Houston2013, WHU-Hi-LongKou, and Xuzhou datasets, reaching 99.05%, 98.07%, 98.76%, and 98.18%, respectively.
{"title":"SFCFNet: A Spatial–Frequency Cross-Attention Fusion Network for Hyperspectral Image Classification","authors":"Wei Huang;JiaLu Li;Qiqiang Chen;Junru Yin;Jiqiang Niu;Le Sun","doi":"10.1109/JSTARS.2026.3651577","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651577","url":null,"abstract":"In recent years, the integration of convolutional neural networks and Transformers has significantly advanced hyperspectral image (HSI) classification by jointly capturing local and global features. However, most existing methods primarily focus on the fusion of spectral–spatial features while neglecting the complementary information contained in frequency-domain features. To address this issue, we propose a spatial–frequency cross-attention fusion network (SFCFNet) that jointly models spectral, spatial, and frequency-domain features for HSI classification. The framework consists of three core modules: first, the multiscale spectral–spatial feature learning module extracts joint spectral spatial features using multiscale 3-D and 2-D convolutions. Next, the triple-branch representation module employs three branches to capture global spatial features of large-scale structures, local spatial features of fine-grained textures, and multiscale frequency features based on Haar wavelet decomposition, providing complementary multidomain representations for subsequent deep fusion. Finally, the dual-domain feature cross-attention fusion module achieves effective fusion of spatial structures and frequency-domain textures, enhancing the model’s ability to separate complex backgrounds from fine-grained targets and thereby improving classification performance. Compared with other methods, SFCFNet achieves higher overall accuracy on the Salinas, Houston2013, WHU-Hi-LongKou, and Xuzhou datasets, reaching 99.05%, 98.07%, 98.76%, and 98.18%, respectively.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4994-5008"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11340627","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/JSTARS.2026.3652404
Ali Caglayan;Nevrez Imamoglu;Toru Kouyama
Self-supervised pretraining has emerged as a powerful approach for learning transferable representations from large-scale unlabeled data, significantly reducing reliance on task-specific labeled datasets. Although masked autoencoders (MAEs) have shown considerable success in optical remote sensing, such as RGB and multispectral imagery, their application to synthetic aperture radar (SAR) data remains underexplored due to their unique imaging characteristics, including speckle content and intensity variability. In this work, we investigate the effectiveness of MAEs for SAR pretraining, specifically applying MixMAE [Liu, et al.,(2023)] to Sentinel-1 SAR imagery. We introduce SAR-W-MixMAE, a domain-aware self-supervised learning approach that incorporates an SAR-specific pixelwise weighting strategy into the reconstruction loss, mitigating the effects of speckle content and high-intensity backscatter variations. Experimental results demonstrate that SAR-W-MixMAE consistently improves baseline models in multilabel SAR image classification and flood detection tasks, extending the state-of-the-art performance on the popular BigEarthNet dataset. Extensive ablation studies reveal that pretraining duration and fine-tuning dataset size significantly impact downstream performance. In particular, early stopping during pretraining can yield optimal downstream task accuracy, challenging the assumption that prolonged pretraining enhances results. These insights contribute to the development of foundation models tailored for SAR imagery and provide practical guidelines for optimizing pretraining strategies in remote sensing applications.
自监督预训练已经成为一种从大规模未标记数据中学习可转移表征的强大方法,显著减少了对特定任务标记数据集的依赖。尽管掩膜自动编码器(MAEs)在光学遥感(如RGB和多光谱成像)中取得了相当大的成功,但由于其独特的成像特性(包括散斑含量和强度可变性),它们在合成孔径雷达(SAR)数据中的应用仍未得到充分探索。在这项工作中,我们研究了MAEs在SAR预训练中的有效性,特别是将MixMAE [Liu, et .,(2023)]应用于Sentinel-1 SAR图像。我们引入了SAR-W-MixMAE,这是一种领域感知的自监督学习方法,它将sar特定的像素加权策略纳入重建损失,减轻了散斑内容和高强度后向散射变化的影响。实验结果表明,SAR- w - mixmae在多标签SAR图像分类和洪水检测任务中不断改进基线模型,扩展了流行的BigEarthNet数据集的最先进性能。广泛的消融研究表明,预训练时间和微调数据集大小显著影响下游性能。特别是,在预训练期间提前停止可以产生最佳的下游任务准确性,挑战了延长预训练可以提高结果的假设。这些见解有助于开发适合SAR图像的基础模型,并为优化遥感应用中的预训练策略提供实用指南。
{"title":"SAR-W-MixMAE: Polarization-Aware Self-Supervised Pretraining for Masked Autoencoders on SAR Data","authors":"Ali Caglayan;Nevrez Imamoglu;Toru Kouyama","doi":"10.1109/JSTARS.2026.3652404","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3652404","url":null,"abstract":"Self-supervised pretraining has emerged as a powerful approach for learning transferable representations from large-scale unlabeled data, significantly reducing reliance on task-specific labeled datasets. Although masked autoencoders (MAEs) have shown considerable success in optical remote sensing, such as RGB and multispectral imagery, their application to synthetic aperture radar (SAR) data remains underexplored due to their unique imaging characteristics, including speckle content and intensity variability. In this work, we investigate the effectiveness of MAEs for SAR pretraining, specifically applying MixMAE [Liu, et al.,(2023)] to Sentinel-1 SAR imagery. We introduce SAR-W-MixMAE, a domain-aware self-supervised learning approach that incorporates an SAR-specific pixelwise weighting strategy into the reconstruction loss, mitigating the effects of speckle content and high-intensity backscatter variations. Experimental results demonstrate that SAR-W-MixMAE consistently improves baseline models in multilabel SAR image classification and flood detection tasks, extending the state-of-the-art performance on the popular BigEarthNet dataset. Extensive ablation studies reveal that pretraining duration and fine-tuning dataset size significantly impact downstream performance. In particular, early stopping during pretraining can yield optimal downstream task accuracy, challenging the assumption that prolonged pretraining enhances results. These insights contribute to the development of foundation models tailored for SAR imagery and provide practical guidelines for optimizing pretraining strategies in remote sensing applications.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5590-5601"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11344788","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}