Pub Date : 2026-01-15DOI: 10.1109/JSTARS.2026.3654346
Jianshang Liao;Liguo Wang
Hyperspectral image (HSI) classification faces critical challenges in effectively modeling long-range dependencies while maintaining computational efficiency and synergistically exploiting spatial-spectral information. Convolutional neural networks (CNNs) are constrained by local receptive fields, transformers suffer from quadratic computational complexity, and existing state space model (SSM)-based methods lack sophisticated cross-domain interaction mechanisms. This article proposes Spatial-Spectral Attentive Mamba (SSA-Mamba), a novel classification approach addressing these limitations through three synergistic innovations. First, a dual-branch independent modeling strategy allocates separate parameter spaces for spatial and spectral feature extraction via parallel SSMs, preventing feature coupling while enabling domain-specific learning. Second, an asymmetric cross-domain attention mechanism allows spatial features to actively query spectral information through multihead attention, establishing adaptive fusion via gating mechanisms and channel attention. Third, a multiscale residual architecture operating at module-internal, block-internal, and global pathway levels achieves hierarchical feature fusion while maintaining numerical stability through exponential parameterization. The recursive computation mechanism of SSMs enables each position to aggregate global historical information through compact hidden states, achieving O(L) linear complexity compared to transformers’ O(L2) quadratic complexity. Extensive experiments on three benchmark datasets—Houston2013, WHU-Hi-HongHu, and XiongAn—validate the effectiveness of these innovations. SSA-Mamba achieves overall accuracies of 93.98%, 93.58%, and 96.06%, surpassing state-of-the-art approaches by 1.27%, 0.25%, and 1.27%, respectively. The dual-branch design enables effective discrimination of spectrally similar categories, improving Brassica variety classification by 19.21–23.33 percentage points over coupled-feature approaches. The cross-domain attention mechanism enhances urban land cover classification, with Commercial and Highway categories improving by 1.74% and 15.66%. On the large-scale XiongAn dataset (5.92 million pixels), SSA-Mamba demonstrates exceptional scalability with peak GPU memory of only 317.89 MB and per-sample inference time of 0.646 ms, providing an efficient solution for real-time HSI processing. The source code for SSA-Mamba will be made publicly available online.
高光谱图像(HSI)分类面临着在保持计算效率和协同利用空间光谱信息的同时有效建模远程依赖关系的关键挑战。卷积神经网络(cnn)受局部感受场的限制,变压器的计算复杂度为二次型,现有的基于状态空间模型(SSM)的方法缺乏复杂的跨域交互机制。本文提出了空间光谱关注曼巴(SSA-Mamba),这是一种新的分类方法,通过三个协同创新来解决这些限制。首先,双分支独立建模策略通过并行ssm为空间和光谱特征提取分配单独的参数空间,在实现特定领域学习的同时防止特征耦合。其次,非对称跨域注意机制允许空间特征通过多头注意主动查询光谱信息,通过门控机制和通道注意建立自适应融合;第三,在模块内部、块内部和全局路径水平上运行的多尺度残差架构实现了分层特征融合,同时通过指数参数化保持了数值稳定性。ssm的递归计算机制使每个位置能够通过紧凑的隐藏状态聚合全局历史信息,与变压器的O(L2)二次复杂度相比,实现了O(L)线性复杂度。在休斯顿2013、whu - hi -洪湖和雄安三个基准数据集上进行的大量实验验证了这些创新的有效性。SSA-Mamba的总体准确率分别为93.98%、93.58%和96.06%,比目前最先进的方法分别高出1.27%、0.25%和1.27%。双分支设计能够有效识别光谱相似的品类,比耦合特征方法提高了19.21-23.33个百分点。跨域关注机制增强了城市土地覆盖分类,商业类和公路类分别提高了1.74%和15.66%。在大规模雄安数据集(592万像素)上,SSA-Mamba显示出卓越的可扩展性,峰值GPU内存仅为317.89 MB,每样本推理时间为0.646 ms,为实时HSI处理提供了有效的解决方案。SSA-Mamba的源代码将在网上公开。
{"title":"SSA-Mamba: Spatial-Spectral Attentive State Space Model for Hyperspectral Image Classification","authors":"Jianshang Liao;Liguo Wang","doi":"10.1109/JSTARS.2026.3654346","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3654346","url":null,"abstract":"Hyperspectral image (HSI) classification faces critical challenges in effectively modeling long-range dependencies while maintaining computational efficiency and synergistically exploiting spatial-spectral information. Convolutional neural networks (CNNs) are constrained by local receptive fields, transformers suffer from quadratic computational complexity, and existing state space model (SSM)-based methods lack sophisticated cross-domain interaction mechanisms. This article proposes Spatial-Spectral Attentive Mamba (SSA-Mamba), a novel classification approach addressing these limitations through three synergistic innovations. First, a dual-branch independent modeling strategy allocates separate parameter spaces for spatial and spectral feature extraction via parallel SSMs, preventing feature coupling while enabling domain-specific learning. Second, an asymmetric cross-domain attention mechanism allows spatial features to actively query spectral information through multihead attention, establishing adaptive fusion via gating mechanisms and channel attention. Third, a multiscale residual architecture operating at module-internal, block-internal, and global pathway levels achieves hierarchical feature fusion while maintaining numerical stability through exponential parameterization. The recursive computation mechanism of SSMs enables each position to aggregate global historical information through compact hidden states, achieving O(L) linear complexity compared to transformers’ O(L<sup>2</sup>) quadratic complexity. Extensive experiments on three benchmark datasets—Houston2013, WHU-Hi-HongHu, and XiongAn—validate the effectiveness of these innovations. SSA-Mamba achieves overall accuracies of 93.98%, 93.58%, and 96.06%, surpassing state-of-the-art approaches by 1.27%, 0.25%, and 1.27%, respectively. The dual-branch design enables effective discrimination of spectrally similar categories, improving Brassica variety classification by 19.21–23.33 percentage points over coupled-feature approaches. The cross-domain attention mechanism enhances urban land cover classification, with Commercial and Highway categories improving by 1.74% and 15.66%. On the large-scale XiongAn dataset (5.92 million pixels), SSA-Mamba demonstrates exceptional scalability with peak GPU memory of only 317.89 MB and per-sample inference time of 0.646 ms, providing an efficient solution for real-time HSI processing. The source code for SSA-Mamba will be made publicly available online.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"6403-6424"},"PeriodicalIF":5.3,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11355499","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1109/JSTARS.2026.3654241
Yuan Li;Tianzhu Zhang;Ziyi Xiong;Junying Lv;Yinning Pang
Detecting three-dimensional (3-D) windows is vital for creating semantic building models with high level of detail, furnishing smart city and digital twin programs. Existing studies on window extraction using street imagery or laser scanning data often rely on limited types of features, resulting in compromised accuracy and completeness due to shadows and geometric decorations caused by curtains, balconies, plants, and other objects. To enhance the effectiveness and robustness of building window extraction in 3-D, this article proposes an automatic method that leverages synergistic information from multiview-stereo (MVS) point clouds, through an adaptive divide-and-combine pipeline. Color information inherited from the imagery serves as a main clue to acquire the point clouds of individual building façades that may be coplanar and connected. The geometric information associated with normal vectors is then combined with color, to adaptively divide individual building façade into an irregular grid that conforms to the window edges. Subsequently, HSV color and depth distances within each grid cell are computed, and the grid cells are encoded to quantify the global arrangement features of windows. Finally, the multitype features are fused in an integer programming model, by solving which the optimal combination of grid cells corresponding to windows is obtained. Benefitting from the informative MVS point clouds and the fusion of multitype features, our method is able to directly produce 3-D models with high regularity for buildings with different appearances. Experimental results demonstrate that the proposed method is effective in 3-D window extraction while overcoming variations in façade appearances caused by foreign objects and missing data, with a high point-wise precision of 92.7%, recall of 77.09%, IoU of 71.95%, and F1-score of 83.42%. The results also exhibit a high level of integrity, with the accuracy of correctly extracted windows reaching 89.81%. In the future, we will focus on the development of a more universal façade dividing method to deal with even more complicated windows.
{"title":"Automated Extraction of 3-D Windows From MVS Point Clouds by Comprehensive Fusion of Multitype Features","authors":"Yuan Li;Tianzhu Zhang;Ziyi Xiong;Junying Lv;Yinning Pang","doi":"10.1109/JSTARS.2026.3654241","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3654241","url":null,"abstract":"Detecting three-dimensional (3-D) windows is vital for creating semantic building models with high level of detail, furnishing smart city and digital twin programs. Existing studies on window extraction using street imagery or laser scanning data often rely on limited types of features, resulting in compromised accuracy and completeness due to shadows and geometric decorations caused by curtains, balconies, plants, and other objects. To enhance the effectiveness and robustness of building window extraction in 3-D, this article proposes an automatic method that leverages synergistic information from multiview-stereo (MVS) point clouds, through an adaptive divide-and-combine pipeline. Color information inherited from the imagery serves as a main clue to acquire the point clouds of individual building façades that may be coplanar and connected. The geometric information associated with normal vectors is then combined with color, to adaptively divide individual building façade into an irregular grid that conforms to the window edges. Subsequently, HSV color and depth distances within each grid cell are computed, and the grid cells are encoded to quantify the global arrangement features of windows. Finally, the multitype features are fused in an integer programming model, by solving which the optimal combination of grid cells corresponding to windows is obtained. Benefitting from the informative MVS point clouds and the fusion of multitype features, our method is able to directly produce 3-D models with high regularity for buildings with different appearances. Experimental results demonstrate that the proposed method is effective in 3-D window extraction while overcoming variations in façade appearances caused by foreign objects and missing data, with a high point-wise precision of 92.7%, recall of 77.09%, IoU of 71.95%, and F1-score of 83.42%. The results also exhibit a high level of integrity, with the accuracy of correctly extracted windows reaching 89.81%. In the future, we will focus on the development of a more universal façade dividing method to deal with even more complicated windows.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4918-4934"},"PeriodicalIF":5.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11353237","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1109/JSTARS.2026.3654195
Daniel Carcereri;Luca Dell’Amore;Stefano Tebaldini;Paola Rizzoli
The increasing use of artificial intelligence (AI) models in Earth Observation (EO) applications, such as forest height estimation, has led to a growing need for explainable AI (XAI) methods. Despite their high accuracy, AI models are often criticized for their “black-box” nature, making it difficult to understand the inner decision-making process. In this study, we propose a multifaceted approach to XAI for a convolutional neural network (CNN)-based model that estimates forest height from TanDEM-X single-pass InSAR data. By combining domain knowledge, saliency maps, and feature importance analysis through exhaustive model permutations, we provide a comprehensive investigation of the network working principles. Our results suggests that the proposed model is implicitly capable of recognizing and compensating for the SAR acquisition geometry-related distortions. We find that the mean phase center height and its local variability represents the most informative predictor. We also find evidence that the interferometric coherence and the backscatter maps capture complementary but equally relevant views of the vegetation. This work contributes to advance the understanding of the model’s inner workings, and targets the development of more transparent and trustworthy AI for EO applications, ultimately leading to improved accuracy and reliability in the estimation of forest parameters.
{"title":"Insights on the Working Principles of a CNN for Forest Height Regression From Single-Pass InSAR Data","authors":"Daniel Carcereri;Luca Dell’Amore;Stefano Tebaldini;Paola Rizzoli","doi":"10.1109/JSTARS.2026.3654195","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3654195","url":null,"abstract":"The increasing use of artificial intelligence (AI) models in Earth Observation (EO) applications, such as forest height estimation, has led to a growing need for explainable AI (XAI) methods. Despite their high accuracy, AI models are often criticized for their “black-box” nature, making it difficult to understand the inner decision-making process. In this study, we propose a multifaceted approach to XAI for a convolutional neural network (CNN)-based model that estimates forest height from TanDEM-X single-pass InSAR data. By combining domain knowledge, saliency maps, and feature importance analysis through exhaustive model permutations, we provide a comprehensive investigation of the network working principles. Our results suggests that the proposed model is implicitly capable of recognizing and compensating for the SAR acquisition geometry-related distortions. We find that the mean phase center height and its local variability represents the most informative predictor. We also find evidence that the interferometric coherence and the backscatter maps capture complementary but equally relevant views of the vegetation. This work contributes to advance the understanding of the model’s inner workings, and targets the development of more transparent and trustworthy AI for EO applications, ultimately leading to improved accuracy and reliability in the estimation of forest parameters.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4809-4824"},"PeriodicalIF":5.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11352840","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1109/JSTARS.2026.3654017
Ali Al Bataineh;Bandi Vamsi;Scott Alan Smith
Accurate prediction of the water quality index is essential for protecting public health and managing freshwater resources. Existing models often rely on arbitrary weight initialization and make limited use of ensemble learning, which results in unstable performance and reduced interpretability. This study introduces a hybrid machine learning framework that combines feature-informed neural network initialization with gradient boosting (XGBoost) to address these limitations. Neural network weights are initialized using feature significance scores derived from SHapley Additive exPlanations (SHAP) and predictions are iteratively refined using XGBoost. The model was trained and evaluated using the public quality of freshwater dataset and compared against several baselines, including random forest, support vector regression, a conventional artificial neural network with Xavier initialization, and an XGBoost-only model. Our framework achieved an accuracy of 86.9%, an F1-score of 0.849, and a receiver operating characteristic–area under the curve of 0.894, outperforming all comparative methods. Ablation experiments showed that both the SHAP-based initialization and the boosting component each improved performance over simpler baselines.
{"title":"A Hybrid Machine Learning Framework for Water Quality Index Prediction Using Feature-Based Neural Network Initialization","authors":"Ali Al Bataineh;Bandi Vamsi;Scott Alan Smith","doi":"10.1109/JSTARS.2026.3654017","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3654017","url":null,"abstract":"Accurate prediction of the water quality index is essential for protecting public health and managing freshwater resources. Existing models often rely on arbitrary weight initialization and make limited use of ensemble learning, which results in unstable performance and reduced interpretability. This study introduces a hybrid machine learning framework that combines feature-informed neural network initialization with gradient boosting (XGBoost) to address these limitations. Neural network weights are initialized using feature significance scores derived from SHapley Additive exPlanations (SHAP) and predictions are iteratively refined using XGBoost. The model was trained and evaluated using the public quality of freshwater dataset and compared against several baselines, including random forest, support vector regression, a conventional artificial neural network with Xavier initialization, and an XGBoost-only model. Our framework achieved an accuracy of 86.9%, an <italic>F</i>1-score of 0.849, and a receiver operating characteristic–area under the curve of 0.894, outperforming all comparative methods. Ablation experiments showed that both the SHAP-based initialization and the boosting component each improved performance over simpler baselines.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4887-4905"},"PeriodicalIF":5.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11353250","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-13DOI: 10.1109/JSTARS.2026.3653626
Xiaole Lin;Guangping Li;Jiahua Xie;Zhuokun Zhi
While convolutional neural network (CNN)-based methods for small object detection in remote sensing imagery have advanced considerably, substantial challenges remain unresolved, primarily stemming from complex backgrounds and insufficient feature representation. To address these issues, we propose a novel architecture specifically designed to accommodate the unique demands of small objects, termed AMFC-DEIM. This framework introduces three key innovations: first, the adaptive one-to-one (O2O) matching mechanism, which enhances dense O2O matching by adaptively adjusting the matching grid configuration to the object distribution, thereby preserving the resolution of small objects throughout training; second, the focal convolution module, engineered to explicitly align with the spatial characteristics of small objects for extracting fine-grained features; and third, the enhanced normalized Wasserstein distance, which stabilizes the training process and bolsters performance on small targets. Comprehensive experiments conducted on three benchmark remote sensing small object detection datasets: RSOD, LEVIR-SHIP and NWPU VHR-10, demonstrate that AMFC-DEIM achieves remarkable performance, attaining AP$_{50}$ scores of 96.2%, 86.2%, and 95.1%, respectively, while maintaining only 5.27 M parameters. These results substantially outperform several established benchmark models and state-of-the-art methods.
{"title":"AMFC-DEIM: Improved DEIM With Adaptive Matching and Focal Convolution for Remote Sensing Small Object Detection","authors":"Xiaole Lin;Guangping Li;Jiahua Xie;Zhuokun Zhi","doi":"10.1109/JSTARS.2026.3653626","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3653626","url":null,"abstract":"While convolutional neural network (CNN)-based methods for small object detection in remote sensing imagery have advanced considerably, substantial challenges remain unresolved, primarily stemming from complex backgrounds and insufficient feature representation. To address these issues, we propose a novel architecture specifically designed to accommodate the unique demands of small objects, termed AMFC-DEIM. This framework introduces three key innovations: first, the adaptive one-to-one (O2O) matching mechanism, which enhances dense O2O matching by adaptively adjusting the matching grid configuration to the object distribution, thereby preserving the resolution of small objects throughout training; second, the focal convolution module, engineered to explicitly align with the spatial characteristics of small objects for extracting fine-grained features; and third, the enhanced normalized Wasserstein distance, which stabilizes the training process and bolsters performance on small targets. Comprehensive experiments conducted on three benchmark remote sensing small object detection datasets: RSOD, LEVIR-SHIP and NWPU VHR-10, demonstrate that AMFC-DEIM achieves remarkable performance, attaining AP<inline-formula><tex-math>$_{50}$</tex-math></inline-formula> scores of 96.2%, 86.2%, and 95.1%, respectively, while maintaining only 5.27 M parameters. These results substantially outperform several established benchmark models and state-of-the-art methods.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5021-5034"},"PeriodicalIF":5.3,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11347584","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-13DOI: 10.1109/JSTARS.2026.3653676
Jiapeng Huang;Yue Zhang;Xiaozhu Yang;Fan Mo
Forest canopy height is a critical structural parameter for accurately assessing forest carbon storage. This study integrates Global Ecosystem Dynamics Investigation (GEDI) LiDAR data with multisource remote sensing features to construct a multidimensional feature space comprising 13 parameters. By employing high-dimensional feature vectors of “spatial coordinates + environmental features,” the proposed deep learning-based neural network-guided interpolation (NNGI) model effectively harnesses the capacity of deep learning to model complex nonlinear relationships and adaptively extract local features. This method adopts a dual-network collaborative architecture to dynamically learn interpolation weights based on environmental similarity in the feature space, rather than relying on fixed parameters or merely considering spatial distance, thereby effectively fusing the complex nonlinear relationship modeling capability of deep learning with the concept of spatial interpolation. Experiments conducted across five representative regions in the United States demonstrate that the overall accuracy of the NNGI model significantly outperforms traditional machine learning methods, Pearson correlation coefffcient (r) = 0.79, root-mean-square error (RMSE) = 5.38 m, mean absolute error = 4.04 m, bias = –0.15 m. In areas with low (0% –20% ) and high (61% –80% ) vegetation cover fractions, the RMSE decreased by 37.52% and 5.37%, respectively, while the r-value increased by 15.87% and 35.90%, respectively. Regarding different slope aspects, the RMSE for southeastern and western slopes decreased by 30.38% and 18.70%, respectively. This study provides a more reliable solution for the accurate estimation of forest structural parameters in complex environments.
{"title":"A Deep Learning-Based Model for Forest Canopy Height Mapping Using Multisource Remote Sensing Data","authors":"Jiapeng Huang;Yue Zhang;Xiaozhu Yang;Fan Mo","doi":"10.1109/JSTARS.2026.3653676","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3653676","url":null,"abstract":"Forest canopy height is a critical structural parameter for accurately assessing forest carbon storage. This study integrates Global Ecosystem Dynamics Investigation (GEDI) LiDAR data with multisource remote sensing features to construct a multidimensional feature space comprising 13 parameters. By employing high-dimensional feature vectors of “spatial coordinates + environmental features,” the proposed deep learning-based neural network-guided interpolation (NNGI) model effectively harnesses the capacity of deep learning to model complex nonlinear relationships and adaptively extract local features. This method adopts a dual-network collaborative architecture to dynamically learn interpolation weights based on environmental similarity in the feature space, rather than relying on fixed parameters or merely considering spatial distance, thereby effectively fusing the complex nonlinear relationship modeling capability of deep learning with the concept of spatial interpolation. Experiments conducted across five representative regions in the United States demonstrate that the overall accuracy of the NNGI model significantly outperforms traditional machine learning methods, Pearson correlation coefffcient (<italic>r</i>) = 0.79, root-mean-square error (RMSE) = 5.38 m, mean absolute error = 4.04 m, bias = –0.15 m. In areas with low (0% –20% ) and high (61% –80% ) vegetation cover fractions, the RMSE decreased by 37.52% and 5.37%, respectively, while the <italic>r</i>-value increased by 15.87% and 35.90%, respectively. Regarding different slope aspects, the RMSE for southeastern and western slopes decreased by 30.38% and 18.70%, respectively. This study provides a more reliable solution for the accurate estimation of forest structural parameters in complex environments.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4842-4857"},"PeriodicalIF":5.3,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11348094","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/JSTARS.2026.3653452
Yao Xiao;Dianwei Shao;Suhui Wu;Yu Cai;Haili Li;Lichao Zhuang;Yuyue Xu;Yubin Fan;Chang-Qing Ke
Heavy rainfall in June 2024 caused a dramatic expansion of East Dongting Lake, located in northeastern Hunan Province, central China, and a breach occurred at Tuanzhouyuan within the lake region on 5th July. Optical remote sensing, synthetic aperture radar (SAR), and satellite altimetry provided essential data on inundation and water level changes. Using bitemporal Sentinel-1 SAR data, this study constructed a water body change detection dataset and applied the MambaBCD change detection models. The results showed that MambaBCD, based on state space models, showed superior performance, achieving an F1 score of 91.9% and demonstrates superior ability in identifying boundaries and small change areas. The inundation extent of East Dongting Lake from April to August 2024 was mapped using the MambaBCD model and bitemporal Sentinel-1 imagery. A sharp increase in inundation was observed in late June, with the water body expanding to 1142.4 ± 98 km2 by 4th July. In late July, the water body area began to decrease rapidly. In addition, the latest radar altimeter, surface water and ocean topography surpassed Sentinel-3 in monitoring water levels, capturing a peak of 34 m in early July during this flood event, with levels returning to normal by late August. This flooding event was caused by heavy rainfall over 600 km2 of cropland, with 95% of the buildings in Tuanzhouyuan being inundated, resulting in significant economic losses.
{"title":"Monitoring the 2024 Abrupt Flood Event in East Dongting Lake via Deep Learning and Multisource Remote Sensing Data","authors":"Yao Xiao;Dianwei Shao;Suhui Wu;Yu Cai;Haili Li;Lichao Zhuang;Yuyue Xu;Yubin Fan;Chang-Qing Ke","doi":"10.1109/JSTARS.2026.3653452","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3653452","url":null,"abstract":"Heavy rainfall in June 2024 caused a dramatic expansion of East Dongting Lake, located in northeastern Hunan Province, central China, and a breach occurred at Tuanzhouyuan within the lake region on 5th July. Optical remote sensing, synthetic aperture radar (SAR), and satellite altimetry provided essential data on inundation and water level changes. Using bitemporal Sentinel-1 SAR data, this study constructed a water body change detection dataset and applied the MambaBCD change detection models. The results showed that MambaBCD, based on state space models, showed superior performance, achieving an F1 score of 91.9% and demonstrates superior ability in identifying boundaries and small change areas. The inundation extent of East Dongting Lake from April to August 2024 was mapped using the MambaBCD model and bitemporal Sentinel-1 imagery. A sharp increase in inundation was observed in late June, with the water body expanding to 1142.4 ± 98 km<sup>2</sup> by 4th July. In late July, the water body area began to decrease rapidly. In addition, the latest radar altimeter, surface water and ocean topography surpassed Sentinel-3 in monitoring water levels, capturing a peak of 34 m in early July during this flood event, with levels returning to normal by late August. This flooding event was caused by heavy rainfall over 600 km<sup>2</sup> of cropland, with 95% of the buildings in Tuanzhouyuan being inundated, resulting in significant economic losses.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5602-5617"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11347475","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/JSTARS.2026.3651900
Enyu Zhao;Yu Shi;Nianxin Qu;Yulei Wang;Hang Zhao
Infrared small target detection is focused on accurately identifying tiny targets with low signal-to-noise ratio against complex backgrounds, representing a critical challenge in the field of infrared image processing. Existing approaches frequently fail to retain small target information during global semantic extraction and struggle with preserving detailed features and achieving effective feature fusion. To address these limitations, this article proposes a morphology-edge enhanced triple-cascaded network (MEETNet) for infrared small target detection. The network employs a triple-cascaded architecture that maintains high resolution and enhances information interaction between different stages, facilitating effective multilevel feature fusion while safeguarding deep small-target characteristics. MEETNet integrates an edge-detail enhanced module (EDEM) and a detail-aware multi-scale fusion module (DMSFM). These modules introduce edge-detail enhanced features that amalgamate contrast and edge information, thereby amplifying target saliency and improving edge representation. Specifically, EDEM augments target contrast and edge structures by integrating edge-detail-enhanced features with shallow details. This integration improves the discriminability capacity of shallow features for detecting small targets. Moreover, DMSFM implements a multireceptive field mechanism to merge target details with deep semantic insights, enabling the capture of more distinctive global contextual features. Experimental evaluations conducted using two public datasets—NUAA-SIRST and NUDT-SIRST—demonstrate that the proposed MEETNet surpasses existing state-of-the-art methods for infrared small target detection in terms of detection accuracy.
{"title":"MEETNet: Morphology-Edge Enhanced Triple-Cascaded Network for Infrared Small Target Detection","authors":"Enyu Zhao;Yu Shi;Nianxin Qu;Yulei Wang;Hang Zhao","doi":"10.1109/JSTARS.2026.3651900","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651900","url":null,"abstract":"Infrared small target detection is focused on accurately identifying tiny targets with low signal-to-noise ratio against complex backgrounds, representing a critical challenge in the field of infrared image processing. Existing approaches frequently fail to retain small target information during global semantic extraction and struggle with preserving detailed features and achieving effective feature fusion. To address these limitations, this article proposes a morphology-edge enhanced triple-cascaded network (MEETNet) for infrared small target detection. The network employs a triple-cascaded architecture that maintains high resolution and enhances information interaction between different stages, facilitating effective multilevel feature fusion while safeguarding deep small-target characteristics. MEETNet integrates an edge-detail enhanced module (EDEM) and a detail-aware multi-scale fusion module (DMSFM). These modules introduce edge-detail enhanced features that amalgamate contrast and edge information, thereby amplifying target saliency and improving edge representation. Specifically, EDEM augments target contrast and edge structures by integrating edge-detail-enhanced features with shallow details. This integration improves the discriminability capacity of shallow features for detecting small targets. Moreover, DMSFM implements a multireceptive field mechanism to merge target details with deep semantic insights, enabling the capture of more distinctive global contextual features. Experimental evaluations conducted using two public datasets—NUAA-SIRST and NUDT-SIRST—demonstrate that the proposed MEETNet surpasses existing state-of-the-art methods for infrared small target detection in terms of detection accuracy.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4748-4765"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11340625","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep forest-based models for synthetic aperture radar (SAR) image change detection are generally challenged by noise sensitivity and high feature redundancy, which significantly degrade the prediction performance. To address these issues, this article proposes a structure-constrained and feature-screened deep forest, abbreviated as SC-FS-DF, for SAR image change detection. In preclassification, a fuzzy multineighborhood information C-means clustering is proposed to generate high-quality pseudo-labels. It introduces the edge information, the nonlocal and intrasuperpixel neighborhoods into the objective function of fuzzy local information C-means, thus suppressing the speckle noise and constraining structures of targets. In the sample learning and label prediction module, a feature-screened deep forest (FS-DF) framework is constructed by combining feature importance and redundancy analysis with a dropout strategy, thus screening out the noninformative features and meanwhile retaining the informative ones for learning at each cascade layer. Finally, a novel energy function fusing the nonlocal and superpixel information is derived for refining the detection map generated by FS-DF, further preserving fine details and edge locations. Extensive comparison and ablation experiments on five real SAR datasets verify the effectiveness and robustness of the proposed SC-FS-DF, and demonstrate that the SC-FS-DF can well screen the high-dimensional features in change detection and constrain the structures of targets.
{"title":"Feature-Screened and Structure-Constrained Deep Forest for Unsupervised SAR Image Change Detection","authors":"Wanying Song;Ruijing Zhu;Jie Wang;Yinyin Jiang;Yan Wu","doi":"10.1109/JSTARS.2026.3651534","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651534","url":null,"abstract":"Deep forest-based models for synthetic aperture radar (SAR) image change detection are generally challenged by noise sensitivity and high feature redundancy, which significantly degrade the prediction performance. To address these issues, this article proposes a structure-constrained and feature-screened deep forest, abbreviated as SC-FS-DF, for SAR image change detection. In preclassification, a fuzzy multineighborhood information C-means clustering is proposed to generate high-quality pseudo-labels. It introduces the edge information, the nonlocal and intrasuperpixel neighborhoods into the objective function of fuzzy local information C-means, thus suppressing the speckle noise and constraining structures of targets. In the sample learning and label prediction module, a feature-screened deep forest (FS-DF) framework is constructed by combining feature importance and redundancy analysis with a dropout strategy, thus screening out the noninformative features and meanwhile retaining the informative ones for learning at each cascade layer. Finally, a novel energy function fusing the nonlocal and superpixel information is derived for refining the detection map generated by FS-DF, further preserving fine details and edge locations. Extensive comparison and ablation experiments on five real SAR datasets verify the effectiveness and robustness of the proposed SC-FS-DF, and demonstrate that the SC-FS-DF can well screen the high-dimensional features in change detection and constrain the structures of targets.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4056-4068"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11339914","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/JSTARS.2026.3651075
Yu Yao;Hengbin Wang;Xiang Gao;Ziyao Xing;Xiaodong Zhang;Yuanyuan Zhao;Shaoming Li;Zhe Liu
High-resolution remote sensing images provide crucial data support for applications such as precision agriculture and water resource management. However, super-resolution reconstructions often suffer from over-smoothed textures and structural distortions, failing to accurately recover the intricate details of ground objects. To address this issue, this article proposes a remote sensing image super-resolution network (DTWSTSR) that combines the Dual-Tree Complex Wavelet Transform and Swin Transformer, which enhances the ability of texture detail reconstruction by fusing frequency-domain and spatial-domain features. This model includes a Dual-Tree Complex Wavelet Texture Feature Sensing Module (DWTFSM) for integrating frequency and spatial features, and a Multiscale Efficient Channel Attention mechanism to enhance attention to multiscale and global details. In addition, we design a Kolmogorov–Arnold Network based on a branch attention mechanism, which improves the model’s ability to represent complex nonlinear features. During the training process, we investigate the impact of hyperparameters and propose the two-stage SSIM&SL1 loss function to reduce structural differences between images. Experimental results show that DTWSTSR outperforms existing mainstream methods under different magnification factors (×2, ×3, ×4), ranking among the top two in multiple metrics. For example, at ×2 magnification, its PSNR value is 0.64–2.68 dB higher than that of other models. Visual comparisons demonstrate that the proposed model achieves clearer and more accurate detail reconstruction of target ground objects. Furthermore, the model exhibits excellent generalization ability in cross-sensor image (OLI2MSI dataset) reconstruction.
{"title":"DTWSTSR: Dual-Tree Complex Wavelet and Swin Transformer Based Remote Sensing Images Super-Resolution Network","authors":"Yu Yao;Hengbin Wang;Xiang Gao;Ziyao Xing;Xiaodong Zhang;Yuanyuan Zhao;Shaoming Li;Zhe Liu","doi":"10.1109/JSTARS.2026.3651075","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651075","url":null,"abstract":"High-resolution remote sensing images provide crucial data support for applications such as precision agriculture and water resource management. However, super-resolution reconstructions often suffer from over-smoothed textures and structural distortions, failing to accurately recover the intricate details of ground objects. To address this issue, this article proposes a remote sensing image super-resolution network (DTWSTSR) that combines the Dual-Tree Complex Wavelet Transform and Swin Transformer, which enhances the ability of texture detail reconstruction by fusing frequency-domain and spatial-domain features. This model includes a Dual-Tree Complex Wavelet Texture Feature Sensing Module (DWTFSM) for integrating frequency and spatial features, and a Multiscale Efficient Channel Attention mechanism to enhance attention to multiscale and global details. In addition, we design a Kolmogorov–Arnold Network based on a branch attention mechanism, which improves the model’s ability to represent complex nonlinear features. During the training process, we investigate the impact of hyperparameters and propose the two-stage SSIM&SL1 loss function to reduce structural differences between images. Experimental results show that DTWSTSR outperforms existing mainstream methods under different magnification factors (×2, ×3, ×4), ranking among the top two in multiple metrics. For example, at ×2 magnification, its PSNR value is 0.64–2.68 dB higher than that of other models. Visual comparisons demonstrate that the proposed model achieves clearer and more accurate detail reconstruction of target ground objects. Furthermore, the model exhibits excellent generalization ability in cross-sensor image (OLI2MSI dataset) reconstruction.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4730-4747"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11329193","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}