首页 > 最新文献

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing最新文献

英文 中文
SSA-Mamba: Spatial-Spectral Attentive State Space Model for Hyperspectral Image Classification SSA-Mamba:用于高光谱图像分类的空间-光谱关注状态空间模型
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-15 DOI: 10.1109/JSTARS.2026.3654346
Jianshang Liao;Liguo Wang
Hyperspectral image (HSI) classification faces critical challenges in effectively modeling long-range dependencies while maintaining computational efficiency and synergistically exploiting spatial-spectral information. Convolutional neural networks (CNNs) are constrained by local receptive fields, transformers suffer from quadratic computational complexity, and existing state space model (SSM)-based methods lack sophisticated cross-domain interaction mechanisms. This article proposes Spatial-Spectral Attentive Mamba (SSA-Mamba), a novel classification approach addressing these limitations through three synergistic innovations. First, a dual-branch independent modeling strategy allocates separate parameter spaces for spatial and spectral feature extraction via parallel SSMs, preventing feature coupling while enabling domain-specific learning. Second, an asymmetric cross-domain attention mechanism allows spatial features to actively query spectral information through multihead attention, establishing adaptive fusion via gating mechanisms and channel attention. Third, a multiscale residual architecture operating at module-internal, block-internal, and global pathway levels achieves hierarchical feature fusion while maintaining numerical stability through exponential parameterization. The recursive computation mechanism of SSMs enables each position to aggregate global historical information through compact hidden states, achieving O(L) linear complexity compared to transformers’ O(L2) quadratic complexity. Extensive experiments on three benchmark datasets—Houston2013, WHU-Hi-HongHu, and XiongAn—validate the effectiveness of these innovations. SSA-Mamba achieves overall accuracies of 93.98%, 93.58%, and 96.06%, surpassing state-of-the-art approaches by 1.27%, 0.25%, and 1.27%, respectively. The dual-branch design enables effective discrimination of spectrally similar categories, improving Brassica variety classification by 19.21–23.33 percentage points over coupled-feature approaches. The cross-domain attention mechanism enhances urban land cover classification, with Commercial and Highway categories improving by 1.74% and 15.66%. On the large-scale XiongAn dataset (5.92 million pixels), SSA-Mamba demonstrates exceptional scalability with peak GPU memory of only 317.89 MB and per-sample inference time of 0.646 ms, providing an efficient solution for real-time HSI processing. The source code for SSA-Mamba will be made publicly available online.
高光谱图像(HSI)分类面临着在保持计算效率和协同利用空间光谱信息的同时有效建模远程依赖关系的关键挑战。卷积神经网络(cnn)受局部感受场的限制,变压器的计算复杂度为二次型,现有的基于状态空间模型(SSM)的方法缺乏复杂的跨域交互机制。本文提出了空间光谱关注曼巴(SSA-Mamba),这是一种新的分类方法,通过三个协同创新来解决这些限制。首先,双分支独立建模策略通过并行ssm为空间和光谱特征提取分配单独的参数空间,在实现特定领域学习的同时防止特征耦合。其次,非对称跨域注意机制允许空间特征通过多头注意主动查询光谱信息,通过门控机制和通道注意建立自适应融合;第三,在模块内部、块内部和全局路径水平上运行的多尺度残差架构实现了分层特征融合,同时通过指数参数化保持了数值稳定性。ssm的递归计算机制使每个位置能够通过紧凑的隐藏状态聚合全局历史信息,与变压器的O(L2)二次复杂度相比,实现了O(L)线性复杂度。在休斯顿2013、whu - hi -洪湖和雄安三个基准数据集上进行的大量实验验证了这些创新的有效性。SSA-Mamba的总体准确率分别为93.98%、93.58%和96.06%,比目前最先进的方法分别高出1.27%、0.25%和1.27%。双分支设计能够有效识别光谱相似的品类,比耦合特征方法提高了19.21-23.33个百分点。跨域关注机制增强了城市土地覆盖分类,商业类和公路类分别提高了1.74%和15.66%。在大规模雄安数据集(592万像素)上,SSA-Mamba显示出卓越的可扩展性,峰值GPU内存仅为317.89 MB,每样本推理时间为0.646 ms,为实时HSI处理提供了有效的解决方案。SSA-Mamba的源代码将在网上公开。
{"title":"SSA-Mamba: Spatial-Spectral Attentive State Space Model for Hyperspectral Image Classification","authors":"Jianshang Liao;Liguo Wang","doi":"10.1109/JSTARS.2026.3654346","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3654346","url":null,"abstract":"Hyperspectral image (HSI) classification faces critical challenges in effectively modeling long-range dependencies while maintaining computational efficiency and synergistically exploiting spatial-spectral information. Convolutional neural networks (CNNs) are constrained by local receptive fields, transformers suffer from quadratic computational complexity, and existing state space model (SSM)-based methods lack sophisticated cross-domain interaction mechanisms. This article proposes Spatial-Spectral Attentive Mamba (SSA-Mamba), a novel classification approach addressing these limitations through three synergistic innovations. First, a dual-branch independent modeling strategy allocates separate parameter spaces for spatial and spectral feature extraction via parallel SSMs, preventing feature coupling while enabling domain-specific learning. Second, an asymmetric cross-domain attention mechanism allows spatial features to actively query spectral information through multihead attention, establishing adaptive fusion via gating mechanisms and channel attention. Third, a multiscale residual architecture operating at module-internal, block-internal, and global pathway levels achieves hierarchical feature fusion while maintaining numerical stability through exponential parameterization. The recursive computation mechanism of SSMs enables each position to aggregate global historical information through compact hidden states, achieving O(L) linear complexity compared to transformers’ O(L<sup>2</sup>) quadratic complexity. Extensive experiments on three benchmark datasets—Houston2013, WHU-Hi-HongHu, and XiongAn—validate the effectiveness of these innovations. SSA-Mamba achieves overall accuracies of 93.98%, 93.58%, and 96.06%, surpassing state-of-the-art approaches by 1.27%, 0.25%, and 1.27%, respectively. The dual-branch design enables effective discrimination of spectrally similar categories, improving Brassica variety classification by 19.21–23.33 percentage points over coupled-feature approaches. The cross-domain attention mechanism enhances urban land cover classification, with Commercial and Highway categories improving by 1.74% and 15.66%. On the large-scale XiongAn dataset (5.92 million pixels), SSA-Mamba demonstrates exceptional scalability with peak GPU memory of only 317.89 MB and per-sample inference time of 0.646 ms, providing an efficient solution for real-time HSI processing. The source code for SSA-Mamba will be made publicly available online.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"6403-6424"},"PeriodicalIF":5.3,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11355499","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Extraction of 3-D Windows From MVS Point Clouds by Comprehensive Fusion of Multitype Features 基于多类型特征综合融合的MVS点云三维窗口自动提取
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-14 DOI: 10.1109/JSTARS.2026.3654241
Yuan Li;Tianzhu Zhang;Ziyi Xiong;Junying Lv;Yinning Pang
Detecting three-dimensional (3-D) windows is vital for creating semantic building models with high level of detail, furnishing smart city and digital twin programs. Existing studies on window extraction using street imagery or laser scanning data often rely on limited types of features, resulting in compromised accuracy and completeness due to shadows and geometric decorations caused by curtains, balconies, plants, and other objects. To enhance the effectiveness and robustness of building window extraction in 3-D, this article proposes an automatic method that leverages synergistic information from multiview-stereo (MVS) point clouds, through an adaptive divide-and-combine pipeline. Color information inherited from the imagery serves as a main clue to acquire the point clouds of individual building façades that may be coplanar and connected. The geometric information associated with normal vectors is then combined with color, to adaptively divide individual building façade into an irregular grid that conforms to the window edges. Subsequently, HSV color and depth distances within each grid cell are computed, and the grid cells are encoded to quantify the global arrangement features of windows. Finally, the multitype features are fused in an integer programming model, by solving which the optimal combination of grid cells corresponding to windows is obtained. Benefitting from the informative MVS point clouds and the fusion of multitype features, our method is able to directly produce 3-D models with high regularity for buildings with different appearances. Experimental results demonstrate that the proposed method is effective in 3-D window extraction while overcoming variations in façade appearances caused by foreign objects and missing data, with a high point-wise precision of 92.7%, recall of 77.09%, IoU of 71.95%, and F1-score of 83.42%. The results also exhibit a high level of integrity, with the accuracy of correctly extracted windows reaching 89.81%. In the future, we will focus on the development of a more universal façade dividing method to deal with even more complicated windows.
检测三维(3-D)窗口对于创建具有高水平细节的语义建筑模型,提供智慧城市和数字孪生计划至关重要。现有的利用街道图像或激光扫描数据进行窗口提取的研究往往依赖于有限类型的特征,由于窗帘、阳台、植物和其他物体造成的阴影和几何装饰,导致准确性和完整性受到影响。为了提高三维建筑窗口提取的有效性和鲁棒性,本文提出了一种利用多视立体(MVS)点云的协同信息,通过自适应分并管道自动提取的方法。从图像中继承的颜色信息作为获取单个建筑立面点云的主要线索,这些立面可能是共面的,也可能是连通的。然后将与法向量相关的几何信息与颜色相结合,自适应地将单个建筑立面划分为符合窗户边缘的不规则网格。然后,计算每个网格单元内的HSV颜色距离和深度距离,并对网格单元进行编码,量化窗口的全局排列特征。最后,将多类型特征融合到一个整数规划模型中,通过求解该模型得到窗口对应网格单元的最优组合。利用丰富的MVS点云和多类型特征的融合,我们的方法可以直接生成具有高规则性的不同外观建筑物的三维模型。实验结果表明,该方法在克服异物和数据缺失引起的表面形貌变化的同时,能够有效地提取出三维窗口,点向精度为92.7%,召回率为77.09%,IoU为71.95%,f1分数为83.42%。结果也显示出很高的完整性,正确提取窗口的准确率达到89.81%。在未来,我们将专注于开发一种更通用的farade划分方法来处理更复杂的窗口。
{"title":"Automated Extraction of 3-D Windows From MVS Point Clouds by Comprehensive Fusion of Multitype Features","authors":"Yuan Li;Tianzhu Zhang;Ziyi Xiong;Junying Lv;Yinning Pang","doi":"10.1109/JSTARS.2026.3654241","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3654241","url":null,"abstract":"Detecting three-dimensional (3-D) windows is vital for creating semantic building models with high level of detail, furnishing smart city and digital twin programs. Existing studies on window extraction using street imagery or laser scanning data often rely on limited types of features, resulting in compromised accuracy and completeness due to shadows and geometric decorations caused by curtains, balconies, plants, and other objects. To enhance the effectiveness and robustness of building window extraction in 3-D, this article proposes an automatic method that leverages synergistic information from multiview-stereo (MVS) point clouds, through an adaptive divide-and-combine pipeline. Color information inherited from the imagery serves as a main clue to acquire the point clouds of individual building façades that may be coplanar and connected. The geometric information associated with normal vectors is then combined with color, to adaptively divide individual building façade into an irregular grid that conforms to the window edges. Subsequently, HSV color and depth distances within each grid cell are computed, and the grid cells are encoded to quantify the global arrangement features of windows. Finally, the multitype features are fused in an integer programming model, by solving which the optimal combination of grid cells corresponding to windows is obtained. Benefitting from the informative MVS point clouds and the fusion of multitype features, our method is able to directly produce 3-D models with high regularity for buildings with different appearances. Experimental results demonstrate that the proposed method is effective in 3-D window extraction while overcoming variations in façade appearances caused by foreign objects and missing data, with a high point-wise precision of 92.7%, recall of 77.09%, IoU of 71.95%, and F1-score of 83.42%. The results also exhibit a high level of integrity, with the accuracy of correctly extracted windows reaching 89.81%. In the future, we will focus on the development of a more universal façade dividing method to deal with even more complicated windows.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4918-4934"},"PeriodicalIF":5.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11353237","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Insights on the Working Principles of a CNN for Forest Height Regression From Single-Pass InSAR Data 基于单次InSAR数据的森林高度回归CNN工作原理研究
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-14 DOI: 10.1109/JSTARS.2026.3654195
Daniel Carcereri;Luca Dell’Amore;Stefano Tebaldini;Paola Rizzoli
The increasing use of artificial intelligence (AI) models in Earth Observation (EO) applications, such as forest height estimation, has led to a growing need for explainable AI (XAI) methods. Despite their high accuracy, AI models are often criticized for their “black-box” nature, making it difficult to understand the inner decision-making process. In this study, we propose a multifaceted approach to XAI for a convolutional neural network (CNN)-based model that estimates forest height from TanDEM-X single-pass InSAR data. By combining domain knowledge, saliency maps, and feature importance analysis through exhaustive model permutations, we provide a comprehensive investigation of the network working principles. Our results suggests that the proposed model is implicitly capable of recognizing and compensating for the SAR acquisition geometry-related distortions. We find that the mean phase center height and its local variability represents the most informative predictor. We also find evidence that the interferometric coherence and the backscatter maps capture complementary but equally relevant views of the vegetation. This work contributes to advance the understanding of the model’s inner workings, and targets the development of more transparent and trustworthy AI for EO applications, ultimately leading to improved accuracy and reliability in the estimation of forest parameters.
人工智能(AI)模型在地球观测(EO)应用中的使用越来越多,例如森林高度估计,导致对可解释的人工智能(XAI)方法的需求日益增长。尽管具有很高的准确性,但人工智能模型经常因其“黑箱”性质而受到批评,难以理解内部决策过程。在本研究中,我们提出了一种基于卷积神经网络(CNN)的XAI方法,该模型从TanDEM-X单次InSAR数据中估计森林高度。通过结合领域知识、显著性图和通过穷举模型排列的特征重要性分析,我们对网络工作原理进行了全面的研究。我们的研究结果表明,所提出的模型能够隐式地识别和补偿SAR捕获几何相关的畸变。我们发现平均相位中心高度及其局部变率是最具信息量的预测因子。我们还发现有证据表明,干涉相干性和后向散射图捕获了互补但同样相关的植被视图。这项工作有助于促进对模型内部工作原理的理解,并旨在为EO应用开发更透明、更可信的人工智能,最终提高森林参数估计的准确性和可靠性。
{"title":"Insights on the Working Principles of a CNN for Forest Height Regression From Single-Pass InSAR Data","authors":"Daniel Carcereri;Luca Dell’Amore;Stefano Tebaldini;Paola Rizzoli","doi":"10.1109/JSTARS.2026.3654195","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3654195","url":null,"abstract":"The increasing use of artificial intelligence (AI) models in Earth Observation (EO) applications, such as forest height estimation, has led to a growing need for explainable AI (XAI) methods. Despite their high accuracy, AI models are often criticized for their “black-box” nature, making it difficult to understand the inner decision-making process. In this study, we propose a multifaceted approach to XAI for a convolutional neural network (CNN)-based model that estimates forest height from TanDEM-X single-pass InSAR data. By combining domain knowledge, saliency maps, and feature importance analysis through exhaustive model permutations, we provide a comprehensive investigation of the network working principles. Our results suggests that the proposed model is implicitly capable of recognizing and compensating for the SAR acquisition geometry-related distortions. We find that the mean phase center height and its local variability represents the most informative predictor. We also find evidence that the interferometric coherence and the backscatter maps capture complementary but equally relevant views of the vegetation. This work contributes to advance the understanding of the model’s inner workings, and targets the development of more transparent and trustworthy AI for EO applications, ultimately leading to improved accuracy and reliability in the estimation of forest parameters.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4809-4824"},"PeriodicalIF":5.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11352840","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Hybrid Machine Learning Framework for Water Quality Index Prediction Using Feature-Based Neural Network Initialization 基于特征神经网络初始化的水质指标预测混合机器学习框架
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-14 DOI: 10.1109/JSTARS.2026.3654017
Ali Al Bataineh;Bandi Vamsi;Scott Alan Smith
Accurate prediction of the water quality index is essential for protecting public health and managing freshwater resources. Existing models often rely on arbitrary weight initialization and make limited use of ensemble learning, which results in unstable performance and reduced interpretability. This study introduces a hybrid machine learning framework that combines feature-informed neural network initialization with gradient boosting (XGBoost) to address these limitations. Neural network weights are initialized using feature significance scores derived from SHapley Additive exPlanations (SHAP) and predictions are iteratively refined using XGBoost. The model was trained and evaluated using the public quality of freshwater dataset and compared against several baselines, including random forest, support vector regression, a conventional artificial neural network with Xavier initialization, and an XGBoost-only model. Our framework achieved an accuracy of 86.9%, an F1-score of 0.849, and a receiver operating characteristic–area under the curve of 0.894, outperforming all comparative methods. Ablation experiments showed that both the SHAP-based initialization and the boosting component each improved performance over simpler baselines.
准确预测水质指数对保护公众健康和管理淡水资源至关重要。现有模型往往依赖于任意权值初始化,集成学习的使用有限,导致性能不稳定,可解释性降低。本研究引入了一种混合机器学习框架,该框架结合了特征信息神经网络初始化和梯度增强(XGBoost)来解决这些限制。神经网络权重使用SHapley加性解释(SHAP)衍生的特征显著性分数初始化,并使用XGBoost迭代改进预测。该模型使用公共质量的淡水数据集进行训练和评估,并与多个基线进行比较,包括随机森林、支持向量回归、带有Xavier初始化的传统人工神经网络和仅xgboost模型。该框架的准确率为86.9%,f1得分为0.849,接收者工作特征曲线下面积为0.894,优于所有比较方法。烧蚀实验表明,在较简单的基线上,基于shap的初始化和助推组件都提高了性能。
{"title":"A Hybrid Machine Learning Framework for Water Quality Index Prediction Using Feature-Based Neural Network Initialization","authors":"Ali Al Bataineh;Bandi Vamsi;Scott Alan Smith","doi":"10.1109/JSTARS.2026.3654017","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3654017","url":null,"abstract":"Accurate prediction of the water quality index is essential for protecting public health and managing freshwater resources. Existing models often rely on arbitrary weight initialization and make limited use of ensemble learning, which results in unstable performance and reduced interpretability. This study introduces a hybrid machine learning framework that combines feature-informed neural network initialization with gradient boosting (XGBoost) to address these limitations. Neural network weights are initialized using feature significance scores derived from SHapley Additive exPlanations (SHAP) and predictions are iteratively refined using XGBoost. The model was trained and evaluated using the public quality of freshwater dataset and compared against several baselines, including random forest, support vector regression, a conventional artificial neural network with Xavier initialization, and an XGBoost-only model. Our framework achieved an accuracy of 86.9%, an <italic>F</i>1-score of 0.849, and a receiver operating characteristic–area under the curve of 0.894, outperforming all comparative methods. Ablation experiments showed that both the SHAP-based initialization and the boosting component each improved performance over simpler baselines.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4887-4905"},"PeriodicalIF":5.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11353250","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AMFC-DEIM: Improved DEIM With Adaptive Matching and Focal Convolution for Remote Sensing Small Object Detection AMFC-DEIM:基于自适应匹配和焦点卷积的改进DEIM遥感小目标检测
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-13 DOI: 10.1109/JSTARS.2026.3653626
Xiaole Lin;Guangping Li;Jiahua Xie;Zhuokun Zhi
While convolutional neural network (CNN)-based methods for small object detection in remote sensing imagery have advanced considerably, substantial challenges remain unresolved, primarily stemming from complex backgrounds and insufficient feature representation. To address these issues, we propose a novel architecture specifically designed to accommodate the unique demands of small objects, termed AMFC-DEIM. This framework introduces three key innovations: first, the adaptive one-to-one (O2O) matching mechanism, which enhances dense O2O matching by adaptively adjusting the matching grid configuration to the object distribution, thereby preserving the resolution of small objects throughout training; second, the focal convolution module, engineered to explicitly align with the spatial characteristics of small objects for extracting fine-grained features; and third, the enhanced normalized Wasserstein distance, which stabilizes the training process and bolsters performance on small targets. Comprehensive experiments conducted on three benchmark remote sensing small object detection datasets: RSOD, LEVIR-SHIP and NWPU VHR-10, demonstrate that AMFC-DEIM achieves remarkable performance, attaining AP$_{50}$ scores of 96.2%, 86.2%, and 95.1%, respectively, while maintaining only 5.27 M parameters. These results substantially outperform several established benchmark models and state-of-the-art methods.
虽然基于卷积神经网络(CNN)的遥感图像小目标检测方法已经取得了长足的进步,但仍然存在大量的挑战,主要源于复杂的背景和不足的特征表示。为了解决这些问题,我们提出了一种新的架构,专门设计用于适应小物体的独特需求,称为AMFC-DEIM。该框架引入了三个关键创新:第一,自适应一对一(O2O)匹配机制,通过自适应调整匹配网格配置以适应目标分布,从而在整个训练过程中保持小目标的分辨率,从而增强密集的O2O匹配;第二,焦点卷积模块,设计明确对准小物体的空间特征,提取细粒度特征;第三,增强的归一化Wasserstein距离,稳定了训练过程,提高了在小目标上的表现。在RSOD、levirship和NWPU VHR-10三个基准遥感小目标检测数据集上进行的综合实验表明,AMFC-DEIM在仅保留5.27个参数的情况下,取得了显著的性能,分别获得了96.2%、86.2%和95.1%的AP$_{50}$得分。这些结果大大优于几种已建立的基准模型和最先进的方法。
{"title":"AMFC-DEIM: Improved DEIM With Adaptive Matching and Focal Convolution for Remote Sensing Small Object Detection","authors":"Xiaole Lin;Guangping Li;Jiahua Xie;Zhuokun Zhi","doi":"10.1109/JSTARS.2026.3653626","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3653626","url":null,"abstract":"While convolutional neural network (CNN)-based methods for small object detection in remote sensing imagery have advanced considerably, substantial challenges remain unresolved, primarily stemming from complex backgrounds and insufficient feature representation. To address these issues, we propose a novel architecture specifically designed to accommodate the unique demands of small objects, termed AMFC-DEIM. This framework introduces three key innovations: first, the adaptive one-to-one (O2O) matching mechanism, which enhances dense O2O matching by adaptively adjusting the matching grid configuration to the object distribution, thereby preserving the resolution of small objects throughout training; second, the focal convolution module, engineered to explicitly align with the spatial characteristics of small objects for extracting fine-grained features; and third, the enhanced normalized Wasserstein distance, which stabilizes the training process and bolsters performance on small targets. Comprehensive experiments conducted on three benchmark remote sensing small object detection datasets: RSOD, LEVIR-SHIP and NWPU VHR-10, demonstrate that AMFC-DEIM achieves remarkable performance, attaining AP<inline-formula><tex-math>$_{50}$</tex-math></inline-formula> scores of 96.2%, 86.2%, and 95.1%, respectively, while maintaining only 5.27 M parameters. These results substantially outperform several established benchmark models and state-of-the-art methods.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5021-5034"},"PeriodicalIF":5.3,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11347584","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Deep Learning-Based Model for Forest Canopy Height Mapping Using Multisource Remote Sensing Data 基于深度学习的多源遥感森林冠层高度制图模型
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-13 DOI: 10.1109/JSTARS.2026.3653676
Jiapeng Huang;Yue Zhang;Xiaozhu Yang;Fan Mo
Forest canopy height is a critical structural parameter for accurately assessing forest carbon storage. This study integrates Global Ecosystem Dynamics Investigation (GEDI) LiDAR data with multisource remote sensing features to construct a multidimensional feature space comprising 13 parameters. By employing high-dimensional feature vectors of “spatial coordinates + environmental features,” the proposed deep learning-based neural network-guided interpolation (NNGI) model effectively harnesses the capacity of deep learning to model complex nonlinear relationships and adaptively extract local features. This method adopts a dual-network collaborative architecture to dynamically learn interpolation weights based on environmental similarity in the feature space, rather than relying on fixed parameters or merely considering spatial distance, thereby effectively fusing the complex nonlinear relationship modeling capability of deep learning with the concept of spatial interpolation. Experiments conducted across five representative regions in the United States demonstrate that the overall accuracy of the NNGI model significantly outperforms traditional machine learning methods, Pearson correlation coefffcient (r) = 0.79, root-mean-square error (RMSE) = 5.38 m, mean absolute error = 4.04 m, bias = –0.15 m. In areas with low (0% –20% ) and high (61% –80% ) vegetation cover fractions, the RMSE decreased by 37.52% and 5.37%, respectively, while the r-value increased by 15.87% and 35.90%, respectively. Regarding different slope aspects, the RMSE for southeastern and western slopes decreased by 30.38% and 18.70%, respectively. This study provides a more reliable solution for the accurate estimation of forest structural parameters in complex environments.
森林冠层高度是准确评估森林碳储量的重要结构参数。本研究将全球生态系统动力学调查(GEDI)激光雷达数据与多源遥感特征相结合,构建了包含13个参数的多维特征空间。通过采用“空间坐标+环境特征”的高维特征向量,所提出的基于深度学习的神经网络引导插值(NNGI)模型有效地利用了深度学习的能力来建模复杂的非线性关系并自适应地提取局部特征。该方法采用双网络协同架构,基于特征空间中的环境相似性动态学习插值权值,而不是依赖于固定参数或仅仅考虑空间距离,从而有效地将深度学习的复杂非线性关系建模能力与空间插值的概念融合在一起。在美国五个具有代表性的地区进行的实验表明,NNGI模型的整体精度显著优于传统的机器学习方法,Pearson相关系数(r) = 0.79,均方根误差(RMSE) = 5.38 m,平均绝对误差= 4.04 m,偏差= -0.15 m。低植被覆盖度(0% ~ 20%)和高植被覆盖度(61% ~ 80%)区域的RMSE分别降低了37.52%和5.37%,r值分别增加了15.87%和35.90%。在不同坡向上,东南坡和西坡的RMSE分别下降了30.38%和18.70%。该研究为复杂环境下森林结构参数的准确估计提供了更可靠的解决方案。
{"title":"A Deep Learning-Based Model for Forest Canopy Height Mapping Using Multisource Remote Sensing Data","authors":"Jiapeng Huang;Yue Zhang;Xiaozhu Yang;Fan Mo","doi":"10.1109/JSTARS.2026.3653676","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3653676","url":null,"abstract":"Forest canopy height is a critical structural parameter for accurately assessing forest carbon storage. This study integrates Global Ecosystem Dynamics Investigation (GEDI) LiDAR data with multisource remote sensing features to construct a multidimensional feature space comprising 13 parameters. By employing high-dimensional feature vectors of “spatial coordinates + environmental features,” the proposed deep learning-based neural network-guided interpolation (NNGI) model effectively harnesses the capacity of deep learning to model complex nonlinear relationships and adaptively extract local features. This method adopts a dual-network collaborative architecture to dynamically learn interpolation weights based on environmental similarity in the feature space, rather than relying on fixed parameters or merely considering spatial distance, thereby effectively fusing the complex nonlinear relationship modeling capability of deep learning with the concept of spatial interpolation. Experiments conducted across five representative regions in the United States demonstrate that the overall accuracy of the NNGI model significantly outperforms traditional machine learning methods, Pearson correlation coefffcient (<italic>r</i>) = 0.79, root-mean-square error (RMSE) = 5.38 m, mean absolute error = 4.04 m, bias = –0.15 m. In areas with low (0% –20% ) and high (61% –80% ) vegetation cover fractions, the RMSE decreased by 37.52% and 5.37%, respectively, while the <italic>r</i>-value increased by 15.87% and 35.90%, respectively. Regarding different slope aspects, the RMSE for southeastern and western slopes decreased by 30.38% and 18.70%, respectively. This study provides a more reliable solution for the accurate estimation of forest structural parameters in complex environments.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4842-4857"},"PeriodicalIF":5.3,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11348094","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Monitoring the 2024 Abrupt Flood Event in East Dongting Lake via Deep Learning and Multisource Remote Sensing Data 基于深度学习和多源遥感数据的2024年东洞庭湖突发性洪水监测
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/JSTARS.2026.3653452
Yao Xiao;Dianwei Shao;Suhui Wu;Yu Cai;Haili Li;Lichao Zhuang;Yuyue Xu;Yubin Fan;Chang-Qing Ke
Heavy rainfall in June 2024 caused a dramatic expansion of East Dongting Lake, located in northeastern Hunan Province, central China, and a breach occurred at Tuanzhouyuan within the lake region on 5th July. Optical remote sensing, synthetic aperture radar (SAR), and satellite altimetry provided essential data on inundation and water level changes. Using bitemporal Sentinel-1 SAR data, this study constructed a water body change detection dataset and applied the MambaBCD change detection models. The results showed that MambaBCD, based on state space models, showed superior performance, achieving an F1 score of 91.9% and demonstrates superior ability in identifying boundaries and small change areas. The inundation extent of East Dongting Lake from April to August 2024 was mapped using the MambaBCD model and bitemporal Sentinel-1 imagery. A sharp increase in inundation was observed in late June, with the water body expanding to 1142.4 ± 98 km2 by 4th July. In late July, the water body area began to decrease rapidly. In addition, the latest radar altimeter, surface water and ocean topography surpassed Sentinel-3 in monitoring water levels, capturing a peak of 34 m in early July during this flood event, with levels returning to normal by late August. This flooding event was caused by heavy rainfall over 600 km2 of cropland, with 95% of the buildings in Tuanzhouyuan being inundated, resulting in significant economic losses.
2024年6月的强降雨导致位于中国中部湖南省东北部的东洞庭湖急剧膨胀,并于7月5日在湖区内的团州源发生决口。光学遥感、合成孔径雷达(SAR)和卫星测高提供了洪水和水位变化的基本数据。利用Sentinel-1双时相SAR数据,构建水体变化检测数据集,并应用MambaBCD变化检测模型。结果表明,基于状态空间模型的MambaBCD表现出优异的性能,F1得分为91.9%,在识别边界和小变化区域方面表现出优异的能力。利用MambaBCD模型和Sentinel-1双时相影像,绘制了2024年4 - 8月东洞庭湖的淹没范围。6月下旬洪涝面积急剧增加,至7月4日洪涝面积扩大至1142.4±98 km2。7月下旬,水体面积开始迅速减少。此外,最新的雷达高度计、地表水和海洋地形监测水位超过了Sentinel-3,在7月初的洪水事件中捕捉到34米的峰值,到8月底水位恢复正常。此次洪涝灾害是由超过600平方公里农田的强降雨引起的,团州园95%的建筑物被淹没,造成了重大的经济损失。
{"title":"Monitoring the 2024 Abrupt Flood Event in East Dongting Lake via Deep Learning and Multisource Remote Sensing Data","authors":"Yao Xiao;Dianwei Shao;Suhui Wu;Yu Cai;Haili Li;Lichao Zhuang;Yuyue Xu;Yubin Fan;Chang-Qing Ke","doi":"10.1109/JSTARS.2026.3653452","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3653452","url":null,"abstract":"Heavy rainfall in June 2024 caused a dramatic expansion of East Dongting Lake, located in northeastern Hunan Province, central China, and a breach occurred at Tuanzhouyuan within the lake region on 5th July. Optical remote sensing, synthetic aperture radar (SAR), and satellite altimetry provided essential data on inundation and water level changes. Using bitemporal Sentinel-1 SAR data, this study constructed a water body change detection dataset and applied the MambaBCD change detection models. The results showed that MambaBCD, based on state space models, showed superior performance, achieving an F1 score of 91.9% and demonstrates superior ability in identifying boundaries and small change areas. The inundation extent of East Dongting Lake from April to August 2024 was mapped using the MambaBCD model and bitemporal Sentinel-1 imagery. A sharp increase in inundation was observed in late June, with the water body expanding to 1142.4 ± 98 km<sup>2</sup> by 4th July. In late July, the water body area began to decrease rapidly. In addition, the latest radar altimeter, surface water and ocean topography surpassed Sentinel-3 in monitoring water levels, capturing a peak of 34 m in early July during this flood event, with levels returning to normal by late August. This flooding event was caused by heavy rainfall over 600 km<sup>2</sup> of cropland, with 95% of the buildings in Tuanzhouyuan being inundated, resulting in significant economic losses.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5602-5617"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11347475","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MEETNet: Morphology-Edge Enhanced Triple-Cascaded Network for Infrared Small Target Detection 用于红外小目标检测的形态学边缘增强三级联网络
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/JSTARS.2026.3651900
Enyu Zhao;Yu Shi;Nianxin Qu;Yulei Wang;Hang Zhao
Infrared small target detection is focused on accurately identifying tiny targets with low signal-to-noise ratio against complex backgrounds, representing a critical challenge in the field of infrared image processing. Existing approaches frequently fail to retain small target information during global semantic extraction and struggle with preserving detailed features and achieving effective feature fusion. To address these limitations, this article proposes a morphology-edge enhanced triple-cascaded network (MEETNet) for infrared small target detection. The network employs a triple-cascaded architecture that maintains high resolution and enhances information interaction between different stages, facilitating effective multilevel feature fusion while safeguarding deep small-target characteristics. MEETNet integrates an edge-detail enhanced module (EDEM) and a detail-aware multi-scale fusion module (DMSFM). These modules introduce edge-detail enhanced features that amalgamate contrast and edge information, thereby amplifying target saliency and improving edge representation. Specifically, EDEM augments target contrast and edge structures by integrating edge-detail-enhanced features with shallow details. This integration improves the discriminability capacity of shallow features for detecting small targets. Moreover, DMSFM implements a multireceptive field mechanism to merge target details with deep semantic insights, enabling the capture of more distinctive global contextual features. Experimental evaluations conducted using two public datasets—NUAA-SIRST and NUDT-SIRST—demonstrate that the proposed MEETNet surpasses existing state-of-the-art methods for infrared small target detection in terms of detection accuracy.
红外小目标检测的重点是在复杂背景下准确识别低信噪比的微小目标,是红外图像处理领域的一个关键挑战。现有的方法在全局语义提取过程中往往不能保留小目标信息,难以保留细节特征并实现有效的特征融合。为了解决这些限制,本文提出了一种用于红外小目标检测的形态学边缘增强三级联网络(MEETNet)。该网络采用三级联架构,既保持了高分辨率,又增强了各阶段之间的信息交互,在保证深度小目标特征的同时,实现了有效的多级特征融合。MEETNet集成了边缘细节增强模块(EDEM)和细节感知多尺度融合模块(DMSFM)。这些模块引入边缘细节增强特征,合并对比度和边缘信息,从而放大目标显著性并改善边缘表示。具体来说,EDEM通过将边缘细节增强特征与浅层细节相结合来增强目标对比度和边缘结构。这种融合提高了浅层特征对小目标的识别能力。此外,DMSFM实现了一种多接受场机制,将目标细节与深度语义洞察合并在一起,从而能够捕获更多独特的全局上下文特征。使用两个公共数据集(nuaa - sirst和nudt - sirst)进行的实验评估表明,所提出的MEETNet在检测精度方面优于现有的最先进的红外小目标检测方法。
{"title":"MEETNet: Morphology-Edge Enhanced Triple-Cascaded Network for Infrared Small Target Detection","authors":"Enyu Zhao;Yu Shi;Nianxin Qu;Yulei Wang;Hang Zhao","doi":"10.1109/JSTARS.2026.3651900","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651900","url":null,"abstract":"Infrared small target detection is focused on accurately identifying tiny targets with low signal-to-noise ratio against complex backgrounds, representing a critical challenge in the field of infrared image processing. Existing approaches frequently fail to retain small target information during global semantic extraction and struggle with preserving detailed features and achieving effective feature fusion. To address these limitations, this article proposes a morphology-edge enhanced triple-cascaded network (MEETNet) for infrared small target detection. The network employs a triple-cascaded architecture that maintains high resolution and enhances information interaction between different stages, facilitating effective multilevel feature fusion while safeguarding deep small-target characteristics. MEETNet integrates an edge-detail enhanced module (EDEM) and a detail-aware multi-scale fusion module (DMSFM). These modules introduce edge-detail enhanced features that amalgamate contrast and edge information, thereby amplifying target saliency and improving edge representation. Specifically, EDEM augments target contrast and edge structures by integrating edge-detail-enhanced features with shallow details. This integration improves the discriminability capacity of shallow features for detecting small targets. Moreover, DMSFM implements a multireceptive field mechanism to merge target details with deep semantic insights, enabling the capture of more distinctive global contextual features. Experimental evaluations conducted using two public datasets—NUAA-SIRST and NUDT-SIRST—demonstrate that the proposed MEETNet surpasses existing state-of-the-art methods for infrared small target detection in terms of detection accuracy.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4748-4765"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11340625","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature-Screened and Structure-Constrained Deep Forest for Unsupervised SAR Image Change Detection 基于特征筛选和结构约束的无监督SAR图像变化检测深度森林
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/JSTARS.2026.3651534
Wanying Song;Ruijing Zhu;Jie Wang;Yinyin Jiang;Yan Wu
Deep forest-based models for synthetic aperture radar (SAR) image change detection are generally challenged by noise sensitivity and high feature redundancy, which significantly degrade the prediction performance. To address these issues, this article proposes a structure-constrained and feature-screened deep forest, abbreviated as SC-FS-DF, for SAR image change detection. In preclassification, a fuzzy multineighborhood information C-means clustering is proposed to generate high-quality pseudo-labels. It introduces the edge information, the nonlocal and intrasuperpixel neighborhoods into the objective function of fuzzy local information C-means, thus suppressing the speckle noise and constraining structures of targets. In the sample learning and label prediction module, a feature-screened deep forest (FS-DF) framework is constructed by combining feature importance and redundancy analysis with a dropout strategy, thus screening out the noninformative features and meanwhile retaining the informative ones for learning at each cascade layer. Finally, a novel energy function fusing the nonlocal and superpixel information is derived for refining the detection map generated by FS-DF, further preserving fine details and edge locations. Extensive comparison and ablation experiments on five real SAR datasets verify the effectiveness and robustness of the proposed SC-FS-DF, and demonstrate that the SC-FS-DF can well screen the high-dimensional features in change detection and constrain the structures of targets.
基于深度森林的合成孔径雷达(SAR)图像变化检测模型存在噪声敏感性和特征冗余度高的问题,严重影响了预测效果。为了解决这些问题,本文提出了一种结构约束和特征筛选的深森林,简称SC-FS-DF,用于SAR图像变化检测。在预分类中,提出了一种模糊多邻域信息c均值聚类方法来生成高质量的伪标签。在模糊局部信息C-means的目标函数中引入边缘信息、非局部和超像素内邻域,从而抑制散斑噪声和约束目标结构。在样本学习和标签预测模块中,将特征重要性和冗余分析与dropout策略相结合,构建了特征筛选深度森林(FS-DF)框架,从而筛选出非信息特征,同时保留每个级联层学习的信息特征。最后,导出了一种融合非局部和超像素信息的能量函数,用于细化FS-DF生成的检测图,进一步保留了精细细节和边缘位置。在5个真实SAR数据集上进行了大量对比和烧蚀实验,验证了该算法的有效性和鲁棒性,并证明了该算法在变化检测中能够很好地筛选高维特征并约束目标结构。
{"title":"Feature-Screened and Structure-Constrained Deep Forest for Unsupervised SAR Image Change Detection","authors":"Wanying Song;Ruijing Zhu;Jie Wang;Yinyin Jiang;Yan Wu","doi":"10.1109/JSTARS.2026.3651534","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651534","url":null,"abstract":"Deep forest-based models for synthetic aperture radar (SAR) image change detection are generally challenged by noise sensitivity and high feature redundancy, which significantly degrade the prediction performance. To address these issues, this article proposes a structure-constrained and feature-screened deep forest, abbreviated as SC-FS-DF, for SAR image change detection. In preclassification, a fuzzy multineighborhood information C-means clustering is proposed to generate high-quality pseudo-labels. It introduces the edge information, the nonlocal and intrasuperpixel neighborhoods into the objective function of fuzzy local information C-means, thus suppressing the speckle noise and constraining structures of targets. In the sample learning and label prediction module, a feature-screened deep forest (FS-DF) framework is constructed by combining feature importance and redundancy analysis with a dropout strategy, thus screening out the noninformative features and meanwhile retaining the informative ones for learning at each cascade layer. Finally, a novel energy function fusing the nonlocal and superpixel information is derived for refining the detection map generated by FS-DF, further preserving fine details and edge locations. Extensive comparison and ablation experiments on five real SAR datasets verify the effectiveness and robustness of the proposed SC-FS-DF, and demonstrate that the SC-FS-DF can well screen the high-dimensional features in change detection and constrain the structures of targets.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4056-4068"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11339914","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DTWSTSR: Dual-Tree Complex Wavelet and Swin Transformer Based Remote Sensing Images Super-Resolution Network 基于双树复小波和Swin变压器的遥感图像超分辨网络
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1109/JSTARS.2026.3651075
Yu Yao;Hengbin Wang;Xiang Gao;Ziyao Xing;Xiaodong Zhang;Yuanyuan Zhao;Shaoming Li;Zhe Liu
High-resolution remote sensing images provide crucial data support for applications such as precision agriculture and water resource management. However, super-resolution reconstructions often suffer from over-smoothed textures and structural distortions, failing to accurately recover the intricate details of ground objects. To address this issue, this article proposes a remote sensing image super-resolution network (DTWSTSR) that combines the Dual-Tree Complex Wavelet Transform and Swin Transformer, which enhances the ability of texture detail reconstruction by fusing frequency-domain and spatial-domain features. This model includes a Dual-Tree Complex Wavelet Texture Feature Sensing Module (DWTFSM) for integrating frequency and spatial features, and a Multiscale Efficient Channel Attention mechanism to enhance attention to multiscale and global details. In addition, we design a Kolmogorov–Arnold Network based on a branch attention mechanism, which improves the model’s ability to represent complex nonlinear features. During the training process, we investigate the impact of hyperparameters and propose the two-stage SSIM&SL1 loss function to reduce structural differences between images. Experimental results show that DTWSTSR outperforms existing mainstream methods under different magnification factors (×2, ×3, ×4), ranking among the top two in multiple metrics. For example, at ×2 magnification, its PSNR value is 0.64–2.68 dB higher than that of other models. Visual comparisons demonstrate that the proposed model achieves clearer and more accurate detail reconstruction of target ground objects. Furthermore, the model exhibits excellent generalization ability in cross-sensor image (OLI2MSI dataset) reconstruction.
高分辨率遥感图像为精准农业和水资源管理等应用提供了重要的数据支持。然而,超分辨率重建往往存在纹理过度平滑和结构扭曲的问题,无法准确恢复地物的复杂细节。针对这一问题,本文提出了一种结合双树复小波变换和Swin变压器的遥感图像超分辨率网络(DTWSTSR),通过融合频域和空域特征,增强了纹理细节的重建能力。该模型采用双树复小波纹理特征感知模块(DWTFSM)对频率和空间特征进行融合,采用多尺度高效通道关注机制对多尺度和全局细节进行关注。此外,我们设计了一个基于分支注意机制的Kolmogorov-Arnold网络,提高了模型表征复杂非线性特征的能力。在训练过程中,我们研究了超参数的影响,并提出了两阶段SSIM&SL1损失函数来减少图像之间的结构差异。实验结果表明,DTWSTSR在不同放大倍数(×2, ×3, ×4)下优于现有主流方法,多项指标均排名前两位。例如,在×2放大倍数下,其PSNR值比其他模型高0.64-2.68 dB。目视对比表明,该模型对目标地物的细节重建更加清晰、准确。此外,该模型在跨传感器图像(OLI2MSI数据集)重建中表现出良好的泛化能力。
{"title":"DTWSTSR: Dual-Tree Complex Wavelet and Swin Transformer Based Remote Sensing Images Super-Resolution Network","authors":"Yu Yao;Hengbin Wang;Xiang Gao;Ziyao Xing;Xiaodong Zhang;Yuanyuan Zhao;Shaoming Li;Zhe Liu","doi":"10.1109/JSTARS.2026.3651075","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3651075","url":null,"abstract":"High-resolution remote sensing images provide crucial data support for applications such as precision agriculture and water resource management. However, super-resolution reconstructions often suffer from over-smoothed textures and structural distortions, failing to accurately recover the intricate details of ground objects. To address this issue, this article proposes a remote sensing image super-resolution network (DTWSTSR) that combines the Dual-Tree Complex Wavelet Transform and Swin Transformer, which enhances the ability of texture detail reconstruction by fusing frequency-domain and spatial-domain features. This model includes a Dual-Tree Complex Wavelet Texture Feature Sensing Module (DWTFSM) for integrating frequency and spatial features, and a Multiscale Efficient Channel Attention mechanism to enhance attention to multiscale and global details. In addition, we design a Kolmogorov–Arnold Network based on a branch attention mechanism, which improves the model’s ability to represent complex nonlinear features. During the training process, we investigate the impact of hyperparameters and propose the two-stage SSIM&SL1 loss function to reduce structural differences between images. Experimental results show that DTWSTSR outperforms existing mainstream methods under different magnification factors (×2, ×3, ×4), ranking among the top two in multiple metrics. For example, at ×2 magnification, its PSNR value is 0.64–2.68 dB higher than that of other models. Visual comparisons demonstrate that the proposed model achieves clearer and more accurate detail reconstruction of target ground objects. Furthermore, the model exhibits excellent generalization ability in cross-sensor image (OLI2MSI dataset) reconstruction.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4730-4747"},"PeriodicalIF":5.3,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11329193","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1