Accurate monitoring of wetland vegetation inundation is crucial for maintaining regional ecological balance and conserving biodiversity, serving as a fundamental prerequisite for wetland environmental monitoring and protection. The complex scattering characteristics of vegetation under different inundation conditions, combined with spatial and seasonal heterogeneity, pose significant challenges to precise vegetation inundation state identification. Therefore, this study proposes a novel approach named the linear-exponential model, shapelets, and multirocket integration (LESMI), for monitoring the inundation state and temporal changes of wetland vegetation using radar backscatter variation patterns. First, a new linear-exponential model is developed to characterize the backscatter-water depth relationship and represent the inundation state characteristics of wetland vegetation. Second, based on the typical inundated state of historical stages determined by the linear-exponential model, LESMI method innovatively combines the Shapelets with multirocket classification to efficiently extract multivariate key time periods features for inundation state identification and achieve large-scale, near real-time inundation state classification. Experimental results in the Dongting Lake wetland show that the proposed method achieves inundation recognition accuracies of 96.84% for reeds and 92.59% for grassland, outperforming traditional methods and LSTM deep learning by average margins of 12.95% and 1.87%, respectively. The linear-exponential model significantly enhances identification performance, improving accuracy by 5.64% and 3.83% compared to linear and normal distribution models. Monitoring from 2019 to 2021 demonstrates that LESMI effectively captures flood peak impacts on vegetation inundation and provides detailed classification of noninundated, shallow inundated, and deep inundated states, offering reliable technical support for dynamic wetland ecosystem monitoring and refined management.
{"title":"LESMI: Integrating Linear-Exponential Model, Shapelets, and Multirocket for Wetland Vegetation Inundation Monitoring With Time Series SAR","authors":"Yuanye Cao;Xiuguo Liu;Yuannan Long;Hui Yang;Shixiong Yan;Qihao Chen","doi":"10.1109/JSTARS.2025.3649200","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3649200","url":null,"abstract":"Accurate monitoring of wetland vegetation inundation is crucial for maintaining regional ecological balance and conserving biodiversity, serving as a fundamental prerequisite for wetland environmental monitoring and protection. The complex scattering characteristics of vegetation under different inundation conditions, combined with spatial and seasonal heterogeneity, pose significant challenges to precise vegetation inundation state identification. Therefore, this study proposes a novel approach named the linear-exponential model, shapelets, and multirocket integration (LESMI), for monitoring the inundation state and temporal changes of wetland vegetation using radar backscatter variation patterns. First, a new linear-exponential model is developed to characterize the backscatter-water depth relationship and represent the inundation state characteristics of wetland vegetation. Second, based on the typical inundated state of historical stages determined by the linear-exponential model, LESMI method innovatively combines the Shapelets with multirocket classification to efficiently extract multivariate key time periods features for inundation state identification and achieve large-scale, near real-time inundation state classification. Experimental results in the Dongting Lake wetland show that the proposed method achieves inundation recognition accuracies of 96.84% for reeds and 92.59% for grassland, outperforming traditional methods and LSTM deep learning by average margins of 12.95% and 1.87%, respectively. The linear-exponential model significantly enhances identification performance, improving accuracy by 5.64% and 3.83% compared to linear and normal distribution models. Monitoring from 2019 to 2021 demonstrates that LESMI effectively captures flood peak impacts on vegetation inundation and provides detailed classification of noninundated, shallow inundated, and deep inundated states, offering reliable technical support for dynamic wetland ecosystem monitoring and refined management.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4242-4256"},"PeriodicalIF":5.3,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11319178","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1109/JSTARS.2025.3649548
Xuan Liu;Lina Cai;Jiahua Li;Tianle Mao
This study examined a novel harmful algal bloom (HAB) inversion model (HABI) using domestic Chinese ocean color and temperature scanner multispectral data from HY-1C/D satellites. The model achieves the dual capabilities of HAB presence detection and density quantification, a key advancement over conventional binary classification models that lack the ability to delineate HAB density gradients. Key findings of this article include the following. 1) The HABI model uses spectral bands at 443, 490, and 565 nm, demonstrating superior performance in quantifying HAB density gradients compared to existing methods, with design adaptability to sensors featuring similar spectral configurations. 2) HABI achieved high inversion accuracy (R2 = 0.8682, RMSE = 0.09195, Recall = 0.9300, Precision = 0.949, F1-score = 0.939), showing strong consistency with Bulletin of China Marine Disaster and in situ HAB measurements in the Waters near the Yangtze River Estuary. 3) The distribution of HABs takes on obvious temporal and spatial change characteristics, with high density clusters localized in coastal zones, peaking in spring/summer, and changed seasonally. Their seasonal factors contributing to the change of HAB mainly include Yangtze River freshwater discharge and coastal upwelling, and modulated by physical (e.g., sea surface temperature), anthropogenic (e.g., industrial wastewater), and biogeochemical factors (e.g., dissolved inorganic nitrogen) as well as biodiversity. These findings are conceptually integrated in Fig. 14, synthesizing the model mechanics and spatio-temporal dynamics. The HABI algorithm proposed in this article can effectively applied for HAB monitoring and quantification, providing a technical support for near-shore ecological assessment and management.
{"title":"Study on Harmful Algal Blooms in the Waters Near the Yangtze River Estuary Based on Twin Satellites HY-1C/D COCTS Data","authors":"Xuan Liu;Lina Cai;Jiahua Li;Tianle Mao","doi":"10.1109/JSTARS.2025.3649548","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3649548","url":null,"abstract":"This study examined a novel harmful algal bloom (HAB) inversion model (HABI) using domestic Chinese ocean color and temperature scanner multispectral data from HY-1C/D satellites. The model achieves the dual capabilities of HAB presence detection and density quantification, a key advancement over conventional binary classification models that lack the ability to delineate HAB density gradients. Key findings of this article include the following. 1) The HABI model uses spectral bands at 443, 490, and 565 nm, demonstrating superior performance in quantifying HAB density gradients compared to existing methods, with design adaptability to sensors featuring similar spectral configurations. 2) HABI achieved high inversion accuracy (<italic>R</i><sup>2</sup> = 0.8682, RMSE = 0.09195, Recall = 0.9300, Precision = 0.949, F1-score = 0.939), showing strong consistency with Bulletin of China Marine Disaster and in situ HAB measurements in the Waters near the Yangtze River Estuary. 3) The distribution of HABs takes on obvious temporal and spatial change characteristics, with high density clusters localized in coastal zones, peaking in spring/summer, and changed seasonally. Their seasonal factors contributing to the change of HAB mainly include Yangtze River freshwater discharge and coastal upwelling, and modulated by physical (e.g., sea surface temperature), anthropogenic (e.g., industrial wastewater), and biogeochemical factors (e.g., dissolved inorganic nitrogen) as well as biodiversity. These findings are conceptually integrated in Fig. 14, synthesizing the model mechanics and spatio-temporal dynamics. The HABI algorithm proposed in this article can effectively applied for HAB monitoring and quantification, providing a technical support for near-shore ecological assessment and management.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4872-4886"},"PeriodicalIF":5.3,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11319152","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1109/JSTARS.2025.3648330
Mengmeng Wang;Xu Lin;Yuanxin Ye;Wenhui Wu;Bai Zhu;Yanshuai Dai
Change detection (CD) is a fundamental task that is pivotal in understanding surface changes. Recently, CD methods have advanced rapidly and attained impressive results, driven by deep learning technology. However, existing methods generally employ fusion modules with the same design for multilevel features, overlooking the inherent distinctions between low-level spatial features and deep-level semantic features generated by deep networks. To overcome this limitation, this article proposes a novel CD network, referred to as DACNet. This method introduces a divide-and-conquer fusion strategy designed to fuse multilevel features using different fusion strategies. Specifically, the widely used MobileNetV2 is employed within a dual-branch architecture to extract multilevel features from bitemporal images. Subsequently, the proposed divide-and-conquer fusion strategy comprises two specialized modules: the change region localization module and the edge complementarity module, which are tailored to fuse deep-level semantic features and low-level spatial features, respectively. In addition, to mitigate the unnecessary noise introduced by the conventional UNet architectures, attention gates are introduced into the UNet decoder to enhance the changed information and suppress background noises. Extensive experiments are conducted on three available CD datasets: LEVIR-CD, Google-CD, and MSRS-CD. The proposed network achieved favorable results compared to the nine state-of-the-art methods across all experiments, improving the F1 score by 0.93%, 1.10%, and 0.81% on the LEVIR-CD, Google-CD, and MSRS-CD datasets, respectively.
{"title":"A Novel Network for Change Detection Based on a Divide-and-Conquer Fusion Strategy","authors":"Mengmeng Wang;Xu Lin;Yuanxin Ye;Wenhui Wu;Bai Zhu;Yanshuai Dai","doi":"10.1109/JSTARS.2025.3648330","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3648330","url":null,"abstract":"Change detection (CD) is a fundamental task that is pivotal in understanding surface changes. Recently, CD methods have advanced rapidly and attained impressive results, driven by deep learning technology. However, existing methods generally employ fusion modules with the same design for multilevel features, overlooking the inherent distinctions between low-level spatial features and deep-level semantic features generated by deep networks. To overcome this limitation, this article proposes a novel CD network, referred to as DACNet. This method introduces a divide-and-conquer fusion strategy designed to fuse multilevel features using different fusion strategies. Specifically, the widely used MobileNetV2 is employed within a dual-branch architecture to extract multilevel features from bitemporal images. Subsequently, the proposed divide-and-conquer fusion strategy comprises two specialized modules: the change region localization module and the edge complementarity module, which are tailored to fuse deep-level semantic features and low-level spatial features, respectively. In addition, to mitigate the unnecessary noise introduced by the conventional UNet architectures, attention gates are introduced into the UNet decoder to enhance the changed information and suppress background noises. Extensive experiments are conducted on three available CD datasets: LEVIR-CD, Google-CD, and MSRS-CD. The proposed network achieved favorable results compared to the nine state-of-the-art methods across all experiments, improving the F1 score by 0.93%, 1.10%, and 0.81% on the LEVIR-CD, Google-CD, and MSRS-CD datasets, respectively.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2891-2904"},"PeriodicalIF":5.3,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11316246","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Transformer models have been widely adopted for hyperspectral image (HSI) classification due to their exceptional long-sequence modeling capabilities. However, the self-attention mechanism in Transformers incurs quadratic computational complexity, posing challenges in both speed and memory consumption. Recently, a novel state-space model—the Mamba model—has emerged, overcoming the quadratic complexity of self-attention by achieving linear computational complexity while retaining powerful long-sequence modeling. Yet, the original Mamba design does not account for the unique spectral–spatial characteristics of HSI data, making it difficult to capture multiscale features. This limitation can lead to the loss of critical spectral–spatial cues at fine targets and complex boundaries, resulting in increased classification noise, blurred boundary segmentation, and reduced overall accuracy. To address the loss of fine–grained spectral–spatial information in HSI, we propose HyM3S: a hyperspectral multiscale spatial–spectral sequence model that integrates multiscale spatial–spectral convolutions with Mamba’s linear sequence modeling.HyM3S first extracts multiscale spatial and spectral features in parallel along horizontal and vertical branches, and reinforces salient channels via channel–wise attention. Features are then adaptively fused across modality and directional dimensions to form a unified joint representation. Finally, this representation is fed into the Mamba module for long–range dependency modeling under linear complexity, thereby significantly improving classification accuracy and suppressing noise. Experiments on four benchmark HSI datasets—Pavia University (PaviaU), Houston2013, WHU–Hi–HanChuan, and WHU–Hi–HongHu—demonstrate the clear superiority of the proposed HyM3S model for HSI classification.
变压器模型由于其特殊的长序列建模能力而被广泛应用于高光谱图像(HSI)分类。然而,变形金刚的自注意机制带来了二次计算复杂度,在速度和内存消耗方面都提出了挑战。最近出现了一种新的状态空间模型——曼巴模型,它在保持强大的长序列建模的同时,通过实现线性计算复杂度,克服了自注意的二次复杂性。然而,最初的曼巴设计并没有考虑到HSI数据独特的光谱空间特征,这使得捕捉多尺度特征变得困难。这种限制可能导致在精细目标和复杂边界处丢失关键的光谱空间线索,从而导致分类噪声增加、边界分割模糊和整体精度降低。为了解决HSI中细粒度光谱空间信息的丢失问题,我们提出了HyM3S:一种集成了多尺度空间光谱卷积和曼巴线性序列建模的高光谱多尺度空间光谱序列模型。HyM3S首先沿水平和垂直分支平行提取多尺度空间和光谱特征,并通过通道智能关注强化突出通道。然后在模态和方向维度上自适应融合特征,形成统一的联合表示。最后,将该表示形式输入到Mamba模块中进行线性复杂度下的远程依赖建模,从而显著提高了分类精度并抑制了噪声。在帕维亚大学(PaviaU)、休斯顿2013、WHU-Hi-HanChuan和whu - hi - honghu四个基准HSI数据集上的实验表明,所提出的HyM3S模型在HSI分类方面具有明显的优势。
{"title":"HyM3S: Integrating Multiscale Spatial–Spectral Features With Sequence Modeling for Hyperspectral Classification","authors":"Yin Chen;Shaoqun Qi;Luhe Wan;Chunlong Du;Zhiwei Lin;Ling Zhu;Xiaona Yu","doi":"10.1109/JSTARS.2025.3647643","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3647643","url":null,"abstract":"Transformer models have been widely adopted for hyperspectral image (HSI) classification due to their exceptional long-sequence modeling capabilities. However, the self-attention mechanism in Transformers incurs quadratic computational complexity, posing challenges in both speed and memory consumption. Recently, a novel state-space model—the Mamba model—has emerged, overcoming the quadratic complexity of self-attention by achieving linear computational complexity while retaining powerful long-sequence modeling. Yet, the original Mamba design does not account for the unique spectral–spatial characteristics of HSI data, making it difficult to capture multiscale features. This limitation can lead to the loss of critical spectral–spatial cues at fine targets and complex boundaries, resulting in increased classification noise, blurred boundary segmentation, and reduced overall accuracy. To address the loss of fine–grained spectral–spatial information in HSI, we propose HyM3S: a hyperspectral multiscale spatial–spectral sequence model that integrates multiscale spatial–spectral convolutions with Mamba’s linear sequence modeling.HyM3S first extracts multiscale spatial and spectral features in parallel along horizontal and vertical branches, and reinforces salient channels via channel–wise attention. Features are then adaptively fused across modality and directional dimensions to form a unified joint representation. Finally, this representation is fed into the Mamba module for long–range dependency modeling under linear complexity, thereby significantly improving classification accuracy and suppressing noise. Experiments on four benchmark HSI datasets—Pavia University (PaviaU), Houston2013, WHU–Hi–HanChuan, and WHU–Hi–HongHu—demonstrate the clear superiority of the proposed HyM3S model for HSI classification.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"3190-3205"},"PeriodicalIF":5.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11314656","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-24DOI: 10.1109/JSTARS.2025.3647776
Adrian Perez-Portero;Jorge Querol;Adriano Camps
Global navigation satellite systems (GNSSs) are critical infrastructure components in modern positioning, navigation, and timing (PNT) services, playing a vital role in both civilian and defense applications. These systems operate in specific frequency bands that are also utilized by other Earth Observation technologies, such as GNSS-radio occultations and GNSS-reflectometry. Other passive microwave remote sensing techniques such as microwave radiometers, work with very faint signals in nearby frequency bands within the L-Band. However, the increasing prevalence of radio-frequency interferences (RFIs) poses a significant threat, potentially compromising the integrity and reliability of PNT services, and corrupting geophysical observations. Effective RFI mitigation relies on accurate detection and classification of interference sources, a task that becomes increasingly challenging due to the complexity and diversity of RFI signals. This work presents an automated classification system for RFI detection and characterization in GNSS bands. The methodology employs advanced digital signal processing techniques and statistical algorithms to improve RFI detection and classification. RFI events are then stored in a long-term database to provide insights into the local spectrum, and to aid in mitigation and law enforcement efforts. This study provides a description of the classification system, including its architecture, implementation, and performance analysis. The results highlight the potential of this system to enhance the resilience of GNSS PNT services against RFI.
{"title":"Automatic RFI Detection, Location, and Classification System in GNSS Bands","authors":"Adrian Perez-Portero;Jorge Querol;Adriano Camps","doi":"10.1109/JSTARS.2025.3647776","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3647776","url":null,"abstract":"Global navigation satellite systems (GNSSs) are critical infrastructure components in modern positioning, navigation, and timing (PNT) services, playing a vital role in both civilian and defense applications. These systems operate in specific frequency bands that are also utilized by other Earth Observation technologies, such as GNSS-radio occultations and GNSS-reflectometry. Other passive microwave remote sensing techniques such as microwave radiometers, work with very faint signals in nearby frequency bands within the L-Band. However, the increasing prevalence of radio-frequency interferences (RFIs) poses a significant threat, potentially compromising the integrity and reliability of PNT services, and corrupting geophysical observations. Effective RFI mitigation relies on accurate detection and classification of interference sources, a task that becomes increasingly challenging due to the complexity and diversity of RFI signals. This work presents an automated classification system for RFI detection and characterization in GNSS bands. The methodology employs advanced digital signal processing techniques and statistical algorithms to improve RFI detection and classification. RFI events are then stored in a long-term database to provide insights into the local spectrum, and to aid in mitigation and law enforcement efforts. This study provides a description of the classification system, including its architecture, implementation, and performance analysis. The results highlight the potential of this system to enhance the resilience of GNSS PNT services against RFI.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"3072-3084"},"PeriodicalIF":5.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11314687","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-24DOI: 10.1109/JSTARS.2025.3647680
Muhammad Yasir;Shanwei Liu;Mingming Xu;Fernando J. Aguilar;Jianhua Wan;Shiqing Wei;Saied Pirasteh;Hong Fan;Qamar Ul Islam
Tracking objects in synthetic aperture radar (SAR) imagery is critical for maritime surveillance, traffic monitoring, and security applications, but remains a major challenge due to speckle noise, sea clutter, and limited temporal continuity. Most existing tracking-by-detection methods process frames independently, often resulting in weak associations and frequent identity switches (IDs). To overcome these limitations, we propose TFST, a two-frame SAR ship tracking framework that integrates detection, feature encoding, and optimal assignment. In this way, the goal of this work is to address the current gaps in SAR ship tracking by strengthening cross-frame partnerships and reducing IDs through an integrated two-frame tracking framework. In our approach, a deep detector first processes consecutive frames to generate candidate bounding boxes. A lightweight feature extractor encodes both appearance and structural cues, while a matching module constructs a cost matrix that combines feature similarity and positional consistency. Gating is applied to remove infeasible associations, and the Hungarian algorithm is employed to achieve a globally optimal assignment. Quantitative evaluations performed on three widely known and publicly available SAR-Ship datasets (SSTD, SSDD, and SAR-Ship) further highlight the advantages of TFST. In terms of ship detection performance, TFST achieved an average mAP@50 improvement of 2.2% over the YOLOv12 baseline model on all three tested datasets. Regarding tracking results, the superiority of TFST over state-of-the-art multiobject trackers becomes even more evident. In fact, the proposed model achieved the highest multiple object tracking accuracy (MOTA) (86.9%) and the best IDF1 score (82.7%), thus outperforming strong baselines such as Siam-SORT (82.1% MOTA and 79.8% IDF1) and TrackFormer (80.7% MOTA and 78.7% IDF1). In conclusion, TFST demonstrated improved robustness, fewer ID switches, and higher tracking accuracy compared to baseline methods, underscoring its effectiveness in complex maritime environments.
{"title":"TFST: Two-Frame Ship Tracking for SAR Using YOLOv12 and Feature-Based Matching","authors":"Muhammad Yasir;Shanwei Liu;Mingming Xu;Fernando J. Aguilar;Jianhua Wan;Shiqing Wei;Saied Pirasteh;Hong Fan;Qamar Ul Islam","doi":"10.1109/JSTARS.2025.3647680","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3647680","url":null,"abstract":"Tracking objects in synthetic aperture radar (SAR) imagery is critical for maritime surveillance, traffic monitoring, and security applications, but remains a major challenge due to speckle noise, sea clutter, and limited temporal continuity. Most existing tracking-by-detection methods process frames independently, often resulting in weak associations and frequent identity switches (IDs). To overcome these limitations, we propose TFST, a two-frame SAR ship tracking framework that integrates detection, feature encoding, and optimal assignment. In this way, the goal of this work is to address the current gaps in SAR ship tracking by strengthening cross-frame partnerships and reducing IDs through an integrated two-frame tracking framework. In our approach, a deep detector first processes consecutive frames to generate candidate bounding boxes. A lightweight feature extractor encodes both appearance and structural cues, while a matching module constructs a cost matrix that combines feature similarity and positional consistency. Gating is applied to remove infeasible associations, and the Hungarian algorithm is employed to achieve a globally optimal assignment. Quantitative evaluations performed on three widely known and publicly available SAR-Ship datasets (<italic>SSTD, SSDD</i>, and <italic>SAR-Ship</i>) further highlight the advantages of TFST. In terms of ship detection performance, TFST achieved an average <italic>mAP@50</i> improvement of 2.2% over the YOLOv12 baseline model on all three tested datasets. Regarding tracking results, the superiority of TFST over state-of-the-art multiobject trackers becomes even more evident. In fact, the proposed model achieved the highest multiple object tracking accuracy (MOTA) (86.9%) and the best <italic>IDF1</i> score (82.7%), thus outperforming strong baselines such as <italic>Siam-SORT</i> (82.1% <italic>MOTA</i> and 79.8% <italic>IDF1</i>) and <italic>TrackFormer</i> (80.7% <italic>MOTA</i> and 78.7% <italic>IDF1</i>). In conclusion, TFST demonstrated improved robustness, fewer <italic>ID</i> switches, and higher tracking accuracy compared to baseline methods, underscoring its effectiveness in complex maritime environments.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"3175-3189"},"PeriodicalIF":5.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11314707","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-24DOI: 10.1109/JSTARS.2025.3648014
Wenjun Hong;Zhanchao Huang;Yongke Yang;Luping You;Junchao Cai;Jiajun Zhou;Weiwang Guan;Hua Su
Sea ice recognition is of great significance for reflecting climate change and ensuring ship navigation safety. In recent years, many deep learning-based methods have been proposed and applied to the segmentation and recognition of sea ice regions. However, existing deep learning models often struggle to effectively capture the subtle spectral differences between sea ice and seawater, as well as the large-scale spatial dependencies in high-resolution remote sensing images, resulting in limited segmentation accuracy in areas with ambiguous ice–water boundaries. This article proposes a red–blue normalized difference index (RB-NDI) guided state-space model (SSM) approach for sea ice segmentation, termed SI-Mamba. In the proposed SI-Mamba, the index-guided collaborative enhancement module employs an RB-NDI index-guided SSM mechanism to overcome the limitations in explicit modeling of spectral features, achieving efficient modeling of long-range spatial dependencies in sea ice distribution. Furthermore, the designed dynamic boundary focus loss function adjusts the model’s expressive capability in edge-sensitive regions through collaborative optimization between the main segmentation head and the index-assisted head. Experiments on multiple generated sea ice datasets demonstrate that the proposed SI-Mamba achieves improved performance in sea ice segmentation and identification in optical remote sensing images. It significantly enhances the accuracy of ice–water boundaries recognition and generalization capability in complex scenarios, offering a novel and effective solution for remote sensing-based sea ice recognition.
{"title":"SI-Mamba: High-Resolution Sea Ice Recognition via RB-NDI Guided State-Space Model","authors":"Wenjun Hong;Zhanchao Huang;Yongke Yang;Luping You;Junchao Cai;Jiajun Zhou;Weiwang Guan;Hua Su","doi":"10.1109/JSTARS.2025.3648014","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3648014","url":null,"abstract":"Sea ice recognition is of great significance for reflecting climate change and ensuring ship navigation safety. In recent years, many deep learning-based methods have been proposed and applied to the segmentation and recognition of sea ice regions. However, existing deep learning models often struggle to effectively capture the subtle spectral differences between sea ice and seawater, as well as the large-scale spatial dependencies in high-resolution remote sensing images, resulting in limited segmentation accuracy in areas with ambiguous ice–water boundaries. This article proposes a red–blue normalized difference index (RB-NDI) guided state-space model (SSM) approach for sea ice segmentation, termed SI-Mamba. In the proposed SI-Mamba, the index-guided collaborative enhancement module employs an RB-NDI index-guided SSM mechanism to overcome the limitations in explicit modeling of spectral features, achieving efficient modeling of long-range spatial dependencies in sea ice distribution. Furthermore, the designed dynamic boundary focus loss function adjusts the model’s expressive capability in edge-sensitive regions through collaborative optimization between the main segmentation head and the index-assisted head. Experiments on multiple generated sea ice datasets demonstrate that the proposed SI-Mamba achieves improved performance in sea ice segmentation and identification in optical remote sensing images. It significantly enhances the accuracy of ice–water boundaries recognition and generalization capability in complex scenarios, offering a novel and effective solution for remote sensing-based sea ice recognition.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2783-2795"},"PeriodicalIF":5.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11314692","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Remote sensing (RS) change detection (CD), a technique focused on identifying surface alterations from bitemporal images, holds substantial significance for various applications, such as land management and disaster surveillance. In the last decade, deep learning-based CD methods have advanced rapidly. However, recognizing sporadic distributional changes over time in complex scenes remains a significant challenge. However, many existing solutions rely on large model capacities and high computational costs, yet still fail to incorporate sufficient semantic information for accurately recognizing complex real changes. To tackle this challenge, a lightweight full-information and dual-guide network (Lighter) for RSCD is presented. Specifically, we design a lightweight full-information mingling module that emphasizes the injection of multiperspective information during the feature interaction. This approach leverages rich semantics as cues to reason about diverse changes. Furthermore, we propose a lightweight dual-guide difference capture module, which utilizes the unique information of each guide to teach each other thereby reducing the interference of pseudovariations. Extensive experiments on four datasets demonstrate that our lightweight architecture achieves state-of-the-art performance with only 1.10 M parameters and 2.01 G FLOPs.
遥感(RS)变化检测(CD)是一种从双时相图像中识别地表变化的技术,在土地管理和灾害监测等各种应用中具有重要意义。在过去的十年中,基于深度学习的CD方法发展迅速。然而,在复杂的场景中识别零星的分布变化仍然是一个重大的挑战。然而,许多现有的解决方案依赖于大的模型容量和高的计算成本,但仍然不能包含足够的语义信息来准确识别复杂的真实变化。为了应对这一挑战,RSCD提出了一种轻量级的全信息双导网络(Lighter)。具体来说,我们设计了一个轻量级的全信息混合模块,强调在功能交互过程中注入多视角信息。这种方法利用丰富的语义作为推理各种变化的线索。此外,我们提出了一种轻量级的双导差分捕获模块,该模块利用每个导的独特信息相互教导,从而减少伪变异的干扰。在四个数据集上进行的大量实验表明,我们的轻量级架构仅以1.10 M参数和2.01 G FLOPs实现了最先进的性能。
{"title":"Lighter: A Lightweight Full-Information and Dual-Guide Network for Remote Sensing Image Change Detection","authors":"Yuan Wang;Sixian Chan;Guoyu Yang;Jian Tao;Tianyang Dong;Xiaolong Zhou;Xiaoqin Zhang","doi":"10.1109/JSTARS.2025.3647926","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3647926","url":null,"abstract":"Remote sensing (RS) change detection (CD), a technique focused on identifying surface alterations from bitemporal images, holds substantial significance for various applications, such as land management and disaster surveillance. In the last decade, deep learning-based CD methods have advanced rapidly. However, recognizing sporadic distributional changes over time in complex scenes remains a significant challenge. However, many existing solutions rely on large model capacities and high computational costs, yet still fail to incorporate sufficient semantic information for accurately recognizing complex real changes. To tackle this challenge, a lightweight full-information and dual-guide network (Lighter) for RSCD is presented. Specifically, we design a lightweight full-information mingling module that emphasizes the injection of multiperspective information during the feature interaction. This approach leverages rich semantics as cues to reason about diverse changes. Furthermore, we propose a lightweight dual-guide difference capture module, which utilizes the unique information of each guide to teach each other thereby reducing the interference of pseudovariations. Extensive experiments on four datasets demonstrate that our lightweight architecture achieves state-of-the-art performance with only 1.10 M parameters and 2.01 G FLOPs.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2924-2937"},"PeriodicalIF":5.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11314198","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-24DOI: 10.1109/JSTARS.2025.3648023
Bei Cheng;Beihao Xu;Wenjie Gan;Qingwang Wang
Multimodal object detection plays a crucial role in all-weather and multiscene applications of aerial imagery. Existing studies mainly focus on multimodal fusion and interlevel feature interaction during feature extraction, that is, correcting or enhancing dual-branch weights through fused features or multimodal interactions, while neglecting the supplementation of missing modality features. This limitation can lead to noise propagation across layers and a reduction in interaction capability caused by feature absence. In this article, we propose a multimodal collaborative interactive soft fusion network (MCISFNet) for RGB-infrared aerial image object detection. The proposed method introduces a saliency-guided multimodal soft fusion mechanism (SMSFM), which explicitly directs attention to and enhances critical regions, dynamically adjusts feature weights, and integrates complementary information to mitigate the problem of missing data in dual-branch representations. To address the complexity of aerial scenarios, we further develop a multiscale interactive gating module (MIGM) that explicitly incorporates multiscale contextual information, enabling fine-grained refinement of primary modality features and enhancing the discriminability of fused representations. Moreover, we design a cross-modal global context collaborative modeling (CGCCM) strategy, in which a cross-modal shared branch is constructed to jointly perform context extraction and feature fusion. This collaborative design not only improves the alignment of deep semantic features but also ensures that the learned RGB and IR features are more consistent and complementary, while reducing computational cost. Extensive experiments conducted on three multimodal aerial image detection datasets (DroneVehicle, VEDAI, and ODinMJ) demonstrate the robustness and generalization capability of the proposed MCISFNet framework.
{"title":"Multimodal Collaborative Interactive Soft Fusion Network for RGB-Infrared Aerial Image Object Detection","authors":"Bei Cheng;Beihao Xu;Wenjie Gan;Qingwang Wang","doi":"10.1109/JSTARS.2025.3648023","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3648023","url":null,"abstract":"Multimodal object detection plays a crucial role in all-weather and multiscene applications of aerial imagery. Existing studies mainly focus on multimodal fusion and interlevel feature interaction during feature extraction, that is, correcting or enhancing dual-branch weights through fused features or multimodal interactions, while neglecting the supplementation of missing modality features. This limitation can lead to noise propagation across layers and a reduction in interaction capability caused by feature absence. In this article, we propose a multimodal collaborative interactive soft fusion network (MCISFNet) for RGB-infrared aerial image object detection. The proposed method introduces a saliency-guided multimodal soft fusion mechanism (SMSFM), which explicitly directs attention to and enhances critical regions, dynamically adjusts feature weights, and integrates complementary information to mitigate the problem of missing data in dual-branch representations. To address the complexity of aerial scenarios, we further develop a multiscale interactive gating module (MIGM) that explicitly incorporates multiscale contextual information, enabling fine-grained refinement of primary modality features and enhancing the discriminability of fused representations. Moreover, we design a cross-modal global context collaborative modeling (CGCCM) strategy, in which a cross-modal shared branch is constructed to jointly perform context extraction and feature fusion. This collaborative design not only improves the alignment of deep semantic features but also ensures that the learned RGB and IR features are more consistent and complementary, while reducing computational cost. Extensive experiments conducted on three multimodal aerial image detection datasets (DroneVehicle, VEDAI, and ODinMJ) demonstrate the robustness and generalization capability of the proposed MCISFNet framework.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"3206-3218"},"PeriodicalIF":5.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11314552","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-24DOI: 10.1109/JSTARS.2025.3647732
Haiwei Yuan;Shikuan Wang;Jianzhou Gong
Intertidal mangrove communities present significant challenges for remote sensing classification due to tidal dynamics, intertwining canopies, and spectral confusion between species. These factors often lead to classification errors and indistinct species boundaries, primarily due to intraspectral variation and interspectral similarity, and also limit the capture of vertical structural features. To address these issues, we propose MSMNet, a specialized multimodal deep learning method for fine-grained mangrove species classification based on Sentinel-2 multispectral imagery. The model uses ResNet50 as its backbone architecture and integrates the Mamba state-space module to model long-range spatial correlations. MSMNet incorporates wavelet transform technology to enhance its ability to represent textures and uses three pathways to extract three key mangrove remote sensing modalities: morphological texture, spectral features, and vegetation physiological characteristics. The design of the multimodal dynamic fusion and enhanced multiscale integration modules supports cross-modal adaptive weight allocation and efficient cross-scale feature aggregation, leveraging complementary information across dimensions. It is demonstrated by experimental results that all baseline models are significantly outperformed by MSMNet with 73.80% mIoU, 84.47% $F1$-score, and 99.37% overall accuracy. Compared to the second-best approach, its metrics improved by 2.25%, 1.59%, and 0.06%, respectively. MSMNet notably demonstrated exceptional performance in classifying key mangrove species, such as Rhizophora stylosa, achieving a 4.06% accuracy improvement and significantly reducing misclassification rates at species level. These findings confirm that multimodal feature fusion and multiscale information integration are crucial for improving the accuracy of mangrove species classification. MSMNet provides an efficient solution for precise intertidal mangrove mapping.
{"title":"A Multimodal Remote Sensing Method for Mangrove Species Classification Based on Sentinel-2 Imagery","authors":"Haiwei Yuan;Shikuan Wang;Jianzhou Gong","doi":"10.1109/JSTARS.2025.3647732","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3647732","url":null,"abstract":"Intertidal mangrove communities present significant challenges for remote sensing classification due to tidal dynamics, intertwining canopies, and spectral confusion between species. These factors often lead to classification errors and indistinct species boundaries, primarily due to intraspectral variation and interspectral similarity, and also limit the capture of vertical structural features. To address these issues, we propose MSMNet, a specialized multimodal deep learning method for fine-grained mangrove species classification based on Sentinel-2 multispectral imagery. The model uses ResNet50 as its backbone architecture and integrates the Mamba state-space module to model long-range spatial correlations. MSMNet incorporates wavelet transform technology to enhance its ability to represent textures and uses three pathways to extract three key mangrove remote sensing modalities: morphological texture, spectral features, and vegetation physiological characteristics. The design of the multimodal dynamic fusion and enhanced multiscale integration modules supports cross-modal adaptive weight allocation and efficient cross-scale feature aggregation, leveraging complementary information across dimensions. It is demonstrated by experimental results that all baseline models are significantly outperformed by MSMNet with 73.80% mIoU, 84.47% <inline-formula><tex-math>$F1$</tex-math></inline-formula>-score, and 99.37% overall accuracy. Compared to the second-best approach, its metrics improved by 2.25%, 1.59%, and 0.06%, respectively. MSMNet notably demonstrated exceptional performance in classifying key mangrove species, such as Rhizophora stylosa, achieving a 4.06% accuracy improvement and significantly reducing misclassification rates at species level. These findings confirm that multimodal feature fusion and multiscale information integration are crucial for improving the accuracy of mangrove species classification. MSMNet provides an efficient solution for precise intertidal mangrove mapping.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4766-4778"},"PeriodicalIF":5.3,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11314702","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}