Pub Date : 2026-01-20DOI: 10.1109/JSTARS.2026.3655691
Yuan Yuan;Junhan Zhou;Lei Lin;Ying Yu;Qingshan Liu
Optical satellite time series data play a crucial role in monitoring vegetation dynamics and land surface changes. However, persistent cloud cover often leads to missing data, particularly during critical phenological stages, which significantly diminishes data quality and hinders downstream applications. To address this issue, we present conditional optical-SAR multitemporal diffusion (CosmDiff), a novel framework for reconstructing optical satellite time series by integrating multimodal, multitemporal optical and synthetic aperture radar (SAR) data using conditional diffusion models. In CosmDiff, the reconstruction task is formulated as a multivariate time series imputation problem, where missing values are modeled as conditionally dependent on both cloudfree optical observations and synergic SAR time series. The framework incorporates a Transformer-based network within the diffusion process, introducing a novel dimensional decomposition attention mechanism that fuses optical-SAR time series across both temporal and feature dimensions. This mechanism enables the dynamic extraction of essential and complementary features from both modalities. In addition, linearly interpolated optical time series are used as auxiliary inputs to further guide the imputation process. Experimental results on Sentinel-1/-2 datasets demonstrate that CosmDiff consistently outperforms both traditional interpolation methods and advanced deep learning approaches, achieving a 3.8% reduction in mean absolute error and a 6.8% improvement in spectral angle mapper compared to competing methods. Furthermore, CosmDiff provides comprehensive uncertainty estimates for its predictions, which are particularly valuable for decision-making applications.
{"title":"CosmDiff: Integrating Multitemporal Optical-SAR Data With Conditional Diffusion Models for Optical Satellite Time Series Reconstruction","authors":"Yuan Yuan;Junhan Zhou;Lei Lin;Ying Yu;Qingshan Liu","doi":"10.1109/JSTARS.2026.3655691","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3655691","url":null,"abstract":"Optical satellite time series data play a crucial role in monitoring vegetation dynamics and land surface changes. However, persistent cloud cover often leads to missing data, particularly during critical phenological stages, which significantly diminishes data quality and hinders downstream applications. To address this issue, we present conditional optical-SAR multitemporal diffusion (CosmDiff), a novel framework for reconstructing optical satellite time series by integrating multimodal, multitemporal optical and synthetic aperture radar (SAR) data using conditional diffusion models. In CosmDiff, the reconstruction task is formulated as a multivariate time series imputation problem, where missing values are modeled as conditionally dependent on both cloudfree optical observations and synergic SAR time series. The framework incorporates a Transformer-based network within the diffusion process, introducing a novel dimensional decomposition attention mechanism that fuses optical-SAR time series across both temporal and feature dimensions. This mechanism enables the dynamic extraction of essential and complementary features from both modalities. In addition, linearly interpolated optical time series are used as auxiliary inputs to further guide the imputation process. Experimental results on Sentinel-1/-2 datasets demonstrate that CosmDiff consistently outperforms both traditional interpolation methods and advanced deep learning approaches, achieving a 3.8% reduction in mean absolute error and a 6.8% improvement in spectral angle mapper compared to competing methods. Furthermore, CosmDiff provides comprehensive uncertainty estimates for its predictions, which are particularly valuable for decision-making applications.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5722-5740"},"PeriodicalIF":5.3,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11359003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Transformer-based architectures have shown strong potential in hyperspectral unmixing due to their powerful modeling capabilities. However, most existing transformer-based methods still struggle to effectively capture and fuse spatial–spectral features, and their predominant reliance on reconstruction error further constrains overall unmixing performance. Moreover, they rarely account for the nonlinear correlations that inherently exist between the spatial and spectral domains. To address these challenges, we propose a sampling-based spatial–spectral transformer and generative adversarial network (SSST-GAN). The proposed model employs a dual-branch, sampling-based transformer encoder to independently extract spatial and spectral representations. Specifically, the spatial branch adopts a full-sampling multihead attention mechanism to capture rich contextual dependences among spatial pixels, while the spectral branch utilizes a sparse sampling strategy to efficiently distill key information from high-dimensional spectral data. A feature enhancement module is introduced to integrate and strengthen the complementary characteristics of spatial and spectral features. To further improve the modeling of complex nonlinear mixing patterns, we incorporate a generalized nonlinear fluctuation model at the decoding stage. In addition, SSST-GAN leverages a generative adversarial learning framework, in which a discriminator evaluates the authenticity of reconstructed pixels, thereby enhancing the fidelity of the unmixing results. Extensive experiments on both synthetic and real-world datasets demonstrate that SSST-GAN consistently outperforms several state-of-the-art methods in terms of unmixing accuracy.
{"title":"SSST-GAN: A Sampling-Based Spatial-Spectral Transformer and Generative Adversarial Network for Hyperspectral Unmixing","authors":"Yu Zhang;Jiageng Huang;Yefei Huang;Wei Gao;Jie Chen","doi":"10.1109/JSTARS.2026.3655512","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3655512","url":null,"abstract":"Transformer-based architectures have shown strong potential in hyperspectral unmixing due to their powerful modeling capabilities. However, most existing transformer-based methods still struggle to effectively capture and fuse spatial–spectral features, and their predominant reliance on reconstruction error further constrains overall unmixing performance. Moreover, they rarely account for the nonlinear correlations that inherently exist between the spatial and spectral domains. To address these challenges, we propose a sampling-based spatial–spectral transformer and generative adversarial network (SSST-GAN). The proposed model employs a dual-branch, sampling-based transformer encoder to independently extract spatial and spectral representations. Specifically, the spatial branch adopts a full-sampling multihead attention mechanism to capture rich contextual dependences among spatial pixels, while the spectral branch utilizes a sparse sampling strategy to efficiently distill key information from high-dimensional spectral data. A feature enhancement module is introduced to integrate and strengthen the complementary characteristics of spatial and spectral features. To further improve the modeling of complex nonlinear mixing patterns, we incorporate a generalized nonlinear fluctuation model at the decoding stage. In addition, SSST-GAN leverages a generative adversarial learning framework, in which a discriminator evaluates the authenticity of reconstructed pixels, thereby enhancing the fidelity of the unmixing results. Extensive experiments on both synthetic and real-world datasets demonstrate that SSST-GAN consistently outperforms several state-of-the-art methods in terms of unmixing accuracy.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5741-5757"},"PeriodicalIF":5.3,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11358397","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Infrared imaging plays a crucial role in applications, such as search-and-rescue operations and fire monitoring, due to its robustness under complex environmental conditions. Nevertheless, the inherent low spatial resolution of infrared cameras, and the complicated imaging degradation process, still constrains the quality of captured images, thereby posing challenges for downstream tasks. Existing infrared image super-resolution methods (e.g., diffusion-based methods) often neglect the unique modality characteristics of infrared images and fail to effectively introduce additional fine-grained information. To address these limitations, we propose a novel framework named Visible-light-guided infrared image super resolution with dual amplitude-phase optimization (vap-SR). By leveraging the powerful generative capability of conditional diffusion and fully exploiting the rich structural priors embedded in visible images, vap-SR effectively compensates for the deficiencies of infrared images in terms of details, thereby overcoming the inherent limitations in texture fidelity. Phase and amplitude losses are designed to preserve the physical characteristics of the infrared modality while effectively leveraging the structural information from visible-light images. Extensive experiments demonstrate that vap-SR consistently outperforms state-of-the-art methods in both reconstruction quality and downstream object detection task, validating its effectiveness for infrared super resolution.
{"title":"Visible-Light-Guided Infrared Image Super Resolution With Dual Amplitude-Phase Optimization","authors":"Qingwang Wang;Yuhang Wu;Pengcheng Jin;Yan Lin;Zhen Zhang;Tao Shen","doi":"10.1109/JSTARS.2026.3655485","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3655485","url":null,"abstract":"Infrared imaging plays a crucial role in applications, such as search-and-rescue operations and fire monitoring, due to its robustness under complex environmental conditions. Nevertheless, the inherent low spatial resolution of infrared cameras, and the complicated imaging degradation process, still constrains the quality of captured images, thereby posing challenges for downstream tasks. Existing infrared image super-resolution methods (e.g., diffusion-based methods) often neglect the unique modality characteristics of infrared images and fail to effectively introduce additional fine-grained information. To address these limitations, we propose a novel framework named Visible-light-guided infrared image super resolution with dual amplitude-phase optimization (vap-SR). By leveraging the powerful generative capability of conditional diffusion and fully exploiting the rich structural priors embedded in visible images, vap-SR effectively compensates for the deficiencies of infrared images in terms of details, thereby overcoming the inherent limitations in texture fidelity. Phase and amplitude losses are designed to preserve the physical characteristics of the infrared modality while effectively leveraging the structural information from visible-light images. Extensive experiments demonstrate that vap-SR consistently outperforms state-of-the-art methods in both reconstruction quality and downstream object detection task, validating its effectiveness for infrared super resolution.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5774-5784"},"PeriodicalIF":5.3,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11358958","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1109/JSTARS.2026.3655359
Jianing Shao;Yanlei Du;Xiaofeng Yang;Longxiang Linghu;Jinsong Chong;Jian Yang
This study numerically investigates the spatial ergodicity of Doppler characteristics in polarimetric ocean radar scattering. The full Apel wave spectrum is employed to generate 2-D time-varying sea surfaces that involve all dominant large-scale gravity waves and small-scale capillary waves. By solving the radar scattering from time-varying ocean surfaces with various illumination sizes using the second-order small-slope approximation (SSA-2) model, the Doppler spectra, along with the Doppler shift and width, are thus computed and analyzed. The numerical simulations are conducted at L-band for three typical fully developed sea states. A Doppler shift error threshold is defined based on the accuracy requirements of sea surface current retrieval, and the spatial ergodicity of Doppler shift is evaluated quantitatively. Simulation results indicate that under co-polarization, the Doppler shift manifests spatial ergodicity when the sea surface size illuminated by radar is no less than one-quarter of the largest gravity wave wavelength at the corresponding sea state. For cross-polarization, the spatial ergodicity of the Doppler shift is significantly reduced and is observed only when the illumination size exceeds about one-half of the largest gravity wave wavelength. The results also indicate that wind direction has a limited effect on the spatial ergodicity of the Doppler shift.
{"title":"Spatial Ergodicity of Doppler Characteristics in Polarimetric Ocean Radar Scattering: A Numerical Study","authors":"Jianing Shao;Yanlei Du;Xiaofeng Yang;Longxiang Linghu;Jinsong Chong;Jian Yang","doi":"10.1109/JSTARS.2026.3655359","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3655359","url":null,"abstract":"This study numerically investigates the spatial ergodicity of Doppler characteristics in polarimetric ocean radar scattering. The full Apel wave spectrum is employed to generate 2-D time-varying sea surfaces that involve all dominant large-scale gravity waves and small-scale capillary waves. By solving the radar scattering from time-varying ocean surfaces with various illumination sizes using the second-order small-slope approximation (SSA-2) model, the Doppler spectra, along with the Doppler shift and width, are thus computed and analyzed. The numerical simulations are conducted at L-band for three typical fully developed sea states. A Doppler shift error threshold is defined based on the accuracy requirements of sea surface current retrieval, and the spatial ergodicity of Doppler shift is evaluated quantitatively. Simulation results indicate that under co-polarization, the Doppler shift manifests spatial ergodicity when the sea surface size illuminated by radar is no less than one-quarter of the largest gravity wave wavelength at the corresponding sea state. For cross-polarization, the spatial ergodicity of the Doppler shift is significantly reduced and is observed only when the illumination size exceeds about one-half of the largest gravity wave wavelength. The results also indicate that wind direction has a limited effect on the spatial ergodicity of the Doppler shift.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5493-5506"},"PeriodicalIF":5.3,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11358708","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1109/JSTARS.2026.3655550
Ch Muhammad Awais;Marco Reggiannini;Davide Moroni;Oktay Karakus
High-resolution imagery plays a critical role in improving the performance of visual recognition tasks such as classification, detection, and segmentation. In many domains, including remote sensing and surveillance, low-resolution images can limit the accuracy of automated analysis. To address this, superresolution techniques have been widely adopted to attempt to reconstruct high-resolution images from low-resolution inputs. Related traditional approaches focus solely on enhancing image quality based on pixel-level metrics, leaving the relationship between superresolved image fidelity and downstream classification performance largely underexplored. This raises a key question: Can integrating classification objectives directly into the superresolution process further improve classification accuracy? In this article, we try to respond to this question by investigating the relationship between superresolution and classification through the deployment of a specialized algorithmic strategy. We propose a novel methodology that increases the resolution of synthetic aperture radar imagery by optimizing loss functions that account for both image quality and classification performance. Our approach improves image quality, as measured by scientifically ascertained image quality indicators, while also enhancing classification accuracy.
{"title":"A Classification-Aware Superresolution Framework for Ship Targets in SAR Imagery","authors":"Ch Muhammad Awais;Marco Reggiannini;Davide Moroni;Oktay Karakus","doi":"10.1109/JSTARS.2026.3655550","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3655550","url":null,"abstract":"High-resolution imagery plays a critical role in improving the performance of visual recognition tasks such as classification, detection, and segmentation. In many domains, including remote sensing and surveillance, low-resolution images can limit the accuracy of automated analysis. To address this, superresolution techniques have been widely adopted to attempt to reconstruct high-resolution images from low-resolution inputs. Related traditional approaches focus solely on enhancing image quality based on pixel-level metrics, leaving the relationship between superresolved image fidelity and downstream classification performance largely underexplored. This raises a key question: Can integrating classification objectives directly into the superresolution process further improve classification accuracy? In this article, we try to respond to this question by investigating the relationship between superresolution and classification through the deployment of a specialized algorithmic strategy. We propose a novel methodology that increases the resolution of synthetic aperture radar imagery by optimizing loss functions that account for both image quality and classification performance. Our approach improves image quality, as measured by scientifically ascertained image quality indicators, while also enhancing classification accuracy.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"6614-6622"},"PeriodicalIF":5.3,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11358667","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1109/JSTARS.2026.3655376
Haiwei Yu;Huapeng Li;Jian Lu;Tongtong Zhao;Baoqi Liu
Accurate crop yield estimation is essential for global food security, especially high-resolution mapping that supports field-scale management and detailed yield gap analysis. This study developed a hybrid yield estimation framework, named ensemble Kalman filter-random forest (EnKF-RF), which coupled data assimilation with a two-stage random forest approach. In this framework, Sentinel-2-derived leaf area index was first assimilated into the WOrld FOod Studies model using the EnKF algorithm. An RF-based metamodel (RF_SIM) was then trained to approximate the assimilation process, followed by a second RF model (RF_FIELD) that integrated land surface phenology, extreme-climate indicators, and limited ground observations to estimate crop yield. The proposed framework was applied to maize yield estimation in Jilin Province, China, during 2022–2024. The results showed that EnKF-RF achieved superior performance [R 2 = 0.476, root-mean-square error (RMSE) = 1565.87 kg/ha, and mean absolute error (MAE) = 1299.42 kg/ha] compared with a standalone random forest (R 2 = 0.394, RMSE = 1685.04 kg/ha, and MAE = 1428.69 kg/ha) and the scalable crop yield mapper approach. Furthermore, the implementation of the metamodel substantially enhanced the efficiency of the EnKF-RF framework, allowing annual maize yield estimation to be achieved within 18 min per 10 000 km2 in Jilin Province when utilizing Google Earth Engine. Water availability was identified as the primary driver of interannual yield variability, especially due to spring drought and the co-occurrence of water stress and waterlogging during July and August according to the SHapley Additive exPlanations. Generally, EnKF-RF provides a scalable and efficient solution for high-resolution maize yield mapping, particularly in data-scarce regions.
{"title":"Metamodel-Accelerated High-Resolution Maize Yield Mapping via Sentinel-2 Assimilation and Random Forest","authors":"Haiwei Yu;Huapeng Li;Jian Lu;Tongtong Zhao;Baoqi Liu","doi":"10.1109/JSTARS.2026.3655376","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3655376","url":null,"abstract":"Accurate crop yield estimation is essential for global food security, especially high-resolution mapping that supports field-scale management and detailed yield gap analysis. This study developed a hybrid yield estimation framework, named ensemble Kalman filter-random forest (EnKF-RF), which coupled data assimilation with a two-stage random forest approach. In this framework, Sentinel-2-derived leaf area index was first assimilated into the WOrld FOod Studies model using the EnKF algorithm. An RF-based metamodel (RF_SIM) was then trained to approximate the assimilation process, followed by a second RF model (RF_FIELD) that integrated land surface phenology, extreme-climate indicators, and limited ground observations to estimate crop yield. The proposed framework was applied to maize yield estimation in Jilin Province, China, during 2022–2024. The results showed that EnKF-RF achieved superior performance [<italic>R</i> <sup>2</sup> = 0.476, root-mean-square error (RMSE) = 1565.87 kg/ha, and mean absolute error (MAE) = 1299.42 kg/ha] compared with a standalone random forest (<italic>R</i> <sup>2</sup> = 0.394, RMSE = 1685.04 kg/ha, and MAE = 1428.69 kg/ha) and the scalable crop yield mapper approach. Furthermore, the implementation of the metamodel substantially enhanced the efficiency of the EnKF-RF framework, allowing annual maize yield estimation to be achieved within 18 min per 10 000 km<sup>2</sup> in Jilin Province when utilizing Google Earth Engine. Water availability was identified as the primary driver of interannual yield variability, especially due to spring drought and the co-occurrence of water stress and waterlogging during July and August according to the SHapley Additive exPlanations. Generally, EnKF-RF provides a scalable and efficient solution for high-resolution maize yield mapping, particularly in data-scarce regions.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"6341-6358"},"PeriodicalIF":5.3,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11358689","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1109/JSTARS.2026.3655350
Wenjing Li;Libin Du;Xinglei Zhao
Accurate water-land classification is fundamental for topographic mapping and coastal zone monitoring based on airborne LiDAR bathymetry (ALB). However, due to the limited information content and feature ambiguity of one-dimensional (1-D) waveform signals, accurate classification from single-wavelength ALB data remains challenging. To address this issue, a dual-branch multimodal fusion network (CRMF-Net) is proposed to improve both classification accuracy and robustness. The proposed network consists of a convolutional neural network (CNN) branch and a convolutional block attention module optimized residual neural network branch, which are designed to capture complementary temporal and spatial features, respectively. The 1-D green waveform is converted into a 2-D time-frequency representation through the continuous wavelet transform, thereby increasing the dimensions and quantity of waveform features. By jointly exploiting complementary information from waveform signals and their corresponding time–frequency representations, the proposed method enables more effective feature representation without relying on extensive handcrafted analysis. Experiments conducted on CZMIL datasets from Qinshan Island demonstrate that CRMF-Net achieves an overall accuracy of 97.33% with a kappa coefficient of 0.9168, outperforming traditional methods, such as fuzzy C-means, support vector machine, and the one-dimensional convolutional neural network approach. These results indicate that the proposed method provides a promising solution for fully automated processing of single-wavelength ALB data.
{"title":"CRMF-Net: A Multimodal Fusion Network for Water–Land Classification From Single-Wavelength Bathymetric LiDAR","authors":"Wenjing Li;Libin Du;Xinglei Zhao","doi":"10.1109/JSTARS.2026.3655350","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3655350","url":null,"abstract":"Accurate water-land classification is fundamental for topographic mapping and coastal zone monitoring based on airborne LiDAR bathymetry (ALB). However, due to the limited information content and feature ambiguity of one-dimensional (1-D) waveform signals, accurate classification from single-wavelength ALB data remains challenging. To address this issue, a dual-branch multimodal fusion network (CRMF-Net) is proposed to improve both classification accuracy and robustness. The proposed network consists of a convolutional neural network (CNN) branch and a convolutional block attention module optimized residual neural network branch, which are designed to capture complementary temporal and spatial features, respectively. The 1-D green waveform is converted into a 2-D time-frequency representation through the continuous wavelet transform, thereby increasing the dimensions and quantity of waveform features. By jointly exploiting complementary information from waveform signals and their corresponding time–frequency representations, the proposed method enables more effective feature representation without relying on extensive handcrafted analysis. Experiments conducted on CZMIL datasets from Qinshan Island demonstrate that CRMF-Net achieves an overall accuracy of 97.33% with a kappa coefficient of 0.9168, outperforming traditional methods, such as fuzzy C-means, support vector machine, and the one-dimensional convolutional neural network approach. These results indicate that the proposed method provides a promising solution for fully automated processing of single-wavelength ALB data.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5804-5813"},"PeriodicalIF":5.3,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11358398","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1109/JSTARS.2026.3655144
Marta Alonso Tubía;Miguel Baena Botana;An Vo Quang;Ana Burgin;Oliva Garcia Cantú-Ros
Dynamic population mapping has become crucial for capturing real-time human movement and behavior, beyond traditional population mapping relying on census data. Differentiating indoor and outdoor activity enhances accuracy for smart city planning, emergency response, public health, or emerging technologies like Innovative Air Mobility, where pedestrian data informs safer, less disruptive flight planning. Data passively collected from mobile networks have proven to be highly effective in accurately capturing population presence and mobility patterns. By enhancing this rich data source with GPS data for spatial accuracy and validating the results with satellite imagery of detected pedestrians, we provide a procedure for indoor and outdoor population detection. The results show agreement between both methodologies. Despite some limitations related to GPS data biases and pedestrian detection issues caused by urban furniture and shadows, the procedure demonstrates strong potential to capture people’s movements, which could ultimately enable near real-time monitoring of population presence on the streets.
{"title":"Toward Outdoor Population Presence Monitoring With Mobile Network Data and Satellite Imagery","authors":"Marta Alonso Tubía;Miguel Baena Botana;An Vo Quang;Ana Burgin;Oliva Garcia Cantú-Ros","doi":"10.1109/JSTARS.2026.3655144","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3655144","url":null,"abstract":"Dynamic population mapping has become crucial for capturing real-time human movement and behavior, beyond traditional population mapping relying on census data. Differentiating indoor and outdoor activity enhances accuracy for smart city planning, emergency response, public health, or emerging technologies like Innovative Air Mobility, where pedestrian data informs safer, less disruptive flight planning. Data passively collected from mobile networks have proven to be highly effective in accurately capturing population presence and mobility patterns. By enhancing this rich data source with GPS data for spatial accuracy and validating the results with satellite imagery of detected pedestrians, we provide a procedure for indoor and outdoor population detection. The results show agreement between both methodologies. Despite some limitations related to GPS data biases and pedestrian detection issues caused by urban furniture and shadows, the procedure demonstrates strong potential to capture people’s movements, which could ultimately enable near real-time monitoring of population presence on the streets.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"5834-5852"},"PeriodicalIF":5.3,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11358662","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.1109/JSTARS.2026.3654602
Ming Tong;Shenghua Fan;Jiu Jiang;Hezhi Sun;Jisan Yang;Chu He
Recently, detectors based on deep learning have boosted the state-of-the-art of application on ship detection in synthetic aperture radar (SAR) images. However, constructing discriminative feature from scattering of background and distinguishing contour of ship precisely still present challenging subject to the inherent scattering mechanism of SAR. In this article, a dual-branch detection framework with perception of scattering characteristic and geometric contour is introduced to deal with the problem. First, a scattering characteristic perception branch is proposed to fit the scattering distribution of SAR ship through conditional diffusion model, which introduces learnable scattering feature. Second, a convex contour perception branch is designed as two-stage coarse-to-fine pipeline to delimit the irregular boundary of ship by learning scattering key points. Finally, a cross-token integration module following Bayesian framework is introduced to couple features of scattering and texture adaptively to learn construction of discriminative feature. Furthermore, comprehensive experiments on three authoritative SAR datasets for oriented ship detection demonstrate the effectiveness of proposed method.
{"title":"Dual-Perception Detector for Ship Detection in SAR Images","authors":"Ming Tong;Shenghua Fan;Jiu Jiang;Hezhi Sun;Jisan Yang;Chu He","doi":"10.1109/JSTARS.2026.3654602","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3654602","url":null,"abstract":"Recently, detectors based on deep learning have boosted the state-of-the-art of application on ship detection in synthetic aperture radar (SAR) images. However, constructing discriminative feature from scattering of background and distinguishing contour of ship precisely still present challenging subject to the inherent scattering mechanism of SAR. In this article, a dual-branch detection framework with perception of scattering characteristic and geometric contour is introduced to deal with the problem. First, a scattering characteristic perception branch is proposed to fit the scattering distribution of SAR ship through conditional diffusion model, which introduces learnable scattering feature. Second, a convex contour perception branch is designed as two-stage coarse-to-fine pipeline to delimit the irregular boundary of ship by learning scattering key points. Finally, a cross-token integration module following Bayesian framework is introduced to couple features of scattering and texture adaptively to learn construction of discriminative feature. Furthermore, comprehensive experiments on three authoritative SAR datasets for oriented ship detection demonstrate the effectiveness of proposed method.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"4790-4808"},"PeriodicalIF":5.3,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11355870","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.1109/JSTARS.2026.3655033
Yixin Zhu;Zhimin Sha;Pengzhi Wei;Shirong Ye;Pengfei Xia;Fangxin Hu
Tropospheric delay, for which water vapor is a major cause, is a significant source of error in the global navigation satellite system. This article presents the gray figure-based zenith tropospheric delay prediction (GFZTD) model, which is built on convolutional long short-term memory networks and self-attention mechanisms. The model converts 3-D zenith tropospheric delay (ZTD) grid products into multilayer 2-D grayscale images for predictive analysis. Utilizing the global forecast system (GFS) and ERA5 data from southeastern China and its adjacent seas in 2023, the GFZTD model is trained through seasonal slicing and stratification by altitude. This approach generates high spatiotemporal resolution ZTD 3-D grid products in near real time. To evaluate the grid prediction results, ERA5 is used as the truth, with an overall root-mean-square error (RMSE) of 1.35 cm, representing improvements of 26.5% and 71.0% over ZTD derived from GFS and global pressure and temperature 3 (GPT3), respectively. The model also successfully mitigates regional extreme prediction errors in complex terrain environments for GFS. In addition, when using Vienna mapping function 3 postprocessing products to assess ZTD prediction values at various stations, the GFZTD model shows an average RMSE of 1.49 cm. This result indicates the improvements of 13.1% and 69.4% compared with GFS and GPT3, respectively, underscoring the model's applicability at the station scale.
{"title":"GFZTD: A Multimodal Fusion-Driven 3-D Tropospheric Delay Prediction Model Coupling Self-Attention and ConvLSTM","authors":"Yixin Zhu;Zhimin Sha;Pengzhi Wei;Shirong Ye;Pengfei Xia;Fangxin Hu","doi":"10.1109/JSTARS.2026.3655033","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3655033","url":null,"abstract":"Tropospheric delay, for which water vapor is a major cause, is a significant source of error in the global navigation satellite system. This article presents the gray figure-based zenith tropospheric delay prediction (GFZTD) model, which is built on convolutional long short-term memory networks and self-attention mechanisms. The model converts 3-D zenith tropospheric delay (ZTD) grid products into multilayer 2-D grayscale images for predictive analysis. Utilizing the global forecast system (GFS) and ERA5 data from southeastern China and its adjacent seas in 2023, the GFZTD model is trained through seasonal slicing and stratification by altitude. This approach generates high spatiotemporal resolution ZTD 3-D grid products in near real time. To evaluate the grid prediction results, ERA5 is used as the truth, with an overall root-mean-square error (RMSE) of 1.35 cm, representing improvements of 26.5% and 71.0% over ZTD derived from GFS and global pressure and temperature 3 (GPT3), respectively. The model also successfully mitigates regional extreme prediction errors in complex terrain environments for GFS. In addition, when using Vienna mapping function 3 postprocessing products to assess ZTD prediction values at various stations, the GFZTD model shows an average RMSE of 1.49 cm. This result indicates the improvements of 13.1% and 69.4% compared with GFS and GPT3, respectively, underscoring the model's applicability at the station scale.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"6375-6388"},"PeriodicalIF":5.3,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11355947","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}