Pub Date : 2026-02-23DOI: 10.1109/JSTARS.2026.3665495
Qiong Ran;Pengfei Bian;Luyang Cai;Jinlin Chen;Huanqian Yan;He Sun
Vehicle detection holds significant research value and practical application potential. However, existing vehicle detection algorithms struggle to meet the demands for fast, accurate detection and tracking of weak and small targets in uncrewed aerial vehicle (UAV) remote sensing imagery. In this work, weak targets refer to vehicle targets with small spatial extent, low contrast with the background, and subtle temporal changes, which are difficult to detect reliably in UAV imagery. To address this issue, this article proposes a bitemporal change detection method based on spatiotemporal features. Specifically, the proposed method involves processing the data captured by UAVs, applying feature point matching for image alignment, and performing change detection on bitemporal data to achieve more precise recognition of vehicle targets. The results demonstrate that the proposed algorithm exhibits more competitive detection performance compared to traditional unsupervised methods, such as change vector analysis, differential component analysis, iteratively weighted multivariate extinction detection, and multivariate detection. Compared to traditional methods, our proposed method achieves superior performance in detecting small and weak targets, particularly excelling in identifying weak targets while reducing the occurrence of false positives.
{"title":"Small-Vehicle Change Detection in UAV Imagery via Physics-Aware Spatiotemporal Cues and Reproducible Evaluation","authors":"Qiong Ran;Pengfei Bian;Luyang Cai;Jinlin Chen;Huanqian Yan;He Sun","doi":"10.1109/JSTARS.2026.3665495","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3665495","url":null,"abstract":"Vehicle detection holds significant research value and practical application potential. However, existing vehicle detection algorithms struggle to meet the demands for fast, accurate detection and tracking of weak and small targets in uncrewed aerial vehicle (UAV) remote sensing imagery. In this work, weak targets refer to vehicle targets with small spatial extent, low contrast with the background, and subtle temporal changes, which are difficult to detect reliably in UAV imagery. To address this issue, this article proposes a bitemporal change detection method based on spatiotemporal features. Specifically, the proposed method involves processing the data captured by UAVs, applying feature point matching for image alignment, and performing change detection on bitemporal data to achieve more precise recognition of vehicle targets. The results demonstrate that the proposed algorithm exhibits more competitive detection performance compared to traditional unsupervised methods, such as change vector analysis, differential component analysis, iteratively weighted multivariate extinction detection, and multivariate detection. Compared to traditional methods, our proposed method achieves superior performance in detecting small and weak targets, particularly excelling in identifying weak targets while reducing the occurrence of false positives.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"8239-8249"},"PeriodicalIF":5.3,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11407963","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Western Kunlun Mountains are a region known for a high concentration of surge-type glaciers in High Mountain Asia and have long been of interest to glaciologists. This article examines the 2021–2023 surge of the eastern branch of ZhongFeng Glacier (ZFG) and reviews the 2003–2004 surge of its western branch, utilising multisource digital elevation models, Landsat MSS/ETM+/OLI, Sentinel-2, and meteorological data. Our findings reveal that surges in both the eastern and western branches of the ZFG were initiated during the summer, with durations of 2 years and 1 year, respectively. Peak flow velocities exceeded 10 m/day, more than 50 times the velocities observed during quiescent periods. During surges, the glacier termini of the eastern and western branches thickened by 60.25 ± 3.07 m and 76.21 ± 8.05 m, respectively, corresponding to ice mass gains of 0.53 ± 0.03 km3 and 0.74 ± 0.08 km3. Based on the timing characteristics of these surges, we conclude that both branches of the ZFG are influenced by hydrological mechanisms. Furthermore, differences in surface and subglacial topography are determined to be the primary factors contributing to the asynchrony of surges between the two branches.
{"title":"Surging Dynamics of the ZhongFeng Glacier, Western Kunlun Mountains","authors":"Yongpeng Gao;Miaomiao Qi;Jianxin Mu;Yang Liu;Chunhai Xu;Pengbin Liang","doi":"10.1109/JSTARS.2026.3665525","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3665525","url":null,"abstract":"The Western Kunlun Mountains are a region known for a high concentration of surge-type glaciers in High Mountain Asia and have long been of interest to glaciologists. This article examines the 2021–2023 surge of the eastern branch of ZhongFeng Glacier (ZFG) and reviews the 2003–2004 surge of its western branch, utilising multisource digital elevation models, Landsat MSS/ETM+/OLI, Sentinel-2, and meteorological data. Our findings reveal that surges in both the eastern and western branches of the ZFG were initiated during the summer, with durations of 2 years and 1 year, respectively. Peak flow velocities exceeded 10 m/day, more than 50 times the velocities observed during quiescent periods. During surges, the glacier termini of the eastern and western branches thickened by 60.25 ± 3.07 m and 76.21 ± 8.05 m, respectively, corresponding to ice mass gains of 0.53 ± 0.03 km<sup>3</sup> and 0.74 ± 0.08 km<sup>3</sup>. Based on the timing characteristics of these surges, we conclude that both branches of the ZFG are influenced by hydrological mechanisms. Furthermore, differences in surface and subglacial topography are determined to be the primary factors contributing to the asynchrony of surges between the two branches.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"8319-8328"},"PeriodicalIF":5.3,"publicationDate":"2026-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11404153","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The generation of rapid and accurate geospatial data and three-dimensional (3D) features is essential for supporting multipurpose land management services. This study presents CAdastre and Spatial map adjustment with spatial Computation for Automatic builDing dEtection and 3D generation (CASCADE-3D), a graphical user interface (GUI) developed for the automated reconstruction of 3D models at Levels of Detail (LOD) 1 and 2. CASCADE-3D integrates advanced deep-learning frameworks to perform building outline detection and point cloud classification. Building outlines are extracted using SAM, a promptable segmentation system capable of zero-shot generalization to unfamiliar objects and images without requiring additional training. The CASCADE-3D GUI enables interactive digitization, automatic regularization, and refinement of the segmentation mask based on its primary orientation. Each building height model (BHM) is generated by classifying raw point clouds with the DGCNN algorithm to extract ground and building classes. Accurate reconstruction of complex LOD2 models requires precise extraction of roof structures that captures the geometric configuration and orientation of roofs in intricate architectural forms. To achieve this, roof structure detection techniques were applied using each building’s aspect. The study utilized point clouds and orthophotos of 1,215 buildings, encompassing diverse architectural forms and land cover types, across several provinces in Indonesia. The CASCADE-3D GUI was evaluated for its accuracy in detecting building outlines and roof structures, and performing LOD1/2 reconstruction. The results indicate that the reconstructed 3D building geometries yielded an RMSE of 0.36 m. Subsequently, CASCADE-3D reconstructs LOD1 and LOD2 building models and exports them in CityJSON format.
{"title":"CASCADE-3D: A GUI-Driven Framework for Automated 3D Building Model Reconstruction","authors":"Ruli Andaru;Bambang Kun Cahyono;Yulaikhah;Trias Aditya;Purnama Budi Santosa;Calvin Wijaya;Riyas Syamsul;Fairuz Akmal;Hyatma Adikara;Habib Muhammad;Fikri Kurniawan","doi":"10.1109/JSTARS.2026.3663677","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3663677","url":null,"abstract":"The generation of rapid and accurate geospatial data and three-dimensional (3D) features is essential for supporting multipurpose land management services. This study presents CAdastre and Spatial map adjustment with spatial Computation for Automatic builDing dEtection and 3D generation (CASCADE-3D), a graphical user interface (GUI) developed for the automated reconstruction of 3D models at Levels of Detail (LOD) 1 and 2. CASCADE-3D integrates advanced deep-learning frameworks to perform building outline detection and point cloud classification. Building outlines are extracted using SAM, a promptable segmentation system capable of zero-shot generalization to unfamiliar objects and images without requiring additional training. The CASCADE-3D GUI enables interactive digitization, automatic regularization, and refinement of the segmentation mask based on its primary orientation. Each building height model (BHM) is generated by classifying raw point clouds with the DGCNN algorithm to extract ground and building classes. Accurate reconstruction of complex LOD2 models requires precise extraction of roof structures that captures the geometric configuration and orientation of roofs in intricate architectural forms. To achieve this, roof structure detection techniques were applied using each building’s aspect. The study utilized point clouds and orthophotos of 1,215 buildings, encompassing diverse architectural forms and land cover types, across several provinces in Indonesia. The CASCADE-3D GUI was evaluated for its accuracy in detecting building outlines and roof structures, and performing LOD1/2 reconstruction. The results indicate that the reconstructed 3D building geometries yielded an RMSE of 0.36 m. Subsequently, CASCADE-3D reconstructs LOD1 and LOD2 building models and exports them in CityJSON format.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"7419-7442"},"PeriodicalIF":5.3,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11400619","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147299531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-19DOI: 10.1109/JSTARS.2026.3666269
Chenghao Lu;Donglin Li;Hengyi Jia;Taoli Yang
The performance of coherence analysis system (CAS) is critically dependent on phase accuracy. Most advanced phase unwrapping (PU) algorithms are designed for 2-D problem, but data from various interferometric systems are typically acquired as 1-D time series. The 1-D PU problem is more challenging since it suffers more seriously from noise and the available adjacent points are severely restricted compared to the 2-D case. To address the prevalent challenge of 1-D phase noise with a skewed nonzero-mean distribution, this article introduces a novel Fourier decomposition-based phase-processing technique (FDPT). The FDPT procedure begins with a fast Fourier transform (FFT) of the original noisy phase signal. The frequency spectrums of them are then divided into subbands, which undergo a flatness evaluation to identify and extract the dominant frequency components. The inverse FFT is applied to each phase component, converting them back for individual processing using the adaptive nonlocal filtering algorithm. Finally, the subphase components are coherently summed, and followed by PU methods to reconstruct the phase. Simulation results demonstrate the superiority of the proposed FDPT over conventional methods, confirming improvements in waveform similarity and a reduction in root-mean-square error.
{"title":"Fourier Decomposition-Based Phase Processing Technique: A Novel Approach for 1-D Phase Unwrapping","authors":"Chenghao Lu;Donglin Li;Hengyi Jia;Taoli Yang","doi":"10.1109/JSTARS.2026.3666269","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3666269","url":null,"abstract":"The performance of coherence analysis system (CAS) is critically dependent on phase accuracy. Most advanced phase unwrapping (PU) algorithms are designed for 2-D problem, but data from various interferometric systems are typically acquired as 1-D time series. The 1-D PU problem is more challenging since it suffers more seriously from noise and the available adjacent points are severely restricted compared to the 2-D case. To address the prevalent challenge of 1-D phase noise with a skewed nonzero-mean distribution, this article introduces a novel Fourier decomposition-based phase-processing technique (FDPT). The FDPT procedure begins with a fast Fourier transform (FFT) of the original noisy phase signal. The frequency spectrums of them are then divided into subbands, which undergo a flatness evaluation to identify and extract the dominant frequency components. The inverse FFT is applied to each phase component, converting them back for individual processing using the adaptive nonlocal filtering algorithm. Finally, the subphase components are coherently summed, and followed by PU methods to reconstruct the phase. Simulation results demonstrate the superiority of the proposed FDPT over conventional methods, confirming improvements in waveform similarity and a reduction in root-mean-square error.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"8679-8687"},"PeriodicalIF":5.3,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11399896","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-18DOI: 10.1109/JSTARS.2026.3666131
Christina Eisfelder;Philipp Reiners;Claudia Kuenzer
Monitoring long-term land surface temperature (LST) time series and analyzing their anomalies and trends are essential for understanding spatial patterns of global warming, particularly in Europe—the fastest-warming continent. In this study, we derived and analyzed monthly maximum LST trends over central and southern Europe at 1 km2 resolution from advanced very high resolution radiometer-based TIMELINE LST data for the period 1986–2018. We found that almost 40% of the study area exhibited statistically significant (p<0.1)>0.5 K/decade and thus significantly contribute to the overall surface warming. In contrast, forested areas showed lower LST trend magnitudes (<0.5 K/decade) and a smaller share of areas with significant trends. With respect to elevation, our results revealed the lowest LST trends below 50 m and at mid-elevation ranges (750–1250 m). Both the magnitude of LST trends and the percentage area with significant trends rise towards both lower and higher altitudes. These results help to understand current warming patterns and demonstrate that long-term, high-resolution LST datasets can be used to study land-climate interactions in depth.
{"title":"Land Surface Temperature Trends Over Central and Southern Europe: Derivation and Analyses of Long-Term (1986–2018) Monthly Maxima","authors":"Christina Eisfelder;Philipp Reiners;Claudia Kuenzer","doi":"10.1109/JSTARS.2026.3666131","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3666131","url":null,"abstract":"Monitoring long-term land surface temperature (LST) time series and analyzing their anomalies and trends are essential for understanding spatial patterns of global warming, particularly in Europe—the fastest-warming continent. In this study, we derived and analyzed monthly maximum LST trends over central and southern Europe at 1 km<sup>2</sup> resolution from advanced very high resolution radiometer-based TIMELINE LST data for the period 1986–2018. We found that almost 40% of the study area exhibited statistically significant (<italic>p</i><0.1)>0.5 K/decade and thus significantly contribute to the overall surface warming. In contrast, forested areas showed lower LST trend magnitudes (<0.5 K/decade) and a smaller share of areas with significant trends. With respect to elevation, our results revealed the lowest LST trends below 50 m and at mid-elevation ranges (750–1250 m). Both the magnitude of LST trends and the percentage area with significant trends rise towards both lower and higher altitudes. These results help to understand current warming patterns and demonstrate that long-term, high-resolution LST datasets can be used to study land-climate interactions in depth.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"8108-8125"},"PeriodicalIF":5.3,"publicationDate":"2026-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11398115","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-17DOI: 10.1109/JSTARS.2026.3665843
Jiafeng Wang;Hao Li;Xuedong Yao;Jianhua Li
The radiometric calibration is critical for ensuring the quantitative reliability of synthetic aperture radar (SAR) sensor across multiple applications. However, traditional calibration models often struggle with adaptability when corner reflectors (CRs) deviate from ideal cross-shaped responses and appear instead as patch-like bright spots, thereby reducing calibration accuracy. This article proposed a core response energy extraction model for CRs based on adaptive regular window (ARW) optimization, leading to the development of an improved SAR radiometric calibration model, referred as ARW-RC. The ARW-RC significantly improves the completeness of core energy extraction, background clutter suppression, and adaptability. The core energy extraction of CRs from multiband SAR images in Hainan and Rizhao demonstrates that the proposed model effectively captures the core region boundaries, proving its robustness and adaptability across diverse imaging scenarios. Specifically, compared with traditional calibration models, the ARW-RC achieved a standard deviation of 0.55 dB for the CRs response energy in X-band SAR image. After radiometric calibration, the relative accuracy improved to 0.70 dB, representing more than a twofold improvement in radiometric accuracy over traditional models. In addition, the absolute accuracy improved to 0.50 dB, an improvement of 0.69 dB. For the S-band SAR image, the ARW-RC achieved a standard deviation of 1.37 dB in CRs response energy. The relative and absolute accuracies were 1.52 dB and 1.14 dB, respectively. These confirm that the ARW-RC model offers high accuracy and broad applicability, providing an effective solution for SAR sensors calibration and multisource data fusion.
{"title":"An Adaptive Regular Window Optimization-Based Radiometric Calibration Model for Airborne SAR","authors":"Jiafeng Wang;Hao Li;Xuedong Yao;Jianhua Li","doi":"10.1109/JSTARS.2026.3665843","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3665843","url":null,"abstract":"The radiometric calibration is critical for ensuring the quantitative reliability of synthetic aperture radar (SAR) sensor across multiple applications. However, traditional calibration models often struggle with adaptability when corner reflectors (CRs) deviate from ideal cross-shaped responses and appear instead as patch-like bright spots, thereby reducing calibration accuracy. This article proposed a core response energy extraction model for CRs based on adaptive regular window (ARW) optimization, leading to the development of an improved SAR radiometric calibration model, referred as ARW-RC. The ARW-RC significantly improves the completeness of core energy extraction, background clutter suppression, and adaptability. The core energy extraction of CRs from multiband SAR images in Hainan and Rizhao demonstrates that the proposed model effectively captures the core region boundaries, proving its robustness and adaptability across diverse imaging scenarios. Specifically, compared with traditional calibration models, the ARW-RC achieved a standard deviation of 0.55 dB for the CRs response energy in X-band SAR image. After radiometric calibration, the relative accuracy improved to 0.70 dB, representing more than a twofold improvement in radiometric accuracy over traditional models. In addition, the absolute accuracy improved to 0.50 dB, an improvement of 0.69 dB. For the S-band SAR image, the ARW-RC achieved a standard deviation of 1.37 dB in CRs response energy. The relative and absolute accuracies were 1.52 dB and 1.14 dB, respectively. These confirm that the ARW-RC model offers high accuracy and broad applicability, providing an effective solution for SAR sensors calibration and multisource data fusion.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"8329-8345"},"PeriodicalIF":5.3,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11397644","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-17DOI: 10.1109/JSTARS.2026.3665649
Li Lv;Zhenyang Xie;Hongmin Gao;Shufang Xu;Zhenzhen Li;Haihua Xie;Dongxiao Liu
With the increasing diversity of remote sensing (RS) data sources, joint clustering of multimodal RS data demonstrates tremendous potential in Earth observation applications by aggregating multisource information without relying on labeled data. Although significant progress has been made in multiview subspace clustering, existing algorithms still face two limitations: inadequate exploration of complex cross-modal interactions and long-range dependencies, as well as limited capability in handling large-scale multimodal RS datasets. To address these challenges, this article proposes contrastive prototype clustering for multimodal RS data based on spectral–spatial cross Mamba (CPCM). The proposed method encompasses two core innovations. First, we design a multimodal spectral–spatial cross Mamba (S2CM) that performs global contextual modeling with linear complexity in both spectral and spatial dimensions through dual-path Mamba blocks, while employing cross-attention mechanisms to achieve deep semantic fusion of multidimensional features. Second, an end-to-end joint optimization framework is developed, which integrates contrastive learning with clustering learning through a unified objective function. This framework achieves collaborative convergence of feature learning and cluster refinement through an online clustering mechanism that utilizes prototype learning, making it scalable for large-scale multimodal datasets. The effectiveness of the proposed CPCM method is evaluated on three real-world RS datasets: Trento, MUUFL, and Augsburg. Experimental results demonstrate that CPCM achieves overall clustering accuracies of 94.69%, 69.29%, and 83.21% on these datasets, respectively, indicating its superior performance and strong capability in handling large-scale datasets.
{"title":"Contrastive Prototype Clustering for Multimodal Remote Sensing Data Based on Spectral–Spatial Cross Mamba","authors":"Li Lv;Zhenyang Xie;Hongmin Gao;Shufang Xu;Zhenzhen Li;Haihua Xie;Dongxiao Liu","doi":"10.1109/JSTARS.2026.3665649","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3665649","url":null,"abstract":"With the increasing diversity of remote sensing (RS) data sources, joint clustering of multimodal RS data demonstrates tremendous potential in Earth observation applications by aggregating multisource information without relying on labeled data. Although significant progress has been made in multiview subspace clustering, existing algorithms still face two limitations: inadequate exploration of complex cross-modal interactions and long-range dependencies, as well as limited capability in handling large-scale multimodal RS datasets. To address these challenges, this article proposes contrastive prototype clustering for multimodal RS data based on spectral–spatial cross Mamba (CPCM). The proposed method encompasses two core innovations. First, we design a multimodal spectral–spatial cross Mamba (S2CM) that performs global contextual modeling with linear complexity in both spectral and spatial dimensions through dual-path Mamba blocks, while employing cross-attention mechanisms to achieve deep semantic fusion of multidimensional features. Second, an end-to-end joint optimization framework is developed, which integrates contrastive learning with clustering learning through a unified objective function. This framework achieves collaborative convergence of feature learning and cluster refinement through an online clustering mechanism that utilizes prototype learning, making it scalable for large-scale multimodal datasets. The effectiveness of the proposed CPCM method is evaluated on three real-world RS datasets: Trento, MUUFL, and Augsburg. Experimental results demonstrate that CPCM achieves overall clustering accuracies of 94.69%, 69.29%, and 83.21% on these datasets, respectively, indicating its superior performance and strong capability in handling large-scale datasets.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"8056-8070"},"PeriodicalIF":5.3,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11397521","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyperspectral target detection (HTD) aims to identify target locations in a hyperspectral image (HSI) using limited prior target spectra. Existing methods often use contrastive learning to construct target and background sample sets from unlabeled HSI and compare their similarity in feature space to enhance background–target separability. However, they often fail to ensure high-purity sample sets, limiting their ability to effectively separate target and background features. Therefore, we propose a spatial–spectral background–target separation network (SBSNet). The SBSNet leverages prior target spectra to construct high-purity target and background sets from the unlabeled HSI and integrates them into a multiscale spatial–spectral feature learning framework to optimize the feature space for more discriminative target detection. Specifically, the primary contributions of this article are threefold. First, we propose a local spatial–spectral feature fusion module to extract spatial–spectral feature from the raw HSI and proposes a spatial–spectral pseudolabel purification strategy to obtain pure target and background pixel sets from unlabeled HSI. In addition, we introduce the pseudolabel map as prior information to supervise the training process. Second, we design a highly robust multiscale spatial–spectral autoencoder specifically for HTD, which is used for sample generation during the data preparation and for feature extraction during the training. Third, we propose a clustered adaptive focus training strategy, which synergistically optimizes the feature space through clustered sampling and adaptive exponential weighted loss. Finally, experimental results demonstrate that the proposed SBSNet achieves superior detection performance on five public HSI datasets in various scenarios, compared with state-of-the-art HTD methods.
{"title":"SBSNet: Spatial–Spectral Background–Target Separation Network for Hyperspectral Target Detection","authors":"Jianlin Xiang;Yanshan Li;Linhui Dai;Ruo Qi;Haojin Tang;Li Zhang;Kunhua Zhang;Weixin Xie","doi":"10.1109/JSTARS.2026.3665707","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3665707","url":null,"abstract":"Hyperspectral target detection (HTD) aims to identify target locations in a hyperspectral image (HSI) using limited prior target spectra. Existing methods often use contrastive learning to construct target and background sample sets from unlabeled HSI and compare their similarity in feature space to enhance background–target separability. However, they often fail to ensure high-purity sample sets, limiting their ability to effectively separate target and background features. Therefore, we propose a spatial–spectral background–target separation network (SBSNet). The SBSNet leverages prior target spectra to construct high-purity target and background sets from the unlabeled HSI and integrates them into a multiscale spatial–spectral feature learning framework to optimize the feature space for more discriminative target detection. Specifically, the primary contributions of this article are threefold. First, we propose a local spatial–spectral feature fusion module to extract spatial–spectral feature from the raw HSI and proposes a spatial–spectral pseudolabel purification strategy to obtain pure target and background pixel sets from unlabeled HSI. In addition, we introduce the pseudolabel map as prior information to supervise the training process. Second, we design a highly robust multiscale spatial–spectral autoencoder specifically for HTD, which is used for sample generation during the data preparation and for feature extraction during the training. Third, we propose a clustered adaptive focus training strategy, which synergistically optimizes the feature space through clustered sampling and adaptive exponential weighted loss. Finally, experimental results demonstrate that the proposed SBSNet achieves superior detection performance on five public HSI datasets in various scenarios, compared with state-of-the-art HTD methods.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"8648-8663"},"PeriodicalIF":5.3,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11397668","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-17DOI: 10.1109/JSTARS.2026.3665808
Bowen Gan;Wei Liang;Zhaodong Niu
Most resident space object (RSO) detection methods are designed based on the observation mode of the telescope, focusing the core of RSO detection on point-like or streak-like object detection. These RSOs are typically small in size and weak in energy, while noise and stars will affect the detection. Therefore, these methods adopt complex processing pipelines or design neural networks to achieve detection. However, for dim RSOs, the features about point or streak are not prominent, making it difficult for these methods to maintain stable detection performance. In this article, we propose a network called STAR (Spatial and Temporal Context-Aware Network for RSO Detection), which attempts to use spatial and temporal context information as supplementary cues to enhance detection performance. STAR introduces Spatial Context Extraction module that fuses small and large kernel convolutions to capture fine morphological feature and surrounding context information, respectively, and Temporal Context Extraction module that employs deformable attention to adaptively model motion patterns across frames. Experiments on a self-collected dataset composed entirely of real images show that STAR exhibits excellent detection capability. On the public dataset SpotGEO, STAR achieves 94.93% F1 score and 29646.49 mean squared error, surpassing the champion of the SpotGEO challenge and outperforming many current deep-learning-based methods.
大多数驻留空间目标探测方法都是根据望远镜的观测模式设计的,将驻留空间目标探测的核心集中在点状或条纹状目标的探测上。这些rso通常体积小,能量弱,而噪音和恒星会影响探测。因此,这些方法采用复杂的处理管道或设计神经网络来实现检测。然而,对于微弱的rso,点或条纹的特征并不突出,使得这些方法难以保持稳定的检测性能。在本文中,我们提出了一个名为STAR (Spatial and Temporal context - aware network for RSO Detection)的网络,它试图使用空间和时间上下文信息作为补充线索来提高检测性能。STAR引入了空间上下文提取模块,该模块分别融合小核卷积和大核卷积来捕获精细的形态特征和周围的上下文信息,以及时间上下文提取模块,该模块采用可变形的注意力来自适应地跨帧建模运动模式。在完全由真实图像组成的自采集数据集上进行的实验表明,STAR具有出色的检测能力。在公共数据集SpotGEO上,STAR达到了94.93%的F1分数和29646.49的均方误差,超过了SpotGEO挑战赛的冠军,并且优于当前许多基于深度学习的方法。
{"title":"STAR: Spatial and Temporal Context-Aware Network for Resident Space Object Detection","authors":"Bowen Gan;Wei Liang;Zhaodong Niu","doi":"10.1109/JSTARS.2026.3665808","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3665808","url":null,"abstract":"Most resident space object (RSO) detection methods are designed based on the observation mode of the telescope, focusing the core of RSO detection on point-like or streak-like object detection. These RSOs are typically small in size and weak in energy, while noise and stars will affect the detection. Therefore, these methods adopt complex processing pipelines or design neural networks to achieve detection. However, for dim RSOs, the features about point or streak are not prominent, making it difficult for these methods to maintain stable detection performance. In this article, we propose a network called STAR (Spatial and Temporal Context-Aware Network for RSO Detection), which attempts to use spatial and temporal context information as supplementary cues to enhance detection performance. STAR introduces Spatial Context Extraction module that fuses small and large kernel convolutions to capture fine morphological feature and surrounding context information, respectively, and Temporal Context Extraction module that employs deformable attention to adaptively model motion patterns across frames. Experiments on a self-collected dataset composed entirely of real images show that STAR exhibits excellent detection capability. On the public dataset SpotGEO, STAR achieves 94.93% F1 score and 29646.49 mean squared error, surpassing the champion of the SpotGEO challenge and outperforming many current deep-learning-based methods.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"8428-8440"},"PeriodicalIF":5.3,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11397566","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147440504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-17DOI: 10.1109/JSTARS.2026.3665662
Shuai Zhang;Fangxiong Wang;Wubiao Huang;Fei Deng
With the rapid development of the offshore wind power industry in recent years, its effects on local social, economic, and ecological environments have attracted widespread attention. Therefore, a timely understanding of the development status of offshore wind power, specifically, offshore wind turbines (OWTs) is crucial for the healthy and sustainable development of the offshore wind power industry. However, existing OWT detection methods often struggle to achieve timely, high-precision end-to-end detection of OWTs. To address this, in this study, an OWT detection network (OWT-DNet) based on multimodal remote sensing data is proposed. This network integrates Sentinel-1 synthetic aperture radar imagery and Sentinel-2 optical imagery, effectively addressing the insufficient semantic information inherent in single-modal data for OWT detection. Experiments across five global test regions demonstrate that OWT-DNet achieves detection accuracy, recall, and comprehensive evaluation metrics that exceed 99.9%. Furthermore, OWT-DNet demonstrates outstanding detection performance under complex weather conditions. Comparative and ablation experiments validate the network's superior capability in OWT detection tasks. Overall, timely, high-precision end-to-end OWT detection is achieved for the first time on the basis of multimodal remote sensing data. Furthermore, an inaugural multimodal OWT sample dataset is established, laying a solid foundation for future OWT detection research.
{"title":"OWT-DNet: A Timely and High-Accuracy End-to-End Offshore Wind Turbine Detection Network Based on Multimodal Remote Sensing Data","authors":"Shuai Zhang;Fangxiong Wang;Wubiao Huang;Fei Deng","doi":"10.1109/JSTARS.2026.3665662","DOIUrl":"https://doi.org/10.1109/JSTARS.2026.3665662","url":null,"abstract":"With the rapid development of the offshore wind power industry in recent years, its effects on local social, economic, and ecological environments have attracted widespread attention. Therefore, a timely understanding of the development status of offshore wind power, specifically, offshore wind turbines (OWTs) is crucial for the healthy and sustainable development of the offshore wind power industry. However, existing OWT detection methods often struggle to achieve timely, high-precision end-to-end detection of OWTs. To address this, in this study, an OWT detection network (OWT-DNet) based on multimodal remote sensing data is proposed. This network integrates Sentinel-1 synthetic aperture radar imagery and Sentinel-2 optical imagery, effectively addressing the insufficient semantic information inherent in single-modal data for OWT detection. Experiments across five global test regions demonstrate that OWT-DNet achieves detection accuracy, recall, and comprehensive evaluation metrics that exceed 99.9%. Furthermore, OWT-DNet demonstrates outstanding detection performance under complex weather conditions. Comparative and ablation experiments validate the network's superior capability in OWT detection tasks. Overall, timely, high-precision end-to-end OWT detection is achieved for the first time on the basis of multimodal remote sensing data. Furthermore, an inaugural multimodal OWT sample dataset is established, laying a solid foundation for future OWT detection research.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"7991-8004"},"PeriodicalIF":5.3,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11397679","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147362374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}