In the realm of artificial intelligence, the emergence of foundation models, backed by high computing capabilities and extensive data, has been revolutionary. A segment anything model (SAM), built on the vision transformer (ViT) model with millions of parameters and trained on its corresponding large-scale dataset SA-1B, excels in various segmentation scenarios relying on its significance of semantic information and generalization ability. Such achievement of visual foundation model stimulates continuous researches on specific downstream tasks in computer vision. The classwise-SAM-adapter (CWSAM) is designed to adapt the high-performing SAM for landcover classification on space-borne synthetic aperture radar (SAR) images. The proposed CWSAM freezes most of SAM's parameters and incorporates lightweight adapters for parameter-efficient fine-tuning, and a classwise mask decoder is designed to achieve semantic segmentation task. This adapt-tuning method allows for efficient landcover classification of SAR images, balancing the accuracy with computational demand. In addition, the task-specific input module injects low-frequency information of SAR images by MLP-based layers to improve the model performance. Compared to conventional state-of-the-art semantic segmentation algorithms by extensive experiments, CWSAM showcases enhanced performance with fewer computing resources, highlighting the potential of leveraging foundational models such as SAM for specific downstream tasks in the SAR domain.
{"title":"ClassWise-SAM-Adapter: Parameter-Efficient Fine-Tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation","authors":"Xinyang Pu;Hecheng Jia;Linghao Zheng;Feng Wang;Feng Xu","doi":"10.1109/JSTARS.2025.3532690","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3532690","url":null,"abstract":"In the realm of artificial intelligence, the emergence of foundation models, backed by high computing capabilities and extensive data, has been revolutionary. A segment anything model (SAM), built on the vision transformer (ViT) model with millions of parameters and trained on its corresponding large-scale dataset SA-1B, excels in various segmentation scenarios relying on its significance of semantic information and generalization ability. Such achievement of visual foundation model stimulates continuous researches on specific downstream tasks in computer vision. The classwise-SAM-adapter (CWSAM) is designed to adapt the high-performing SAM for landcover classification on space-borne synthetic aperture radar (SAR) images. The proposed CWSAM freezes most of SAM's parameters and incorporates lightweight adapters for parameter-efficient fine-tuning, and a classwise mask decoder is designed to achieve semantic segmentation task. This adapt-tuning method allows for efficient landcover classification of SAR images, balancing the accuracy with computational demand. In addition, the task-specific input module injects low-frequency information of SAR images by MLP-based layers to improve the model performance. Compared to conventional state-of-the-art semantic segmentation algorithms by extensive experiments, CWSAM showcases enhanced performance with fewer computing resources, highlighting the potential of leveraging foundational models such as SAM for specific downstream tasks in the SAR domain.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4791-4804"},"PeriodicalIF":4.7,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10849617","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simulating synthetic aperture radar (SAR) images of crater terrain is a crucial technique for expanding SAR sample databases and facilitating the development of quantitative information extraction models for craters. However, existing simulation methods often overlook crucial factors, including the explosive depth effect in crater morphology modeling and the double-bounce scattering effect in electromagnetic scattering calculations. To overcome these limitations, this article introduces a novel approach to simulating SAR images of crater terrain. The approach incorporates crater formation theory to describe the relationship between various explosion parameters and craters. Moreover, it employs a hybrid ray-tracing approach that considers both surface and double-bounce scattering effects. Initially, crater morphology models are established for surface, shallow burial, and deep burial explosions. This involves incorporating the explosive depth parameter into crater morphology modeling through crater formation theory and quantitatively assessing soil movement influenced by the explosion. Subsequently, the ray-tracing algorithm and the advanced integral equation model are combined to accurately calculate electromagnetic scattering characteristics. Finally, simulated SAR images of the crater terrain are generated using the SAR echo fast time-frequency domain simulation algorithm and the chirp scaling imaging algorithm. The results obtained by simulating SAR images under different explosion parameters offer valuable insights into the effects of various explosion parameters on crater morphology. This research could contribute to the creation of comprehensive crater terrain datasets and support the application of SAR technology for damage assessment purposes.
{"title":"SAR Image Simulation for Crater Terrain Using Formation Theory-Based Modeling and Hybrid Ray-Tracing","authors":"Ya-Ting Zhou;Yongsheng Zhou;Qiang Yin;Fei Ma;Fan Zhang","doi":"10.1109/JSTARS.2025.3532748","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3532748","url":null,"abstract":"Simulating synthetic aperture radar (SAR) images of crater terrain is a crucial technique for expanding SAR sample databases and facilitating the development of quantitative information extraction models for craters. However, existing simulation methods often overlook crucial factors, including the explosive depth effect in crater morphology modeling and the double-bounce scattering effect in electromagnetic scattering calculations. To overcome these limitations, this article introduces a novel approach to simulating SAR images of crater terrain. The approach incorporates crater formation theory to describe the relationship between various explosion parameters and craters. Moreover, it employs a hybrid ray-tracing approach that considers both surface and double-bounce scattering effects. Initially, crater morphology models are established for surface, shallow burial, and deep burial explosions. This involves incorporating the explosive depth parameter into crater morphology modeling through crater formation theory and quantitatively assessing soil movement influenced by the explosion. Subsequently, the ray-tracing algorithm and the advanced integral equation model are combined to accurately calculate electromagnetic scattering characteristics. Finally, simulated SAR images of the crater terrain are generated using the SAR echo fast time-frequency domain simulation algorithm and the chirp scaling imaging algorithm. The results obtained by simulating SAR images under different explosion parameters offer valuable insights into the effects of various explosion parameters on crater morphology. This research could contribute to the creation of comprehensive crater terrain datasets and support the application of SAR technology for damage assessment purposes.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5005-5017"},"PeriodicalIF":4.7,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10849666","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143388615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1109/JSTARS.2025.3532816
Maozhi Wang;Shu-Hua Chen;Jun Feng;Wenxi Xu;Daming Wang
Identification of spectrally similar materials from multispectral remote sensing (RS) imagery with only several bands is an important issue that challenges comprehensive applications of the RS of surface characteristics. This study proposes a new method to identify spectrally similar materials from these types of imagery. The method is constructed based on the theory of condition number of matrix, and a theorem is proven as the foundation of the designed identification algorithm. Mathematically, the motivation behind designing this new algorithm is to decrease the condition number of the matrix for a linear system and, by doing so, to change an ill-conditioned system to a well-conditioned one. Technically, this new method achieves the purpose by adding supplementary features to all the original spectra including similar materials, which can be further used as indicative signatures to identify these materials. Thus, the proposed method is named a condition number-based method with supplementary features (SF-CNM). The threshold scheme and supplementary features are two main novelty techniques to ensure the uniqueness and accuracy of the proposed SF-CNM for specified samples. The results for a case study to identify water, ice, snow, shadow, and other materials from Landsat 8 OLI data indicate that SF-CNM can identify the materials specified by the given samples successfully and accurately and that SF-CNM significantly outperforms those of spectral angle mapper algorithm, Mahalanobis classifier, maximum likelihood, and artificial neural network, and produces the performance similar to, even slightly better than that of support vector machine.
{"title":"Identification of Spectrally Similar Materials From Multispectral Imagery Based on Condition Number of Matrix","authors":"Maozhi Wang;Shu-Hua Chen;Jun Feng;Wenxi Xu;Daming Wang","doi":"10.1109/JSTARS.2025.3532816","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3532816","url":null,"abstract":"Identification of spectrally similar materials from multispectral remote sensing (RS) imagery with only several bands is an important issue that challenges comprehensive applications of the RS of surface characteristics. This study proposes a new method to identify spectrally similar materials from these types of imagery. The method is constructed based on the theory of condition number of matrix, and a theorem is proven as the foundation of the designed identification algorithm. Mathematically, the motivation behind designing this new algorithm is to decrease the condition number of the matrix for a linear system and, by doing so, to change an ill-conditioned system to a well-conditioned one. Technically, this new method achieves the purpose by adding supplementary features to all the original spectra including similar materials, which can be further used as indicative signatures to identify these materials. Thus, the proposed method is named a condition number-based method with supplementary features (SF-CNM). The threshold scheme and supplementary features are two main novelty techniques to ensure the uniqueness and accuracy of the proposed SF-CNM for specified samples. The results for a case study to identify water, ice, snow, shadow, and other materials from Landsat 8 OLI data indicate that SF-CNM can identify the materials specified by the given samples successfully and accurately and that SF-CNM significantly outperforms those of spectral angle mapper algorithm, Mahalanobis classifier, maximum likelihood, and artificial neural network, and produces the performance similar to, even slightly better than that of support vector machine.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4751-4766"},"PeriodicalIF":4.7,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10849635","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143379536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-20DOI: 10.1109/JSTARS.2025.3532219
Sizhe Wang;Wenwen Li;Chia-Yu Hsu
Sea ice forecasting remains a challenging topic due to the complexity of understanding its driving forces and modeling its dynamics. This article contributes to the expanding literature by developing a data-driven, artificial intelligence (AI)-based solution for forecasting sea ice concentration in the Arctic. Specifically, we introduced STEPNet—a spatial and temporal encoding pipeline capable of handling the temporal heterogeneity of multivariate sea ice drivers, including various climate and environmental factors with varying impacts on sea ice concentration changes. STEPNet employs dedicated encoders designed to effectively mine prominent spatial, temporal, and spatiotemporal relationships within the data. It builds on and extends the architecture of vision and temporal transformer architectures to leverage their power in extracting important hidden relationships over long data ranges. The learning pipeline is designed for flexibility and extendibility, enabling easy integration of different encoders to process diverse data characteristics and meet computational demands. A series of ablation studies and comparative experiments were conducted to validate the effectiveness of our architecture design and the superior performance of the proposed STEPNet model compared to other AI solutions and numerical models.
{"title":"STEPNet: A Spatial and Temporal Encoding Pipeline to Handle Temporal Heterogeneity in Climate Modeling Using AI: A Use Case of Sea Ice Forecasting","authors":"Sizhe Wang;Wenwen Li;Chia-Yu Hsu","doi":"10.1109/JSTARS.2025.3532219","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3532219","url":null,"abstract":"Sea ice forecasting remains a challenging topic due to the complexity of understanding its driving forces and modeling its dynamics. This article contributes to the expanding literature by developing a data-driven, artificial intelligence (AI)-based solution for forecasting sea ice concentration in the Arctic. Specifically, we introduced STEPNet—a spatial and temporal encoding pipeline capable of handling the temporal heterogeneity of multivariate sea ice drivers, including various climate and environmental factors with varying impacts on sea ice concentration changes. STEPNet employs dedicated encoders designed to effectively mine prominent spatial, temporal, and spatiotemporal relationships within the data. It builds on and extends the architecture of vision and temporal transformer architectures to leverage their power in extracting important hidden relationships over long data ranges. The learning pipeline is designed for flexibility and extendibility, enabling easy integration of different encoders to process diverse data characteristics and meet computational demands. A series of ablation studies and comparative experiments were conducted to validate the effectiveness of our architecture design and the superior performance of the proposed STEPNet model compared to other AI solutions and numerical models.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4921-4935"},"PeriodicalIF":4.7,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10848183","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143388473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Remote sensing-based classification of crops is the foundation for the monitoring of food production and management. A range of remote sensing images, encompassing spatial, spectral, and temporal dimensions, has facilitated the classification of crops. However, prevailing methods for crop classification via remote sensing focus on either temporal or spatial features of images. These unimodal methods often encounter challenges posed by noise interference in real-world scenarios, and may struggle to discriminate between crops with similar spectral signatures, thereby leading to misclassification over extensive areas. To address the issue, we propose a novel approach termed spatiotemporal fusion-based crop classification network (STFCropNet), which integrates high-resolution (HR) images with medium-resolution time-series (TS) images. STFCropNet consists of a temporal branch, which captures seasonal spectral variations and coarse-grained spatial information from TS data, and a spatial branch that extracts geometric details and multiscale spatial features from HR images. By integrating features from both branches, STFCropNet achieves fine-grained crop classification while effectively reducing salt and pepper noise. We evaluate STFCropNet in two study areas of China with diverse topographic features. Experimental results demonstrate that STFCropNet outperforms state-of-the-art models in both study areas. STFCropNet achieves an overall accuracy of 83.2% and 90.6%, representing improvements of 3.6% and 4.1%, respectively, compared to the second-best baseline model. We release our code at.
{"title":"STFCropNet: A Spatiotemporal Fusion Network for Crop Classification in Multiresolution Remote Sensing Images","authors":"Wei Wu;Yapeng Liu;Kun Li;Haiping Yang;Liao Yang;Zuohui Chen","doi":"10.1109/JSTARS.2025.3531886","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3531886","url":null,"abstract":"Remote sensing-based classification of crops is the foundation for the monitoring of food production and management. A range of remote sensing images, encompassing spatial, spectral, and temporal dimensions, has facilitated the classification of crops. However, prevailing methods for crop classification via remote sensing focus on either temporal or spatial features of images. These unimodal methods often encounter challenges posed by noise interference in real-world scenarios, and may struggle to discriminate between crops with similar spectral signatures, thereby leading to misclassification over extensive areas. To address the issue, we propose a novel approach termed spatiotemporal fusion-based crop classification network (STFCropNet), which integrates high-resolution (HR) images with medium-resolution time-series (TS) images. STFCropNet consists of a temporal branch, which captures seasonal spectral variations and coarse-grained spatial information from TS data, and a spatial branch that extracts geometric details and multiscale spatial features from HR images. By integrating features from both branches, STFCropNet achieves fine-grained crop classification while effectively reducing salt and pepper noise. We evaluate STFCropNet in two study areas of China with diverse topographic features. Experimental results demonstrate that STFCropNet outperforms state-of-the-art models in both study areas. STFCropNet achieves an overall accuracy of 83.2% and 90.6%, representing improvements of 3.6% and 4.1%, respectively, compared to the second-best baseline model. We release our code at.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4736-4750"},"PeriodicalIF":4.7,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10848201","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-20DOI: 10.1109/JSTARS.2025.3532126
Peter Brotzer;Emiliano Casalini;David Small;Alexander Damm;Elías Méndez Domínguez
Satellite and airborne synthetic aperture radar (SAR) systems are frequently used for topographic mapping. However, their limited scene aspects lead to reduced angular coverage, making them less effective in environments with complex surface structures and tall objects. This limitation can be overcome by drone-based SAR systems, which are becoming increasingly advanced, but their potential for three-dimensional (3-D) imaging remains largely unexplored. In this article, we utilize multiaspect SAR data acquired with a K-band drone system with 700 MHz bandwidth and investigate the potential 3-D point cloud retrievals in high resolution. Through a series of experiments with increasingly complex 3-D structures, we evaluate the accuracy of the derived point clouds. Independent references—based on light detection and ranging (LiDAR) and 3-D construction models—are used to validate our results. Our findings demonstrate that the drone SAR system can produce accurate and complete point clouds, with average Chamfer distances on the order of 1 m compared to reference data, highlighting the significance of multiple aspect acquisitions for 3-D mapping applications.
卫星和机载合成孔径雷达(SAR)系统常用于地形测绘。然而,它们的场景范围有限,导致角度覆盖范围缩小,在表面结构复杂和有高大物体的环境中效果不佳。基于无人机的合成孔径雷达系统可以克服这一限制,该系统正变得越来越先进,但其在三维(3-D)成像方面的潜力在很大程度上仍未得到开发。本文利用 700 MHz 带宽的 K 波段无人机系统获取的多光谱合成孔径雷达数据,研究了高分辨率三维点云检索的潜力。通过对日益复杂的三维结构进行一系列实验,我们评估了所得点云的准确性。基于光探测与测距(LiDAR)和三维建筑模型的独立参考资料被用来验证我们的结果。我们的研究结果表明,无人机合成孔径雷达系统可以生成精确、完整的点云,与参考数据相比,平均倒角距离约为 1 米,突出了多方面采集对于三维测绘应用的重要性。
{"title":"Retrieving Multiaspect Point Clouds From a Multichannel K-Band SAR Drone","authors":"Peter Brotzer;Emiliano Casalini;David Small;Alexander Damm;Elías Méndez Domínguez","doi":"10.1109/JSTARS.2025.3532126","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3532126","url":null,"abstract":"Satellite and airborne synthetic aperture radar (SAR) systems are frequently used for topographic mapping. However, their limited scene aspects lead to reduced angular coverage, making them less effective in environments with complex surface structures and tall objects. This limitation can be overcome by drone-based SAR systems, which are becoming increasingly advanced, but their potential for three-dimensional (3-D) imaging remains largely unexplored. In this article, we utilize multiaspect SAR data acquired with a K-band drone system with 700 MHz bandwidth and investigate the potential 3-D point cloud retrievals in high resolution. Through a series of experiments with increasingly complex 3-D structures, we evaluate the accuracy of the derived point clouds. Independent references—based on light detection and ranging (LiDAR) and 3-D construction models—are used to validate our results. Our findings demonstrate that the drone SAR system can produce accurate and complete point clouds, with average Chamfer distances on the order of 1 m compared to reference data, highlighting the significance of multiple aspect acquisitions for 3-D mapping applications.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5033-5045"},"PeriodicalIF":4.7,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10848217","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143388604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1109/JSTARS.2025.3531439
Shuang Wu;Lei Deng;Qinghua Qiao
Accurate long-term estimation of fractional vegetation cover (FVC) is crucial for monitoring vegetation dynamics. Satellite-based methods, such as the dimidiate pixel method (DPM), struggle with spatial heterogeneity due to coarse resolution. Existing methods using unmanned aerial vehicles (UAVs) combined with satellite data (UCS) inadequately leverage the high spatial resolution of UAV imagery to address spatial heterogeneity and are seldom applied to long-term FVC monitoring. To overcome spatial challenges, an improved dimidiate pixel method (IDPM) is proposed here, utilizing 2021 Landsat imagery to generate FVCDPM via DPM and upscaled UAV imagery for FVCUAV as ground references. The IDPM uses the pruned exact linear time method to segment the normalized difference vegetation index (NDVI) into intervals, within which DPM performance is evaluated for potential improvements. Specifically, if the difference (D) between FVCDPM and FVCUAV is nonzero, NDVI-derived texture features are incorporated into FVCDPM through multiple linear regression to enhance accuracy. To address temporal challenges and ensure consistency across years, the 2021 NDVI serves as a reference for inter-year NDVI calibration, employing least squares regression (LSR) and histogram matching (HM) to identify the most effective method for extending the IDPM to other years. Results demonstrate that 1) the IDPM, by developing distinct DPM improvement models for different NDVI intervals, considerably improves UAV and satellite data integration, with a 48.51% increase in R2 and a 56.47% reduction in root mean square error (RMSE) compared to the DPM and UCS and 2) HM is found to be more suitable for mining areas, increasing R2 by 25.00% and reducing RMSE by 54.05% compared to LSR. This method provides an efficient, rapid solution for mitigating spatial heterogeneity and advancing long-term FVC estimation.
{"title":"Estimating Long-Term Fractional Vegetation Cover Using an Improved Dimidiate Pixel Method With UAV-Assisted Satellite Data: A Case Study in a Mining Region","authors":"Shuang Wu;Lei Deng;Qinghua Qiao","doi":"10.1109/JSTARS.2025.3531439","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3531439","url":null,"abstract":"Accurate long-term estimation of fractional vegetation cover (FVC) is crucial for monitoring vegetation dynamics. Satellite-based methods, such as the dimidiate pixel method (DPM), struggle with spatial heterogeneity due to coarse resolution. Existing methods using unmanned aerial vehicles (UAVs) combined with satellite data (UCS) inadequately leverage the high spatial resolution of UAV imagery to address spatial heterogeneity and are seldom applied to long-term FVC monitoring. To overcome spatial challenges, an improved dimidiate pixel method (IDPM) is proposed here, utilizing 2021 Landsat imagery to generate FVC<sub>DPM</sub> via DPM and upscaled UAV imagery for FVC<sub>UAV</sub> as ground references. The IDPM uses the pruned exact linear time method to segment the normalized difference vegetation index (NDVI) into intervals, within which DPM performance is evaluated for potential improvements. Specifically, if the difference (D) between FVC<sub>DPM</sub> and FVC<sub>UAV</sub> is nonzero, NDVI-derived texture features are incorporated into FVC<sub>DPM</sub> through multiple linear regression to enhance accuracy. To address temporal challenges and ensure consistency across years, the 2021 NDVI serves as a reference for inter-year NDVI calibration, employing least squares regression (LSR) and histogram matching (HM) to identify the most effective method for extending the IDPM to other years. Results demonstrate that 1) the IDPM, by developing distinct DPM improvement models for different NDVI intervals, considerably improves UAV and satellite data integration, with a 48.51% increase in <italic>R</i><sup>2</sup> and a 56.47% reduction in root mean square error (RMSE) compared to the DPM and UCS and 2) HM is found to be more suitable for mining areas, increasing <italic>R</i><sup>2</sup> by 25.00% and reducing RMSE by 54.05% compared to LSR. This method provides an efficient, rapid solution for mitigating spatial heterogeneity and advancing long-term FVC estimation.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4162-4173"},"PeriodicalIF":4.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10845181","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1109/JSTARS.2025.3531448
Chang-Jiang Zhang;Mei-Shu Chen;Lei-Ming Ma;Xiao-Qin Lu
Tropical cyclone (TC) is a highly catastrophic weather event, and accurate estimation of intensity is of great significance. The current proposed TC intensity estimation model focuses on training using satellite images from single or two channels, and the model cannot fully capture features related to TC intensity, resulting in low accuracy. To this end, we propose a double-layer encoder–decoder model for estimating the intensity of TC, which is trained using images from three channels: infrared, water vapor, and passive microwave. The model mainly consists of three modules: wavelet transform enhancement module, multichannel satellite image fusion module, and TC intensity estimation module, which are used to extract high-frequency information from the source image, generate a three-channel fused image, and perform TC intensity estimation. To validate the performance of our model, we conducted extensive experiments on the TCIR dataset. The experimental results show that the proposed model has MAE and RMSE of 3.76 m/s and 4.62 m/s for TC intensity estimation, which are 15.70% and 20.07% lower than advanced Dvorak technology, respectively. Therefore, the model proposed in this article has great potential in accurately estimating TC intensity.
{"title":"Deep Learning and Wavelet Transform Combined With Multichannel Satellite Images for Tropical Cyclone Intensity Estimation","authors":"Chang-Jiang Zhang;Mei-Shu Chen;Lei-Ming Ma;Xiao-Qin Lu","doi":"10.1109/JSTARS.2025.3531448","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3531448","url":null,"abstract":"Tropical cyclone (TC) is a highly catastrophic weather event, and accurate estimation of intensity is of great significance. The current proposed TC intensity estimation model focuses on training using satellite images from single or two channels, and the model cannot fully capture features related to TC intensity, resulting in low accuracy. To this end, we propose a double-layer encoder–decoder model for estimating the intensity of TC, which is trained using images from three channels: infrared, water vapor, and passive microwave. The model mainly consists of three modules: wavelet transform enhancement module, multichannel satellite image fusion module, and TC intensity estimation module, which are used to extract high-frequency information from the source image, generate a three-channel fused image, and perform TC intensity estimation. To validate the performance of our model, we conducted extensive experiments on the TCIR dataset. The experimental results show that the proposed model has MAE and RMSE of 3.76 m/s and 4.62 m/s for TC intensity estimation, which are 15.70% and 20.07% lower than advanced Dvorak technology, respectively. Therefore, the model proposed in this article has great potential in accurately estimating TC intensity.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4711-4735"},"PeriodicalIF":4.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10845190","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1109/JSTARS.2025.3531353
Qun Song;Hangyuan Lu;Chang Xu;Rixian Liu;Weiguo Wan;Wei Tu
Pansharpening is the process of fusing a multispectral (MS) image with a panchromatic image to produce a high-resolution MS (HRMS) image. However, existing techniques face challenges in integrating long-range dependencies to correct locally misaligned features, which results in spatial-spectral distortions. Moreover, these methods tend to be computationally expensive. To address these challenges, we propose a novel detail injection algorithm and develop the invertible attention-guided adaptive convolution and dual-domain Transformer (IACDT) network. In IACDT, we designed an invertible attention mechanism embedded with spectral-spatial attention to efficiently and losslessly extract locally spatial-spectral-aware detail information. In addition, we presented a frequency-spatial dual-domain attention mechanism that combines a frequency-enhanced Transformer and a spatial window Transformer for long-range contextual detail feature correction. This architecture effectively integrates local detail features with long-range dependencies, enabling the model to correct both local misalignments and global inconsistencies. The final HRMS image is obtained through a reconstruction block that consists of residual multireceptive field attention. Extensive experiments demonstrate that IACDT achieves superior fusion performance, computational efficiency, and outstanding results in downstream tasks compared to state-of-the-art methods.
{"title":"Invertible Attention-Guided Adaptive Convolution and Dual-Domain Transformer for Pansharpening","authors":"Qun Song;Hangyuan Lu;Chang Xu;Rixian Liu;Weiguo Wan;Wei Tu","doi":"10.1109/JSTARS.2025.3531353","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3531353","url":null,"abstract":"Pansharpening is the process of fusing a multispectral (MS) image with a panchromatic image to produce a high-resolution MS (HRMS) image. However, existing techniques face challenges in integrating long-range dependencies to correct locally misaligned features, which results in spatial-spectral distortions. Moreover, these methods tend to be computationally expensive. To address these challenges, we propose a novel detail injection algorithm and develop the invertible attention-guided adaptive convolution and dual-domain Transformer (IACDT) network. In IACDT, we designed an invertible attention mechanism embedded with spectral-spatial attention to efficiently and losslessly extract locally spatial-spectral-aware detail information. In addition, we presented a frequency-spatial dual-domain attention mechanism that combines a frequency-enhanced Transformer and a spatial window Transformer for long-range contextual detail feature correction. This architecture effectively integrates local detail features with long-range dependencies, enabling the model to correct both local misalignments and global inconsistencies. The final HRMS image is obtained through a reconstruction block that consists of residual multireceptive field attention. Extensive experiments demonstrate that IACDT achieves superior fusion performance, computational efficiency, and outstanding results in downstream tasks compared to state-of-the-art methods.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5217-5231"},"PeriodicalIF":4.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10845120","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143422910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1109/JSTARS.2025.3530926
Tianxiang Wang;Zhangfan Zeng;ShiHe Zhou;Qiao Xu
Automatic target recognition based on synthetic aperture radar (SAR) has extensive applications in dynamic surveillance, modern airport management, and military decision-making. However, the natural mechanisms of SAR imaging introduce challenges such as target feature discretization, clutter interference, and significant scale variation, which hinder the performance of existing recognition networks in practical scenarios. As such, this article presents a novel network architecture: the multiscale discrete feature enhancement network with augmented reversible transformation. The proposed network consists of three core components: an augmented feature extraction (AFE) backbone, a discrete feature enhancement module (DFEM), and a Spider feature pyramid network (Spider FPN). The AFE backbone has the capability of effective target information preservation and clutter suppression with the aid of integration of augmented reversible transformations with intermediate supervision module and double subnetworks. The DFEM enhances both local and global discrete feature awareness through its two submodules: local discrete feature enhancement module and global semantic information awareness module. The Spider FPN overcomes target scale variation challenges, especially for small-scale targets, through a fusion-diffusion mechanism and the designed feature perception fusion module. The functionality of the proposed method is evaluated on three public datasets: SARDet-100 K, MSAR-1.0, and SAR-AIRcraft-1.0 of various polarizations and environmental conditions. Experimental results demonstrate that the proposed network outperforms current state-of-the-art methods in terms of average precision by the levels of 63.3%, 72.3%, and 67.4%, respectively.
{"title":"A Multiscale Discrete Feature Enhancement Network With Augmented Reversible Transformation for SAR Automatic Target Recognition","authors":"Tianxiang Wang;Zhangfan Zeng;ShiHe Zhou;Qiao Xu","doi":"10.1109/JSTARS.2025.3530926","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530926","url":null,"abstract":"Automatic target recognition based on synthetic aperture radar (SAR) has extensive applications in dynamic surveillance, modern airport management, and military decision-making. However, the natural mechanisms of SAR imaging introduce challenges such as target feature discretization, clutter interference, and significant scale variation, which hinder the performance of existing recognition networks in practical scenarios. As such, this article presents a novel network architecture: the multiscale discrete feature enhancement network with augmented reversible transformation. The proposed network consists of three core components: an augmented feature extraction (AFE) backbone, a discrete feature enhancement module (DFEM), and a Spider feature pyramid network (Spider FPN). The AFE backbone has the capability of effective target information preservation and clutter suppression with the aid of integration of augmented reversible transformations with intermediate supervision module and double subnetworks. The DFEM enhances both local and global discrete feature awareness through its two submodules: local discrete feature enhancement module and global semantic information awareness module. The Spider FPN overcomes target scale variation challenges, especially for small-scale targets, through a fusion-diffusion mechanism and the designed feature perception fusion module. The functionality of the proposed method is evaluated on three public datasets: SARDet-100 K, MSAR-1.0, and SAR-AIRcraft-1.0 of various polarizations and environmental conditions. Experimental results demonstrate that the proposed network outperforms current state-of-the-art methods in terms of average precision by the levels of 63.3%, 72.3%, and 67.4%, respectively.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5135-5156"},"PeriodicalIF":4.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10844330","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143422911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}