Hyperspectral imaging (HSI) can capture a large amount of spectral information at various wavelengths, enabling detailed material classification and identification, making it a key tool in remote sensing, particularly for coastal area monitoring. In recent years, the convolutional neural network (CNN) framework and transformer models have demonstrated strong performance in HSI classification, especially in applications requiring precise change detection and analysis. However, due to the high dimensionality of HSI data and the complexity of spectral-spatial feature extraction, achieving accurate results in coastal areas remains challenging. This article introduces a new hybrid model, CSTFNet, which combines an improved CNN module and dual-layer Swin transformer (DLST) to tackle these challenges. CSTFNet integrates spectral and spatial processing capabilities, significantly reducing computational complexity while maintaining high classification accuracy. The improved CNN module employs one-dimensional convolutions to handle high-dimensional data, while the DLST module uses window-based multihead attention to capture both local and global dependencies. Experiments conducted on four standard HSI datasets (Houston-2013, Samson, KSC, and Botswana) demonstrate that CSTFNet outperforms traditional and state-of-the-art algorithms, achieving overall classification accuracy exceeding 99% . In particular, on the Houston-2013 dataset, the results for OA and AA are 1.00 and the kappa coefficient is 0. 976. The results highlight the robustness and efficiency of the proposed model in coastal area applications, where accurate and reliable spectral-spatial classification is crucial for monitoring and environmental management.
{"title":"CSTFNet: A CNN and Dual Swin-Transformer Fusion Network for Remote Sensing Hyperspectral Data Fusion and Classification of Coastal Areas","authors":"Dekai Li;Harold Neira-Molina;Mengxing Huang;Syam M.S.;Yu Zhang;Zhang Junfeng;Uzair Aslam Bhatti;Muhammad Asif;Nadia Sarhan;Emad Mahrous Awwad","doi":"10.1109/JSTARS.2025.3530935","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530935","url":null,"abstract":"Hyperspectral imaging (HSI) can capture a large amount of spectral information at various wavelengths, enabling detailed material classification and identification, making it a key tool in remote sensing, particularly for coastal area monitoring. In recent years, the convolutional neural network (CNN) framework and transformer models have demonstrated strong performance in HSI classification, especially in applications requiring precise change detection and analysis. However, due to the high dimensionality of HSI data and the complexity of spectral-spatial feature extraction, achieving accurate results in coastal areas remains challenging. This article introduces a new hybrid model, CSTFNet, which combines an improved CNN module and dual-layer Swin transformer (DLST) to tackle these challenges. CSTFNet integrates spectral and spatial processing capabilities, significantly reducing computational complexity while maintaining high classification accuracy. The improved CNN module employs one-dimensional convolutions to handle high-dimensional data, while the DLST module uses window-based multihead attention to capture both local and global dependencies. Experiments conducted on four standard HSI datasets (Houston-2013, Samson, KSC, and Botswana) demonstrate that CSTFNet outperforms traditional and state-of-the-art algorithms, achieving overall classification accuracy exceeding 99% . In particular, on the Houston-2013 dataset, the results for OA and AA are 1.00 and the kappa coefficient is 0. 976. The results highlight the robustness and efficiency of the proposed model in coastal area applications, where accurate and reliable spectral-spatial classification is crucial for monitoring and environmental management.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5853-5865"},"PeriodicalIF":4.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10844328","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143465620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1109/JSTARS.2025.3530989
Tingting Wei;Xingwang Hu;Zhengwei Guo;Gaofeng Shu;Yabo Huang;Ning Li
In the increasingly complex electromagnetic environment, the spectrum is becoming more and more crowded. Synthetic aperture radar (SAR) is more susceptible to be affected by the radio frequency interference (RFI) in the same frequency band when receiving echo signal. Pulse RFI (PRFI) is a common form of RFI and often has time-varying characteristics, which will deteriorate the SAR images quality and hinder image interpretation. To effectively suppress the PRFI, the serial number of the pulses in SAR raw data containing PRFI need to be screened out with high precision. A two-stage method for screening PRFI in SAR raw data alternating the use of time and frequency domains was proposed in this article. First, range-cell level difference screening is performed in the time domain and frequency domain, respectively, to initially screen the PRFI. Then, the preliminary screening results are accumulated along the range direction, and the accumulated results are classified using a clustering algorithm to perform pulse-level screening to obtain the serial number of the pulses containing PRFI. Compared with the traditional PRFI screening methods, the proposed approach boasts a remarkable ability to circumvent missed screening and false alarm when screening weak-energy PRFIs. It possesses exceptional sensitivity and accuracy, offering fresh perspectives and innovative solutions to the PRFI screening challenge. The effectiveness and superiority of the proposed method are verified by the simulation data and measured data experiments.
{"title":"A Two-Stage Method for Screening Pulse RFI in SAR Raw Data Alternating the Use of Time and Frequency Domains","authors":"Tingting Wei;Xingwang Hu;Zhengwei Guo;Gaofeng Shu;Yabo Huang;Ning Li","doi":"10.1109/JSTARS.2025.3530989","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530989","url":null,"abstract":"In the increasingly complex electromagnetic environment, the spectrum is becoming more and more crowded. Synthetic aperture radar (SAR) is more susceptible to be affected by the radio frequency interference (RFI) in the same frequency band when receiving echo signal. Pulse RFI (PRFI) is a common form of RFI and often has time-varying characteristics, which will deteriorate the SAR images quality and hinder image interpretation. To effectively suppress the PRFI, the serial number of the pulses in SAR raw data containing PRFI need to be screened out with high precision. A two-stage method for screening PRFI in SAR raw data alternating the use of time and frequency domains was proposed in this article. First, range-cell level difference screening is performed in the time domain and frequency domain, respectively, to initially screen the PRFI. Then, the preliminary screening results are accumulated along the range direction, and the accumulated results are classified using a clustering algorithm to perform pulse-level screening to obtain the serial number of the pulses containing PRFI. Compared with the traditional PRFI screening methods, the proposed approach boasts a remarkable ability to circumvent missed screening and false alarm when screening weak-energy PRFIs. It possesses exceptional sensitivity and accuracy, offering fresh perspectives and innovative solutions to the PRFI screening challenge. The effectiveness and superiority of the proposed method are verified by the simulation data and measured data experiments.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4331-4346"},"PeriodicalIF":4.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10844320","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate prediction of sea surface temperature (SST), a crucial indicator of global climate and ecosystem changes, holds significant economic and social benefits. Deep learning has shown preliminary success in modeling the dynamic spatial-temporal dependencies within SST signals, yet it remains challenging to obtain precise SSTs due to the inherent variabilities across multiple temporal and spatial scales, driven by distinct physical processes. In this paper, we propose a novel multi-scale spatio-temporal attention network, named MUSTAN, tailored for the SST prediction problem. MUSTAN achieves multi-scale fusion through a progressive scale expansion paradigm, where sub-scale representations are iteratively merged with its counterpart scale units, enabling the propagation of fine-scale SST changes across broader scales. For each scale, MUSTAN introduces temporal attention to characterize dynamic SST patterns in different ocean regions, and spatial attention to capture intricate SST evolution interplay among these regions. Extensive experiments conducted on datasets from the Bohai Sea, Yellow Sea, and South China Sea consistently validate the effectiveness and superiority of our design, outperforming the state-of-the-art methods on SST prediction tasks.
{"title":"Multiscale Spatio-Temporal Attention Network for Sea Surface Temperature Prediction","authors":"Zhenxiang Bai;Zhengya Sun;Bojie Fan;An-An Liu;Zhiqiang Wei;Bo Yin","doi":"10.1109/JSTARS.2025.3531122","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3531122","url":null,"abstract":"Accurate prediction of sea surface temperature (SST), a crucial indicator of global climate and ecosystem changes, holds significant economic and social benefits. Deep learning has shown preliminary success in modeling the dynamic spatial-temporal dependencies within SST signals, yet it remains challenging to obtain precise SSTs due to the inherent variabilities across multiple temporal and spatial scales, driven by distinct physical processes. In this paper, we propose a novel multi-scale spatio-temporal attention network, named MUSTAN, tailored for the SST prediction problem. MUSTAN achieves multi-scale fusion through a progressive scale expansion paradigm, where sub-scale representations are iteratively merged with its counterpart scale units, enabling the propagation of fine-scale SST changes across broader scales. For each scale, MUSTAN introduces temporal attention to characterize dynamic SST patterns in different ocean regions, and spatial attention to capture intricate SST evolution interplay among these regions. Extensive experiments conducted on datasets from the Bohai Sea, Yellow Sea, and South China Sea consistently validate the effectiveness and superiority of our design, outperforming the state-of-the-art methods on SST prediction tasks.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5866-5877"},"PeriodicalIF":4.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10844304","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143465762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1109/JSTARS.2025.3531658
Chengming Wang;Peng Duan;Jinjiang Li
Change detection (CD) has extensive applications and is a crucial method for identifying and localizing target changes. In recent years, various CD methods represented by convolutional neural network (CNN) and transformer have achieved significant success in effectively detecting difference areas in bitemporal remote sensing images. However, CNN still exhibit limitations in local feature extraction when confronted with pseudochanges caused by different object types across global scales. Although transformers can effectively detect true change regions due to their long-range dependencies, the shadows cast by buildings under varying lighting conditions can introduce localized noise in these areas. To address these challenges, we propose the dynamically focused progressive fusion network (DFPF-Net) to simultaneously tackle global and local noise influences. On one hand, we utilize a pyramid vision transformer (PVT) as a weight-shared siamese network to implement change detection, efficiently fusing multilevel features extracted from the pyramid structure through a residual based progressive enhanced fusion module (PEFM). On the other hand, we propose the dynamic change focus module, which employs attention mechanisms and edge detection algorithms to mitigate noise interference across varying ranges. Extensive experiments on four datasets demonstrate that DFPF-Net outperforms mainstream CD methods.
{"title":"DFPF-Net: Dynamically Focused Progressive Fusion Network for Remote Sensing Change Detection","authors":"Chengming Wang;Peng Duan;Jinjiang Li","doi":"10.1109/JSTARS.2025.3531658","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3531658","url":null,"abstract":"Change detection (CD) has extensive applications and is a crucial method for identifying and localizing target changes. In recent years, various CD methods represented by convolutional neural network (CNN) and transformer have achieved significant success in effectively detecting difference areas in bitemporal remote sensing images. However, CNN still exhibit limitations in local feature extraction when confronted with pseudochanges caused by different object types across global scales. Although transformers can effectively detect true change regions due to their long-range dependencies, the shadows cast by buildings under varying lighting conditions can introduce localized noise in these areas. To address these challenges, we propose the dynamically focused progressive fusion network (DFPF-Net) to simultaneously tackle global and local noise influences. On one hand, we utilize a pyramid vision transformer (PVT) as a weight-shared siamese network to implement change detection, efficiently fusing multilevel features extracted from the pyramid structure through a residual based progressive enhanced fusion module (PEFM). On the other hand, we propose the dynamic change focus module, which employs attention mechanisms and edge detection algorithms to mitigate noise interference across varying ranges. Extensive experiments on four datasets demonstrate that DFPF-Net outperforms mainstream CD methods.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5905-5918"},"PeriodicalIF":4.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10845177","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143465676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1109/JSTARS.2025.3530152
Changzhi Yang;Kebiao Mao;Jiancheng Shi;Zhonghua Guo;Sayed M. Bateni
Current research often improves the accuracy of global navigation satellite system-reflectometry soil moisture (SM) inversion by incorporating auxiliary data, which somewhat limits its potential for practical application. To reduce the reliance on auxiliary data, this article presents a cyclone global navigation satellite system SM inversion method based on the time-constrained and spatially explicit artificial intelligence (TCSE-AI) model. The method initially segments data into multiple subsets through time constraints, thus limiting irrelevant factors to a relatively stable state and endowing the data with temporal attributes. Then, it incorporates raster data spatial information, integrating the potential spatiotemporal distribution characteristics of the data into the SM inversion model. Finally, it constructs SM inversion models using machine learning methods. The experimental results indicate that the TCSE-AI SM inversion model based on the XGBoost and random forest model architectures achieved favorable results. Their monthly SM inversion results for 2022 were compared with the soil moisture active passive (SMAP) products, with Pearson's correlation coefficients (R) all greater than 0.91 and root-mean-square errors (RMSEs) less than 0.05 cm3/cm3. Subsequently, this study used the XGBoost method as an example for validation with in situ data and conducted an interannual SM cross-inversion experiment. From January to June 2022, the R between SM inversion results in the study area and in situ SM was 0.788, with an RMSE of 0.063 cm3/cm3. The interannual cross-inversion experimental results, except for cases of missing data over multiple days, indicate that the TCSE-AI model generally achieved the accurate estimates of SM. Compared with SMAP SM, the R was all greater than 0.8, with a maximum RMSE of 0.072 cm3/cm3, and they showed satisfactory consistency with the in situ data.
{"title":"A Time-Constrained and Spatially Explicit AI Model for Soil Moisture Inversion Using CYGNSS Data","authors":"Changzhi Yang;Kebiao Mao;Jiancheng Shi;Zhonghua Guo;Sayed M. Bateni","doi":"10.1109/JSTARS.2025.3530152","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530152","url":null,"abstract":"Current research often improves the accuracy of global navigation satellite system-reflectometry soil moisture (SM) inversion by incorporating auxiliary data, which somewhat limits its potential for practical application. To reduce the reliance on auxiliary data, this article presents a cyclone global navigation satellite system SM inversion method based on the time-constrained and spatially explicit artificial intelligence (TCSE-AI) model. The method initially segments data into multiple subsets through time constraints, thus limiting irrelevant factors to a relatively stable state and endowing the data with temporal attributes. Then, it incorporates raster data spatial information, integrating the potential spatiotemporal distribution characteristics of the data into the SM inversion model. Finally, it constructs SM inversion models using machine learning methods. The experimental results indicate that the TCSE-AI SM inversion model based on the XGBoost and random forest model architectures achieved favorable results. Their monthly SM inversion results for 2022 were compared with the soil moisture active passive (SMAP) products, with Pearson's correlation coefficients (<italic>R</i>) all greater than 0.91 and root-mean-square errors (RMSEs) less than 0.05 cm<sup>3</sup>/cm<sup>3</sup>. Subsequently, this study used the XGBoost method as an example for validation with in situ data and conducted an interannual SM cross-inversion experiment. From January to June 2022, the <italic>R</i> between SM inversion results in the study area and in situ SM was 0.788, with an RMSE of 0.063 cm<sup>3</sup>/cm<sup>3</sup>. The interannual cross-inversion experimental results, except for cases of missing data over multiple days, indicate that the TCSE-AI model generally achieved the accurate estimates of SM. Compared with SMAP SM, the <italic>R</i> was all greater than 0.8, with a maximum RMSE of 0.072 cm<sup>3</sup>/cm<sup>3</sup>, and they showed satisfactory consistency with the in situ data.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5100-5119"},"PeriodicalIF":4.7,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10845082","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143422805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Updating and digitizing cadastral maps remains a major challenge in land administration, demanding significant financial and human resources. This study presents a fully automated AI-based system to address this issue, focusing on the extraction and digitization of agricultural cadastral maps using photogrammetric images. The proposed method leverages the segment anything model for high-accuracy segmentation, achieving a notable intersection over union score of 92%, significantly outperforming traditional approaches. In addition, the system reduces processing time by 40% and eliminates the need for manual intervention, enabling scalable, efficient digitization. These improvements are critical for better land-use planning, resource allocation, and sustainable land management practices. The model, implemented using open-source Python libraries, integrates three stages: image preprocessing, AI-based segmentation, and postprocessing. By automating these processes, the system not only accelerates map production but also reduces environmental impacts associated with traditional mapping techniques. The approach also enhances the accuracy of agricultural boundary delineation, offering benefits for land dispute resolution and optimized agricultural practices. This research contributes to the modernization of land administration systems by providing an accessible, scalable solution for surveyors and policymakers. It bridges the gap between cutting-edge artificial intelligence advancements and practical applications, addressing technical and operational challenges in geospatial data management. The findings underscore the importance of automating cadastral mapping for both economic efficiency and environmental sustainability.
{"title":"Super-Resolution AI-Based Approach for Extracting Agricultural Cadastral Maps: Form and Content Validation","authors":"Alireza Vafaeinejad;Nima Alimohammadi;Alireza Sharifi;Mohammad Mahdi Safari","doi":"10.1109/JSTARS.2025.3530714","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530714","url":null,"abstract":"Updating and digitizing cadastral maps remains a major challenge in land administration, demanding significant financial and human resources. This study presents a fully automated AI-based system to address this issue, focusing on the extraction and digitization of agricultural cadastral maps using photogrammetric images. The proposed method leverages the segment anything model for high-accuracy segmentation, achieving a notable intersection over union score of 92%, significantly outperforming traditional approaches. In addition, the system reduces processing time by 40% and eliminates the need for manual intervention, enabling scalable, efficient digitization. These improvements are critical for better land-use planning, resource allocation, and sustainable land management practices. The model, implemented using open-source Python libraries, integrates three stages: image preprocessing, AI-based segmentation, and postprocessing. By automating these processes, the system not only accelerates map production but also reduces environmental impacts associated with traditional mapping techniques. The approach also enhances the accuracy of agricultural boundary delineation, offering benefits for land dispute resolution and optimized agricultural practices. This research contributes to the modernization of land administration systems by providing an accessible, scalable solution for surveyors and policymakers. It bridges the gap between cutting-edge artificial intelligence advancements and practical applications, addressing technical and operational challenges in geospatial data management. The findings underscore the importance of automating cadastral mapping for both economic efficiency and environmental sustainability.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5204-5216"},"PeriodicalIF":4.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843845","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143422918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-16DOI: 10.1109/JSTARS.2025.3530525
Zhe Zhang;Yukuan Dong;Chunlin Li;Chengrun Wu;Qiushi Wang;Xiao Liu
Urbanization has increased the surface urban heat island (SUHI) effect. This study uses local climate zones (LCZ) and urban built environment characteristics (UBECs) to explore the factors influencing land surface temperature (LST) and SUHI in various UBECs in Shenyang, China. Google Earth Engine was used to calculate LST. An LCZ map of Shenyang was created to analyze seasonal differences in the SUHI. A correlation model was used to screen the UBEC, and a geographically and temporally weighted regression (GTWR) model was used to explain the spatial variations in the urban heat environment caused by built environments in different seasons. Compared to traditional methods, the GTWR model exhibits better goodness of fit and is more effective in capturing the spatiotemporal heterogeneity of variables. Compact and high-rise areas had higher SUHI effects compared to other LCZs, whereas land-cover LCZs had a cool-island effect. The GTWR model helps planners identify the climatic impacts of each factor in different spatial locations within the study area, as well as variations across seasons. Vegetation-related factors had less impact in densely-built areas, whereas the proportion of blue areas was more effective in alleviating extreme climates in high-density zones. The impact of building density on the heat island effect exhibited substantial spatiotemporal variation, particularly in compact, high-rise LCZs during both seasons. To address extreme winter–summer weather in cold regions, this study examined seasonal SUHIs and their interaction with UBECs, offering strategies and guidance for heat mitigation in urban design.
{"title":"Impacts and Spatiotemporal Differentiation of Built Environments on the Urban Heat Island Effect in Cold-Climate Cities Based on Local Climate Zones","authors":"Zhe Zhang;Yukuan Dong;Chunlin Li;Chengrun Wu;Qiushi Wang;Xiao Liu","doi":"10.1109/JSTARS.2025.3530525","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530525","url":null,"abstract":"Urbanization has increased the surface urban heat island (SUHI) effect. This study uses local climate zones (LCZ) and urban built environment characteristics (UBECs) to explore the factors influencing land surface temperature (LST) and SUHI in various UBECs in Shenyang, China. Google Earth Engine was used to calculate LST. An LCZ map of Shenyang was created to analyze seasonal differences in the SUHI. A correlation model was used to screen the UBEC, and a geographically and temporally weighted regression (GTWR) model was used to explain the spatial variations in the urban heat environment caused by built environments in different seasons. Compared to traditional methods, the GTWR model exhibits better goodness of fit and is more effective in capturing the spatiotemporal heterogeneity of variables. Compact and high-rise areas had higher SUHI effects compared to other LCZs, whereas land-cover LCZs had a cool-island effect. The GTWR model helps planners identify the climatic impacts of each factor in different spatial locations within the study area, as well as variations across seasons. Vegetation-related factors had less impact in densely-built areas, whereas the proportion of blue areas was more effective in alleviating extreme climates in high-density zones. The impact of building density on the heat island effect exhibited substantial spatiotemporal variation, particularly in compact, high-rise LCZs during both seasons. To address extreme winter–summer weather in cold regions, this study examined seasonal SUHIs and their interaction with UBECs, offering strategies and guidance for heat mitigation in urban design.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5406-5422"},"PeriodicalIF":4.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843833","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143446294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-16DOI: 10.1109/JSTARS.2025.3530442
Siyuan Wang;Yinghua Wang;Xiaoting Zhang;Chen Zhang;Hongwei Liu
Nowadays, meta-learning is the mainstream method for solving few-shot synthetic aperture radar (SAR) target classification, devoted to learning a lot of empirical knowledge from the source domain to quickly recognize the novel classes after seeing only a few samples. However, obtaining the source domain with sufficiently labeled SAR images is difficult, leading to limited transferable empirical knowledge from the source to the target domain. Moreover, most existing methods only rely on visual images to learn the targets' feature representations, resulting in poor feature discriminability in few-shot situations. To tackle the above problems, we propose a novel visual-semantic cooperative network (VSC-Net) that involves visual and semantic dual classification to compensate for the inaccuracy of visual classification through semantic classification. First, we design textual semantic descriptions of SAR targets to exploit rich semantic information. Then, the designed textual semantic descriptions are encoded by the text encoder of the pretrained large vision language model to obtain class semantic embeddings of targets. In the visual classification stage, we develop the semantic-based visual prototype calibration module to project the class semantic embeddings to the visual space to calibrate the visual prototypes, improving the reliability of the prototypes computed from a few support samples. Besides, semantic consistency loss is proposed to constrain the accuracy of the class semantic embeddings projected to the visual space. During the semantic classification stage, the visual features of query samples are mapped into the semantic space, and their classes are predicted via searching for the nearest class semantic embeddings. Furthermore, we introduce a visual indication loss to modify the semantic classification using the calibrated visual prototypes. Ultimately, query samples' classes are decided by merging the visual and semantic classification results. We conduct adequate experiments on the SAR target dataset, which validate VSC-Net's few-shot classification efficacy.
{"title":"Visual-Semantic Cooperative Learning for Few-Shot SAR Target Classification","authors":"Siyuan Wang;Yinghua Wang;Xiaoting Zhang;Chen Zhang;Hongwei Liu","doi":"10.1109/JSTARS.2025.3530442","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530442","url":null,"abstract":"Nowadays, meta-learning is the mainstream method for solving few-shot synthetic aperture radar (SAR) target classification, devoted to learning a lot of empirical knowledge from the source domain to quickly recognize the novel classes after seeing only a few samples. However, obtaining the source domain with sufficiently labeled SAR images is difficult, leading to limited transferable empirical knowledge from the source to the target domain. Moreover, most existing methods only rely on visual images to learn the targets' feature representations, resulting in poor feature discriminability in few-shot situations. To tackle the above problems, we propose a novel visual-semantic cooperative network (VSC-Net) that involves visual and semantic dual classification to compensate for the inaccuracy of visual classification through semantic classification. First, we design textual semantic descriptions of SAR targets to exploit rich semantic information. Then, the designed textual semantic descriptions are encoded by the text encoder of the pretrained large vision language model to obtain class semantic embeddings of targets. In the visual classification stage, we develop the semantic-based visual prototype calibration module to project the class semantic embeddings to the visual space to calibrate the visual prototypes, improving the reliability of the prototypes computed from a few support samples. Besides, semantic consistency loss is proposed to constrain the accuracy of the class semantic embeddings projected to the visual space. During the semantic classification stage, the visual features of query samples are mapped into the semantic space, and their classes are predicted via searching for the nearest class semantic embeddings. Furthermore, we introduce a visual indication loss to modify the semantic classification using the calibrated visual prototypes. Ultimately, query samples' classes are decided by merging the visual and semantic classification results. We conduct adequate experiments on the SAR target dataset, which validate VSC-Net's few-shot classification efficacy.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"6532-6550"},"PeriodicalIF":4.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843851","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-16DOI: 10.1109/JSTARS.2025.3530141
Huan Liu;Xuefeng Ren;Yang Gan;Yongming Chen;Ping Lin
Aircraft target detection in remote sensing images faces numerous challenges, including target size variations, low resolution, and complex backgrounds. To address these challenges, an enhanced end-to-end aircraft detection framework (DIMD-DETR) is developed based on an improved metric space. Initially, a bilayer targeted prediction method is proposed to strengthen gradient interaction across decoder layers, thereby enhancing detection accuracy and sensitivity in complex scenarios. The pyramid structure and self-attention mechanism from pyramid vision transformer V2 are incorporated to enable effective joint learning of both global and local features, which significantly boosts performance for low-resolution targets. To further enhance the model's generalization capabilities, an aircraft-specific data augmentation strategy is meticulously devised, thereby improving the model's adaptability to variations in scale and appearance. In addition, a metric-space-based loss function is developed to optimize the collaborative effects of the modular architecture, enhancing detection performance in complex backgrounds and under varying target conditions. Finally, a dynamic learning rate scheduling strategy is proposed to balance rapid convergence with global exploration, thereby elevating the model's robustness in challenging environments. Compared to current popular networks, our model demonstrated superior detection performance with fewer parameters.
{"title":"DIMD-DETR: DDQ-DETR With Improved Metric Space for End-to-End Object Detector on Remote Sensing Aircrafts","authors":"Huan Liu;Xuefeng Ren;Yang Gan;Yongming Chen;Ping Lin","doi":"10.1109/JSTARS.2025.3530141","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530141","url":null,"abstract":"Aircraft target detection in remote sensing images faces numerous challenges, including target size variations, low resolution, and complex backgrounds. To address these challenges, an enhanced end-to-end aircraft detection framework (DIMD-DETR) is developed based on an improved metric space. Initially, a bilayer targeted prediction method is proposed to strengthen gradient interaction across decoder layers, thereby enhancing detection accuracy and sensitivity in complex scenarios. The pyramid structure and self-attention mechanism from pyramid vision transformer V2 are incorporated to enable effective joint learning of both global and local features, which significantly boosts performance for low-resolution targets. To further enhance the model's generalization capabilities, an aircraft-specific data augmentation strategy is meticulously devised, thereby improving the model's adaptability to variations in scale and appearance. In addition, a metric-space-based loss function is developed to optimize the collaborative effects of the modular architecture, enhancing detection performance in complex backgrounds and under varying target conditions. Finally, a dynamic learning rate scheduling strategy is proposed to balance rapid convergence with global exploration, thereby elevating the model's robustness in challenging environments. Compared to current popular networks, our model demonstrated superior detection performance with fewer parameters.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4498-4509"},"PeriodicalIF":4.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843752","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-16DOI: 10.1109/JSTARS.2025.3530146
Daniyaer Sidekejiang;Panpan Zheng;Liejun Wang
With the development of deep learning (DL) in recent years, numerous remote sensing image change detection (CD) networks have emerged. However, existing DL-based CD networks still face two significant issues: 1) the lack of adequate supervision during the encoding process; and 2) the coupling of overall information with edge information. To overcome these challenges, we propose the Edge detection-guided (ED-guided) strategy and the Dual-flow strategy, integrating them into a novel Multilabel Dual-flow Network (MLDFNet). The ED-guided strategy supervises the encoding process with our self-generated edge labels, enabling feature extraction with reduced noise and more precise semantics. Concurrently, the Dual-flow strategy allows the network to process overall and edge information separately, reducing the interference between the two and enabling the network to observe both simultaneously. These strategies are effectively integrated through our proposed Dual-flow Convolution Block. Extensive experiments demonstrate that MLDFNet significantly outperforms existing state-of-the-art methods, achieving outstanding F1 scores of 91.72%, 97.84%, and 94.85% on the LEVIR-CD, CDD, and BCDD datasets, respectively. These results validate its superior performance and potential value in real-world remote sensing applications.
{"title":"MLDFNet: A Multilabel Dual-Flow Network for Change Detection in Bitemporal Remote Sensing Images","authors":"Daniyaer Sidekejiang;Panpan Zheng;Liejun Wang","doi":"10.1109/JSTARS.2025.3530146","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530146","url":null,"abstract":"With the development of deep learning (DL) in recent years, numerous remote sensing image change detection (CD) networks have emerged. However, existing DL-based CD networks still face two significant issues: 1) the lack of adequate supervision during the encoding process; and 2) the coupling of overall information with edge information. To overcome these challenges, we propose the Edge detection-guided (ED-guided) strategy and the Dual-flow strategy, integrating them into a novel Multilabel Dual-flow Network (MLDFNet). The ED-guided strategy supervises the encoding process with our self-generated edge labels, enabling feature extraction with reduced noise and more precise semantics. Concurrently, the Dual-flow strategy allows the network to process overall and edge information separately, reducing the interference between the two and enabling the network to observe both simultaneously. These strategies are effectively integrated through our proposed Dual-flow Convolution Block. Extensive experiments demonstrate that MLDFNet significantly outperforms existing state-of-the-art methods, achieving outstanding F1 scores of 91.72%, 97.84%, and 94.85% on the LEVIR-CD, CDD, and BCDD datasets, respectively. These results validate its superior performance and potential value in real-world remote sensing applications.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4867-4880"},"PeriodicalIF":4.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843821","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143430371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}