Pub Date : 2025-01-16DOI: 10.1109/JSTARS.2025.3530141
Huan Liu;Xuefeng Ren;Yang Gan;Yongming Chen;Ping Lin
Aircraft target detection in remote sensing images faces numerous challenges, including target size variations, low resolution, and complex backgrounds. To address these challenges, an enhanced end-to-end aircraft detection framework (DIMD-DETR) is developed based on an improved metric space. Initially, a bilayer targeted prediction method is proposed to strengthen gradient interaction across decoder layers, thereby enhancing detection accuracy and sensitivity in complex scenarios. The pyramid structure and self-attention mechanism from pyramid vision transformer V2 are incorporated to enable effective joint learning of both global and local features, which significantly boosts performance for low-resolution targets. To further enhance the model's generalization capabilities, an aircraft-specific data augmentation strategy is meticulously devised, thereby improving the model's adaptability to variations in scale and appearance. In addition, a metric-space-based loss function is developed to optimize the collaborative effects of the modular architecture, enhancing detection performance in complex backgrounds and under varying target conditions. Finally, a dynamic learning rate scheduling strategy is proposed to balance rapid convergence with global exploration, thereby elevating the model's robustness in challenging environments. Compared to current popular networks, our model demonstrated superior detection performance with fewer parameters.
{"title":"DIMD-DETR: DDQ-DETR With Improved Metric Space for End-to-End Object Detector on Remote Sensing Aircrafts","authors":"Huan Liu;Xuefeng Ren;Yang Gan;Yongming Chen;Ping Lin","doi":"10.1109/JSTARS.2025.3530141","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530141","url":null,"abstract":"Aircraft target detection in remote sensing images faces numerous challenges, including target size variations, low resolution, and complex backgrounds. To address these challenges, an enhanced end-to-end aircraft detection framework (DIMD-DETR) is developed based on an improved metric space. Initially, a bilayer targeted prediction method is proposed to strengthen gradient interaction across decoder layers, thereby enhancing detection accuracy and sensitivity in complex scenarios. The pyramid structure and self-attention mechanism from pyramid vision transformer V2 are incorporated to enable effective joint learning of both global and local features, which significantly boosts performance for low-resolution targets. To further enhance the model's generalization capabilities, an aircraft-specific data augmentation strategy is meticulously devised, thereby improving the model's adaptability to variations in scale and appearance. In addition, a metric-space-based loss function is developed to optimize the collaborative effects of the modular architecture, enhancing detection performance in complex backgrounds and under varying target conditions. Finally, a dynamic learning rate scheduling strategy is proposed to balance rapid convergence with global exploration, thereby elevating the model's robustness in challenging environments. Compared to current popular networks, our model demonstrated superior detection performance with fewer parameters.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4498-4509"},"PeriodicalIF":4.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843752","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-16DOI: 10.1109/JSTARS.2025.3530146
Daniyaer Sidekejiang;Panpan Zheng;Liejun Wang
With the development of deep learning (DL) in recent years, numerous remote sensing image change detection (CD) networks have emerged. However, existing DL-based CD networks still face two significant issues: 1) the lack of adequate supervision during the encoding process; and 2) the coupling of overall information with edge information. To overcome these challenges, we propose the Edge detection-guided (ED-guided) strategy and the Dual-flow strategy, integrating them into a novel Multilabel Dual-flow Network (MLDFNet). The ED-guided strategy supervises the encoding process with our self-generated edge labels, enabling feature extraction with reduced noise and more precise semantics. Concurrently, the Dual-flow strategy allows the network to process overall and edge information separately, reducing the interference between the two and enabling the network to observe both simultaneously. These strategies are effectively integrated through our proposed Dual-flow Convolution Block. Extensive experiments demonstrate that MLDFNet significantly outperforms existing state-of-the-art methods, achieving outstanding F1 scores of 91.72%, 97.84%, and 94.85% on the LEVIR-CD, CDD, and BCDD datasets, respectively. These results validate its superior performance and potential value in real-world remote sensing applications.
{"title":"MLDFNet: A Multilabel Dual-Flow Network for Change Detection in Bitemporal Remote Sensing Images","authors":"Daniyaer Sidekejiang;Panpan Zheng;Liejun Wang","doi":"10.1109/JSTARS.2025.3530146","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530146","url":null,"abstract":"With the development of deep learning (DL) in recent years, numerous remote sensing image change detection (CD) networks have emerged. However, existing DL-based CD networks still face two significant issues: 1) the lack of adequate supervision during the encoding process; and 2) the coupling of overall information with edge information. To overcome these challenges, we propose the Edge detection-guided (ED-guided) strategy and the Dual-flow strategy, integrating them into a novel Multilabel Dual-flow Network (MLDFNet). The ED-guided strategy supervises the encoding process with our self-generated edge labels, enabling feature extraction with reduced noise and more precise semantics. Concurrently, the Dual-flow strategy allows the network to process overall and edge information separately, reducing the interference between the two and enabling the network to observe both simultaneously. These strategies are effectively integrated through our proposed Dual-flow Convolution Block. Extensive experiments demonstrate that MLDFNet significantly outperforms existing state-of-the-art methods, achieving outstanding F1 scores of 91.72%, 97.84%, and 94.85% on the LEVIR-CD, CDD, and BCDD datasets, respectively. These results validate its superior performance and potential value in real-world remote sensing applications.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4867-4880"},"PeriodicalIF":4.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843821","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143430371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-16DOI: 10.1109/JSTARS.2025.3528834
Jacob L. Strunk;Diogo N. Cosenza;Francisco Mauro;Hans-Erik Andersen;Sytze de Bruin;Timothy Bryant;Petteri Packalen
Different sizes and shapes of field plots relative to raster grid cells were found to negatively affect lidar augmented forest inventory. This issue is called the “change of spatial support problem (COSP)” and caused biases and reduction in estimation efficiency (precision per number of plots). For a ∼14 000 km2 study area in Oregon State, USA, we examined three different plot shapes, both fixed-radius and cluster plots, alongside grid cell sizes ranging from 5 to 70 m. Effect size varied with the magnitude of spatial mismatch between plots and raster grid cells. There was up to 15% bias and a 98% reduction in estimation efficiency. Fortunately, no negative effects were observed for circle (plots) versus square (grid cell) shaped regions with the same areas (m2). This study contributes to the sparse body of literature around change of spatial support in the area-based approach to lidar forest inventory and provides methods to easily avoid and mitigate negative effects. The simplest approach to avoid bias, although not always practical or feasible, is to exactly match the area (m2) of circular field plots and raster grid cells. Use of metrics robust to spatial effects, such as median height and height ratios, can also reduce change of spatial support effects. Finally, we demonstrate that attribution of plots directly from raster grid cells (the “raster-intersect” approach) is robust to change of spatial support and flexible in application, but sacrifices a small amount of predictive power (a glossary of technical terminology is also provided in the appendix).
{"title":"Mitigation of Spatial Effects on an Area-Based Lidar Forest Inventory (2024)","authors":"Jacob L. Strunk;Diogo N. Cosenza;Francisco Mauro;Hans-Erik Andersen;Sytze de Bruin;Timothy Bryant;Petteri Packalen","doi":"10.1109/JSTARS.2025.3528834","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3528834","url":null,"abstract":"Different sizes and shapes of field plots relative to raster grid cells were found to negatively affect lidar augmented forest inventory. This issue is called the “change of spatial support problem (COSP)” and caused biases and reduction in estimation efficiency (precision per number of plots). For a ∼14 000 km<sup>2</sup> study area in Oregon State, USA, we examined three different plot shapes, both fixed-radius and cluster plots, alongside grid cell sizes ranging from 5 to 70 m. Effect size varied with the magnitude of spatial mismatch between plots and raster grid cells. There was up to 15% bias and a 98% reduction in estimation efficiency. Fortunately, no negative effects were observed for circle (plots) versus square (grid cell) shaped regions with the same areas (m<sup>2</sup>). This study contributes to the sparse body of literature around change of spatial support in the area-based approach to lidar forest inventory and provides methods to easily avoid and mitigate negative effects. The simplest approach to avoid bias, although not always practical or feasible, is to exactly match the area (m<sup>2</sup>) of circular field plots and raster grid cells. Use of metrics robust to spatial effects, such as median height and height ratios, can also reduce change of spatial support effects. Finally, we demonstrate that attribution of plots directly from raster grid cells (the “raster-intersect” approach) is robust to change of spatial support and flexible in application, but sacrifices a small amount of predictive power (a glossary of technical terminology is also provided in the appendix).","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5287-5302"},"PeriodicalIF":4.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843354","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143422838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-16DOI: 10.1109/JSTARS.2025.3530136
Lin-Yu Dai;Ming-Dian Li;Si-Wei Chen
Polarimetric synthetic aperture radar (PolSAR) can acquire full-polarization information, which is the solid foundation for target scattering mechanism interpretation and utilization. Meanwhile, PolSAR image resolution is usually lower than the synthetic aperture radar (SAR) image, which may limit its potentials for target detection and recognition. Image super-resolution with the convolutional neural network is a promising solution to fulfill this issue. In order to make full use of both polarimetric and spatial information to further enhance super-resolution performance, this work proposes the polarimetric contexture convolutional network (PCCN) for PolSAR image super-resolution. The main contributions are threefold. First, a new PolSAR data representation of the polarimetric contexture matrix is established, which can fully represent the cube of polarimetric and spatial information into a coded matrix. Then, a dual-branch architecture of the polarimetric and spatial feature extraction block is designed to extract both polarimetric and spatial features separately. Finally, these intrinsic polarimetric and spatial features are effectively fused at both local and global levels for PolSAR image super-resolution. The proposed PCCN method is trained with one X-band polarimetric and interferometric synthetic aperture radar (PiSAR) data, while evaluated with the same scene but different PiSAR imaging direction and with different sensors data including the C-band Radarsat-2 and the X-band COSMO-SkyMed of various imaging scenes. Compared with state-of-the-art algorithms, experimental studies demonstrate and validate the effectiveness and superiority of the proposed method in both visualization examination and quantitative metrics. The proposed method can provide better super-resolution PolSAR images from both polarimetric and spatial viewpoints.
{"title":"PCCN: Polarimetric Contexture Convolutional Network for PolSAR Image Super-Resolution","authors":"Lin-Yu Dai;Ming-Dian Li;Si-Wei Chen","doi":"10.1109/JSTARS.2025.3530136","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530136","url":null,"abstract":"Polarimetric synthetic aperture radar (PolSAR) can acquire full-polarization information, which is the solid foundation for target scattering mechanism interpretation and utilization. Meanwhile, PolSAR image resolution is usually lower than the synthetic aperture radar (SAR) image, which may limit its potentials for target detection and recognition. Image super-resolution with the convolutional neural network is a promising solution to fulfill this issue. In order to make full use of both polarimetric and spatial information to further enhance super-resolution performance, this work proposes the polarimetric contexture convolutional network (PCCN) for PolSAR image super-resolution. The main contributions are threefold. First, a new PolSAR data representation of the polarimetric contexture matrix is established, which can fully represent the cube of polarimetric and spatial information into a coded matrix. Then, a dual-branch architecture of the polarimetric and spatial feature extraction block is designed to extract both polarimetric and spatial features separately. Finally, these intrinsic polarimetric and spatial features are effectively fused at both local and global levels for PolSAR image super-resolution. The proposed PCCN method is trained with one <italic>X</i>-band polarimetric and interferometric synthetic aperture radar (PiSAR) data, while evaluated with the same scene but different PiSAR imaging direction and with different sensors data including the <italic>C</i>-band Radarsat-2 and the <italic>X</i>-band COSMO-SkyMed of various imaging scenes. Compared with state-of-the-art algorithms, experimental studies demonstrate and validate the effectiveness and superiority of the proposed method in both visualization examination and quantitative metrics. The proposed method can provide better super-resolution PolSAR images from both polarimetric and spatial viewpoints.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4664-4679"},"PeriodicalIF":4.7,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843849","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-15DOI: 10.1109/JSTARS.2025.3530710
Victoria A. Walker;Michael H. Cosh;William A. White;Andreas Colliander;Victoria R. Kelly;Paul Siqueira
Data were collected across multiple forested domains during the Soil Moisture Active Passive Validation Experiment 2019–2022 to improve understanding of soil moisture retrievals under dense vegetation. Soil surface roughness was one of many soil and vegetation parameters sampled during intensive operations periods during the spring and summer of 2022 because of its importance to retrieval accuracy (rougher soils have a higher emissivity and reduced sensitivity to soil moisture compared to smooth soils with otherwise identical characteristics). A total of 410 valid pinboard transects were collected across 24 sites between the two temperate forest domains located in the northeastern United States. Two experimental methods (handheld lidar and ultrasonic robot) were additionally tested at select sites. After removal of topographic slope, the forest floor was found to be relatively smooth with average rms heights of $9+-1 ,{mathrm{mm}}$ in the central Massachusetts domain and $6+-1 ,{mathrm{mm}}$ in the Millbrook, New York domain. These correspond to estimates of the model roughness parameter, $h$, of 0.31 and 0.16, respectively, which is within the range of accepted lookup table values but smoother than suggested by recent studies retrieving $h$ over forests.
{"title":"Soil Surface Roughness in Temperate Forest During SMAPVEX19-22","authors":"Victoria A. Walker;Michael H. Cosh;William A. White;Andreas Colliander;Victoria R. Kelly;Paul Siqueira","doi":"10.1109/JSTARS.2025.3530710","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530710","url":null,"abstract":"Data were collected across multiple forested domains during the Soil Moisture Active Passive Validation Experiment 2019–2022 to improve understanding of soil moisture retrievals under dense vegetation. Soil surface roughness was one of many soil and vegetation parameters sampled during intensive operations periods during the spring and summer of 2022 because of its importance to retrieval accuracy (rougher soils have a higher emissivity and reduced sensitivity to soil moisture compared to smooth soils with otherwise identical characteristics). A total of 410 valid pinboard transects were collected across 24 sites between the two temperate forest domains located in the northeastern United States. Two experimental methods (handheld lidar and ultrasonic robot) were additionally tested at select sites. After removal of topographic slope, the forest floor was found to be relatively smooth with average rms heights of <inline-formula><tex-math>$9+-1 ,{mathrm{mm}}$</tex-math></inline-formula> in the central Massachusetts domain and <inline-formula><tex-math>$6+-1 ,{mathrm{mm}}$</tex-math></inline-formula> in the Millbrook, New York domain. These correspond to estimates of the model roughness parameter, <inline-formula><tex-math>$h$</tex-math></inline-formula>, of 0.31 and 0.16, respectively, which is within the range of accepted lookup table values but smoother than suggested by recent studies retrieving <inline-formula><tex-math>$h$</tex-math></inline-formula> over forests.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4640-4647"},"PeriodicalIF":4.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843322","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143106082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feature extraction is crucial for hyperspectral image classification (HSIC), and transformer-based methods have demonstrated significant potential in this field due to their exceptional global modeling capabilities. However, existing transformer-based methods use patches of fixed size and shape as input, which, while leveraging information from neighboring similar pixels to some extent, may also introduce heterogeneous pixels from nonhomogeneous regions, leading to a decrease in classification accuracy. In addition, since the goal of HSIC is to classify the center pixel, the attention calculation in these methods may focus on pixels unrelated to the center pixel, further impacting the accuracy of the classification. To address these issues, a novel transformer framework called CenterFormer is proposed, which enhances the center pixel to fully leverage the rich spatial and spectral information. Specifically, a multigranularity feature extractor is designed to effectively capture the fine-grained and coarse-grained spatial–spectral features of hyperspectral images, mitigating performance degradation caused by heterogeneous pixels. Moreover, a transformer encoder with center spatial–spectral attention is introduced, which enhances the center pixel and models global spatial–spectral information to improve classification performance. Finally, an adaptive classifier balances the classification results from different granularity branches, further enhancing the performance of CenterFormer. Comparative experiments conducted on four challenging datasets validate the model's effectiveness. Experimental results show that our model achieves an improvement in overall accuracy of up to 2.83$% $ compared to the current state-of-the-art methods.
{"title":"CenterFormer: A Center Spatial–Spectral Attention Transformer Network for Hyperspectral Image Classification","authors":"Chenjing Jia;Xiaohua Zhang;Hongyun Meng;Shuxiang Xia;Licheng Jiao","doi":"10.1109/JSTARS.2025.3529985","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3529985","url":null,"abstract":"Feature extraction is crucial for hyperspectral image classification (HSIC), and transformer-based methods have demonstrated significant potential in this field due to their exceptional global modeling capabilities. However, existing transformer-based methods use patches of fixed size and shape as input, which, while leveraging information from neighboring similar pixels to some extent, may also introduce heterogeneous pixels from nonhomogeneous regions, leading to a decrease in classification accuracy. In addition, since the goal of HSIC is to classify the center pixel, the attention calculation in these methods may focus on pixels unrelated to the center pixel, further impacting the accuracy of the classification. To address these issues, a novel transformer framework called CenterFormer is proposed, which enhances the center pixel to fully leverage the rich spatial and spectral information. Specifically, a multigranularity feature extractor is designed to effectively capture the fine-grained and coarse-grained spatial–spectral features of hyperspectral images, mitigating performance degradation caused by heterogeneous pixels. Moreover, a transformer encoder with center spatial–spectral attention is introduced, which enhances the center pixel and models global spatial–spectral information to improve classification performance. Finally, an adaptive classifier balances the classification results from different granularity branches, further enhancing the performance of CenterFormer. Comparative experiments conducted on four challenging datasets validate the model's effectiveness. Experimental results show that our model achieves an improvement in overall accuracy of up to 2.83<inline-formula><tex-math>$% $</tex-math></inline-formula> compared to the current state-of-the-art methods.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5523-5539"},"PeriodicalIF":4.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10841983","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143446364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-15DOI: 10.1109/JSTARS.2025.3529993
Lin He;Wenrui Liang;Antonio Plaza
Over recent years, denoising diffusion probabilistic models (DDPMs) have received many attentions due to their powerful ability to infer data distribution. However, most of existing DDPM-based hyperspectral (HS) pansharpening methods over rely on local processing to perform recovery, which usually fails to reconcile global contextual semantics and local details in data. To address the issue, we propose a two-level semantic-driven diffusion method for HS pansharpening. In our method, we first extract semantics in two levels, where the low-level semantic not only leads the extraction of conditional details, but also supports the further semantic extraction while the high-level semantic is related to scene cognition. Then, the features from both the low-level and high-level semantics are conditionally injected to the denoising network to guide the high-resolution HS recovery. Experiments on multiple datasets verify the effectiveness of our method.
{"title":"Two-Level Semantic-Driven Diffusion Based Hyperspectral Pansharpening","authors":"Lin He;Wenrui Liang;Antonio Plaza","doi":"10.1109/JSTARS.2025.3529993","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3529993","url":null,"abstract":"Over recent years, denoising diffusion probabilistic models (DDPMs) have received many attentions due to their powerful ability to infer data distribution. However, most of existing DDPM-based hyperspectral (HS) pansharpening methods over rely on local processing to perform recovery, which usually fails to reconcile global contextual semantics and local details in data. To address the issue, we propose a two-level semantic-driven diffusion method for HS pansharpening. In our method, we first extract semantics in two levels, where the low-level semantic not only leads the extraction of conditional details, but also supports the further semantic extraction while the high-level semantic is related to scene cognition. Then, the features from both the low-level and high-level semantics are conditionally injected to the denoising network to guide the high-resolution HS recovery. Experiments on multiple datasets verify the effectiveness of our method.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4213-4226"},"PeriodicalIF":4.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10842049","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143106033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-15DOI: 10.1109/JSTARS.2025.3530762
Junjie Luo;Zheng Yuan;Lingzi Xu;Wenhui Xu
The application of digital twin (DT) technology in studying public environmental perception and associated health benefits is emerging, yet most research has focused on static green spaces, providing limited insights into dynamic waterscapes. This study aims to systematically evaluate the effects of waterfront and nonwaterfront environments on public physiological and psychological responses using a DT platform. A high-precision 3-D virtual replica of a suburban park was constructed using UAV oblique photogrammetry and handheld lidar scanning technologies. Real-time environmental data were integrated into the DT using IoT devices, establishing a dynamic link between the digital environment and physical worlds. Participants underwent field tests in both environments, measuring physiological indicators (e.g., heart rate and blood oxygen saturation) and psychological indicators (e.g., pleasure and relaxation). We found that waterfront environments outperformed nonwaterfront environments in terms of relaxation and vitality, while no significant differences were observed between the two environments regarding physiological indicators. In addition, ANCOVA and random forest analyses identified temperature and sunlight intensity as key environmental factors influencing heart rate and psychological well-being. The study reveals specific mechanisms through which different environmental characteristics impact public well-being and demonstrates the DT platform's capabilities in real-time environmental data collection and landscape quantification. These findings provide valuable insights for urban planners and public health policymakers in designing landscapes that enhance urban residents' health and well-being.
{"title":"Assessing the Impact of Waterfront Environments on Public Well-Being Through Digital Twin Technology","authors":"Junjie Luo;Zheng Yuan;Lingzi Xu;Wenhui Xu","doi":"10.1109/JSTARS.2025.3530762","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3530762","url":null,"abstract":"The application of digital twin (DT) technology in studying public environmental perception and associated health benefits is emerging, yet most research has focused on static green spaces, providing limited insights into dynamic waterscapes. This study aims to systematically evaluate the effects of waterfront and nonwaterfront environments on public physiological and psychological responses using a DT platform. A high-precision 3-D virtual replica of a suburban park was constructed using UAV oblique photogrammetry and handheld lidar scanning technologies. Real-time environmental data were integrated into the DT using IoT devices, establishing a dynamic link between the digital environment and physical worlds. Participants underwent field tests in both environments, measuring physiological indicators (e.g., heart rate and blood oxygen saturation) and psychological indicators (e.g., pleasure and relaxation). We found that waterfront environments outperformed nonwaterfront environments in terms of relaxation and vitality, while no significant differences were observed between the two environments regarding physiological indicators. In addition, ANCOVA and random forest analyses identified temperature and sunlight intensity as key environmental factors influencing heart rate and psychological well-being. The study reveals specific mechanisms through which different environmental characteristics impact public well-being and demonstrates the DT platform's capabilities in real-time environmental data collection and landscape quantification. These findings provide valuable insights for urban planners and public health policymakers in designing landscapes that enhance urban residents' health and well-being.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4536-4553"},"PeriodicalIF":4.7,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10843315","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143106041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-14DOI: 10.1109/JSTARS.2025.3528192
Jacob Beck;Lukas Malte Kemeter;Konrad Dürrbeck;Mohamed Hesham Ibrahim Abdalla;Frauke Kreuter
High-quality annotations are a critical success factor for machine learning (ML) applications. To achieve this, we have traditionally relied on human annotators, navigating the challenges of limited budgets and the varying task-specific expertise, costs, and availability. Since the emergence of large language models (LLMs), their popularity for generating automated annotations has grown, extending possibilities and complexity of designing an efficient annotation strategy. Increasingly, computer vision capabilities have been integrated into general-purpose LLMs like ChatGPT. This raises the question of how effectively LLMs can be used in satellite image annotation tasks and how they compare to traditional annotator types. This study presents a comprehensive investigation and comparison of various human and automated annotators for image classification. We evaluate the feasibility and economic competitiveness of using the ChatGPT4-V model for a complex land usage annotation task and compare it with alternative human annotators. A set of satellite images is annotated by a domain expert and 15 additional human and automated annotators, differing in expertise and costs. Our analyzes examine the annotation quality loss between the expert and other annotators. This comparison is conducted through, first, descriptive analyzes, second, fitting linear probability models, and third, comparing F1-scores. Ultimately, we simulate annotation strategies where samples are split according to an automatically assigned certainty score. Routing low-certainty images to human annotators can cut total annotation costs by over 50% with minimal impact on label quality. We discuss implications regarding the economic competitiveness of annotation strategies, prompt engineering, and the task-specificity of expertise.
{"title":"Toward Integrating ChatGPT Into Satellite Image Annotation Workflows: A Comparison of Label Quality and Costs of Human and Automated Annotators","authors":"Jacob Beck;Lukas Malte Kemeter;Konrad Dürrbeck;Mohamed Hesham Ibrahim Abdalla;Frauke Kreuter","doi":"10.1109/JSTARS.2025.3528192","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3528192","url":null,"abstract":"High-quality annotations are a critical success factor for machine learning (ML) applications. To achieve this, we have traditionally relied on human annotators, navigating the challenges of limited budgets and the varying task-specific expertise, costs, and availability. Since the emergence of large language models (LLMs), their popularity for generating automated annotations has grown, extending possibilities and complexity of designing an efficient annotation strategy. Increasingly, computer vision capabilities have been integrated into general-purpose LLMs like ChatGPT. This raises the question of how effectively LLMs can be used in satellite image annotation tasks and how they compare to traditional annotator types. This study presents a comprehensive investigation and comparison of various human and automated annotators for image classification. We evaluate the feasibility and economic competitiveness of using the ChatGPT4-V model for a complex land usage annotation task and compare it with alternative human annotators. A set of satellite images is annotated by a domain expert and 15 additional human and automated annotators, differing in expertise and costs. Our analyzes examine the annotation quality loss between the expert and other annotators. This comparison is conducted through, first, descriptive analyzes, second, fitting linear probability models, and third, comparing F1-scores. Ultimately, we simulate annotation strategies where samples are split according to an automatically assigned certainty score. Routing low-certainty images to human annotators can cut total annotation costs by over 50% with minimal impact on label quality. We discuss implications regarding the economic competitiveness of annotation strategies, prompt engineering, and the task-specificity of expertise.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"4366-4381"},"PeriodicalIF":4.7,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10841407","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Ice, Cloud, and land Elevation Satellite-2 (ICESat-2) is widely used in mapping and monitoring the changes of ice sheets and forest vegetation. However, the satellite receives all the photons returning from around 532 nm, including surface signal photons and atmospheric noise photons, which means that the acquisition of surface information and high-level products is limited by the large proportion of noise photons. Therefore, onboard filtering is required on the satellite to identify the position of the signal photons. In this article, we propose a simple and effective onboard filtering algorithm that does not require any prior terrain information. Based on 1 019 954 major frames of data, the processing time, data volume, and signal recognition accuracy were calculated, and the impacts of five influencing factors (time of day, land cover, solar elevation, surface slope, and beam strength) on the algorithm were evaluated. The results showed that the processing time was lower compared with existing algorithms, the average ratio of all the major frame ranges was 0.8234, and 98.69% of the areas that originally included the signal could also be identified. Subsequent evaluations found that the selected solar elevation and surface slope have the greatest impact on the accuracy of the algorithm. The proposed no-prior-terrain onboard filtering algorithm represents an effective means for obtaining the telemetry range from ICESat-2 altimetry data, to address the challenges of onboard storage and satellite ground transmission.
{"title":"Retrieving Telemetry Range From ICESat-2 Data by a No-Prior-Terrain Onboard Filtering Algorithm","authors":"Yuan Sun;Huan Xie;Chunhui Wang;Qi Xu;Binbin Li;Changda Liu;Min Ji;Xiaohua Tong","doi":"10.1109/JSTARS.2025.3529744","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3529744","url":null,"abstract":"The Ice, Cloud, and land Elevation Satellite-2 (ICESat-2) is widely used in mapping and monitoring the changes of ice sheets and forest vegetation. However, the satellite receives all the photons returning from around 532 nm, including surface signal photons and atmospheric noise photons, which means that the acquisition of surface information and high-level products is limited by the large proportion of noise photons. Therefore, onboard filtering is required on the satellite to identify the position of the signal photons. In this article, we propose a simple and effective onboard filtering algorithm that does not require any prior terrain information. Based on 1 019 954 major frames of data, the processing time, data volume, and signal recognition accuracy were calculated, and the impacts of five influencing factors (time of day, land cover, solar elevation, surface slope, and beam strength) on the algorithm were evaluated. The results showed that the processing time was lower compared with existing algorithms, the average ratio of all the major frame ranges was 0.8234, and 98.69% of the areas that originally included the signal could also be identified. Subsequent evaluations found that the selected solar elevation and surface slope have the greatest impact on the accuracy of the algorithm. The proposed no-prior-terrain onboard filtering algorithm represents an effective means for obtaining the telemetry range from ICESat-2 altimetry data, to address the challenges of onboard storage and satellite ground transmission.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"5120-5134"},"PeriodicalIF":4.7,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10841948","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143422751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}