Pub Date : 2025-08-21DOI: 10.1109/LGRS.2025.3601230
Yubo Ma;Wei He;Siyu Cai;Qingke Zou
Single hyperspectral image (HSI) super-resolution (SR), which is limited by the lack of exterior information, has always been a challenging task. A lot of effort has gone into fully mining spectral information or adopting pretrained models to enhance spatial resolution. However, few SR approaches take into account structural features from the perspective of multidimensional segmentation of the image. Therefore, a novel spectral–spatial segmentation-based local bicubic interpolation (S3LBI) is proposed to implement segmented and blocked interpolation according to the characteristics of HSI. Specifically, the bands of an HSI are clustered into several spectral segments. Then, super-pixel segmentation is carried out in each spectral segment. After that, the bicubic interpolations are separately conducted on different spectral–spatial segments. Experiments demonstrate the superiority of our S3LBI over the compared HSI SR approaches.
{"title":"S3LBI: Spectral–Spatial Segmentation-Based Local Bicubic Interpolation for Single Hyperspectral Image Super-Resolution","authors":"Yubo Ma;Wei He;Siyu Cai;Qingke Zou","doi":"10.1109/LGRS.2025.3601230","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601230","url":null,"abstract":"Single hyperspectral image (HSI) super-resolution (SR), which is limited by the lack of exterior information, has always been a challenging task. A lot of effort has gone into fully mining spectral information or adopting pretrained models to enhance spatial resolution. However, few SR approaches take into account structural features from the perspective of multidimensional segmentation of the image. Therefore, a novel spectral–spatial segmentation-based local bicubic interpolation (S3LBI) is proposed to implement segmented and blocked interpolation according to the characteristics of HSI. Specifically, the bands of an HSI are clustered into several spectral segments. Then, super-pixel segmentation is carried out in each spectral segment. After that, the bicubic interpolations are separately conducted on different spectral–spatial segments. Experiments demonstrate the superiority of our S3LBI over the compared HSI SR approaches.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-21DOI: 10.1109/LGRS.2025.3601083
Bei Cheng;Zao Liu;Huxiao Tang;Qingwang Wang;Wenhao Chen;Tao Chen;Tao Shen
The latest remote sensing image saliency detectors primarily rely on RGB information alone. However, spatial and geometric information embedded in depth images is robust to variations in lighting and color. Integrating depth information with RGB images can enhance the spatial structure of objects. In light of this, we innovatively propose a remote sensing image saliency detection model that fuses RGB and depth information, named the multimodal-guided transformer architecture (MGTA). Specifically, we first introduce the strongly correlated complementary fusion (SCCF) module to explore cross-modal consistency and similarity, maintaining consistency across different modalities while uncovering multidimensional common information. In addition, the global–local context information interaction (GLCII) module is designed to extract global semantic information and local detail information, effectively utilizing contextual information while reducing the number of parameters. Finally, a cascaded feature-guided decoder (CFGD) is employed to gradually fuse hierarchical decoding features, effectively integrating multilevel data and accurately locating target positions. Extensive experiments demonstrate that our proposed model outperforms 14 state-of-the-art methods. The code and results of our method are available at https://github.com/Zackisliuzao/MGTANet
{"title":"Multimodal-Guided Transformer Architecture for Remote Sensing Salient Object Detection","authors":"Bei Cheng;Zao Liu;Huxiao Tang;Qingwang Wang;Wenhao Chen;Tao Chen;Tao Shen","doi":"10.1109/LGRS.2025.3601083","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601083","url":null,"abstract":"The latest remote sensing image saliency detectors primarily rely on RGB information alone. However, spatial and geometric information embedded in depth images is robust to variations in lighting and color. Integrating depth information with RGB images can enhance the spatial structure of objects. In light of this, we innovatively propose a remote sensing image saliency detection model that fuses RGB and depth information, named the multimodal-guided transformer architecture (MGTA). Specifically, we first introduce the strongly correlated complementary fusion (SCCF) module to explore cross-modal consistency and similarity, maintaining consistency across different modalities while uncovering multidimensional common information. In addition, the global–local context information interaction (GLCII) module is designed to extract global semantic information and local detail information, effectively utilizing contextual information while reducing the number of parameters. Finally, a cascaded feature-guided decoder (CFGD) is employed to gradually fuse hierarchical decoding features, effectively integrating multilevel data and accurately locating target positions. Extensive experiments demonstrate that our proposed model outperforms 14 state-of-the-art methods. The code and results of our method are available at <uri>https://github.com/Zackisliuzao/MGTANet</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-21DOI: 10.1109/LGRS.2025.3601200
Binpeng Yan;Jiaqi Zhao;Mutian Li;Rui Pan
The channel system is intimately linked to the formation of oil and gas reservoirs. In petroliferous basins, channel deposits frequently serve as both storage spaces and fluid conduits. Consequently, the accurate identification of channels in 3-D seismic data is, therefore, critical for reservoir prediction. Traditional seismic attribute-based methods can outline channel boundaries, but noise and stratigraphic complexity introduce discontinuities that reduce accuracy and require extensive manual correction. Deep learning-based methods outperform conventional methods in terms of efficiency and precision. However, the similar seismic signatures of channels and continuous karst caves in seismic profiles can still mislead the existing models. To address this challenge, we proposed an improved variant of the 3-D TransUnet model for 3-D seismic data recognition. The model incorporates channel and spatial attention mechanisms into the skip connections of the TransUnet architecture, effectively enhancing its feature representation capability and recognition accuracy. In addition, a multiloss function is introduced to improve the delineation and continuity of the channel while increasing the model’s robustness against nonchannel interference features. Experiments on synthetic and field seismic data confirm superior boundary delineation, continuity, and noise resistance compared with baseline methods.
{"title":"Channel Characterization Based on 3-D TransUnet-CBAM With Multiloss Function","authors":"Binpeng Yan;Jiaqi Zhao;Mutian Li;Rui Pan","doi":"10.1109/LGRS.2025.3601200","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601200","url":null,"abstract":"The channel system is intimately linked to the formation of oil and gas reservoirs. In petroliferous basins, channel deposits frequently serve as both storage spaces and fluid conduits. Consequently, the accurate identification of channels in 3-D seismic data is, therefore, critical for reservoir prediction. Traditional seismic attribute-based methods can outline channel boundaries, but noise and stratigraphic complexity introduce discontinuities that reduce accuracy and require extensive manual correction. Deep learning-based methods outperform conventional methods in terms of efficiency and precision. However, the similar seismic signatures of channels and continuous karst caves in seismic profiles can still mislead the existing models. To address this challenge, we proposed an improved variant of the 3-D TransUnet model for 3-D seismic data recognition. The model incorporates channel and spatial attention mechanisms into the skip connections of the TransUnet architecture, effectively enhancing its feature representation capability and recognition accuracy. In addition, a multiloss function is introduced to improve the delineation and continuity of the channel while increasing the model’s robustness against nonchannel interference features. Experiments on synthetic and field seismic data confirm superior boundary delineation, continuity, and noise resistance compared with baseline methods.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Consultative Committee for Space Data Systems (CCSDS) proposed the CCSDS 123.0-B-2 standard for compressing large volumes of data acquired by multispectral and hyperspectral sensors. However, data dependencies in the CCSDS 123.0-B-2 predictor lead to feedback loops during the weight update process. This poses challenges for fully pipelined hardware implementation of the predictor and severely limits the achievable data throughput. Therefore, it is critical to improve throughput while keeping the degradation in compression performance within an acceptable range. This work demonstrates that by appropriately reducing the frequency of weight updates, the data dependencies in the predictor can be mitigated, thus shortening the critical path in hardware implementation and eliminating feedback loops. Experimental results show that under the band interleaved by line (BIL) data format, the proposed method achieves a throughput of 348.4 MSamples/s using only 3995 look-up tables (LUTs).
{"title":"Removal of the Feedback Loop in CCSDS 123.0-B-2 During Hardware Implementation","authors":"Liang Jia;Qi Wang;Lei Zhang;Chengpeng Song;Peng Zhang","doi":"10.1109/LGRS.2025.3601194","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601194","url":null,"abstract":"The Consultative Committee for Space Data Systems (CCSDS) proposed the CCSDS 123.0-B-2 standard for compressing large volumes of data acquired by multispectral and hyperspectral sensors. However, data dependencies in the CCSDS 123.0-B-2 predictor lead to feedback loops during the weight update process. This poses challenges for fully pipelined hardware implementation of the predictor and severely limits the achievable data throughput. Therefore, it is critical to improve throughput while keeping the degradation in compression performance within an acceptable range. This work demonstrates that by appropriately reducing the frequency of weight updates, the data dependencies in the predictor can be mitigated, thus shortening the critical path in hardware implementation and eliminating feedback loops. Experimental results show that under the band interleaved by line (BIL) data format, the proposed method achieves a throughput of 348.4 MSamples/s using only 3995 look-up tables (LUTs).","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-21DOI: 10.1109/LGRS.2025.3601138
Jiayu Sun;Hao Huan;Ran Tao;Yue Wang
In passive localization, the synthetic aperture positioning (SAP) method enables high-precision positioning under low signal-to-noise ratio (SNR) conditions. However, higher order phase errors induced by platform self-localization errors degrade image focusing and reduce localization accuracy. In this letter, a polynomial fitting approach based on designing optimal prewhitening filters using autoregressive (AR) models and employing iteratively reweighted least squares (IRLS) is applied to the unwrapped phase to eliminate higher order error components. In addition, a multiple subaperture phase stitching method is proposed to mitigate phase susceptibility to noise interference and error accumulation during phase unwrapping. The effectiveness of the proposed method is validated through both simulations and UAV experiments. Results demonstrate that meter-level localization accuracy can be achieved for the emitter target.
{"title":"Polynomial Fitting Emitter Localization Method Based on Multisubaperture Phase Stitching","authors":"Jiayu Sun;Hao Huan;Ran Tao;Yue Wang","doi":"10.1109/LGRS.2025.3601138","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601138","url":null,"abstract":"In passive localization, the synthetic aperture positioning (SAP) method enables high-precision positioning under low signal-to-noise ratio (SNR) conditions. However, higher order phase errors induced by platform self-localization errors degrade image focusing and reduce localization accuracy. In this letter, a polynomial fitting approach based on designing optimal prewhitening filters using autoregressive (AR) models and employing iteratively reweighted least squares (IRLS) is applied to the unwrapped phase to eliminate higher order error components. In addition, a multiple subaperture phase stitching method is proposed to mitigate phase susceptibility to noise interference and error accumulation during phase unwrapping. The effectiveness of the proposed method is validated through both simulations and UAV experiments. Results demonstrate that meter-level localization accuracy can be achieved for the emitter target.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145021344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Urban slums present critical challenges for sustainable development, particularly in rapidly urbanizing cities like Makassar, Indonesia. This study develops an automated slum mapping approach that integrates high-resolution SPOT-6/7 satellite imagery (1.5-m spatial resolution) with multimodal geospatial data using a U-Net convolutional neural network. Our methodology combines spectral and textural features from satellite imagery with nighttime light emissions, infrastructure proximity analysis, land use classifications, and socioeconomic indicators. The integrated approach achieves an overall accuracy of 97.1%–98.3% across both the datasets. However, slum-specific classification remains challenging with producer’s accuracy of 55.8%–59.1% and user’s accuracy of 22.9%–35.7%, yielding F1-scores of 0.33–0.43 for slum detection. Despite these limitations, the approach demonstrates significant enhancements over traditional census-based methods through automated processing, improved spatial resolution (1.5 m versus administrative units), and increased temporal frequency (annual versus decadal updates). The framework provides actionable insights for urban planning and social assistance targeting while establishing a foundation for automated slum monitoring system iterative improvement.
{"title":"Enhanced Slum Mapping Through U-Net CNN and Multimodal Remote Sensing Data: A Case Study of Makassar City","authors":"Yohanes Fridolin Hestrio;Eduard Thomas Prakoso;Kiki Winda Veronica;Ika Siwi Supriyani;Destri Yanti Hutapea;Siti Desty Wahyuningsih;Nico Cendiana;Steward Augusto;Krisna Malik Sukarno;Olivia Maftukhaturrizqoh;Rubini Jusuf;Orbita Roswintiarti;Wisnu Jatmiko","doi":"10.1109/LGRS.2025.3601167","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601167","url":null,"abstract":"Urban slums present critical challenges for sustainable development, particularly in rapidly urbanizing cities like Makassar, Indonesia. This study develops an automated slum mapping approach that integrates high-resolution SPOT-6/7 satellite imagery (1.5-m spatial resolution) with multimodal geospatial data using a U-Net convolutional neural network. Our methodology combines spectral and textural features from satellite imagery with nighttime light emissions, infrastructure proximity analysis, land use classifications, and socioeconomic indicators. The integrated approach achieves an overall accuracy of 97.1%–98.3% across both the datasets. However, slum-specific classification remains challenging with producer’s accuracy of 55.8%–59.1% and user’s accuracy of 22.9%–35.7%, yielding F1-scores of 0.33–0.43 for slum detection. Despite these limitations, the approach demonstrates significant enhancements over traditional census-based methods through automated processing, improved spatial resolution (1.5 m versus administrative units), and increased temporal frequency (annual versus decadal updates). The framework provides actionable insights for urban planning and social assistance targeting while establishing a foundation for automated slum monitoring system iterative improvement.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-21DOI: 10.1109/LGRS.2025.3601507
Pratyush V. Talreja;Surya S. Durbha
Hurricanes cause significant damage to communities, necessitating rapid and accurate damage assessment to support timely disaster response. However, image-based deep learning models for hurricane-induced damage assessment face substantial challenges due to domain shifts across different hurricane events, and the restricted availability of labeled data for each disaster further complicates this task. In this study, we propose a novel domain-adaptive deep learning framework that mitigates the domain gap while requiring minimal labeled samples from the target domain. Our approach integrates a self-supervised learning (SSL) pretext task to enhance feature robustness and leverages a novel bilateral local Moran’s I (BLMI) module to improve spatial feature aggregation for damage localization. We evaluate our method using aerial datasets from Hurricanes Harvey, Matthew, and Michael. The experimental results demonstrate that our model achieves more than 5% improvement in damage classification accuracy over baseline methods. These findings highlight the potential of our approach for scalable and efficient hurricane damage assessment in real-world disaster scenarios.
飓风对社区造成重大破坏,需要快速准确的损害评估,以支持及时的灾害应对。然而,基于图像的飓风损伤评估深度学习模型面临着巨大的挑战,因为不同飓风事件之间的域转移,并且每个灾难标记数据的有限可用性进一步使这项任务复杂化。在本研究中,我们提出了一种新的领域自适应深度学习框架,该框架可以减轻领域差距,同时需要来自目标领域的最小标记样本。我们的方法集成了一个自监督学习(SSL)借口任务来增强特征鲁棒性,并利用一个新的双边局部Moran 's I (BLMI)模块来改进用于损伤定位的空间特征聚合。我们使用哈维、马修和迈克尔飓风的航空数据集来评估我们的方法。实验结果表明,该模型的损伤分类精度比基线方法提高了5%以上。这些发现突出了我们的方法在现实世界灾害情景中可扩展和有效的飓风损害评估的潜力。
{"title":"SpADANet: A Spatially Aware Domain Adaptation Network for Hurricane Damage Assessment","authors":"Pratyush V. Talreja;Surya S. Durbha","doi":"10.1109/LGRS.2025.3601507","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601507","url":null,"abstract":"Hurricanes cause significant damage to communities, necessitating rapid and accurate damage assessment to support timely disaster response. However, image-based deep learning models for hurricane-induced damage assessment face substantial challenges due to domain shifts across different hurricane events, and the restricted availability of labeled data for each disaster further complicates this task. In this study, we propose a novel domain-adaptive deep learning framework that mitigates the domain gap while requiring minimal labeled samples from the target domain. Our approach integrates a self-supervised learning (SSL) pretext task to enhance feature robustness and leverages a novel bilateral local Moran’s I (BLMI) module to improve spatial feature aggregation for damage localization. We evaluate our method using aerial datasets from Hurricanes Harvey, Matthew, and Michael. The experimental results demonstrate that our model achieves more than 5% improvement in damage classification accuracy over baseline methods. These findings highlight the potential of our approach for scalable and efficient hurricane damage assessment in real-world disaster scenarios.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Remote sensing image captioning (RSIC) is an important task in environmental monitoring and disaster assessment. However, existing methods are constrained by redundant feature interference, insufficient multiscale feature integration, and cross-modal semantic gaps, leading to limited performance in scenarios requiring fine-grained descriptions and semantic integrity, such as disaster assessment and emergency response. In this letter, we propose a cross-modal semantic alignment model for RSIC (CSA-RSIC), addressing these challenges with three innovations. First, we designed an adaptive feature selection module (AFSM) that generates channel weights through dual pooling. The AFSM dynamically weights the most informative features at each scale to improve caption accuracy. Second, we propose a cross-scale feature aggregation module (CFAM) that constructs a hierarchical feature pyramid by aligning multiscale resolutions and performs attention-guided fusion with enhanced weighting via AFSM, ensuring the effective integration of fine-grained and global semantic information. Finally, a novel loss function that combines contrastive learning and consistency loss is proposed to enhance the semantic alignment between visual and textual features. Experiments on three datasets show the advancement of CSA-RSIC over strong baselines, indicating its effectiveness in enhancing both semantic completeness and accuracy.
{"title":"CSA-RSIC: Cross-Modal Semantic Alignment for Remote Sensing Image Captioning","authors":"Kangda Cheng;Jinlong Liu;Rui Mao;Zhilu Wu;Erik Cambria","doi":"10.1109/LGRS.2025.3601114","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601114","url":null,"abstract":"Remote sensing image captioning (RSIC) is an important task in environmental monitoring and disaster assessment. However, existing methods are constrained by redundant feature interference, insufficient multiscale feature integration, and cross-modal semantic gaps, leading to limited performance in scenarios requiring fine-grained descriptions and semantic integrity, such as disaster assessment and emergency response. In this letter, we propose a cross-modal semantic alignment model for RSIC (CSA-RSIC), addressing these challenges with three innovations. First, we designed an adaptive feature selection module (AFSM) that generates channel weights through dual pooling. The AFSM dynamically weights the most informative features at each scale to improve caption accuracy. Second, we propose a cross-scale feature aggregation module (CFAM) that constructs a hierarchical feature pyramid by aligning multiscale resolutions and performs attention-guided fusion with enhanced weighting via AFSM, ensuring the effective integration of fine-grained and global semantic information. Finally, a novel loss function that combines contrastive learning and consistency loss is proposed to enhance the semantic alignment between visual and textual features. Experiments on three datasets show the advancement of CSA-RSIC over strong baselines, indicating its effectiveness in enhancing both semantic completeness and accuracy.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-21DOI: 10.1109/LGRS.2025.3601239
Maximilian Kromer;Panagiotis Agrafiotis;Begüm Demir
Accurate image-based bathymetric mapping in shallow waters remains challenging due to the complex optical distortions, such as wave-induced patterns, scattering, and sunglint, introduced by the dynamic water surface, the water column properties, and solar illumination. In this work, we introduce Sea-Undistort, a comprehensive synthetic dataset of 1200 paired $512times 512$ through-water scenes rendered in Blender. Each pair comprises a distortion-free and a distorted view, featuring realistic water effects, such as sun glint, waves, and scattering over diverse seabeds. Accompanied by per-image metadata, such as camera parameters, sun position, and average depth, Sea-Undistort enables supervised training that is otherwise infeasible in real environments. We use Sea-Undistort to benchmark two state-of-the-art image restoration methods alongside an enhanced lightweight diffusion-based framework with an early fusion sun-glint mask. When applied to real aerial data, the enhanced diffusion model delivers more complete digital surface models (DSMs) of the seabed, especially in deeper areas, reduces bathymetric errors, suppresses glint and scattering, and crisply restores fine seabed details. Dataset, weights, and code are publicly available at https://www.magicbathy.eu/Sea-Undistort.html.
{"title":"Sea-Undistort: A Dataset for Through-Water Image Restoration in High-Resolution Airborne Bathymetric Mapping","authors":"Maximilian Kromer;Panagiotis Agrafiotis;Begüm Demir","doi":"10.1109/LGRS.2025.3601239","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601239","url":null,"abstract":"Accurate image-based bathymetric mapping in shallow waters remains challenging due to the complex optical distortions, such as wave-induced patterns, scattering, and sunglint, introduced by the dynamic water surface, the water column properties, and solar illumination. In this work, we introduce Sea-Undistort, a comprehensive synthetic dataset of 1200 paired <inline-formula> <tex-math>$512times 512$ </tex-math></inline-formula> through-water scenes rendered in Blender. Each pair comprises a distortion-free and a distorted view, featuring realistic water effects, such as sun glint, waves, and scattering over diverse seabeds. Accompanied by per-image metadata, such as camera parameters, sun position, and average depth, Sea-Undistort enables supervised training that is otherwise infeasible in real environments. We use Sea-Undistort to benchmark two state-of-the-art image restoration methods alongside an enhanced lightweight diffusion-based framework with an early fusion sun-glint mask. When applied to real aerial data, the enhanced diffusion model delivers more complete digital surface models (DSMs) of the seabed, especially in deeper areas, reduces bathymetric errors, suppresses glint and scattering, and crisply restores fine seabed details. Dataset, weights, and code are publicly available at <uri>https://www.magicbathy.eu/Sea-Undistort.html</uri>.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11132387","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-21DOI: 10.1109/LGRS.2025.3601391
Qi Zhang;Yonggang Qian;Kun Li;Qiyao Li;Jianmin Wang;Dacheng Li
Land surface emissivity (LSE) is crucial for retrieving land surface temperature (LST) from Landsat 9 TIRS-2 thermal infrared (TIR) data. However, the single-band LSE product (band 10) provided officially is insufficient for the split-window (SW) algorithm requiring dual-band emissivity inputs. This letter proposes a land cover and channel transformed-LSE (LCCT-LSE) method to estimate band 11 LSE and enables LST retrieval using the SW algorithm on Google Earth Engine. Cross-validation with MOD21 LSE products showed that the LCCT-LSE method achieved a mean absolute error (MAE) of 0.004 and a root mean square error (RMSE) of 0.005, outperforming the classification-based method, NDVI threshold method, and vegetation cover vegetation cover-based method (VCM) methods. In situ validation showed SW-retrieved LST attains MAE/RMSE of 1.27/2.13 K, with consistent accuracy across diverse land covers (water: 0.86 K, soil: 1.58 K, desert: 1.71 K, sand: 1.80 K, and vegetation: 0.87 K). A comparison with the official Landsat 9 LST product indicated that the bias of retrieved LST is within 1 K for all land cover classes (cropland, forest, grassland, shrubland, water, barren, and impervious) in Beijing. These results demonstrated that the LCCT-LSE method is capable of estimating the LSE in Landsat 9 band 11 with a reliable and accurate result. This study provides a new insight for LST retrieval from Landsat 9 data.
{"title":"Land Surface Emissivity Retrieval From Landsat 9 Data in Combination With Land Cover Data and Spectral Library","authors":"Qi Zhang;Yonggang Qian;Kun Li;Qiyao Li;Jianmin Wang;Dacheng Li","doi":"10.1109/LGRS.2025.3601391","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601391","url":null,"abstract":"Land surface emissivity (LSE) is crucial for retrieving land surface temperature (LST) from Landsat 9 TIRS-2 thermal infrared (TIR) data. However, the single-band LSE product (band 10) provided officially is insufficient for the split-window (SW) algorithm requiring dual-band emissivity inputs. This letter proposes a land cover and channel transformed-LSE (LCCT-LSE) method to estimate band 11 LSE and enables LST retrieval using the SW algorithm on Google Earth Engine. Cross-validation with MOD21 LSE products showed that the LCCT-LSE method achieved a mean absolute error (MAE) of 0.004 and a root mean square error (RMSE) of 0.005, outperforming the classification-based method, NDVI threshold method, and vegetation cover vegetation cover-based method (VCM) methods. In situ validation showed SW-retrieved LST attains MAE/RMSE of 1.27/2.13 K, with consistent accuracy across diverse land covers (water: 0.86 K, soil: 1.58 K, desert: 1.71 K, sand: 1.80 K, and vegetation: 0.87 K). A comparison with the official Landsat 9 LST product indicated that the bias of retrieved LST is within 1 K for all land cover classes (cropland, forest, grassland, shrubland, water, barren, and impervious) in Beijing. These results demonstrated that the LCCT-LSE method is capable of estimating the LSE in Landsat 9 band 11 with a reliable and accurate result. This study provides a new insight for LST retrieval from Landsat 9 data.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145011321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}