Pub Date : 2026-01-06DOI: 10.1016/j.jag.2025.105044
Shuang Liu , Zeyu Yu , Ying Liu , Zhong Zhang , Chaojun Shi , Baihua Xiao
Recently, Transformer structures have been applied to ground-based remote sensing data to detect clouds, and the vanilla self-attention as the core component of Transformer utilize all tokens in the process of modeling long-range dependencies, resulting in involving less semantically correlated tokens. In this paper, we propose a novel Transformer network named Content-Aware Fusion Transformer (CAFTrans) for the ground-based remote sensing cloud detection task, which could effectively select the relevant tokens according to the content information of cloud sample. To this end, we propose the Content-Aware Selection Attention (CASA) in the Transformer encoder where we first construct the token-to-token similarity matrix with a learnable weight matrix and then dynamically select the tokens with high semantic relevance for each query according to the similarity matrix. Meanwhile, we introduce the Multi-Scale Fusion Mechanism (MSFM), which is built upon CASA and designed to capture long-range dependencies across multiple feature scales. To facilitate model training and evaluation, we present the Large-Scale Occluded Cloud Detection Dataset (LOCDD)—the first ground-based remote sensing dataset to consider three categories: clouds, sky, and occlusions. Comprehensive quantitative results and qualitative visualizations on the LOCDD dataset demonstrate the robust performance of the proposed CAFTrans model. The source code is freely available at: https://github.com/shuangliutjnu/CAFTrans.
{"title":"CAFTrans: Content-Aware Fusion Transformer for ground-based remote sensing cloud detection","authors":"Shuang Liu , Zeyu Yu , Ying Liu , Zhong Zhang , Chaojun Shi , Baihua Xiao","doi":"10.1016/j.jag.2025.105044","DOIUrl":"10.1016/j.jag.2025.105044","url":null,"abstract":"<div><div>Recently, Transformer structures have been applied to ground-based remote sensing data to detect clouds, and the vanilla self-attention as the core component of Transformer utilize all tokens in the process of modeling long-range dependencies, resulting in involving less semantically correlated tokens. In this paper, we propose a novel Transformer network named Content-Aware Fusion Transformer (CAFTrans) for the ground-based remote sensing cloud detection task, which could effectively select the relevant tokens according to the content information of cloud sample. To this end, we propose the Content-Aware Selection Attention (CASA) in the Transformer encoder where we first construct the token-to-token similarity matrix with a learnable weight matrix and then dynamically select the tokens with high semantic relevance for each query according to the similarity matrix. Meanwhile, we introduce the Multi-Scale Fusion Mechanism (MSFM), which is built upon CASA and designed to capture long-range dependencies across multiple feature scales. To facilitate model training and evaluation, we present the Large-Scale Occluded Cloud Detection Dataset (LOCDD)—the first ground-based remote sensing dataset to consider three categories: clouds, sky, and occlusions. Comprehensive quantitative results and qualitative visualizations on the LOCDD dataset demonstrate the robust performance of the proposed CAFTrans model. The source code is freely available at: <span><span>https://github.com/shuangliutjnu/CAFTrans</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"146 ","pages":"Article 105044"},"PeriodicalIF":8.6,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145926085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.jag.2025.105077
Shuang Zhu , Chuang Song , Chen Yu , Zhenhong Li , Yi Chen , Zhenjiang Liu , Kui Liu , Guangqian Zou , Jianbing Peng
Landslides represent a major type of earthquake-triggered geohazard, with their dynamic evolution influenced by the complex interaction of postseismic environmental factors. The 2017 Mw 6.4 Nyingchi earthquake triggered numerous landslides, yet research has largely focused on coseismic mapping, and there remains a lack of cataloging and characterization of postseismic landslides accelerated by the earthquake (i.e., Earthquake Accelerated Landslides, EALs). These landslides remain a serious hazard to surrounding communities, with impacts that persist well beyond the earthquake. This study employs Sentinel-1 time-series analysis to reveal the spatial distribution and postseismic deformation mechanisms of EALs. A total of 299 EALs were identified and cataloged, which are primarily distributed in high mountainous regions west of the epicenter and show a distinct linear clustering along active faults. By comparing near-field and far-field EALs, we found that far-field landslides may experience greater acceleration despite weaker ground shaking. Time-series analysis indicated a rise in EAL deformation velocity from 1.56 cm/yr (preseismic) to 5.08 cm/yr (postseismic), reflecting a strong accelerating effect of the earthquake. In addition to the deformation acceleration triggered by the mainshock, postseismic landslide deformation also exhibited seasonality correlated to precipitation and land surface temperature, as well as aftershock-induced variations. Exponential modeling further indicates a decreasing deformation trend of EALs, with stabilization occurring approximately 6.8 years after the mainshock. This research systematically examined landslide evolution after the Nyingchi earthquake, providing a basis for post-earthquake landslide hazard assessment, with particular emphasis on the increased far-field landslide hazards.
{"title":"Increased far-field landslide hazards due to postseismic acceleration by the 2017 Mw 6.4 Nyingchi earthquake","authors":"Shuang Zhu , Chuang Song , Chen Yu , Zhenhong Li , Yi Chen , Zhenjiang Liu , Kui Liu , Guangqian Zou , Jianbing Peng","doi":"10.1016/j.jag.2025.105077","DOIUrl":"10.1016/j.jag.2025.105077","url":null,"abstract":"<div><div>Landslides represent a major type of earthquake-triggered geohazard, with their dynamic evolution influenced by the complex interaction of postseismic environmental factors. The 2017 Mw 6.4 Nyingchi earthquake triggered numerous landslides, yet research has largely focused on coseismic mapping, and there remains a lack of cataloging and characterization of postseismic landslides accelerated by the earthquake (i.e., Earthquake Accelerated Landslides, EALs). These landslides remain a serious hazard to surrounding communities, with impacts that persist well beyond the earthquake. This study employs Sentinel-1 time-series analysis to reveal the spatial distribution and postseismic deformation mechanisms of EALs. A total of 299 EALs were identified and cataloged, which are primarily distributed in high mountainous regions west of the epicenter and show a distinct linear clustering along active faults. By comparing near-field and far-field EALs, we found that far-field landslides may experience greater acceleration despite weaker ground shaking. Time-series analysis indicated a rise in EAL deformation velocity from 1.56 cm/yr (preseismic) to 5.08 cm/yr (postseismic), reflecting a strong accelerating effect of the earthquake. In addition to the deformation acceleration triggered by the mainshock, postseismic landslide deformation also exhibited seasonality correlated to precipitation and land surface temperature, as well as aftershock-induced variations. Exponential modeling further indicates a decreasing deformation trend of EALs, with stabilization occurring approximately 6.8 years after the mainshock. This research systematically examined landslide evolution after the Nyingchi earthquake, providing a basis for post-earthquake landslide hazard assessment, with particular emphasis on the increased far-field landslide hazards.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"146 ","pages":"Article 105077"},"PeriodicalIF":8.6,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145926447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.jag.2025.105072
Yanfang Sun , Guosheng Wu , Yongze Song , Haiyang Liu , Lin Wang , Zehua Zhang , Jiao Hu
Spatial association and spatial interaction are fundamental to understanding geographical phenomena and regional development disparities, with broad applicability across disciplines. Existing spatial heterogeneity analysis face significant challenges in capturing pattern interactions and local variability. This study develops a local pattern interaction (LPI) model that integrates local complexity patterns or geocomplexity of spatial data, the interaction of patterns, and their locally varied power of determinants (PD). LPI is implemented in assessing the PD of local variables and pattern interactions on the spatial distributions of urbanization using statistical data, remote sensing imagery, and open geospatial data. The results show that LPI effectively identifies the local PD of interactions involving the geocomplexity patterns of urbanization-related explanatory variables. Model performance is evaluated by comparison with the optimal-parameters geographical detector (OPGD), a widely used spatial heterogeneity–based PD identification model. The model validation shows that LPI provides advantages over OPGD by capturing spatially varying interaction patterns and local effects, whereas OPGD assesses only global interaction effects. For example, the LPI-derived PD for the interaction between total retail sales and the geocomplexity pattern of tertiary-industry output averages 0.610 [0.336,0.783], indicating critical spatial variation in both local PD values and their significance, while the OPGD-derived PD yields a single global estimate of 0.537 (p < 0.01). This research advances theoretical understanding of spatial association and interaction, while providing an innovative analytical tool and decision-support capability for regional development, urban planning, and resource allocation.
{"title":"Local effects of pattern interactions in driving urbanization","authors":"Yanfang Sun , Guosheng Wu , Yongze Song , Haiyang Liu , Lin Wang , Zehua Zhang , Jiao Hu","doi":"10.1016/j.jag.2025.105072","DOIUrl":"10.1016/j.jag.2025.105072","url":null,"abstract":"<div><div>Spatial association and spatial interaction are fundamental to understanding geographical phenomena and regional development disparities, with broad applicability across disciplines. Existing spatial heterogeneity analysis face significant challenges in capturing pattern interactions and local variability. This study develops a local pattern interaction (LPI) model that integrates local complexity patterns or geocomplexity of spatial data, the interaction of patterns, and their locally varied power of determinants (PD). LPI is implemented in assessing the PD of local variables and pattern interactions on the spatial distributions of urbanization using statistical data, remote sensing imagery, and open geospatial data. The results show that LPI effectively identifies the local PD of interactions involving the geocomplexity patterns of urbanization-related explanatory variables. Model performance is evaluated by comparison with the optimal-parameters geographical detector (OPGD), a widely used spatial heterogeneity–based PD identification model. The model validation shows that LPI provides advantages over OPGD by capturing spatially varying interaction patterns and local effects, whereas OPGD assesses only global interaction effects. For example, the LPI-derived PD for the interaction between total retail sales and the geocomplexity pattern of tertiary-industry output averages 0.610 [0.336,0.783], indicating critical spatial variation in both local PD values and their significance, while the OPGD-derived PD yields a single global estimate of 0.537 (p < 0.01). This research advances theoretical understanding of spatial association and interaction, while providing an innovative analytical tool and decision-support capability for regional development, urban planning, and resource allocation.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"146 ","pages":"Article 105072"},"PeriodicalIF":8.6,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.jag.2025.105038
Kai Deng , Xiangyun Hu , Yibing Xiong , Aokun Liang , Jiong Xu
Semantic image synthesis (SIS) is essential for remote sensing, particularly in generating high-quality training data for scarce annotated datasets. While existing SIS methods have advanced pixel-wise mappings between semantic maps and images, they often overlook spatial priors, such as relationships between geographic objects (e.g., road-building adjacency), leading to structural inconsistencies in synthesized images. To address this, we propose the graph-prior diffusion transformer (GDiT) for semantically controllable remote sensing image synthesis. We first convert semantic maps into semantic graphs, encoding geographic objects as nodes with structured spatial interactions. To capture spatial and semantic relationships, we propose the Geometric-Semantic Aware Module (GSAM), which integrates CLIP-extracted semantics and geometric attributes for a more context-aware representation. Furthermore, we design the Graph Diffusion Transformer (GDiT) Block, which employs graph-to-image cross-attention to refine spatial structures, ensuring topological coherence and semantic fidelity in synthesized images. Experiments on the landcover and landuse dataset show that GDiT achieves competitive performance by incorporating text prompts to enable multilevel control across global, object and pixel dimensions, generating high-fidelity images while using only 38.9% of the parameters compared to GeoSynth, significantly improving efficiency and accuracy. The code and dataset will be released at https://github.com/whudk/GDiT.
{"title":"GDiT: A graph-prior-guided diffusion transformer for semantic-controllable remote sensing image synthesis","authors":"Kai Deng , Xiangyun Hu , Yibing Xiong , Aokun Liang , Jiong Xu","doi":"10.1016/j.jag.2025.105038","DOIUrl":"10.1016/j.jag.2025.105038","url":null,"abstract":"<div><div>Semantic image synthesis (SIS) is essential for remote sensing, particularly in generating high-quality training data for scarce annotated datasets. While existing SIS methods have advanced pixel-wise mappings between semantic maps and images, they often overlook spatial priors, such as relationships between geographic objects (e.g., road-building adjacency), leading to structural inconsistencies in synthesized images. To address this, we propose the graph-prior diffusion transformer (GDiT) for semantically controllable remote sensing image synthesis. We first convert semantic maps into semantic graphs, encoding geographic objects as nodes with structured spatial interactions. To capture spatial and semantic relationships, we propose the Geometric-Semantic Aware Module (GSAM), which integrates CLIP-extracted semantics and geometric attributes for a more context-aware representation. Furthermore, we design the Graph Diffusion Transformer (GDiT) Block, which employs graph-to-image cross-attention to refine spatial structures, ensuring topological coherence and semantic fidelity in synthesized images. Experiments on the landcover and landuse dataset show that GDiT achieves competitive performance by incorporating text prompts to enable multilevel control across global, object and pixel dimensions, generating high-fidelity images while using only 38.9% of the parameters compared to GeoSynth, significantly improving efficiency and accuracy. The code and dataset will be released at <span><span>https://github.com/whudk/GDiT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"146 ","pages":"Article 105038"},"PeriodicalIF":8.6,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.jag.2025.105046
Zhe Wang , Carmen Galaz García , Benjamin S. Halpern
Monitoring Earth’s dynamic systems benefits from high spatial and temporal resolution imagery, a combination rarely available from a single sensor. This study addresses the novel challenge of cross-platform super-resolution (SR), aiming to enhance high-frequency satellite data with high-resolution aerial data. We employed a diffusion-based deep learning model, Super-Resolution via Iterative Refinement (SR3), to upscale 3-m Planet satellite imagery to the 60-cm resolution of the National Agriculture Imagery Program (NAIP) images, a fivefold enhancement. The findings reveal that the “domain gap” between aerial and satellite data is a significant obstacle. While the model performed robustly on single-source data (peak signal-to-noise-ratio, or PSNR, of 27.28 dB), its cross-platform performance was substantially lower (best PSNR of 16.85 dB). Interestingly, models trained from scratch consistently outperformed fine-tuned models, suggesting negative transfer due to the differences between the aerial and satellite data sources. Furthermore, environmental metrics like the Normalized Difference Vegetation Index (NDVI) proved to be more effective performance indicators than standard computer vision metrics (such as PSNR) and structural similarity index measure (SSIM), showing better preservation of critical spectral information for vegetation analysis. This work demonstrates both the potential and the distinct challenges of using diffusion models for super-resolution across different remote sensing platforms. Our findings underscore the importance of tailored approaches in super-resolution and provide insights into leveraging state-of-the-art deep learning techniques for ecological monitoring and resource management.
{"title":"Cross-platform super-resolution: A diffusion model approach for enhancing satellite imagery with aerial data","authors":"Zhe Wang , Carmen Galaz García , Benjamin S. Halpern","doi":"10.1016/j.jag.2025.105046","DOIUrl":"10.1016/j.jag.2025.105046","url":null,"abstract":"<div><div>Monitoring Earth’s dynamic systems benefits from high spatial and temporal resolution imagery, a combination rarely available from a single sensor. This study addresses the novel challenge of cross-platform super-resolution (SR), aiming to enhance high-frequency satellite data with high-resolution aerial data. We employed a diffusion-based deep learning model, Super-Resolution via Iterative Refinement (SR3), to upscale 3-m Planet satellite imagery to the 60-cm resolution of the National Agriculture Imagery Program (NAIP) images, a fivefold enhancement. The findings reveal that the “domain gap” between aerial and satellite data is a significant obstacle. While the model performed robustly on single-source data (peak signal-to-noise-ratio, or PSNR, of 27.28 dB), its cross-platform performance was substantially lower (best PSNR of 16.85 dB). Interestingly, models trained from scratch consistently outperformed fine-tuned models, suggesting negative transfer due to the differences between the aerial and satellite data sources. Furthermore, environmental metrics like the Normalized Difference Vegetation Index (NDVI) proved to be more effective performance indicators than standard computer vision metrics (such as PSNR) and structural similarity index measure (SSIM), showing better preservation of critical spectral information for vegetation analysis. This work demonstrates both the potential and the distinct challenges of using diffusion models for super-resolution across different remote sensing platforms. Our findings underscore the importance of tailored approaches in super-resolution and provide insights into leveraging state-of-the-art deep learning techniques for ecological monitoring and resource management.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"146 ","pages":"Article 105046"},"PeriodicalIF":8.6,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145926084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Following the September 20, 2019 instability event, the Jungong landslide—a large-scale red-bed feature in the upper Yellow River Basin—has exhibited persistent creep, necessitating systematic kinematic analysis to constrain deformation drivers. In this context, we conducted a multidisciplinary approach integrating interferometric synthetic aperture radar (InSAR), unmanned aerial vehicle (UAV) surveys, optical satellite remote sensing, and high-density electrical resistivity tomography (HD-ERT) to investigate its kinematic evolution. Firstly, interferometric processing of SAR imagery from ALOS/PALSAR-1, ALOS/PALSAR-2 and Sentinel-1 systems (March 2007-August 2024) revealed continuous creeping with maximum deformation velocity reaching −129 mm/yr in descending Sentinel-1. Based on morphological and deformation characteristics, the slope was divided into four secondary zones. Through digital image correlation (DIC) of optical images, horizontal displacements exceeding 20 m induced by instability were detected at the front edge of Zone I. The three-dimensional (3D) deformation field was then inverted by combining multi-orbit InSAR observations and a topography-constrained model, revealing significant spatial heterogeneity of displacement characteristics. The maximum velocities in the eastward, northward, and vertical directions were −107, 53, and −71 mm/yr, respectively. Additionally, the internal structure along two profiles was detected using HD-ERT. Finally, a method combining Singular Spectrum Analysis (SSA) and wavelet transform was proposed to quantitatively analyze the temporal relationship between periodic displacements and rainfall. Different zones exhibited varying degrees of correlation with rainfall, with a time lag of approximately 45 days in Zone I. This multidisciplinary approach enhances our understanding of the kinematic behavior of the Jungong landslide, providing critical reference for future hazard assessment.
{"title":"Multi-Platform geodetic synergy of InSAR, UAV, optical, and HD-ERT constrains kinematic evolution of the Jungong landslide (Yellow River Basin)","authors":"Xiaoyu Liu , Wu Zhu , Yuxin Zhou , Jiewei Zhan , Zhanxi Wei , Jing Wu , Haixing Shang , Chao Du","doi":"10.1016/j.jag.2025.105082","DOIUrl":"10.1016/j.jag.2025.105082","url":null,"abstract":"<div><div>Following the September 20, 2019 instability event, the Jungong landslide—a large-scale red-bed feature in the upper Yellow River Basin—has exhibited persistent creep, necessitating systematic kinematic analysis to constrain deformation drivers. In this context, we conducted a multidisciplinary approach integrating interferometric synthetic aperture radar (InSAR), unmanned aerial vehicle (UAV) surveys, optical satellite remote sensing, and high-density electrical resistivity tomography (HD-ERT) to investigate its kinematic evolution. Firstly, interferometric processing of SAR imagery from ALOS/PALSAR-1, ALOS/PALSAR-2 and Sentinel-1 systems (March 2007-August 2024) revealed continuous creeping with maximum deformation velocity reaching −129 mm/yr in descending Sentinel-1. Based on morphological and deformation characteristics, the slope was divided into four secondary zones. Through digital image correlation (DIC) of optical images, horizontal displacements exceeding 20 m induced by instability were detected at the front edge of Zone I. The three-dimensional (3D) deformation field was then inverted by combining multi-orbit InSAR observations and a topography-constrained model, revealing significant spatial heterogeneity of displacement characteristics. The maximum velocities in the eastward, northward, and vertical directions were −107, 53, and −71 mm/yr, respectively. Additionally, the internal structure along two profiles was detected using HD-ERT. Finally, a method combining Singular Spectrum Analysis (SSA) and wavelet transform was proposed to quantitatively analyze the temporal relationship between periodic displacements and rainfall. Different zones exhibited varying degrees of correlation with rainfall, with a time lag of approximately 45 days in Zone I. This multidisciplinary approach enhances our understanding of the kinematic behavior of the Jungong landslide, providing critical reference for future hazard assessment.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"146 ","pages":"Article 105082"},"PeriodicalIF":8.6,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145919767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-05DOI: 10.1016/j.jag.2025.105066
Jiawei Wu , Zijian Liu , Qixiang Tong , Zhipeng Zhu , Hui He , Xinghui Wu , Haihua Xing
Automatic extraction of coastlines from remote sensing images is of great practical importance for coastal risk assessment, ecological environmental protection, and marine economic development. However, the highly dynamic nature of coastlines and the complex, diverse characteristics of land–sea boundaries make precise coastline extraction a challenging task. Although traditional deep learning methods have demonstrated good performance in this respect, they still face numerous shortcomings when dealing with high computational costs and the need to fully utilize multiscale features. In this paper, to address these problems, we propose a novel and efficient land–sea segmentation model for remote sensing imagery based on a classical U-shaped network structure, named DA-MiTUNet. On the one hand, we introduce the convolutional block attention module into the Mix Transformer (MiT), forming a dual-attention encoder in conjunction with an efficient self-attention mechanism. This integration ensures comprehensive extraction of global context and local information, thereby enabling more precise determination of complex land–sea boundary features. On the other hand, we propose an adaptive feature fusion module to further promote the effective fusion of features across different hierarchical levels, achieving more refined land–sea boundary segmentation. Experimental results on the Gaofen-1 Hainan Coastline Dataset (GF–HNCD) and Benchmark Sea–Land Dataset (BSD) datasets demonstrated that the proposed DA-MiTUNet model outperforms other comparative models in terms of both the average F1 score and the mean Intersection over Union value, while achieving excellent segmentation results with relatively low computational complexity, thereby reflecting the potential of our model for dynamic coastal monitoring during extreme sea level events.
海岸带遥感影像自动提取对海岸带风险评估、生态环境保护和海洋经济发展具有重要的现实意义。然而,海岸线的高度动态性质和陆海边界的复杂多样特征使得精确的海岸线提取成为一项具有挑战性的任务。尽管传统的深度学习方法在这方面表现良好,但在处理高计算成本和需要充分利用多尺度特征时,它们仍然面临许多缺点。为了解决这些问题,本文提出了一种基于经典u型网络结构的新型高效遥感影像陆海分割模型DA-MiTUNet。一方面,我们将卷积块注意模块引入到Mix Transformer (MiT)中,结合高效的自注意机制形成双注意编码器。这种整合确保了对全球背景和局部信息的全面提取,从而能够更精确地确定复杂的陆海边界特征。另一方面,提出自适应特征融合模块,进一步促进不同层次特征的有效融合,实现更精细的陆海边界分割。在高分一号海南海岸线数据集(GF-HNCD)和基准海-地数据集(BSD)数据集上的实验结果表明,所提出的DA-MiTUNet模型在平均F1得分和平均Intersection over Union值方面都优于其他比较模型,同时在较低的计算复杂度下获得了良好的分割结果,从而体现了我们的模型在极端海平面事件下动态海岸监测的潜力。
{"title":"DA-MiTUNet: A Mix Transformer with dual attention embedding in unet for Land-Sea segmentation of remote sensing images","authors":"Jiawei Wu , Zijian Liu , Qixiang Tong , Zhipeng Zhu , Hui He , Xinghui Wu , Haihua Xing","doi":"10.1016/j.jag.2025.105066","DOIUrl":"10.1016/j.jag.2025.105066","url":null,"abstract":"<div><div>Automatic extraction of coastlines from remote sensing images is of great practical importance for coastal risk assessment, ecological environmental protection, and marine economic development. However, the highly dynamic nature of coastlines and the complex, diverse characteristics of land–sea boundaries make precise coastline extraction a challenging task. Although traditional deep learning methods have demonstrated good performance in this respect, they still face numerous shortcomings when dealing with high computational costs and the need to fully utilize multiscale features. In this paper, to address these problems, we propose a novel and efficient land–sea segmentation model for remote sensing imagery based on a classical U-shaped network structure, named DA-MiTUNet. On the one hand, we introduce the convolutional block attention module into the Mix Transformer (MiT), forming a dual-attention encoder in conjunction with an efficient self-attention mechanism. This integration ensures comprehensive extraction of global context and local information, thereby enabling more precise determination of complex land–sea boundary features. On the other hand, we propose an adaptive feature fusion module to further promote the effective fusion of features across different hierarchical levels, achieving more refined land–sea boundary segmentation. Experimental results on the Gaofen-1 Hainan Coastline Dataset (GF–HNCD) and Benchmark Sea–Land Dataset (BSD) datasets demonstrated that the proposed DA-MiTUNet model outperforms other comparative models in terms of both the average F1 score and the mean Intersection over Union value, while achieving excellent segmentation results with relatively low computational complexity, thereby reflecting the potential of our model for dynamic coastal monitoring during extreme sea level events.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"146 ","pages":"Article 105066"},"PeriodicalIF":8.6,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1016/j.jag.2025.105019
Xueli Guo , Zhichao Wen , Jianzhu Huai , Jian Kuang , Yizhou Xue , Xuanxuan Zhang , Bisheng Yang , You Li
In the era of embodied intelligence, wearable datasets for pedestrian navigation are essential. However, publicly available multisensory datasets tailored to such scenarios remain scarce. Traditional sensors such as RGB cameras and LiDAR often struggle to capture the fast and irregular dynamics of human motion. To address this gap, we introduce a large-scale pedestrian-wearable dataset primarily recorded using event cameras. The dataset includes event camera, RGB cameras, LiDAR sensors, a tactical-grade IMU, and GNSS, covering a wide range of indoor and outdoor environments with diverse motion types and illumination conditions. High-precision ground truth is obtained using motion capture systems indoors and GNSS/IMU integration with bidirectional smoothing outdoors. The dataset is structured into 23 subsets categorized by motion dynamics and lighting, supporting the development and evaluation of robust localization and SLAM algorithms. Benchmarking with state-of-the-art frameworks reveals notable performance degradation under highly dynamic or low-light conditions, highlighting the dataset’s value for advancing pedestrian navigation and event-based perception. The dataset and tools are publicly available at: https://github.com/xueli-guo/WECMD.git.
{"title":"WECMD: A multisensor dataset for wearable event cameras in the age of embodied intelligence","authors":"Xueli Guo , Zhichao Wen , Jianzhu Huai , Jian Kuang , Yizhou Xue , Xuanxuan Zhang , Bisheng Yang , You Li","doi":"10.1016/j.jag.2025.105019","DOIUrl":"10.1016/j.jag.2025.105019","url":null,"abstract":"<div><div>In the era of embodied intelligence, wearable datasets for pedestrian navigation are essential. However, publicly available multisensory datasets tailored to such scenarios remain scarce. Traditional sensors such as RGB cameras and LiDAR often struggle to capture the fast and irregular dynamics of human motion. To address this gap, we introduce a large-scale pedestrian-wearable dataset primarily recorded using event cameras. The dataset includes event camera, RGB cameras, LiDAR sensors, a tactical-grade IMU, and GNSS, covering a wide range of indoor and outdoor environments with diverse motion types and illumination conditions. High-precision ground truth is obtained using motion capture systems indoors and GNSS/IMU integration with bidirectional smoothing outdoors. The dataset is structured into 23 subsets categorized by motion dynamics and lighting, supporting the development and evaluation of robust localization and SLAM algorithms. Benchmarking with state-of-the-art frameworks reveals notable performance degradation under highly dynamic or low-light conditions, highlighting the dataset’s value for advancing pedestrian navigation and event-based perception. The dataset and tools are publicly available at: <span><span>https://github.com/xueli-guo/WECMD.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"146 ","pages":"Article 105019"},"PeriodicalIF":8.6,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145791194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Green tides in the Yellow Sea are recurrent hazardous algal blooms whose optical monitoring is often hindered by cloud cover, while existing SAR approaches remain sensitive to sea state and look-alike targets and frequently require per-scene tuning or curated labels, limiting transferability and temporal consistency. To address this, we develop a label-free, fully automatic Sentinel-1 workflow that operationalizes three empirical signatures of green tides: spatial anomaly pre-location using local standard deviation on VV, edge-guided intensity separation via edge-balanced Otsu, and temporal anomaly screening using Z-scores with adaptive thresholding; an automatic object-level filter then removes non-algal marine targets. Implemented on Google Earth Engine at 10 m resolution, the pipeline delivers rapid processing without manual parameters. Validation shows high mapping accuracy: with a global stratified sample set, F1 equals 0.96 in 2019 and 0.97 in 2021; with a local edge validation set, F1 equals 0.94; in an all-pixel assessment over more than 2.5 billion pixels against a baseline, overall F1 equals 0.91. Qualitative comparisons likewise show fewer omissions of low-contrast filaments and fewer perforations within mats than GA-Net and UDNet. An ablation analysis clarifies the role of each module: the spatial pre-locator supplies contiguous candidates; the edge-guided intensity module sharpens boundaries and limits leakage; the temporal module suppresses transient bright seawater and consolidates persistent mats. Used jointly, the three constraints provide complementary information that yields the most stable cross-year performance and a favorable balance between precision and recall. Overall, the framework offers a simple, scalable, and operational pathway for fine-scale, all-weather monitoring and consistent multi-year assessment of green tides in the Yellow Sea.
{"title":"A fully automatic and label-free Sentinel-1 SAR framework for green-tide mapping","authors":"Pengfei Tang , Peijun Du , Shanchuan Guo , Lu Qie , Wei Zhang , Peng Zhang , Mathias Réus , Jocelyn Chanussot","doi":"10.1016/j.jag.2025.105036","DOIUrl":"10.1016/j.jag.2025.105036","url":null,"abstract":"<div><div>Green tides in the Yellow Sea are recurrent hazardous algal blooms whose optical monitoring is often hindered by cloud cover, while existing SAR approaches remain sensitive to sea state and look-alike targets and frequently require per-scene tuning or curated labels, limiting transferability and temporal consistency. To address this, we develop a label-free, fully automatic Sentinel-1 workflow that operationalizes three empirical signatures of green tides: spatial anomaly pre-location using local standard deviation on VV, edge-guided intensity separation via edge-balanced Otsu, and temporal anomaly screening using Z-scores with adaptive thresholding; an automatic object-level filter then removes non-algal marine targets. Implemented on Google Earth Engine at 10 m resolution, the pipeline delivers rapid processing without manual parameters. Validation shows high mapping accuracy: with a global stratified sample set, F1 equals 0.96 in 2019 and 0.97 in 2021; with a local edge validation set, F1 equals 0.94; in an all-pixel assessment over more than 2.5 billion pixels against a baseline, overall F1 equals 0.91. Qualitative comparisons likewise show fewer omissions of low-contrast filaments and fewer perforations within mats than GA-Net and UDNet. An ablation analysis clarifies the role of each module: the spatial pre-locator supplies contiguous candidates; the edge-guided intensity module sharpens boundaries and limits leakage; the temporal module suppresses transient bright seawater and consolidates persistent mats. Used jointly, the three constraints provide complementary information that yields the most stable cross-year performance and a favorable balance between precision and recall. Overall, the framework offers a simple, scalable, and operational pathway for fine-scale, all-weather monitoring and consistent multi-year assessment of green tides in the Yellow Sea.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"146 ","pages":"Article 105036"},"PeriodicalIF":8.6,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145791192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-18DOI: 10.1016/j.jag.2025.105042
Kaiyang Qiu , Qingbin Zhang , Yingzhong Xie , Mingjie Shi , Chengyun Wang , Tong Dong , Jun Ma , Panxing He
Arid ecosystems occupy about two-fifths of the global land surface, and fluctuations in their productivity play a pivotal role in global carbon sequestration and ecosystem service provision. However, the global-scale effect of desertification expansion on the annual maximum photosynthetic peak has not yet been systematically quantified. In this study, 30-m high-resolution desert cover data (GLCLUC) and multi-source remote-sensing photosynthetic indicators were integrated, using a space-for-time substitution framework to establish a global desertification scenario classification system. We quantitatively evaluated the influence of diverse desert expansion and contraction scenarios on the ecosystem photosynthetic peak (GPPmax). Results indicate that the average GPPmax in high-intensity expansion regions (HIEs) is 8.23 g C m−2 8d−1, whereas medium- to low-intensity expansion regions (MIEs) show a value of 8.95 g C m−2 8d−1. By contrast, medium- to low-intensity contraction regions (MIRs) and high-intensity contraction regions (HIRs) demonstrate markedly higher GPPmax values of 10.64 g C m−2 8d−1 and 17.64 g C m−2 8d−1, respectively. Regarding the photosynthetic peak difference (ΔGPPmax), expansion scenarios (HIEs, MIEs) significantly decrease ecosystem photosynthetic potential, with average ΔGPPmax reductions of 1.19–3.95 g C m−2 8d−1 relative to contraction scenarios (HIRs, MIRs). The most pronounced losses occur in South America, North America, and Eurasia, with South America exhibiting reductions exceeding 6 g C m−2 8d−1. Additionally, ecosystems with initially higher photosynthetic potential experience greater GPPmax declines under intense desert expansion. This study provides the first global-scale evidence revealing how different desertification pathways modify ecosystem photosynthetic peaks and their regional disparities, offering critical scientific support for ecological restoration, carbon sequestration strategies, and land management across arid landscapes.
干旱生态系统约占全球陆地面积的五分之二,其生产力的波动在全球固碳和提供生态系统服务方面发挥着关键作用。然而,在全球尺度上,沙漠化扩张对年最大光合峰值的影响尚未得到系统的量化。本研究将30 m高分辨率沙漠覆盖数据(GLCLUC)与多源遥感光合指标相结合,采用时空替代框架建立全球沙漠化情景分类体系。定量评价了不同荒漠扩张收缩情景对生态系统光合峰值(GPPmax)的影响。结果表明,高强度膨胀区(HIEs)的平均GPPmax为8.23 g C m−2 8d−1,中低强度膨胀区(MIEs)的平均GPPmax为8.95 g C m−2 8d−1。相比之下,中低强度收缩区(MIRs)和高强度收缩区(HIRs)的GPPmax值分别为10.64 g C m−2 8d−1和17.64 g C m−2 8d−1。在光合峰值差(ΔGPPmax)方面,扩张情景(HIEs, MIEs)显著降低了生态系统光合潜力,相对于收缩情景(HIRs, MIRs),平均ΔGPPmax降低1.19-3.95 g C m−2 8d−1。最显著的损失发生在南美洲、北美洲和欧亚大陆,南美洲的减少量超过6 g C m−28d−1。此外,具有较高光合潜力的生态系统在剧烈的沙漠扩张下会经历更大的GPPmax下降。该研究首次提供了全球尺度的证据,揭示了不同沙漠化途径如何改变生态系统光合峰值及其区域差异,为干旱景观的生态恢复、碳固存策略和土地管理提供了重要的科学支持。
{"title":"Desertification expansion significantly suppresses photosynthetic peak capacity of arid ecosystems at the global scale","authors":"Kaiyang Qiu , Qingbin Zhang , Yingzhong Xie , Mingjie Shi , Chengyun Wang , Tong Dong , Jun Ma , Panxing He","doi":"10.1016/j.jag.2025.105042","DOIUrl":"10.1016/j.jag.2025.105042","url":null,"abstract":"<div><div>Arid ecosystems occupy about two-fifths of the global land surface, and fluctuations in their productivity play a pivotal role in global carbon sequestration and ecosystem service provision. However, the global-scale effect of desertification expansion on the annual maximum photosynthetic peak has not yet been systematically quantified. In this study, 30-m high-resolution desert cover data (GLCLUC) and multi-source remote-sensing photosynthetic indicators were integrated, using a space-for-time substitution framework to establish a global desertification scenario classification system. We quantitatively evaluated the influence of diverse desert expansion and contraction scenarios on the ecosystem photosynthetic peak (GPP<sub>max</sub>). Results indicate that the average GPP<sub>max</sub> in high-intensity expansion regions (HIEs) is 8.23 g C m<sup>−2</sup> 8d<sup>−1</sup>, whereas medium- to low-intensity expansion regions (MIEs) show a value of 8.95 g C m<sup>−2</sup> 8d<sup>−1</sup>. By contrast, medium- to low-intensity contraction regions (MIRs) and high-intensity contraction regions (HIRs) demonstrate markedly higher GPP<sub>max</sub> values of 10.64 g C m<sup>−2</sup> 8d<sup>−1</sup> and 17.64 g C m<sup>−2</sup> 8d<sup>−1</sup>, respectively. Regarding the photosynthetic peak difference (ΔGPP<sub>max</sub>), expansion scenarios (HIEs, MIEs) significantly decrease ecosystem photosynthetic potential, with average ΔGPP<sub>max</sub> reductions of 1.19–3.95 g C m<sup>−2</sup> 8d<sup>−1</sup> relative to contraction scenarios (HIRs, MIRs). The most pronounced losses occur in South America, North America, and Eurasia, with South America exhibiting reductions exceeding 6 g C m<sup>−2</sup> 8d<sup>−1</sup>. Additionally, ecosystems with initially higher photosynthetic potential experience greater GPP<sub>max</sub> declines under intense desert expansion. This study provides the first global-scale evidence revealing how different desertification pathways modify ecosystem photosynthetic peaks and their regional disparities, offering critical scientific support for ecological restoration, carbon sequestration strategies, and land management across arid landscapes.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"146 ","pages":"Article 105042"},"PeriodicalIF":8.6,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145791193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}