Pub Date : 2025-02-01DOI: 10.1016/j.jag.2025.104363
Dongliang Ma , Fang Zhao , Likai Zhu , Xiaofei Li , Jine Wei , Xi Chen , Lijun Hou , Ye Li , Min Liu
The decrease in global oceanic dissolved oxygen (DO) has exerted a profound impact on marine ecosystems and biogeochemical processes. However, our comprehension of DO distribution and its global change patterns remains hindered by sparse measurements and coarse-resolution simulations. Here we presented Oxyformer, a deep learning method that accurately learns DO-related information and estimates high-resolution global DO concentration. The results derived by Oxyformer demonstrate an accelerated decline in global oceanic DO content, estimated at approximately 1045 ± 665 Tmol decade−1 from 2003 to 2020. The observed trends exhibit considerable variability across different regions and depths, with some new hotspots of recent DO change including the Equatorial Indian Ocean, the South Pacific Ocean, the North Atlantic Ocean, and the Western Coast of California. The unprecedented modeling approach provides a powerful tool to track changes in global DO contents and to facilitate the understanding of their influences on ocean ecosystems and biogeochemical processes.
{"title":"Deep learning reveals hotspots of global oceanic oxygen changes from 2003 to 2020","authors":"Dongliang Ma , Fang Zhao , Likai Zhu , Xiaofei Li , Jine Wei , Xi Chen , Lijun Hou , Ye Li , Min Liu","doi":"10.1016/j.jag.2025.104363","DOIUrl":"10.1016/j.jag.2025.104363","url":null,"abstract":"<div><div>The decrease in global oceanic dissolved oxygen (DO) has exerted a profound impact on marine ecosystems and biogeochemical processes. However, our comprehension of DO distribution and its global change patterns remains hindered by sparse measurements and coarse-resolution simulations. Here we presented Oxyformer, a deep learning method that accurately learns DO-related information and estimates high-resolution global DO concentration. The results derived by Oxyformer demonstrate an accelerated decline in global oceanic DO content, estimated at approximately 1045 ± 665 Tmol decade<sup>−1</sup> from 2003 to 2020. The observed trends exhibit considerable variability across different regions and depths, with some new hotspots of recent DO change including the Equatorial Indian Ocean, the South Pacific Ocean, the North Atlantic Ocean, and the Western Coast of California. The unprecedented modeling approach provides a powerful tool to track changes in global DO contents and to facilitate the understanding of their influences on ocean ecosystems and biogeochemical processes.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104363"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142990324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper introduces IceEB, i.e., an innovative ensemble-based method that is designed to automate mapping of river ice types using radar imagery. Its goal is the merger of outcomes from three classifiers (IceMAP-R, RIACT, and IceBC) through ensemble-estimation, resulting in a highly performant and fully automated river ice-type map, which is applicable under all meteorological conditions. The first step of our research is the development of a meta-classifier and a confidence estimation index, then we validate our method using ground-truth datasets and finally compare the performance between IceEB and the original classifiers. The anticipated outcome was a map exhibiting superior results compared to individual classifiers. Validation and comparison of IceEB employed six RADARSAT-2 HH-HV C-band images that were selected from historical datasets of Quebec and Alberta rivers (Canada). IceEB integrates RADARSAT-2 satellite imagery, a digital elevation model, and a river mask, undergoing preprocessing tasks before activating the three initial classifiers. The meta-classifier then performs ensemble-based classification, yielding a legend comprised of water, sheet ice and rubble ice. This approach facilitates broad participation in validation data collection, differentiation between ice covers and ice jams, and minimization of assumptions regarding ice formation. We conclude that IceEB successfully combines existing radar remote sensing ice- classification models to create accurate river ice-type maps. IceEB’s ensemble-based approach outperforms individual classifiers, achieving overall accuracy >91 % for each class. Shortcomings of the original classifiers are effectively offset through parallel use, resulting in marked improvements in automation and generalizability across diverse Canadian meteorological conditions.
{"title":"IceEB: An ensemble-based method to map river ice type from radar images","authors":"Plante Lévesque Valérie, Chokmani Karem, Gauthier Yves, Bernier Monique","doi":"10.1016/j.jag.2024.104317","DOIUrl":"10.1016/j.jag.2024.104317","url":null,"abstract":"<div><div>This paper introduces IceEB, i.e., an innovative ensemble-based method that is designed to automate mapping of river ice types using radar imagery. Its goal is the merger of outcomes from three classifiers (IceMAP-R, RIACT, and IceBC) through ensemble-estimation, resulting in a highly performant and fully automated river ice-type map, which is applicable under all meteorological conditions. The first step of our research is the development of a <em>meta</em>-classifier and a confidence estimation index, then we validate our method using ground-truth datasets and finally compare the performance between IceEB and the original classifiers. The anticipated outcome was a map exhibiting superior results compared to individual classifiers. Validation and comparison of IceEB employed six RADARSAT-2 HH-HV C-band images that were selected from historical datasets of Quebec and Alberta rivers (Canada). IceEB integrates RADARSAT-2 satellite imagery, a digital elevation model, and a river mask, undergoing preprocessing tasks before activating the three initial classifiers. The <em>meta</em>-classifier then performs ensemble-based classification, yielding a legend comprised of water, sheet ice and rubble ice. This approach facilitates broad participation in validation data collection, differentiation between ice covers and ice jams, and minimization of assumptions regarding ice formation. We conclude that IceEB successfully combines existing radar remote sensing ice- classification models to create accurate river ice-type maps. IceEB’s ensemble-based approach outperforms individual classifiers, achieving overall accuracy >91 % for each class. Shortcomings of the original classifiers are effectively offset through parallel use, resulting in marked improvements in automation and generalizability across diverse Canadian meteorological conditions.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104317"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143049659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.jag.2025.104375
Gareth Roberts , Martin. J. Wooster , Tercia Strydom
Satellite burned area, active fire and fire radiative power (FRP), are key to quantifying fire activity and are one of 54 essential climate variables (ECV) and it is important to validate these data to ensure their consistency. This study investigates some of the factors that influence FRP retrieval and uses Meteosat Spinning Enhanced Visible and InfraRed Imager (SEVIRI) data to do so. Analysis of the influence of a fire’s location within a SEVIRI pixel on FRP was carried out using fire simulations which indicate that FRP varies by up to 14 % at nadir for a single sensor and by up to 55 % when intercomparing simulated FRP from different SEVIRI sensors. Intercomparison between actual MET-11 and MET-08 FRP data on a per-pixel basis reveals a high degree of scatter (81.9 MW), strong correlation (R = 0.72), low bias (∼1 MW) and an average percentage difference of 15.7 %. Variability is reduced when aggregated to fire ‘clusters’ which improves the correlation (R = 0.96) and reduces the average percentage difference (4.2 %). Validation of MET-08 and MET-11 FRP retrievals using FRP from helicopter mounted longwave infrared (LWIR) and midwave infrared (MWIR) thermal cameras is carried out over five prescribed burns. The results reveal good agreement between the SEVIRI and thermal camera FRP although the SEVIRI FRP is typically overestimated compared to that from the LWIR camera. This study illustrates some of the challenges validating satellite FRP which should be accounted for when defining uncertainty thresholds for product requirements and in developing FRP validation protocols.
{"title":"Assessment and validation of Meteosat SEVIRI fire radiative power (FRP) retrievals over Kruger National Park","authors":"Gareth Roberts , Martin. J. Wooster , Tercia Strydom","doi":"10.1016/j.jag.2025.104375","DOIUrl":"10.1016/j.jag.2025.104375","url":null,"abstract":"<div><div>Satellite burned area, active fire and fire radiative power (FRP), are key to quantifying fire activity and are one of 54 essential climate variables (ECV) and it is important to validate these data to ensure their consistency. This study investigates some of the factors that influence FRP retrieval and uses Meteosat Spinning Enhanced Visible and InfraRed Imager (SEVIRI) data to do so. Analysis of the influence of a fire’s location within a SEVIRI pixel on FRP was carried out using fire simulations which indicate that FRP varies by up to 14 % at nadir for a single sensor and by up to 55 % when intercomparing simulated FRP from different SEVIRI sensors. Intercomparison between actual MET-11 and MET-08 FRP data on a per-pixel basis reveals a high degree of scatter (81.9 MW), strong correlation (R = 0.72), low bias (∼1 MW) and an average percentage difference of 15.7 %. Variability is reduced when aggregated to fire ‘clusters’ which improves the correlation (R = 0.96) and reduces the average percentage difference (4.2 %). Validation of MET-08 and MET-11 FRP retrievals using FRP from helicopter mounted longwave infrared (LWIR) and midwave infrared (MWIR) thermal cameras is carried out over five prescribed burns. The results reveal good agreement between the SEVIRI and thermal camera FRP although the SEVIRI FRP is typically overestimated compared to that from the LWIR camera. This study illustrates some of the challenges validating satellite FRP which should be accounted for when defining uncertainty thresholds for product requirements and in developing FRP validation protocols.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104375"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143050027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.jag.2024.104348
Xiaodi Xu , Ya Zhang , Peng Fu , Chaoya Dang , Bowen Cai , Qingwei Zhuang , Zhenfeng Shao , Deren Li , Qing Ding
Mapping urban top of canopy height (UTCH) is essential for quantifying urban vegetation carbon storage and developing effective vegetation management strategies. However, the scarcity and uneven distribution of urban measurement samples pose significant challenges to accurately estimating UTCH on a large scale in complex urban environments. To address this issue, this study utilized ICESat-2 photon spot height data as reference samples, in conjunction with high-resolution GF-2 remote sensing data, to estimate UTCH. To achieve UTCH mapping at a resolution of 4 m, a synergistic model integrating data from the GF-2 and ICESat-2 grid-based canopy height was constructed using the Random Forest technique. The model’s performance was evaluated using 111 urban tree canopy height samples collected across different urban areas. The experimental results demonstrated a moderate correlation between estimated and actual canopy heights, with a coefficient of determination (R) = 0.53, root mean square error (RMSE) = 2.9 m, and mean absolute error (MAE) = 2.04 m. Texture information, the red band, and MNDVI are key indicators for determining UTCH, with contribution percentages of 25.29 %, 13.7 %, and 25.75 %, respectively. As a result, the UTCH model created by fusing remote sensing spectral data with satellite-based lidar data can accurately estimate UTCH and offer a practical solution for predicting UTCH on a regional or even global scale.
{"title":"Synergistic mapping of urban tree canopy height using ICESat-2 data and GF-2 imagery","authors":"Xiaodi Xu , Ya Zhang , Peng Fu , Chaoya Dang , Bowen Cai , Qingwei Zhuang , Zhenfeng Shao , Deren Li , Qing Ding","doi":"10.1016/j.jag.2024.104348","DOIUrl":"10.1016/j.jag.2024.104348","url":null,"abstract":"<div><div>Mapping urban top of canopy height (UTCH) is essential for quantifying urban vegetation carbon storage and developing effective vegetation management strategies. However, the scarcity and uneven distribution of urban measurement samples pose significant challenges to accurately estimating UTCH on a large scale in complex urban environments. To address this issue, this study utilized ICESat-2 photon spot height data as reference samples, in conjunction with high-resolution GF-2 remote sensing data, to estimate UTCH. To achieve UTCH mapping at a resolution of 4 m, a synergistic model integrating data from the GF-2 and ICESat-2 grid-based canopy height was constructed using the Random Forest technique. The model’s performance was evaluated using 111 urban tree canopy height samples collected across different urban areas. The experimental results demonstrated a moderate correlation between estimated and actual canopy heights, with a coefficient of determination (<em>R</em>) = 0.53, root mean square error (<em>RMSE</em>) = 2.9 m, and mean absolute error (<em>MAE</em>) = 2.04 m. Texture information, the red band, and MNDVI are key indicators for determining UTCH, with contribution percentages of 25.29 %, 13.7 %, and 25.75 %, respectively. As a result, the UTCH model created by fusing remote sensing spectral data with satellite-based lidar data can accurately estimate UTCH and offer a practical solution for predicting UTCH on a regional or even global scale.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104348"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143083301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.jag.2024.104312
Yifan Zhang , Jingxuan Li , Zhiyun Wang , Zhengting He , Qingfeng Guan , Jianfeng Lin , Wenhao Yu
Solving geospatial tasks generally requires multiple geospatial tools and steps, i.e., tool-use chains. Automating the geospatial task solving process can effectively enhance the efficiency of GIS users. Traditionally, researchers tend to design rule-based systems to autonomously solve similar geospatial tasks, which is inflexible and difficult to adapt to different tasks. With the development of Large Language Models (LLMs), some research suggests that LLMs have the potential for intelligent task solving with their tool-use ability, which means LLMs can invoke externally provided tools for specific tasks. However, most studies rely on closed-source commercial LLMs like ChatGPT and GPT-4, whose limited API accessibility restricts their deployment on local private devices. Some researchers in the general domain proposed using instruction tuning to improve the tool-use ability of open-source LLMs. However, the requirement of tool-use chains to solve geospatial tasks, including multiple data input and output processes, poses challenges for collecting effective instruction tuning data. To solve these challenges, we propose a framework for training a Geospatial large language model to generate Tool-use Chains autonomously (GTChain). Specifically, we design a seed task-guided self-instruct strategy to generate a geospatial tool-use instruction tuning dataset within a simulated environment, encompassing diverse geospatial task production and corresponding tool-use chain generation. Subsequently, an open-source general-domain LLM, LLaMA-2-7B, is fine-tuned on the collected instruction data to understand geospatial tasks and learn how to generate geospatial tool-use chains. Finally, we also collect an evaluation dataset to serve as a benchmark for assessing the geospatial tool-use ability of LLMs. Experimental results on the evaluation dataset demonstrate that the fine-tuned GTChain can effectively solve geospatial tasks using the provided tools, achieving 32.5% and 27.5% higher accuracy in the percentage of correctly solved tasks compared to GPT-4 and Gemini 1.5 Pro, respectively.
{"title":"Geospatial large language model trained with a simulated environment for generating tool-use chains autonomously","authors":"Yifan Zhang , Jingxuan Li , Zhiyun Wang , Zhengting He , Qingfeng Guan , Jianfeng Lin , Wenhao Yu","doi":"10.1016/j.jag.2024.104312","DOIUrl":"10.1016/j.jag.2024.104312","url":null,"abstract":"<div><div>Solving geospatial tasks generally requires multiple geospatial tools and steps, i.e., tool-use chains. Automating the geospatial task solving process can effectively enhance the efficiency of GIS users. Traditionally, researchers tend to design rule-based systems to autonomously solve similar geospatial tasks, which is inflexible and difficult to adapt to different tasks. With the development of Large Language Models (LLMs), some research suggests that LLMs have the potential for intelligent task solving with their tool-use ability, which means LLMs can invoke externally provided tools for specific tasks. However, most studies rely on closed-source commercial LLMs like ChatGPT and GPT-4, whose limited API accessibility restricts their deployment on local private devices. Some researchers in the general domain proposed using instruction tuning to improve the tool-use ability of open-source LLMs. However, the requirement of tool-use chains to solve geospatial tasks, including multiple data input and output processes, poses challenges for collecting effective instruction tuning data. To solve these challenges, we propose a framework for training a Geospatial large language model to generate Tool-use Chains autonomously (GTChain). Specifically, we design a seed task-guided self-instruct strategy to generate a geospatial tool-use instruction tuning dataset within a simulated environment, encompassing diverse geospatial task production and corresponding tool-use chain generation. Subsequently, an open-source general-domain LLM, LLaMA-2-7B, is fine-tuned on the collected instruction data to understand geospatial tasks and learn how to generate geospatial tool-use chains. Finally, we also collect an evaluation dataset to serve as a benchmark for assessing the geospatial tool-use ability of LLMs. Experimental results on the evaluation dataset demonstrate that the fine-tuned GTChain can effectively solve geospatial tasks using the provided tools, achieving 32.5% and 27.5% higher accuracy in the percentage of correctly solved tasks compared to GPT-4 and Gemini 1.5 Pro, respectively.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104312"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143445340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.jag.2025.104371
Di Jiang , Shiyi Li , Irena Hajnsek , Muhammad Adnan Siddique , Wen Hong , Yirong Wu
Glacial lakes are vital indicators of climate change, offering insights into glacier dynamics, mass balance, and sea-level rise. However, accurate mapping remains challenging due to the detection of small lakes, shadow interference, and complex terrain conditions. This study introduces the U-ViT model, a novel deep learning framework leveraging the IBM-NASA Prithvi Geo-Foundation Model (GFM) to address these issues. U-ViT employs a U-shaped encoder–decoder architecture featuring enhanced multi-channel data fusion and global-local feature extraction. It integrates an Enhanced Squeeze-Excitation block for flexible fine-tuning across various input dimensions and combines Inverted Bottleneck Blocks to improve local feature representation. The model was trained on two datasets: a Sentinel-1&2 fusion dataset from North Pakistan (NPK) and a Gaofen-3 SAR dataset from West Greenland (WGL). Experimental results highlight the U-ViT model’s effectiveness, achieving an F1 score of 0.894 on the NPK dataset, significantly outperforming traditional CNN-based models with scores below 0.8. It excelled in detecting small lakes, segmenting boundaries precisely, and handling cloud-shadowed features compared to public datasets. Notably, the U-ViT demonstrated robust performance with a 50% reduction in training data, underscoring its potential for efficient learning in data-scarce tasks. However, its performance on the WGL dataset did not surpass that of DeepLabV3+, revealing limitations stemming from differences between pre-training and input data modalities. The code supporting this study is available online. This research sets the stage for advancing large-scale glacial lake mapping through the application of GFMs.
{"title":"Glacial lake mapping using remote sensing Geo-Foundation Model","authors":"Di Jiang , Shiyi Li , Irena Hajnsek , Muhammad Adnan Siddique , Wen Hong , Yirong Wu","doi":"10.1016/j.jag.2025.104371","DOIUrl":"10.1016/j.jag.2025.104371","url":null,"abstract":"<div><div>Glacial lakes are vital indicators of climate change, offering insights into glacier dynamics, mass balance, and sea-level rise. However, accurate mapping remains challenging due to the detection of small lakes, shadow interference, and complex terrain conditions. This study introduces the U-ViT model, a novel deep learning framework leveraging the IBM-NASA Prithvi Geo-Foundation Model (GFM) to address these issues. U-ViT employs a U-shaped encoder–decoder architecture featuring enhanced multi-channel data fusion and global-local feature extraction. It integrates an Enhanced Squeeze-Excitation block for flexible fine-tuning across various input dimensions and combines Inverted Bottleneck Blocks to improve local feature representation. The model was trained on two datasets: a Sentinel-1&2 fusion dataset from North Pakistan (NPK) and a Gaofen-3 SAR dataset from West Greenland (WGL). Experimental results highlight the U-ViT model’s effectiveness, achieving an F1 score of 0.894 on the NPK dataset, significantly outperforming traditional CNN-based models with scores below 0.8. It excelled in detecting small lakes, segmenting boundaries precisely, and handling cloud-shadowed features compared to public datasets. Notably, the U-ViT demonstrated robust performance with a 50% reduction in training data, underscoring its potential for efficient learning in data-scarce tasks. However, its performance on the WGL dataset did not surpass that of DeepLabV3+, revealing limitations stemming from differences between pre-training and input data modalities. The code supporting this study is available online. This research sets the stage for advancing large-scale glacial lake mapping through the application of GFMs.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104371"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143445341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.jag.2024.104347
Bin Liu , Jian Kang , Haiyan Guan , Xiaodong Zhi , Yongtao Yu , Lingfei Ma , Daifeng Peng , Linlin Xu , Dongchuan Wang
Although real-time semantic segmentation of pavement cracks is crucial for road evaluation and maintenance decision-making, it is a challenging task due to low operational efficiency and over-segmentation of existing methods. To address these challenges, in this paper, incorporating Transformers and CNNs, we propose a real-time triple-branch crack semantic segmentation network (RTCNet) using digital camera images. The three branches include a detail branch for capturing local detail features, a context branch for extracting global contextual information, and a boundary branch for obtaining crack boundary information. First, to further enhance crack features, we design a Detail Enhance Transformer (DET) module for enlarging global receptive fields and a Multiscale Aggregation (MSA) module for multiscale learning in the context branch. Second, a Boundary Refinement (BR) module with Sobel operators embedded in the boundary branch is designed to refine the crack boundaries. Last, a Detail-Context Fusion (DCF) module is designed to aggregate the intermediate features extracted from the different branches efficiently Comprehensive quantitative and visual comparisons on four datasets showed that the proposed RTCNet outperforms the comparative models in terms of efficiency and effectiveness with the highest F1-score, mIoU, and Frames Per Second (FPS) of 90.56%, 90.25%, and 87.34 in DeepCrack537 dataset, respectively. We also contribute an extensive dataset of pavement cracks, consisting of 464 manually annotated digital images, which is publicly accessible at https://github.com/NJSkate/BeijingHighway-dataset.
{"title":"RTCNet: A novel real-time triple branch network for pavement crack semantic segmentation","authors":"Bin Liu , Jian Kang , Haiyan Guan , Xiaodong Zhi , Yongtao Yu , Lingfei Ma , Daifeng Peng , Linlin Xu , Dongchuan Wang","doi":"10.1016/j.jag.2024.104347","DOIUrl":"10.1016/j.jag.2024.104347","url":null,"abstract":"<div><div>Although real-time semantic segmentation of pavement cracks is crucial for road evaluation and maintenance decision-making, it is a challenging task due to low operational efficiency and over-segmentation of existing methods. To address these challenges, in this paper, incorporating Transformers and CNNs, we propose a real-time triple-branch crack semantic segmentation network (RTCNet) using digital camera images. The three branches include a detail branch for capturing local detail features, a context branch for extracting global contextual information, and a boundary branch for obtaining crack boundary information. First, to further enhance crack features, we design a Detail Enhance Transformer (DET) module for enlarging global receptive fields and a Multiscale Aggregation (MSA) module for multiscale learning in the context branch. Second, a Boundary Refinement (BR) module with Sobel operators embedded in the boundary branch is designed to refine the crack boundaries. Last, a Detail-Context Fusion (DCF) module is designed to aggregate the intermediate features extracted from the different branches efficiently Comprehensive quantitative and visual comparisons on four datasets showed that the proposed RTCNet outperforms the comparative models in terms of efficiency and effectiveness with the highest F<sub>1</sub>-score, mIoU, and Frames Per Second (FPS) of 90.56%, 90.25%, and 87.34 in DeepCrack537 dataset, respectively. We also contribute an extensive dataset of pavement cracks, consisting of 464 manually annotated digital images, which is publicly accessible at <span><span>https://github.com/NJSkate/BeijingHighway-dataset</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104347"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142902113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.jag.2024.104327
Guohua Gou , Han Li , Xuanhao Wang , Hao Zhang , Wei Yang , Haigang Sui
In this work, a depth-only completion method designed to enhance perception in light-deprived environments. We achieve this through LidarDepthNet, a novel end-to-end unsupervised learning framework that fuses heterogeneous depth information captured by two distinct depth sensors: LiDAR and RGB-D cameras. This represents the first unsupervised LiDAR-depth fusion framework for depth completion, demonstrating scalability to diverse real-world subterranean and enclosed environments. To facilitate unsupervised learning, we leverage relative rigid motion transfer (RRMT) to synthesize co-visible depth maps from temporally adjacent frames. This allows us to construct a temporal depth consistency loss, constraining the fused depth to adhere to realistic metric scale. Furthermore, we introduce measurement confidence into the heterogeneous depth fusion model, further refining the fused depth and promoting synergistic complementation between the two depth modalities. Extensive evaluation on both real-world and synthetic datasets, notably a newly proposed LiDAR-depth fusion dataset, LidarDepthSet, demonstrates the significant advantages of our method compared to existing state-of-the-art approaches.
{"title":"Unsupervised deep depth completion with heterogeneous LiDAR and RGB-D camera depth information","authors":"Guohua Gou , Han Li , Xuanhao Wang , Hao Zhang , Wei Yang , Haigang Sui","doi":"10.1016/j.jag.2024.104327","DOIUrl":"10.1016/j.jag.2024.104327","url":null,"abstract":"<div><div>In this work, a depth-only completion method designed to enhance perception in light-deprived environments. We achieve this through LidarDepthNet, a novel end-to-end unsupervised learning framework that fuses heterogeneous depth information captured by two distinct depth sensors: LiDAR and RGB-D cameras. This represents the first unsupervised LiDAR-depth fusion framework for depth completion, demonstrating scalability to diverse real-world subterranean and enclosed environments. To facilitate unsupervised learning, we leverage relative rigid motion transfer (RRMT) to synthesize co-visible depth maps from temporally adjacent frames. This allows us to construct a temporal depth consistency loss, constraining the fused depth to adhere to realistic metric scale. Furthermore, we introduce measurement confidence into the heterogeneous depth fusion model, further refining the fused depth and promoting synergistic complementation between the two depth modalities. Extensive evaluation on both real-world and synthetic datasets, notably a newly proposed LiDAR-depth fusion dataset, LidarDepthSet, demonstrates the significant advantages of our method compared to existing state-of-the-art approaches.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104327"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142874811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.jag.2024.104323
Jingyi Zhou , Jie Shen , Cheng Fu , Robert Weibel , Zhiyong Zhou
The indoor map is an indispensable component to visualize human users’ real-time locations and guided routes to find their destinations in large and complex buildings efficiently. The map design in existing mobile indoor navigation systems mostly considers either the user locations or the route segments but seldom considers the adaptation of the base map scale. Due to uneven densities of spatial elements, the complexity of routes, and the diversity of spatial distribution of navigation decision points, the base map information of indoor navigation maps varies greatly. Hence, it is inevitable to cause an inappropriate amount of map information at different locations and routes. Additionally, existing multi-scale representations of indoor maps are limited to certain scales but not adapted to building locations. Users have to adjust the map scales frequently through multiple interactions with the navigation system. In this study, we propose a method that considers the dynamic elements of indoor maps to quantify the map information for scale adaptation. The indoor navigation map information calculation includes both geometry information and spatial distribution information of static base map elements (area elements, POIs) and dynamic route elements (segments, decision points). The total map information is quantified by setting the weights of the two types of elements. An empirical study on indoor navigation map selection was conducted. Results show that the quantified map information using the proposed method can reflect a user-desired map better than the traditionally used scales.
{"title":"Quantifying indoor navigation map information considering the dynamic map elements for scale adaptation","authors":"Jingyi Zhou , Jie Shen , Cheng Fu , Robert Weibel , Zhiyong Zhou","doi":"10.1016/j.jag.2024.104323","DOIUrl":"10.1016/j.jag.2024.104323","url":null,"abstract":"<div><div>The indoor map is an indispensable component to visualize human users’ real-time locations and guided routes to find their destinations in large and complex buildings efficiently. The map design in existing mobile indoor navigation systems mostly considers either the user locations or the route segments but seldom considers the adaptation of the base map scale. Due to uneven densities of spatial elements, the complexity of routes, and the diversity of spatial distribution of navigation decision points, the base map information of indoor navigation maps varies greatly. Hence, it is inevitable to cause an inappropriate amount of map information at different locations and routes. Additionally, existing multi-scale representations of indoor maps are limited to certain scales but not adapted to building locations. Users have to adjust the map scales frequently through multiple interactions with the navigation system. In this study, we propose a method that considers the dynamic elements of indoor maps to quantify the map information for scale adaptation. The indoor navigation map information calculation includes both geometry information and spatial distribution information of static base map elements (area elements, POIs) and dynamic route elements (segments, decision points). The total map information is quantified by setting the weights of the two types of elements. An empirical study on indoor navigation map selection was conducted. Results show that the quantified map information using the proposed method can reflect a user-desired map better than the traditionally used scales.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104323"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142874863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01DOI: 10.1016/j.jag.2025.104362
Justin Dawsey, Nancy E. McIntyre
Curtailing encroachment is dependent on effectively identifying where problematic species occur. However, traditional classification methods struggle to distinguish spectrally similar species. New techniques that incorporate environmental variables (edaphic, climatic, and topographic characteristics) into classification can refine predictions and help identify important factors associated with species occurrence. We developed a workflow to improve classification of honey mesquite (Neltuma [=Prosopis] glandulosa) in the Southern Great Plains (USA), examining 70 environmental variables to determine which were most associated with mesquite presence. We used Google Earth Engine to run X-means clustering on high-resolution aerial imagery from 50 replicate 78-km2 areas in New Mexico and Texas. We then refined our classification using XGBoost to generate accuracy assessment points for each area to confirm locations of mesquite clusters. Our method improved classification accuracy from 36 % to 83 %. We performed an ex-situ ground-truthed validation study and achieved 74 % accuracy. Inclusion of environmental data increased the accuracy of mesquite classification and allowed us to estimate the influence of each variable in determining whether a given point was classified as mesquite. Shallow, alkaline soils with low water-storage capacity, high electrical conductance, and low cation exchange capacity were associated with mesquite presence; these areas tended to be associated with flat, low-elevation drainages in regions that experience wide annual temperature ranges. These methods provide an easily reproducible and scalable way to assist with image classification of rangeland shrubs from remotely sensed imagery, which may prove useful in managing the further encroachment of problematic species like honey mesquite.
{"title":"Incorporating environmental data to refine the classification and understanding of the mechanisms behind encroachment of a woody species in the Southern Great Plains (USA)","authors":"Justin Dawsey, Nancy E. McIntyre","doi":"10.1016/j.jag.2025.104362","DOIUrl":"10.1016/j.jag.2025.104362","url":null,"abstract":"<div><div>Curtailing encroachment is dependent on effectively identifying where problematic species occur. However, traditional classification methods struggle to distinguish spectrally similar species. New techniques that incorporate environmental variables (edaphic, climatic, and topographic characteristics) into classification can refine predictions and help identify important factors associated with species occurrence. We developed a workflow to improve classification of honey mesquite (<em>Neltuma</em> [=<em>Prosopis</em>] <em>glandulosa</em>) in the Southern Great Plains (USA), examining 70 environmental variables to determine which were most associated with mesquite presence. We used Google Earth Engine to run X-means clustering on high-resolution aerial imagery from 50 replicate 78-km<sup>2</sup> areas in New Mexico and Texas. We then refined our classification using XGBoost to generate accuracy assessment points for each area to confirm locations of mesquite clusters. Our method improved classification accuracy from 36 % to 83 %. We performed an ex-situ ground-truthed validation study and achieved 74 % accuracy. Inclusion of environmental data increased the accuracy of mesquite classification and allowed us to estimate the influence of each variable in determining whether a given point was classified as mesquite. Shallow, alkaline soils with low water-storage capacity, high electrical conductance, and low cation exchange capacity were associated with mesquite presence; these areas tended to be associated with flat, low-elevation drainages in regions that experience wide annual temperature ranges. These methods provide an easily reproducible and scalable way to assist with image classification of rangeland shrubs from remotely sensed imagery, which may prove useful in managing the further encroachment of problematic species like honey mesquite.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104362"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142975201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}