Pub Date : 2026-01-01Epub Date: 2026-01-22DOI: 10.1016/j.ophoto.2026.100116
Faruk Keskin , Fesih Keskin , Gültekin Işık
Accurate hyperspectral image (HSI) classification under scarce labels and class imbalance requires models that couple long-range spectral reasoning with irregular local spatial context. We present GAST, a Graph-Augmented spectral–spatial Transformer with Adaptive Gated Fusion for Small-Sample Hyperspectral Image Classification. GAST pairs a lightweight spectral Transformer with a GATv2-based spatial branch on an 8-neighbor pixel graph, and fuses them via a center-conditioned, channel-wise gating mechanism that uses the center-pixel representation to modulate all tokens in the patch. Unlike conventional static fusion strategies (e.g., concatenation or summation) that assign fixed importance to modalities regardless of image content, this adaptive fusion dynamically modulates the spectral and spatial streams at the pixel level, allowing the model to prioritize spatial texture for complex urban structures while shifting focus to spectral signatures for subtle vegetation classes. Training is further stabilized by an imbalance-aware objective that switches between weighted cross-entropy and focal loss according to a measured class ratio, and by a two-stage Bayesian hyperparameter search that aligns capacity with scene statistics. Across eight public benchmarks under a 5%-label protocol, GAST consistently matches or surpasses recent hybrid graph-Transformer architectures while remaining compact and fast at inference. Ablation studies confirm the complementary roles of both branches and the benefit of gated fusion. The resulting architecture offers a strong accuracy–efficiency trade-off and reliable performance across seeds, making it a practical solution for low-data HSI applications. The code is publicly available at https://github.com/fesihkeskin/GAST.
{"title":"GAST: A graph-augmented spectral–spatial transformer with adaptive gated fusion for small-sample hyperspectral image classification","authors":"Faruk Keskin , Fesih Keskin , Gültekin Işık","doi":"10.1016/j.ophoto.2026.100116","DOIUrl":"10.1016/j.ophoto.2026.100116","url":null,"abstract":"<div><div>Accurate hyperspectral image (HSI) classification under scarce labels and class imbalance requires models that couple long-range spectral reasoning with irregular local spatial context. We present GAST, a Graph-Augmented spectral–spatial Transformer with Adaptive Gated Fusion for Small-Sample Hyperspectral Image Classification. GAST pairs a lightweight spectral Transformer with a GATv2-based spatial branch on an 8-neighbor pixel graph, and fuses them via a center-conditioned, channel-wise gating mechanism that uses the center-pixel representation to modulate all tokens in the patch. Unlike conventional static fusion strategies (e.g., concatenation or summation) that assign fixed importance to modalities regardless of image content, this adaptive fusion dynamically modulates the spectral and spatial streams at the pixel level, allowing the model to prioritize spatial texture for complex urban structures while shifting focus to spectral signatures for subtle vegetation classes. Training is further stabilized by an imbalance-aware objective that switches between weighted cross-entropy and focal loss according to a measured class ratio, and by a two-stage Bayesian hyperparameter search that aligns capacity with scene statistics. Across eight public benchmarks under a 5%-label protocol, GAST consistently matches or surpasses recent hybrid graph-Transformer architectures while remaining compact and fast at inference. Ablation studies confirm the complementary roles of both branches and the benefit of gated fusion. The resulting architecture offers a strong accuracy–efficiency trade-off and reliable performance across seeds, making it a practical solution for low-data HSI applications. The code is publicly available at <span><span>https://github.com/fesihkeskin/GAST</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"19 ","pages":"Article 100116"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-12-09DOI: 10.1016/j.ophoto.2025.100114
Nicolas Barbier , Pierre Ploton , Hadrien Tulet , Gaëlle Viennois , Hugo Leblanc , Benoît Burban , Maxime Réjou-Méchain , Philippe Verley , James Ball , Denis Feurer , Grégoire Vincent
Light drones provide a cheap and effective tool to monitor forest canopy, especially in tropical and equatorial contexts, where infrastructure and resources are limiting. In these regions, good quality optical satellite images are rare, yet the stakes are maximal to characterize forest function, dynamics, diversity, and phenology, and more generally the vegetation-climate interplay.
We describe a complete processing chain based on photogrammetric tools that seeks to optimize the spatial and spectral coherence between repeat image mosaics at centimetric resolution. Our target is to allow individual tree-level monitoring over tens to hundreds of hectare scales with consumer grade equipment (i.e., quadcopter with stabilized RGB camera, standard GNSS positioning).
We demonstrate the increase in spatial accuracy achieved using Time-SIFT and Arosics algorithms, which allow (individually and synergistically) to reduce global and local spatial misalignment between mosaics from several meters to a few centimeters. Time-SIFT provides the advantage of increased robustness in initial image alignment and 3D reconstruction, and hence reduces occasional distortions or data gaps. Using Agisoft's color and white balance corrections combined with the use of vegetation indices provides meaningful quantitative signal despite considerable changes in acquisition conditions.
In particular, indices that are less sensitive to illumination changes, like the green chromatic coordinate (GCC), allowed evidencing a seasonal signal over four years of monitoring in the evergreen moist forest at Paracou in French Guiana. The signal was decorrelated from obvious geometrical effect (sun height), and provided information on the vegetative stage at tree, species, and stand levels.
{"title":"Monitoring tropical forests with light drones: ensuring spatial and temporal consistency in stereophotogrammetric products","authors":"Nicolas Barbier , Pierre Ploton , Hadrien Tulet , Gaëlle Viennois , Hugo Leblanc , Benoît Burban , Maxime Réjou-Méchain , Philippe Verley , James Ball , Denis Feurer , Grégoire Vincent","doi":"10.1016/j.ophoto.2025.100114","DOIUrl":"10.1016/j.ophoto.2025.100114","url":null,"abstract":"<div><div>Light drones provide a cheap and effective tool to monitor forest canopy, especially in tropical and equatorial contexts, where infrastructure and resources are limiting. In these regions, good quality optical satellite images are rare, yet the stakes are maximal to characterize forest function, dynamics, diversity, and phenology, and more generally the vegetation-climate interplay.</div><div>We describe a complete processing chain based on photogrammetric tools that seeks to optimize the spatial and spectral coherence between repeat image mosaics at centimetric resolution. Our target is to allow individual tree-level monitoring over tens to hundreds of hectare scales with consumer grade equipment (i.e., quadcopter with stabilized RGB camera, standard GNSS positioning).</div><div>We demonstrate the increase in spatial accuracy achieved using Time-SIFT and Arosics algorithms, which allow (individually and synergistically) to reduce global and local spatial misalignment between mosaics from several meters to a few centimeters. Time-SIFT provides the advantage of increased robustness in initial image alignment and 3D reconstruction, and hence reduces occasional distortions or data gaps. Using Agisoft's color and white balance corrections combined with the use of vegetation indices provides meaningful quantitative signal despite considerable changes in acquisition conditions.</div><div>In particular, indices that are less sensitive to illumination changes, like the green chromatic coordinate (GCC), allowed evidencing a seasonal signal over four years of monitoring in the evergreen moist forest at Paracou in French Guiana. The signal was decorrelated from obvious geometrical effect (sun height), and provided information on the vegetative stage at tree, species, and stand levels.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"19 ","pages":"Article 100114"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145750140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-10-31DOI: 10.1016/j.ophoto.2025.100106
Ian A. Ocholla , Janne Heiskanen , Faith Karanja , Mark Boitt , Petri Pellikka
Over the past four decades, rising demand for livestock products in Africa has led to increased stocking rates resulting in overgrazing and land degradation. As the population is projected to rise, the need for sustainable livestock management is more urgent than ever, yet efforts are hindered by the lack of accurate, up-to-date livestock counts. Recent advances in remote sensing and deep learning have made it possible to count livestock from space. However, the extent to which models trained on aerial imagery can enhance livestock detection in satellite images and across diverse landscapes remains limited. This study assessed the transferability of YOLO, Faster R-CNN, U-Net, and ResNet models for livestock detection across three contrasting landscapes, Choke bushland (Pleiades Neo), Kapiti savanna (WorldView-3), and LUMO open grassland (WorldView-3), using satellite imagery with 0.3 m and 0.4 m spatial resolution. Additionally, we applied a multi-stage transfer learning to evaluate the effectiveness of aerial imagery (0.1 m) trained models in improving livestock detection in satellite imagery. Results indicate that YOLOv5 consistently outperformed other models, achieving F1 scores of 0.55, 0.67, and 0.85 in Choke, Kapiti, and LUMO, respectively, demonstrating robustness across varying land cover types and sensors. Although segmentation models performed moderately on 0.3 m imagery (F1 scores of 0.51 and 0.40 for Choke and LUMO), their performance dropped significantly on the coarser resolution (0.4 m) Kapiti imagery (F1 score of 0.14). In addition, multi-stage transfer learning improved segmentation models recall by 9.8 % in heterogeneous bushland site. Our results highlight that the integration of multi-source imagery and deep learning can help in large scale livestock monitoring, which is crucial in implementing sustainable rangeland management.
{"title":"Towards monitoring livestock using satellite imagery: Transferability of object detection and segmentation models in Kenyan rangelands","authors":"Ian A. Ocholla , Janne Heiskanen , Faith Karanja , Mark Boitt , Petri Pellikka","doi":"10.1016/j.ophoto.2025.100106","DOIUrl":"10.1016/j.ophoto.2025.100106","url":null,"abstract":"<div><div>Over the past four decades, rising demand for livestock products in Africa has led to increased stocking rates resulting in overgrazing and land degradation. As the population is projected to rise, the need for sustainable livestock management is more urgent than ever, yet efforts are hindered by the lack of accurate, up-to-date livestock counts. Recent advances in remote sensing and deep learning have made it possible to count livestock from space. However, the extent to which models trained on aerial imagery can enhance livestock detection in satellite images and across diverse landscapes remains limited. This study assessed the transferability of YOLO, Faster R-CNN, U-Net, and ResNet models for livestock detection across three contrasting landscapes, Choke bushland (Pleiades Neo), Kapiti savanna (WorldView-3), and LUMO open grassland (WorldView-3), using satellite imagery with 0.3 m and 0.4 m spatial resolution. Additionally, we applied a multi-stage transfer learning to evaluate the effectiveness of aerial imagery (0.1 m) trained models in improving livestock detection in satellite imagery. Results indicate that YOLOv5 consistently outperformed other models, achieving F1 scores of 0.55, 0.67, and 0.85 in Choke, Kapiti, and LUMO, respectively, demonstrating robustness across varying land cover types and sensors. Although segmentation models performed moderately on 0.3 m imagery (F1 scores of 0.51 and 0.40 for Choke and LUMO), their performance dropped significantly on the coarser resolution (0.4 m) Kapiti imagery (F1 score of 0.14). In addition, multi-stage transfer learning improved segmentation models recall by 9.8 % in heterogeneous bushland site. Our results highlight that the integration of multi-source imagery and deep learning can help in large scale livestock monitoring, which is crucial in implementing sustainable rangeland management.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100106"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145466838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-11-19DOI: 10.1016/j.ophoto.2025.100108
Umut Gunes Sefercik, Ilyas Aydin, Mertcan Nazar
High-quality production of building digital twins (DT) is always a challenging issue. In this study, a methodology is proposed to obtain a precise georeferenced 3D building model with high geometric and spectral quality, which is one of the essential components of a high-quality DT production, through the fusion of UAV and terrestrial photogrammetric data. To better evaluate the performance of the proposed methodology a complex building with glass facades, entrance porches, outdoor stairs, and architectural coverings was chosen. The techniques, used for overcoming the challenging issues about multi-source image orientation, spectral enhancement and precise building model production were presented. In the proposed methodology, distinct from the existing literature studies, photos obtained from different sources were not merged in an image-pool before photogrammetric processing, and geometric and spectral calibrations of aerial and terrestrial photos are completed separately before data fusion. In this manner, individual dense point clouds were both generated based on structure from motion (SfM) and made noise-free with filtering in Bentley ContextCapture software. Precise 3D building model production involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The production methodology involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The building model was achieved with the geometric accuracy (RMSE) of ≤ ±2 cm by the fusion of ±1.87 cm and ±1.17 cm accuracy UAV and terrestrial photogrammetry dense point clouds, respectively. In addition, an indoor model was generated by capturing 360° panoramic photos of the building and a complete virtual tour was created by merging indoor and outdoor data in the Unity game engine platform.
{"title":"Generation of precise 3D building models for digital twin projects using multi-source data fusion and integration into virtual tours","authors":"Umut Gunes Sefercik, Ilyas Aydin, Mertcan Nazar","doi":"10.1016/j.ophoto.2025.100108","DOIUrl":"10.1016/j.ophoto.2025.100108","url":null,"abstract":"<div><div>High-quality production of building digital twins (DT) is always a challenging issue. In this study, a methodology is proposed to obtain a precise georeferenced 3D building model with high geometric and spectral quality, which is one of the essential components of a high-quality DT production, through the fusion of UAV and terrestrial photogrammetric data. To better evaluate the performance of the proposed methodology a complex building with glass facades, entrance porches, outdoor stairs, and architectural coverings was chosen. The techniques, used for overcoming the challenging issues about multi-source image orientation, spectral enhancement and precise building model production were presented. In the proposed methodology, distinct from the existing literature studies, photos obtained from different sources were not merged in an image-pool before photogrammetric processing, and geometric and spectral calibrations of aerial and terrestrial photos are completed separately before data fusion. In this manner, individual dense point clouds were both generated based on structure from motion (SfM) and made noise-free with filtering in Bentley ContextCapture software. Precise 3D building model production involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The production methodology involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The building model was achieved with the geometric accuracy (RMSE) of ≤ ±2 cm by the fusion of ±1.87 cm and ±1.17 cm accuracy UAV and terrestrial photogrammetry dense point clouds, respectively. In addition, an indoor model was generated by capturing 360° panoramic photos of the building and a complete virtual tour was created by merging indoor and outdoor data in the Unity game engine platform.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100108"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-11-29DOI: 10.1016/j.ophoto.2025.100111
Mathieu F. Bilodeau , Travis J. Esau , Mason T. MacDonald , Aitazaz A. Farooque
Circlegrammetry is a new drone photogrammetry technique that utilizes circular flight paths. This approach promises higher efficiency for 3D modelling compared to traditional grid-based methods. This study evaluates its performance in a Christmas tree (Balsam fir) field, a complex agricultural environment characterized by intricate vegetation geometry. Experiments were conducted in a 2-ha orchard located in Truro, Nova Scotia, using a DJI Matrice 300 RTK equipped with a high-resolution optical camera. Three Circlegrammetry missions with varying overlaps (25 and 50 %) and flight heights (40 and 60 m) were compared against standard oblique and smart oblique drone missions flown at an flight heights of 60 m. Mission assessments focused on flight efficiency, processing performance and reconstruction accuracy. The point density of the tree canopy, generated from dense point clouds, was also evaluated against different survey methods. Results demonstrated that Circlegrammetry significantly reduced flight times and the number of images required, particularly at lower overlap configurations. For example, Circlegrammetry with a 25 % overlap achieved mission completion in about half the time required for smart oblique methods and in approximately one-third the duration of standard oblique missions. Processing efficiency was similarly favoured by Circlegrammetry (25 % overlap), with notable reductions in processing times. In terms of reconstruction quality, Circlegrammetry produced spatially accurate models with ground-control RMSE values ranging from 1.38 to 1.53 cm. These results were comparable to those of traditional oblique methods, despite not utilizing nadir imagery. However, Circlegrammetry showed limitations in capturing lower canopy details on the tree, with an average point density higher than that of other methods. For example, Circle 25 % performed the worst, with an average point spacing of 15.79 points per millimetre for the lower canopy. In contrast, the standard oblique approach performed the best, with an average point spacing of 11.89 points per millimetre. This suggested some constraints inherent to the inward-facing of the camera and higher oblique-angle flight paths on Cirlegrammetry missions. Overall, Circlegrammetry emerges as a promising method for precision agriculture applications by striking a balance between flight efficiency and reconstruction detail. Circlegrammetry with a 50 % overlap was demonstrated to be a comparable alternative to the smart oblique acquisition method. Future research should focus on optimizing overlap percentages and flight configurations to improve lower canopy coverage further and generalize these findings across diverse agricultural contexts.
{"title":"Circlegrammetry for drone imaging: Evaluating a novel technique for mission planning and 3D mapping","authors":"Mathieu F. Bilodeau , Travis J. Esau , Mason T. MacDonald , Aitazaz A. Farooque","doi":"10.1016/j.ophoto.2025.100111","DOIUrl":"10.1016/j.ophoto.2025.100111","url":null,"abstract":"<div><div>Circlegrammetry is a new drone photogrammetry technique that utilizes circular flight paths. This approach promises higher efficiency for 3D modelling compared to traditional grid-based methods. This study evaluates its performance in a Christmas tree (Balsam fir) field, a complex agricultural environment characterized by intricate vegetation geometry. Experiments were conducted in a 2-ha orchard located in Truro, Nova Scotia, using a DJI Matrice 300 RTK equipped with a high-resolution optical camera. Three Circlegrammetry missions with varying overlaps (25 and 50 %) and flight heights (40 and 60 m) were compared against standard oblique and smart oblique drone missions flown at an flight heights of 60 m. Mission assessments focused on flight efficiency, processing performance and reconstruction accuracy. The point density of the tree canopy, generated from dense point clouds, was also evaluated against different survey methods. Results demonstrated that Circlegrammetry significantly reduced flight times and the number of images required, particularly at lower overlap configurations. For example, Circlegrammetry with a 25 % overlap achieved mission completion in about half the time required for smart oblique methods and in approximately one-third the duration of standard oblique missions. Processing efficiency was similarly favoured by Circlegrammetry (25 % overlap), with notable reductions in processing times. In terms of reconstruction quality, Circlegrammetry produced spatially accurate models with ground-control RMSE values ranging from 1.38 to 1.53 cm. These results were comparable to those of traditional oblique methods, despite not utilizing nadir imagery. However, Circlegrammetry showed limitations in capturing lower canopy details on the tree, with an average point density higher than that of other methods. For example, Circle 25 % performed the worst, with an average point spacing of 15.79 points per millimetre for the lower canopy. In contrast, the standard oblique approach performed the best, with an average point spacing of 11.89 points per millimetre. This suggested some constraints inherent to the inward-facing of the camera and higher oblique-angle flight paths on Cirlegrammetry missions. Overall, Circlegrammetry emerges as a promising method for precision agriculture applications by striking a balance between flight efficiency and reconstruction detail. Circlegrammetry with a 50 % overlap was demonstrated to be a comparable alternative to the smart oblique acquisition method. Future research should focus on optimizing overlap percentages and flight configurations to improve lower canopy coverage further and generalize these findings across diverse agricultural contexts.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100111"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145693690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-10-21DOI: 10.1016/j.ophoto.2025.100104
Jakobus Möhring , Teja Kattenborn , Miguel D. Mahecha , Yan Cheng , Mirela Beloiu Schwenke , Myriam Cloutier , Martin Denter , Julian Frey , Matthias Gassilloud , Anna Göritz , Jan Hempel , Stéphanie Horion , Tommaso Jucker , Samuli Junttila , Pratima Khatri-Chhetri , Kirill Korznikov , Stefan Kruse , Etienne Laliberté , Michael Maroschek , Paul Neumeier , Clemens Mosig
With tree mortality rates rising across many regions of the world, efficient methods to map dead trees are becoming increasingly important to monitor forest dieback, assess ecological impacts, and guide management strategies. Deep learning-based pattern recognition combined with the high spatial detail of aerial images from drones or airplanes provides an avenue for mapping dead tree crowns or partial canopy dieback, collectively referred to as standing deadwood. However, current methods for mapping standing deadwood are limited to specific biomes or image resolutions. Here, we present a transformer-based semantic segmentation model that generalizes across forest biomes and a wide range of image resolutions (1–28 cm) for mapping both dead tree crowns and partial canopy dieback. Our approach combines a SegFormer-based transformer architecture for image feature extraction and Focal Tversky Loss to mitigate class imbalance. We used a globally distributed crowd-sourced dataset of 434 high-resolution aerial images and manual delineations of standing deadwood of vastly varying quality. The orthophotos span all major forest biomes and cover 10,778 hectares. To further mitigate imbalances across biomes, resolutions, deadwood occurrence, and image sources, we developed a four-dimensional sampling scheme that ensures balanced representation during training. The models were trained and evaluated using heterogeneous crowd-sourced data, which, as expected, negatively affects the F1-scores. A visual inspection on independent data highlights the very precise quality of the segmentation. Our analysis revealed resolution-dependent performance variations across biomes, suggesting a relationship between optimal mapping resolution and biome-specific characteristics. We make both our model and a machine-learning-ready dataset publicly available on deadtrees.earth to support future research in tree mortality mapping.
{"title":"Global, multi-scale standing deadwood segmentation in centimeter-scale aerial images","authors":"Jakobus Möhring , Teja Kattenborn , Miguel D. Mahecha , Yan Cheng , Mirela Beloiu Schwenke , Myriam Cloutier , Martin Denter , Julian Frey , Matthias Gassilloud , Anna Göritz , Jan Hempel , Stéphanie Horion , Tommaso Jucker , Samuli Junttila , Pratima Khatri-Chhetri , Kirill Korznikov , Stefan Kruse , Etienne Laliberté , Michael Maroschek , Paul Neumeier , Clemens Mosig","doi":"10.1016/j.ophoto.2025.100104","DOIUrl":"10.1016/j.ophoto.2025.100104","url":null,"abstract":"<div><div>With tree mortality rates rising across many regions of the world, efficient methods to map dead trees are becoming increasingly important to monitor forest dieback, assess ecological impacts, and guide management strategies. Deep learning-based pattern recognition combined with the high spatial detail of aerial images from drones or airplanes provides an avenue for mapping dead tree crowns or partial canopy dieback, collectively referred to as standing deadwood. However, current methods for mapping standing deadwood are limited to specific biomes or image resolutions. Here, we present a transformer-based semantic segmentation model that generalizes across forest biomes and a wide range of image resolutions (1–28 cm) for mapping both dead tree crowns and partial canopy dieback. Our approach combines a SegFormer-based transformer architecture for image feature extraction and Focal Tversky Loss to mitigate class imbalance. We used a globally distributed crowd-sourced dataset of 434 high-resolution aerial images and manual delineations of standing deadwood of vastly varying quality. The orthophotos span all major forest biomes and cover 10,778 hectares. To further mitigate imbalances across biomes, resolutions, deadwood occurrence, and image sources, we developed a four-dimensional sampling scheme that ensures balanced representation during training. The models were trained and evaluated using heterogeneous crowd-sourced data, which, as expected, negatively affects the F1-scores. A visual inspection on independent data highlights the very precise quality of the segmentation. Our analysis revealed resolution-dependent performance variations across biomes, suggesting a relationship between optimal mapping resolution and biome-specific characteristics. We make both our model and a machine-learning-ready dataset publicly available on <span><span>deadtrees.earth</span><svg><path></path></svg></span> to support future research in tree mortality mapping.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100104"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145418262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-10-29DOI: 10.1016/j.ophoto.2025.100105
Insha Batool , Arshad Ashraf , Muhammad Fahim Khokhar
Karakoram glaciers exhibit steady mass or expansion in the central and western Karakoram, contrasting with the retreat observed in the eastern Karakoram and respond differently to climatic conditions compared to glaciers in the world—a phenomenon termed as the Karakoram Anomaly. The absence of long-term ground-based monitoring of climatic variables and glaciers observations in addition to the region's complex terrain, remote location, and harsh climate pose a serious challenge to find a precise explanation for anomalous glaciers behavior and their response to ongoing climate variability. This study compares a high-resolution (10 m) geodetic glaciers data set from 1991 to 2022 with climate variables to assess the glaciers mass balance condition across different elevations in the Hunza and Shigar basins and to examine their relationship with climatic drivers. We observe that glaciers maintain a stable mass balance regardless of elevation. Above 4500 m above sea level, glaciers exhibit surges under the unique climate warming of the twenty first century with slight reduction in snowfall—a phenomenon we refer to as the Karakoram Climate Response Anomaly (KCRA). We find that the unique mountainous land and a predominantly north-facing aspect are the main cause of glaciers stability despite prevailing warming climate signatures in the Karakoram range. However, future projections based on CMIP6 ensemble scenarios indicate a challenging future for glaciers sustainability, with rising temperatures and declining precipitation, particularly in the western Karakoram. These findings underscore the critical need for continuous field observations of glaciers and climate conditions to better understand and predict glacier responses to evolving climate conditions in Karakoram.
{"title":"Anomalous glaciers response to climate variability in the Karakoram region","authors":"Insha Batool , Arshad Ashraf , Muhammad Fahim Khokhar","doi":"10.1016/j.ophoto.2025.100105","DOIUrl":"10.1016/j.ophoto.2025.100105","url":null,"abstract":"<div><div>Karakoram glaciers exhibit steady mass or expansion in the central and western Karakoram, contrasting with the retreat observed in the eastern Karakoram and respond differently to climatic conditions compared to glaciers in the world<strong>—</strong>a phenomenon termed as the Karakoram Anomaly. The absence of long-term ground-based monitoring of climatic variables and glaciers observations in addition to the region's complex terrain, remote location, and harsh climate pose a serious challenge to find a precise explanation for anomalous glaciers behavior and their response to ongoing climate variability. This study compares a high-resolution (10 m) geodetic glaciers data set from 1991 to 2022 with climate variables to assess the glaciers mass balance condition across different elevations in the Hunza and Shigar basins and to examine their relationship with climatic drivers. We observe that glaciers maintain a stable mass balance regardless of elevation. Above 4500 m above sea level, glaciers exhibit surges under the unique climate warming of the twenty first century with slight reduction in snowfall—a phenomenon we refer to as the Karakoram Climate Response Anomaly (KCRA). We find that the unique mountainous land and a predominantly north-facing aspect are the main cause of glaciers stability despite prevailing warming climate signatures in the Karakoram range. However, future projections based on CMIP6 ensemble scenarios indicate a challenging future for glaciers sustainability, with rising temperatures and declining precipitation, particularly in the western Karakoram. These findings underscore the critical need for continuous field observations of glaciers and climate conditions to better understand and predict glacier responses to evolving climate conditions in Karakoram.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100105"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145466832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ground filtering algorithms (GFs) are widely used in point cloud processing to generate digital terrain models. Existing GFs typically rely on rule-based or machine learning approaches to separate ground and non-ground points within an airborne point cloud. However, they often struggle to accurately extract ground points in scenarios containing mountains and heterogeneous buildings. To enhance the accuracy and robustness of ground filtering for airborne point clouds, we propose a data-driven morphological filtering algorithm (DMF). DMF begins by identifying near-ground voxel centroids after voxelizing the input point clouds. Next, a digital elevation model is constructed based on the elevation information of these near-ground voxel centroids. A composite morphological filter is then designed to identify ground and non-ground patches within the digital elevation model before labeling their inner near-ground voxel centroids as GF-support nodes. The composite morphological filter is used to recognize non-ground areas with incomplete edge structures depicted in the input point cloud and to correct misclassified areas. Finally, a bidirectional k-dimensional tree search engine is built between the GF-support nodes and the input point cloud to separate ground and non-ground points. Experimental results show that DMF achieves ground filtering accuracy with an average F-score greater than 0.88, demonstrating robustness in generating digital terrain models across various test scenarios. Furthermore, the intermediate outputs of DMF enable instance segmentation of artificial objects in airborne point clouds. The code for DMF will be shared on GitHub (https://github.com/wbx1727031/DMF).
{"title":"A data-driven morphological filtering algorithm for digital terrain model generation from airborne LiDAR data","authors":"Bingxiao Wu , Xingxing Zhou , Junhong Zhao , Wuming Zhang , Guang Zheng","doi":"10.1016/j.ophoto.2025.100102","DOIUrl":"10.1016/j.ophoto.2025.100102","url":null,"abstract":"<div><div>Ground filtering algorithms (GFs) are widely used in point cloud processing to generate digital terrain models. Existing GFs typically rely on rule-based or machine learning approaches to separate ground and non-ground points within an airborne point cloud. However, they often struggle to accurately extract ground points in scenarios containing mountains and heterogeneous buildings. To enhance the accuracy and robustness of ground filtering for airborne point clouds, we propose a data-driven morphological filtering algorithm (DMF). DMF begins by identifying near-ground voxel centroids after voxelizing the input point clouds. Next, a digital elevation model is constructed based on the elevation information of these near-ground voxel centroids. A composite morphological filter is then designed to identify ground and non-ground patches within the digital elevation model before labeling their inner near-ground voxel centroids as GF-support nodes. The composite morphological filter is used to recognize non-ground areas with incomplete edge structures depicted in the input point cloud and to correct misclassified areas. Finally, a bidirectional <em>k</em>-dimensional tree search engine is built between the GF-support nodes and the input point cloud to separate ground and non-ground points. Experimental results show that DMF achieves ground filtering accuracy with an average F-score greater than 0.88, demonstrating robustness in generating digital terrain models across various test scenarios. Furthermore, the intermediate outputs of DMF enable instance segmentation of artificial objects in airborne point clouds. The code for DMF will be shared on GitHub (<span><span>https://github.com/wbx1727031/DMF</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100102"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145190142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-01Epub Date: 2025-11-13DOI: 10.1016/j.ophoto.2025.100107
Tobias Fichtmueller, Alexander Witt, Christoph Holst
Visual Simultaneous Localization and Mapping (VSLAM) provides a reliable option for the precise vehicle localization required for planning and executing autonomous driving maneuvers, especially in areas where traditional GNSS-based systems fail. Therefore, our current objective is to transmit the generated 3D points (non-semantic landmarks) to the data backend to store them in a map-layer for application as future localization support. However, the limited bandwidth between the vehicle and the data backend requires filtering the landmarks before transmission.
This paper introduces RoFlex, a robust and flexible approach for filtering non-semantic landmarks within the calculation front-end of a VSLAM system. Given the bandwidth restrictions in vehicle-to-data-backend communication, RoFlex selects landmarks beneficial to long-term localization based on their stability, accuracy, and recognizability. In contrast to existing approaches that rely on training data, RoFlex computes an individual score for each landmark using seven distinct attributes to assess their suitability as localization support. The methodology was qualitatively evaluated on several datasets and identified stable, accurate, and recognizable landmarks across different environments and conditions. In addition, we conducted a quantitative evaluation based on three experiments (recognizability, stability, and localization accuracy), demonstrating that RoFlex retains around 90% recognizability and preserves localization performance even when only 50% of the landmarks are used. For this reason, the work represents an effective contribution to long-term localization within the automotive domain. Moreover, the modular design of RoFlex serves as a foundation for further research on filtering non-semantic landmarks.
{"title":"RoFlex: Robust and flexible filtering of non-semantic landmarks for automotive applications","authors":"Tobias Fichtmueller, Alexander Witt, Christoph Holst","doi":"10.1016/j.ophoto.2025.100107","DOIUrl":"10.1016/j.ophoto.2025.100107","url":null,"abstract":"<div><div>Visual Simultaneous Localization and Mapping (VSLAM) provides a reliable option for the precise vehicle localization required for planning and executing autonomous driving maneuvers, especially in areas where traditional GNSS-based systems fail. Therefore, our current objective is to transmit the generated 3D points (non-semantic landmarks) to the data backend to store them in a map-layer for application as future localization support. However, the limited bandwidth between the vehicle and the data backend requires filtering the landmarks before transmission.</div><div>This paper introduces RoFlex, a robust and flexible approach for filtering non-semantic landmarks within the calculation front-end of a VSLAM system. Given the bandwidth restrictions in vehicle-to-data-backend communication, RoFlex selects landmarks beneficial to long-term localization based on their stability, accuracy, and recognizability. In contrast to existing approaches that rely on training data, RoFlex computes an individual score for each landmark using seven distinct attributes to assess their suitability as localization support. The methodology was qualitatively evaluated on several datasets and identified stable, accurate, and recognizable landmarks across different environments and conditions. In addition, we conducted a quantitative evaluation based on three experiments (recognizability, stability, and localization accuracy), demonstrating that RoFlex retains around 90% recognizability and preserves localization performance even when only 50% of the landmarks are used. For this reason, the work represents an effective contribution to long-term localization within the automotive domain. Moreover, the modular design of RoFlex serves as a foundation for further research on filtering non-semantic landmarks.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100107"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-01Epub Date: 2025-05-23DOI: 10.1016/j.ophoto.2025.100090
Bradley J. Koskowich , Michael J. Starek , Scott A. King
Cross-view geolocalization (CVGL) describes the general problem of determining a correlation between terrestrial and nadir oriented imagery. Classical keypoint matching methods find the extreme pose transitions between cameras present in a CVGL configuration challenging to operate in, while deep neural networks demonstrate superb capacity in this area. Traditional photogrammetry methods like structure-from-motion (SfM) or simultaneous localization and mapping (SLAM) can technically accomplish CVGL, but require a sufficiently dense collection of camera views in order to recover camera pose. This research proposes an alternative CVGL solution, a series of algorithmic operations which can completely automate the calculation of target camera pose via a less common photogrammetry method known as monoplotting, also called single camera resectioning. Monoplotting only requires three inputs, which are a target terrestrial camera image, a nadir-oriented image, and an underlying digital surface model. 2D-3D point correspondences are derived from the inputs to optimize for the target terrestrial camera pose. The proposed method applies affine keypointing, pixel color quantization, and keypoint neighbor triangulation to codify explicit relationships used to augment keypoint matching operations done in a CVGL context. These matching results are used to achieve better initial 2D-3D point correlations from monoplotting image pairs, resulting in lower error for single camera resectioning. To gauge the effectiveness of the proposed method, this proposed methodology is applied to urban, suburban, and natural environment datasets. This proposed methodology demonstrates an average 42x improvement in feature matching between CVGL image pairs, which improves on inconsistent baseline methodology by reducing translation errors between 50%–75%.
{"title":"The potential & limitations of monoplotting in cross-view geo-localization conditions","authors":"Bradley J. Koskowich , Michael J. Starek , Scott A. King","doi":"10.1016/j.ophoto.2025.100090","DOIUrl":"10.1016/j.ophoto.2025.100090","url":null,"abstract":"<div><div>Cross-view geolocalization (CVGL) describes the general problem of determining a correlation between terrestrial and nadir oriented imagery. Classical keypoint matching methods find the extreme pose transitions between cameras present in a CVGL configuration challenging to operate in, while deep neural networks demonstrate superb capacity in this area. Traditional photogrammetry methods like structure-from-motion (SfM) or simultaneous localization and mapping (SLAM) can technically accomplish CVGL, but require a sufficiently dense collection of camera views in order to recover camera pose. This research proposes an alternative CVGL solution, a series of algorithmic operations which can completely automate the calculation of target camera pose via a less common photogrammetry method known as monoplotting, also called single camera resectioning. Monoplotting only requires three inputs, which are a target terrestrial camera image, a nadir-oriented image, and an underlying digital surface model. 2D-3D point correspondences are derived from the inputs to optimize for the target terrestrial camera pose. The proposed method applies affine keypointing, pixel color quantization, and keypoint neighbor triangulation to codify explicit relationships used to augment keypoint matching operations done in a CVGL context. These matching results are used to achieve better initial 2D-3D point correlations from monoplotting image pairs, resulting in lower error for single camera resectioning. To gauge the effectiveness of the proposed method, this proposed methodology is applied to urban, suburban, and natural environment datasets. This proposed methodology demonstrates an average 42x improvement in feature matching between CVGL image pairs, which improves on inconsistent baseline methodology by reducing translation errors between 50%–75%.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100090"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}