ISPRS Open Journal of Photogrammetry and Remote Sensing最新文献_第2页

GAST: A graph-augmented spectral–spatial transformer with adaptive gated fusion for small-sample hyperspectral image classification GAST：一种用于小样本高光谱图像分类的自适应门控融合图增强光谱空间转换器

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2026-01-01 Epub Date: 2026-01-22 DOI: 10.1016/j.ophoto.2026.100116

Faruk Keskin , Fesih Keskin , Gültekin Işık

Accurate hyperspectral image (HSI) classification under scarce labels and class imbalance requires models that couple long-range spectral reasoning with irregular local spatial context. We present GAST, a Graph-Augmented spectral–spatial Transformer with Adaptive Gated Fusion for Small-Sample Hyperspectral Image Classification. GAST pairs a lightweight spectral Transformer with a GATv2-based spatial branch on an 8-neighbor pixel graph, and fuses them via a center-conditioned, channel-wise gating mechanism that uses the center-pixel representation to modulate all tokens in the patch. Unlike conventional static fusion strategies (e.g., concatenation or summation) that assign fixed importance to modalities regardless of image content, this adaptive fusion dynamically modulates the spectral and spatial streams at the pixel level, allowing the model to prioritize spatial texture for complex urban structures while shifting focus to spectral signatures for subtle vegetation classes. Training is further stabilized by an imbalance-aware objective that switches between weighted cross-entropy and focal loss according to a measured class ratio, and by a two-stage Bayesian hyperparameter search that aligns capacity with scene statistics. Across eight public benchmarks under a 5%-label protocol, GAST consistently matches or surpasses recent hybrid graph-Transformer architectures while remaining compact and fast at inference. Ablation studies confirm the complementary roles of both branches and the benefit of gated fusion. The resulting architecture offers a strong accuracy–efficiency trade-off and reliable performance across seeds, making it a practical solution for low-data HSI applications. The code is publicly available at https://github.com/fesihkeskin/GAST.

在标签稀缺和类别不平衡的情况下，高光谱图像的精确分类需要将远程光谱推理与不规则的局部空间背景相结合的模型。我们提出了一种用于小样本高光谱图像分类的带有自适应门控融合的图增强光谱空间变压器GAST。GAST将一个轻量级的频谱转换器与一个基于gatv2的空间分支配对在一个8相邻像素图上，并通过一个中心条件的、通道方向的门控机制融合它们，该机制使用中心像素表示来调制补丁中的所有令牌。与传统的静态融合策略（例如，拼接或求和）不同，这种自适应融合在像素级动态调节光谱和空间流，允许模型优先考虑复杂城市结构的空间纹理，同时将焦点转移到微妙植被类别的光谱特征上。通过根据测量的类比在加权交叉熵和焦点损失之间切换的不平衡感知目标，以及通过将容量与场景统计相一致的两阶段贝叶斯超参数搜索，进一步稳定了训练。在5%标签协议下的八个公共基准测试中，GAST始终匹配或超过最近的混合图形转换器架构，同时保持紧凑和快速的推理。消融研究证实了两个分支的互补作用和门控融合的好处。由此产生的架构提供了强大的准确性和效率权衡和可靠的跨种子性能，使其成为低数据HSI应用的实用解决方案。该代码可在https://github.com/fesihkeskin/GAST上公开获得。

{"title":"GAST: A graph-augmented spectral–spatial transformer with adaptive gated fusion for small-sample hyperspectral image classification","authors":"Faruk Keskin , Fesih Keskin , Gültekin Işık","doi":"10.1016/j.ophoto.2026.100116","DOIUrl":"10.1016/j.ophoto.2026.100116","url":null,"abstract":"<div><div>Accurate hyperspectral image (HSI) classification under scarce labels and class imbalance requires models that couple long-range spectral reasoning with irregular local spatial context. We present GAST, a Graph-Augmented spectral–spatial Transformer with Adaptive Gated Fusion for Small-Sample Hyperspectral Image Classification. GAST pairs a lightweight spectral Transformer with a GATv2-based spatial branch on an 8-neighbor pixel graph, and fuses them via a center-conditioned, channel-wise gating mechanism that uses the center-pixel representation to modulate all tokens in the patch. Unlike conventional static fusion strategies (e.g., concatenation or summation) that assign fixed importance to modalities regardless of image content, this adaptive fusion dynamically modulates the spectral and spatial streams at the pixel level, allowing the model to prioritize spatial texture for complex urban structures while shifting focus to spectral signatures for subtle vegetation classes. Training is further stabilized by an imbalance-aware objective that switches between weighted cross-entropy and focal loss according to a measured class ratio, and by a two-stage Bayesian hyperparameter search that aligns capacity with scene statistics. Across eight public benchmarks under a 5%-label protocol, GAST consistently matches or surpasses recent hybrid graph-Transformer architectures while remaining compact and fast at inference. Ablation studies confirm the complementary roles of both branches and the benefit of gated fusion. The resulting architecture offers a strong accuracy–efficiency trade-off and reliable performance across seeds, making it a practical solution for low-data HSI applications. The code is publicly available at <span><span>https://github.com/fesihkeskin/GAST</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"19 ","pages":"Article 100116"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Monitoring tropical forests with light drones: ensuring spatial and temporal consistency in stereophotogrammetric products 用轻型无人机监测热带森林：确保立体摄影测量产品的时空一致性

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2026-01-01 Epub Date: 2025-12-09 DOI: 10.1016/j.ophoto.2025.100114

Nicolas Barbier , Pierre Ploton , Hadrien Tulet , Gaëlle Viennois , Hugo Leblanc , Benoît Burban , Maxime Réjou-Méchain , Philippe Verley , James Ball , Denis Feurer , Grégoire Vincent

Light drones provide a cheap and effective tool to monitor forest canopy, especially in tropical and equatorial contexts, where infrastructure and resources are limiting. In these regions, good quality optical satellite images are rare, yet the stakes are maximal to characterize forest function, dynamics, diversity, and phenology, and more generally the vegetation-climate interplay.

We describe a complete processing chain based on photogrammetric tools that seeks to optimize the spatial and spectral coherence between repeat image mosaics at centimetric resolution. Our target is to allow individual tree-level monitoring over tens to hundreds of hectare scales with consumer grade equipment (i.e., quadcopter with stabilized RGB camera, standard GNSS positioning).

We demonstrate the increase in spatial accuracy achieved using Time-SIFT and Arosics algorithms, which allow (individually and synergistically) to reduce global and local spatial misalignment between mosaics from several meters to a few centimeters. Time-SIFT provides the advantage of increased robustness in initial image alignment and 3D reconstruction, and hence reduces occasional distortions or data gaps. Using Agisoft's color and white balance corrections combined with the use of vegetation indices provides meaningful quantitative signal despite considerable changes in acquisition conditions.

In particular, indices that are less sensitive to illumination changes, like the green chromatic coordinate (GCC), allowed evidencing a seasonal signal over four years of monitoring in the evergreen moist forest at Paracou in French Guiana. The signal was decorrelated from obvious geometrical effect (sun height), and provided information on the vegetative stage at tree, species, and stand levels.

轻型无人机为监测森林冠层提供了一种廉价而有效的工具，特别是在基础设施和资源有限的热带和赤道地区。在这些地区，高质量的光学卫星图像很少，但对表征森林功能、动态、多样性和物候，以及更普遍的植被-气候相互作用具有最大的意义。我们描述了一个基于摄影测量工具的完整处理链，旨在优化厘米分辨率重复图像拼接之间的空间和光谱一致性。我们的目标是允许使用消费级设备（即带有稳定RGB相机的四轴飞行器，标准GNSS定位）进行数十至数百公顷规模的单个树级监测。我们展示了使用Time-SIFT和Arosics算法实现的空间精度的提高，这允许（单独和协同）减少马赛克之间从几米到几厘米的全局和局部空间不对准。Time-SIFT在初始图像对齐和3D重建方面具有增强的鲁棒性，因此减少了偶尔的失真或数据缺口。使用Agisoft的颜色和白平衡校正结合使用植被指数提供了有意义的定量信号，尽管在采集条件有相当大的变化。特别是，对光照变化不太敏感的指数，如绿色色坐标（GCC），在法属圭亚那帕拉库的常绿潮湿森林中进行了四年的监测，证明了季节信号。该信号与明显的几何效应（太阳高度）无关，提供了树、种和林分水平的营养阶段信息。

{"title":"Monitoring tropical forests with light drones: ensuring spatial and temporal consistency in stereophotogrammetric products","authors":"Nicolas Barbier , Pierre Ploton , Hadrien Tulet , Gaëlle Viennois , Hugo Leblanc , Benoît Burban , Maxime Réjou-Méchain , Philippe Verley , James Ball , Denis Feurer , Grégoire Vincent","doi":"10.1016/j.ophoto.2025.100114","DOIUrl":"10.1016/j.ophoto.2025.100114","url":null,"abstract":"<div><div>Light drones provide a cheap and effective tool to monitor forest canopy, especially in tropical and equatorial contexts, where infrastructure and resources are limiting. In these regions, good quality optical satellite images are rare, yet the stakes are maximal to characterize forest function, dynamics, diversity, and phenology, and more generally the vegetation-climate interplay.</div><div>We describe a complete processing chain based on photogrammetric tools that seeks to optimize the spatial and spectral coherence between repeat image mosaics at centimetric resolution. Our target is to allow individual tree-level monitoring over tens to hundreds of hectare scales with consumer grade equipment (i.e., quadcopter with stabilized RGB camera, standard GNSS positioning).</div><div>We demonstrate the increase in spatial accuracy achieved using Time-SIFT and Arosics algorithms, which allow (individually and synergistically) to reduce global and local spatial misalignment between mosaics from several meters to a few centimeters. Time-SIFT provides the advantage of increased robustness in initial image alignment and 3D reconstruction, and hence reduces occasional distortions or data gaps. Using Agisoft's color and white balance corrections combined with the use of vegetation indices provides meaningful quantitative signal despite considerable changes in acquisition conditions.</div><div>In particular, indices that are less sensitive to illumination changes, like the green chromatic coordinate (GCC), allowed evidencing a seasonal signal over four years of monitoring in the evergreen moist forest at Paracou in French Guiana. The signal was decorrelated from obvious geometrical effect (sun height), and provided information on the vegetative stage at tree, species, and stand levels.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"19 ","pages":"Article 100114"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145750140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards monitoring livestock using satellite imagery: Transferability of object detection and segmentation models in Kenyan rangelands 利用卫星图像监测牲畜：肯尼亚牧场目标检测和分割模型的可转移性

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-12-01 Epub Date: 2025-10-31 DOI: 10.1016/j.ophoto.2025.100106

Ian A. Ocholla , Janne Heiskanen , Faith Karanja , Mark Boitt , Petri Pellikka

Over the past four decades, rising demand for livestock products in Africa has led to increased stocking rates resulting in overgrazing and land degradation. As the population is projected to rise, the need for sustainable livestock management is more urgent than ever, yet efforts are hindered by the lack of accurate, up-to-date livestock counts. Recent advances in remote sensing and deep learning have made it possible to count livestock from space. However, the extent to which models trained on aerial imagery can enhance livestock detection in satellite images and across diverse landscapes remains limited. This study assessed the transferability of YOLO, Faster R-CNN, U-Net, and ResNet models for livestock detection across three contrasting landscapes, Choke bushland (Pleiades Neo), Kapiti savanna (WorldView-3), and LUMO open grassland (WorldView-3), using satellite imagery with 0.3 m and 0.4 m spatial resolution. Additionally, we applied a multi-stage transfer learning to evaluate the effectiveness of aerial imagery (0.1 m) trained models in improving livestock detection in satellite imagery. Results indicate that YOLOv5 consistently outperformed other models, achieving F1 scores of 0.55, 0.67, and 0.85 in Choke, Kapiti, and LUMO, respectively, demonstrating robustness across varying land cover types and sensors. Although segmentation models performed moderately on 0.3 m imagery (F1 scores of 0.51 and 0.40 for Choke and LUMO), their performance dropped significantly on the coarser resolution (0.4 m) Kapiti imagery (F1 score of 0.14). In addition, multi-stage transfer learning improved segmentation models recall by 9.8 % in heterogeneous bushland site. Our results highlight that the integration of multi-source imagery and deep learning can help in large scale livestock monitoring, which is crucial in implementing sustainable rangeland management.

在过去四十年中，非洲对畜产品需求的不断增长导致牲畜放养率上升，导致过度放牧和土地退化。随着人口的增长，对可持续牲畜管理的需求比以往任何时候都更加迫切，但由于缺乏准确、最新的牲畜数量，这方面的努力受到了阻碍。遥感和深度学习的最新进展使从太空中计算牲畜成为可能。然而，利用航空图像训练的模型在卫星图像和不同景观中增强牲畜检测的程度仍然有限。本研究利用0.3 m和0.4 m空间分辨率的卫星图像，评估了YOLO、Faster R-CNN、U-Net和ResNet模型在3种不同景观（Choke bushland （Pleiades Neo）、Kapiti savannah （WorldView-3）和LUMO open grass (WorldView-3)）中牲畜检测的可转移性。此外，我们应用多阶段迁移学习来评估航空图像（0.1 m）训练模型在提高卫星图像中牲畜检测方面的有效性。结果表明，YOLOv5始终优于其他模型，在Choke、Kapiti和LUMO中分别获得0.55、0.67和0.85的F1分数，显示出对不同土地覆盖类型和传感器的鲁棒性。尽管分割模型在0.3 m图像上表现一般（Choke和LUMO的F1得分分别为0.51和0.40），但在较粗分辨率（0.4 m） Kapiti图像上表现明显下降（F1得分为0.14）。此外，多阶段迁移学习使不同类型林地的分割模型的召回率提高了9.8%。我们的研究结果强调了多源图像和深度学习的集成可以帮助大规模牲畜监测，这对于实施可持续的牧场管理至关重要。

{"title":"Towards monitoring livestock using satellite imagery: Transferability of object detection and segmentation models in Kenyan rangelands","authors":"Ian A. Ocholla , Janne Heiskanen , Faith Karanja , Mark Boitt , Petri Pellikka","doi":"10.1016/j.ophoto.2025.100106","DOIUrl":"10.1016/j.ophoto.2025.100106","url":null,"abstract":"<div><div>Over the past four decades, rising demand for livestock products in Africa has led to increased stocking rates resulting in overgrazing and land degradation. As the population is projected to rise, the need for sustainable livestock management is more urgent than ever, yet efforts are hindered by the lack of accurate, up-to-date livestock counts. Recent advances in remote sensing and deep learning have made it possible to count livestock from space. However, the extent to which models trained on aerial imagery can enhance livestock detection in satellite images and across diverse landscapes remains limited. This study assessed the transferability of YOLO, Faster R-CNN, U-Net, and ResNet models for livestock detection across three contrasting landscapes, Choke bushland (Pleiades Neo), Kapiti savanna (WorldView-3), and LUMO open grassland (WorldView-3), using satellite imagery with 0.3 m and 0.4 m spatial resolution. Additionally, we applied a multi-stage transfer learning to evaluate the effectiveness of aerial imagery (0.1 m) trained models in improving livestock detection in satellite imagery. Results indicate that YOLOv5 consistently outperformed other models, achieving F1 scores of 0.55, 0.67, and 0.85 in Choke, Kapiti, and LUMO, respectively, demonstrating robustness across varying land cover types and sensors. Although segmentation models performed moderately on 0.3 m imagery (F1 scores of 0.51 and 0.40 for Choke and LUMO), their performance dropped significantly on the coarser resolution (0.4 m) Kapiti imagery (F1 score of 0.14). In addition, multi-stage transfer learning improved segmentation models recall by 9.8 % in heterogeneous bushland site. Our results highlight that the integration of multi-source imagery and deep learning can help in large scale livestock monitoring, which is crucial in implementing sustainable rangeland management.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100106"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145466838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generation of precise 3D building models for digital twin projects using multi-source data fusion and integration into virtual tours 使用多源数据融合和集成到虚拟旅行中，为数字孪生项目生成精确的3D建筑模型

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-12-01 Epub Date: 2025-11-19 DOI: 10.1016/j.ophoto.2025.100108

Umut Gunes Sefercik, Ilyas Aydin, Mertcan Nazar

High-quality production of building digital twins (DT) is always a challenging issue. In this study, a methodology is proposed to obtain a precise georeferenced 3D building model with high geometric and spectral quality, which is one of the essential components of a high-quality DT production, through the fusion of UAV and terrestrial photogrammetric data. To better evaluate the performance of the proposed methodology a complex building with glass facades, entrance porches, outdoor stairs, and architectural coverings was chosen. The techniques, used for overcoming the challenging issues about multi-source image orientation, spectral enhancement and precise building model production were presented. In the proposed methodology, distinct from the existing literature studies, photos obtained from different sources were not merged in an image-pool before photogrammetric processing, and geometric and spectral calibrations of aerial and terrestrial photos are completed separately before data fusion. In this manner, individual dense point clouds were both generated based on structure from motion (SfM) and made noise-free with filtering in Bentley ContextCapture software. Precise 3D building model production involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The production methodology involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The building model was achieved with the geometric accuracy (RMSE) of ≤ ±2 cm by the fusion of ±1.87 cm and ±1.17 cm accuracy UAV and terrestrial photogrammetry dense point clouds, respectively. In addition, an indoor model was generated by capturing 360° panoramic photos of the building and a complete virtual tour was created by merging indoor and outdoor data in the Unity game engine platform.

建筑数字孪生体（DT）的高质量生产一直是一个具有挑战性的问题。本研究提出了一种方法，通过融合无人机和地面摄影测量数据，获得具有高几何和光谱质量的精确地理参考三维建筑模型，这是高质量DT生产的重要组成部分之一。为了更好地评估所提出的方法的性能，选择了一个具有玻璃外墙、入口门廊、室外楼梯和建筑覆盖物的复杂建筑。提出了用于克服多源图像定向、光谱增强和精确建筑模型生成等难题的技术。与现有文献研究不同的是，该方法在进行摄影测量处理之前，没有将不同来源的照片合并到一个图像池中，而是在数据融合之前分别完成航空和地面照片的几何和光谱校准。这样，单个密集点云既基于运动结构（SfM）生成，又在Bentley ContextCapture软件中进行滤波，使其无噪声。精确的3D建筑模型制作包括首先合并地理参考点云，然后从融合云生成3D模型。制作方法包括首先合并地理参考点云，然后从融合云生成3D模型。通过融合精度分别为±1.87 cm和±1.17 cm的无人机和地面摄影测量密集点云，实现了建筑模型的几何精度(RMSE)≤±2 cm。此外，通过拍摄建筑的360°全景照片生成室内模型，并在Unity游戏引擎平台中合并室内外数据创建完整的虚拟游览。

{"title":"Generation of precise 3D building models for digital twin projects using multi-source data fusion and integration into virtual tours","authors":"Umut Gunes Sefercik, Ilyas Aydin, Mertcan Nazar","doi":"10.1016/j.ophoto.2025.100108","DOIUrl":"10.1016/j.ophoto.2025.100108","url":null,"abstract":"<div><div>High-quality production of building digital twins (DT) is always a challenging issue. In this study, a methodology is proposed to obtain a precise georeferenced 3D building model with high geometric and spectral quality, which is one of the essential components of a high-quality DT production, through the fusion of UAV and terrestrial photogrammetric data. To better evaluate the performance of the proposed methodology a complex building with glass facades, entrance porches, outdoor stairs, and architectural coverings was chosen. The techniques, used for overcoming the challenging issues about multi-source image orientation, spectral enhancement and precise building model production were presented. In the proposed methodology, distinct from the existing literature studies, photos obtained from different sources were not merged in an image-pool before photogrammetric processing, and geometric and spectral calibrations of aerial and terrestrial photos are completed separately before data fusion. In this manner, individual dense point clouds were both generated based on structure from motion (SfM) and made noise-free with filtering in Bentley ContextCapture software. Precise 3D building model production involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The production methodology involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The building model was achieved with the geometric accuracy (RMSE) of ≤ ±2 cm by the fusion of ±1.87 cm and ±1.17 cm accuracy UAV and terrestrial photogrammetry dense point clouds, respectively. In addition, an indoor model was generated by capturing 360° panoramic photos of the building and a complete virtual tour was created by merging indoor and outdoor data in the Unity game engine platform.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100108"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Circlegrammetry for drone imaging: Evaluating a novel technique for mission planning and 3D mapping 无人机成像的圆周测量：评估任务规划和三维测绘的新技术

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-12-01 Epub Date: 2025-11-29 DOI: 10.1016/j.ophoto.2025.100111

Mathieu F. Bilodeau , Travis J. Esau , Mason T. MacDonald , Aitazaz A. Farooque

Circlegrammetry is a new drone photogrammetry technique that utilizes circular flight paths. This approach promises higher efficiency for 3D modelling compared to traditional grid-based methods. This study evaluates its performance in a Christmas tree (Balsam fir) field, a complex agricultural environment characterized by intricate vegetation geometry. Experiments were conducted in a 2-ha orchard located in Truro, Nova Scotia, using a DJI Matrice 300 RTK equipped with a high-resolution optical camera. Three Circlegrammetry missions with varying overlaps (25 and 50 %) and flight heights (40 and 60 m) were compared against standard oblique and smart oblique drone missions flown at an flight heights of 60 m. Mission assessments focused on flight efficiency, processing performance and reconstruction accuracy. The point density of the tree canopy, generated from dense point clouds, was also evaluated against different survey methods. Results demonstrated that Circlegrammetry significantly reduced flight times and the number of images required, particularly at lower overlap configurations. For example, Circlegrammetry with a 25 % overlap achieved mission completion in about half the time required for smart oblique methods and in approximately one-third the duration of standard oblique missions. Processing efficiency was similarly favoured by Circlegrammetry (25 % overlap), with notable reductions in processing times. In terms of reconstruction quality, Circlegrammetry produced spatially accurate models with ground-control RMSE values ranging from 1.38 to 1.53 cm. These results were comparable to those of traditional oblique methods, despite not utilizing nadir imagery. However, Circlegrammetry showed limitations in capturing lower canopy details on the tree, with an average point density higher than that of other methods. For example, Circle 25 % performed the worst, with an average point spacing of 15.79 points per millimetre for the lower canopy. In contrast, the standard oblique approach performed the best, with an average point spacing of 11.89 points per millimetre. This suggested some constraints inherent to the inward-facing of the camera and higher oblique-angle flight paths on Cirlegrammetry missions. Overall, Circlegrammetry emerges as a promising method for precision agriculture applications by striking a balance between flight efficiency and reconstruction detail. Circlegrammetry with a 50 % overlap was demonstrated to be a comparable alternative to the smart oblique acquisition method. Future research should focus on optimizing overlap percentages and flight configurations to improve lower canopy coverage further and generalize these findings across diverse agricultural contexts.

圆测法是一种利用圆形飞行路径的新型无人机摄影测量技术。与传统的基于网格的方法相比，这种方法有望提高3D建模的效率。本研究评估了其在圣诞树（香脂冷杉）田的性能，这是一个复杂的农业环境，其特征是复杂的植被几何。实验在新斯科舍省特鲁罗的一个2公顷的果园进行，使用的是配备高分辨率光学相机的大疆Matrice 300 RTK。三个不同重叠（25%和50%）和飞行高度（40和60米）的圆周测量任务与飞行高度为60米的标准斜向和智能斜向无人机任务进行了比较。任务评估侧重于飞行效率、处理性能和重建精度。用不同的调查方法对稠密点云生成的树冠点密度进行了评价。结果表明，圆周测量法显著减少飞行时间和所需图像的数量，特别是在低重叠配置。例如，25%重叠的圆周测量法完成任务所需的时间约为智能倾斜方法所需时间的一半，而标准倾斜任务所需时间约为三分之一。加工效率同样有利于圆周测量（25%重叠），加工时间显著减少。在重建质量方面，Circlegrammetry生成了空间精确的模型，地面控制RMSE值在1.38至1.53 cm之间。这些结果与传统的斜位法相当，尽管没有利用最低点图像。然而，圆形测量法在捕获树下冠层细节方面存在局限性，平均点密度高于其他方法。例如，圆圈25%表现最差，下层冠层的平均点间距为每毫米15.79点。相比之下，标准斜入路表现最好，平均点间距为11.89点/毫米。这表明了一些固有的限制，相机的内面向和更高的斜角飞行路径在圆周测量任务。总的来说，通过在飞行效率和重建细节之间取得平衡，圆周测量法成为精准农业应用的一种有前途的方法。具有50%重叠的圆周测量被证明是智能斜向采集方法的可比替代方案。未来的研究应侧重于优化重叠百分比和飞行配置，以进一步提高低层冠层覆盖度，并将这些发现推广到不同的农业环境中。

{"title":"Circlegrammetry for drone imaging: Evaluating a novel technique for mission planning and 3D mapping","authors":"Mathieu F. Bilodeau , Travis J. Esau , Mason T. MacDonald , Aitazaz A. Farooque","doi":"10.1016/j.ophoto.2025.100111","DOIUrl":"10.1016/j.ophoto.2025.100111","url":null,"abstract":"<div><div>Circlegrammetry is a new drone photogrammetry technique that utilizes circular flight paths. This approach promises higher efficiency for 3D modelling compared to traditional grid-based methods. This study evaluates its performance in a Christmas tree (Balsam fir) field, a complex agricultural environment characterized by intricate vegetation geometry. Experiments were conducted in a 2-ha orchard located in Truro, Nova Scotia, using a DJI Matrice 300 RTK equipped with a high-resolution optical camera. Three Circlegrammetry missions with varying overlaps (25 and 50 %) and flight heights (40 and 60 m) were compared against standard oblique and smart oblique drone missions flown at an flight heights of 60 m. Mission assessments focused on flight efficiency, processing performance and reconstruction accuracy. The point density of the tree canopy, generated from dense point clouds, was also evaluated against different survey methods. Results demonstrated that Circlegrammetry significantly reduced flight times and the number of images required, particularly at lower overlap configurations. For example, Circlegrammetry with a 25 % overlap achieved mission completion in about half the time required for smart oblique methods and in approximately one-third the duration of standard oblique missions. Processing efficiency was similarly favoured by Circlegrammetry (25 % overlap), with notable reductions in processing times. In terms of reconstruction quality, Circlegrammetry produced spatially accurate models with ground-control RMSE values ranging from 1.38 to 1.53 cm. These results were comparable to those of traditional oblique methods, despite not utilizing nadir imagery. However, Circlegrammetry showed limitations in capturing lower canopy details on the tree, with an average point density higher than that of other methods. For example, Circle 25 % performed the worst, with an average point spacing of 15.79 points per millimetre for the lower canopy. In contrast, the standard oblique approach performed the best, with an average point spacing of 11.89 points per millimetre. This suggested some constraints inherent to the inward-facing of the camera and higher oblique-angle flight paths on Cirlegrammetry missions. Overall, Circlegrammetry emerges as a promising method for precision agriculture applications by striking a balance between flight efficiency and reconstruction detail. Circlegrammetry with a 50 % overlap was demonstrated to be a comparable alternative to the smart oblique acquisition method. Future research should focus on optimizing overlap percentages and flight configurations to improve lower canopy coverage further and generalize these findings across diverse agricultural contexts.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100111"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145693690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Global, multi-scale standing deadwood segmentation in centimeter-scale aerial images 厘米尺度航拍图像的全局多尺度直立枯木分割

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-12-01 Epub Date: 2025-10-21 DOI: 10.1016/j.ophoto.2025.100104

Jakobus Möhring , Teja Kattenborn , Miguel D. Mahecha , Yan Cheng , Mirela Beloiu Schwenke , Myriam Cloutier , Martin Denter , Julian Frey , Matthias Gassilloud , Anna Göritz , Jan Hempel , Stéphanie Horion , Tommaso Jucker , Samuli Junttila , Pratima Khatri-Chhetri , Kirill Korznikov , Stefan Kruse , Etienne Laliberté , Michael Maroschek , Paul Neumeier , Clemens Mosig

With tree mortality rates rising across many regions of the world, efficient methods to map dead trees are becoming increasingly important to monitor forest dieback, assess ecological impacts, and guide management strategies. Deep learning-based pattern recognition combined with the high spatial detail of aerial images from drones or airplanes provides an avenue for mapping dead tree crowns or partial canopy dieback, collectively referred to as standing deadwood. However, current methods for mapping standing deadwood are limited to specific biomes or image resolutions. Here, we present a transformer-based semantic segmentation model that generalizes across forest biomes and a wide range of image resolutions (1–28 cm) for mapping both dead tree crowns and partial canopy dieback. Our approach combines a SegFormer-based transformer architecture for image feature extraction and Focal Tversky Loss to mitigate class imbalance. We used a globally distributed crowd-sourced dataset of 434 high-resolution aerial images and manual delineations of standing deadwood of vastly varying quality. The orthophotos span all major forest biomes and cover 10,778 hectares. To further mitigate imbalances across biomes, resolutions, deadwood occurrence, and image sources, we developed a four-dimensional sampling scheme that ensures balanced representation during training. The models were trained and evaluated using heterogeneous crowd-sourced data, which, as expected, negatively affects the F1-scores. A visual inspection on independent data highlights the very precise quality of the segmentation. Our analysis revealed resolution-dependent performance variations across biomes, suggesting a relationship between optimal mapping resolution and biome-specific characteristics. We make both our model and a machine-learning-ready dataset publicly available on deadtrees.earth to support future research in tree mortality mapping.

随着世界许多地区树木死亡率的上升，绘制死树图的有效方法对于监测森林枯死、评估生态影响和指导管理战略变得越来越重要。基于深度学习的模式识别与无人机或飞机的航空图像的高空间细节相结合，为绘制死树冠或部分树冠枯梢（统称为直立枯木）提供了一条途径。然而，目前绘制直立枯木的方法仅限于特定的生物群落或图像分辨率。在这里，我们提出了一个基于转换器的语义分割模型，该模型适用于整个森林生物群落和广泛的图像分辨率（1-28厘米），用于绘制死树冠和部分树冠枯死。我们的方法结合了基于segformer的变压器架构用于图像特征提取和Focal Tversky Loss以减轻类不平衡。我们使用了一个由434张高分辨率航空图像组成的全球分布式众包数据集，以及质量差异极大的直立枯木的人工描绘。正射影像区覆盖了所有主要的森林生物群系，面积达10778公顷。为了进一步减轻生物群落、分辨率、枯木发生率和图像源之间的不平衡，我们开发了一个四维采样方案，以确保在训练期间的平衡表示。这些模型是使用异构的众包数据进行训练和评估的，正如预期的那样，这对f1分数产生了负面影响。对独立数据的视觉检查突出了非常精确的分割质量。我们的分析揭示了不同生物群系的分辨率依赖性表现差异，表明最佳制图分辨率与生物群系特异性特征之间存在关系。我们在deadtrees上公开了我们的模型和一个机器学习就绪的数据集。地球支持未来的树木死亡率地图研究。

{"title":"Global, multi-scale standing deadwood segmentation in centimeter-scale aerial images","authors":"Jakobus Möhring , Teja Kattenborn , Miguel D. Mahecha , Yan Cheng , Mirela Beloiu Schwenke , Myriam Cloutier , Martin Denter , Julian Frey , Matthias Gassilloud , Anna Göritz , Jan Hempel , Stéphanie Horion , Tommaso Jucker , Samuli Junttila , Pratima Khatri-Chhetri , Kirill Korznikov , Stefan Kruse , Etienne Laliberté , Michael Maroschek , Paul Neumeier , Clemens Mosig","doi":"10.1016/j.ophoto.2025.100104","DOIUrl":"10.1016/j.ophoto.2025.100104","url":null,"abstract":"<div><div>With tree mortality rates rising across many regions of the world, efficient methods to map dead trees are becoming increasingly important to monitor forest dieback, assess ecological impacts, and guide management strategies. Deep learning-based pattern recognition combined with the high spatial detail of aerial images from drones or airplanes provides an avenue for mapping dead tree crowns or partial canopy dieback, collectively referred to as standing deadwood. However, current methods for mapping standing deadwood are limited to specific biomes or image resolutions. Here, we present a transformer-based semantic segmentation model that generalizes across forest biomes and a wide range of image resolutions (1–28 cm) for mapping both dead tree crowns and partial canopy dieback. Our approach combines a SegFormer-based transformer architecture for image feature extraction and Focal Tversky Loss to mitigate class imbalance. We used a globally distributed crowd-sourced dataset of 434 high-resolution aerial images and manual delineations of standing deadwood of vastly varying quality. The orthophotos span all major forest biomes and cover 10,778 hectares. To further mitigate imbalances across biomes, resolutions, deadwood occurrence, and image sources, we developed a four-dimensional sampling scheme that ensures balanced representation during training. The models were trained and evaluated using heterogeneous crowd-sourced data, which, as expected, negatively affects the F1-scores. A visual inspection on independent data highlights the very precise quality of the segmentation. Our analysis revealed resolution-dependent performance variations across biomes, suggesting a relationship between optimal mapping resolution and biome-specific characteristics. We make both our model and a machine-learning-ready dataset publicly available on <span><span>deadtrees.earth</span><svg><path></path></svg></span> to support future research in tree mortality mapping.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100104"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145418262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Anomalous glaciers response to climate variability in the Karakoram region 喀喇昆仑地区冰川异常对气候变率的响应

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-12-01 Epub Date: 2025-10-29 DOI: 10.1016/j.ophoto.2025.100105

Insha Batool , Arshad Ashraf , Muhammad Fahim Khokhar

Karakoram glaciers exhibit steady mass or expansion in the central and western Karakoram, contrasting with the retreat observed in the eastern Karakoram and respond differently to climatic conditions compared to glaciers in the world—a phenomenon termed as the Karakoram Anomaly. The absence of long-term ground-based monitoring of climatic variables and glaciers observations in addition to the region's complex terrain, remote location, and harsh climate pose a serious challenge to find a precise explanation for anomalous glaciers behavior and their response to ongoing climate variability. This study compares a high-resolution (10 m) geodetic glaciers data set from 1991 to 2022 with climate variables to assess the glaciers mass balance condition across different elevations in the Hunza and Shigar basins and to examine their relationship with climatic drivers. We observe that glaciers maintain a stable mass balance regardless of elevation. Above 4500 m above sea level, glaciers exhibit surges under the unique climate warming of the twenty first century with slight reduction in snowfall—a phenomenon we refer to as the Karakoram Climate Response Anomaly (KCRA). We find that the unique mountainous land and a predominantly north-facing aspect are the main cause of glaciers stability despite prevailing warming climate signatures in the Karakoram range. However, future projections based on CMIP6 ensemble scenarios indicate a challenging future for glaciers sustainability, with rising temperatures and declining precipitation, particularly in the western Karakoram. These findings underscore the critical need for continuous field observations of glaciers and climate conditions to better understand and predict glacier responses to evolving climate conditions in Karakoram.

喀喇昆仑冰川在喀喇昆仑中部和西部表现出稳定的质量或扩张，与喀喇昆仑东部的退缩形成鲜明对比，与世界上的冰川相比，喀喇昆仑冰川对气候条件的反应不同——这种现象被称为喀喇昆仑异常。由于缺乏对气候变量和冰川观测的长期地面监测，再加上该地区复杂的地形、偏远的位置和恶劣的气候，对寻找冰川异常行为及其对持续气候变率的响应的精确解释构成了严峻的挑战。本研究将1991年至2022年的高分辨率（10米）大地测量冰川数据集与气候变量进行比较，以评估罕萨和日喀尔盆地不同海拔的冰川物质平衡状况，并研究其与气候驱动因素的关系。我们观察到，无论海拔高低，冰川都保持稳定的物质平衡。在海拔4500米以上的地区，冰川在21世纪独特的气候变暖的影响下急剧上升，降雪量略有减少，这种现象我们称之为喀喇昆仑气候响应异常（KCRA）。我们发现，尽管喀喇昆仑山脉气候普遍变暖，但独特的山地和主要面向北的地形是冰川稳定的主要原因。然而，基于CMIP6综合情景的未来预估表明，未来冰川的可持续性将面临挑战，气温上升和降水减少，特别是在喀喇昆仑西部。这些发现强调了对冰川和气候条件进行持续实地观测的迫切需要，以便更好地了解和预测喀喇昆仑冰川对不断变化的气候条件的反应。

{"title":"Anomalous glaciers response to climate variability in the Karakoram region","authors":"Insha Batool , Arshad Ashraf , Muhammad Fahim Khokhar","doi":"10.1016/j.ophoto.2025.100105","DOIUrl":"10.1016/j.ophoto.2025.100105","url":null,"abstract":"<div><div>Karakoram glaciers exhibit steady mass or expansion in the central and western Karakoram, contrasting with the retreat observed in the eastern Karakoram and respond differently to climatic conditions compared to glaciers in the world<strong>—</strong>a phenomenon termed as the Karakoram Anomaly. The absence of long-term ground-based monitoring of climatic variables and glaciers observations in addition to the region's complex terrain, remote location, and harsh climate pose a serious challenge to find a precise explanation for anomalous glaciers behavior and their response to ongoing climate variability. This study compares a high-resolution (10 m) geodetic glaciers data set from 1991 to 2022 with climate variables to assess the glaciers mass balance condition across different elevations in the Hunza and Shigar basins and to examine their relationship with climatic drivers. We observe that glaciers maintain a stable mass balance regardless of elevation. Above 4500 m above sea level, glaciers exhibit surges under the unique climate warming of the twenty first century with slight reduction in snowfall—a phenomenon we refer to as the Karakoram Climate Response Anomaly (KCRA). We find that the unique mountainous land and a predominantly north-facing aspect are the main cause of glaciers stability despite prevailing warming climate signatures in the Karakoram range. However, future projections based on CMIP6 ensemble scenarios indicate a challenging future for glaciers sustainability, with rising temperatures and declining precipitation, particularly in the western Karakoram. These findings underscore the critical need for continuous field observations of glaciers and climate conditions to better understand and predict glacier responses to evolving climate conditions in Karakoram.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100105"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145466832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A data-driven morphological filtering algorithm for digital terrain model generation from airborne LiDAR data 基于机载激光雷达数据生成数字地形模型的数据驱动形态滤波算法

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-12-01 Epub Date: 2025-09-17 DOI: 10.1016/j.ophoto.2025.100102

Bingxiao Wu , Xingxing Zhou , Junhong Zhao , Wuming Zhang , Guang Zheng

Ground filtering algorithms (GFs) are widely used in point cloud processing to generate digital terrain models. Existing GFs typically rely on rule-based or machine learning approaches to separate ground and non-ground points within an airborne point cloud. However, they often struggle to accurately extract ground points in scenarios containing mountains and heterogeneous buildings. To enhance the accuracy and robustness of ground filtering for airborne point clouds, we propose a data-driven morphological filtering algorithm (DMF). DMF begins by identifying near-ground voxel centroids after voxelizing the input point clouds. Next, a digital elevation model is constructed based on the elevation information of these near-ground voxel centroids. A composite morphological filter is then designed to identify ground and non-ground patches within the digital elevation model before labeling their inner near-ground voxel centroids as GF-support nodes. The composite morphological filter is used to recognize non-ground areas with incomplete edge structures depicted in the input point cloud and to correct misclassified areas. Finally, a bidirectional k-dimensional tree search engine is built between the GF-support nodes and the input point cloud to separate ground and non-ground points. Experimental results show that DMF achieves ground filtering accuracy with an average F-score greater than 0.88, demonstrating robustness in generating digital terrain models across various test scenarios. Furthermore, the intermediate outputs of DMF enable instance segmentation of artificial objects in airborne point clouds. The code for DMF will be shared on GitHub (https://github.com/wbx1727031/DMF).

地面滤波算法在点云处理中被广泛应用于生成数字地形模型。现有的GFs通常依赖于基于规则或机器学习的方法来分离机载点云中的地面和非地面点。然而，在包含山脉和异质建筑的场景中，它们往往难以准确地提取地面点。为了提高机载点云地面滤波的精度和鲁棒性，提出了一种数据驱动的形态滤波算法。DMF首先在体素化输入点云后识别近地体素质心。然后，基于这些近地体素质心的高程信息构建数字高程模型。然后设计一个复合形态滤波器来识别数字高程模型中的地面和非地面斑块，然后将其内部近地体素质心标记为gf支持节点。复合形态滤波器用于识别输入点云中边缘结构不完整的非地面区域，并对分类错误的区域进行校正。最后，在gf支持节点和输入点云之间构建双向k维树搜索引擎，分离地点和非地点。实验结果表明，DMF实现了平均f值大于0.88的地面滤波精度，在不同测试场景下生成数字地形模型具有鲁棒性。此外，DMF的中间输出实现了机载点云中人工目标的实例分割。DMF的代码将在GitHub上共享（https://github.com/wbx1727031/DMF）。

{"title":"A data-driven morphological filtering algorithm for digital terrain model generation from airborne LiDAR data","authors":"Bingxiao Wu , Xingxing Zhou , Junhong Zhao , Wuming Zhang , Guang Zheng","doi":"10.1016/j.ophoto.2025.100102","DOIUrl":"10.1016/j.ophoto.2025.100102","url":null,"abstract":"<div><div>Ground filtering algorithms (GFs) are widely used in point cloud processing to generate digital terrain models. Existing GFs typically rely on rule-based or machine learning approaches to separate ground and non-ground points within an airborne point cloud. However, they often struggle to accurately extract ground points in scenarios containing mountains and heterogeneous buildings. To enhance the accuracy and robustness of ground filtering for airborne point clouds, we propose a data-driven morphological filtering algorithm (DMF). DMF begins by identifying near-ground voxel centroids after voxelizing the input point clouds. Next, a digital elevation model is constructed based on the elevation information of these near-ground voxel centroids. A composite morphological filter is then designed to identify ground and non-ground patches within the digital elevation model before labeling their inner near-ground voxel centroids as GF-support nodes. The composite morphological filter is used to recognize non-ground areas with incomplete edge structures depicted in the input point cloud and to correct misclassified areas. Finally, a bidirectional <em>k</em>-dimensional tree search engine is built between the GF-support nodes and the input point cloud to separate ground and non-ground points. Experimental results show that DMF achieves ground filtering accuracy with an average F-score greater than 0.88, demonstrating robustness in generating digital terrain models across various test scenarios. Furthermore, the intermediate outputs of DMF enable instance segmentation of artificial objects in airborne point clouds. The code for DMF will be shared on GitHub (<span><span>https://github.com/wbx1727031/DMF</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100102"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145190142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RoFlex: Robust and flexible filtering of non-semantic landmarks for automotive applications RoFlex：用于汽车应用的鲁棒和灵活的非语义地标过滤

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-12-01 Epub Date: 2025-11-13 DOI: 10.1016/j.ophoto.2025.100107

Tobias Fichtmueller, Alexander Witt, Christoph Holst

Visual Simultaneous Localization and Mapping (VSLAM) provides a reliable option for the precise vehicle localization required for planning and executing autonomous driving maneuvers, especially in areas where traditional GNSS-based systems fail. Therefore, our current objective is to transmit the generated 3D points (non-semantic landmarks) to the data backend to store them in a map-layer for application as future localization support. However, the limited bandwidth between the vehicle and the data backend requires filtering the landmarks before transmission.

This paper introduces RoFlex, a robust and flexible approach for filtering non-semantic landmarks within the calculation front-end of a VSLAM system. Given the bandwidth restrictions in vehicle-to-data-backend communication, RoFlex selects landmarks beneficial to long-term localization based on their stability, accuracy, and recognizability. In contrast to existing approaches that rely on training data, RoFlex computes an individual score for each landmark using seven distinct attributes to assess their suitability as localization support. The methodology was qualitatively evaluated on several datasets and identified stable, accurate, and recognizable landmarks across different environments and conditions. In addition, we conducted a quantitative evaluation based on three experiments (recognizability, stability, and localization accuracy), demonstrating that RoFlex retains around 90% recognizability and preserves localization performance even when only 50% of the landmarks are used. For this reason, the work represents an effective contribution to long-term localization within the automotive domain. Moreover, the modular design of RoFlex serves as a foundation for further research on filtering non-semantic landmarks.

视觉同步定位和地图（VSLAM）为规划和执行自动驾驶机动所需的精确车辆定位提供了可靠的选择，特别是在传统gnss系统失效的区域。因此，我们当前的目标是将生成的3D点（非语义地标）传输到数据后端，以将它们存储在地图层中，作为未来本地化支持的应用程序。然而，车辆和数据后端之间有限的带宽要求在传输前对地标进行过滤。本文介绍了一种鲁棒且灵活的方法RoFlex，用于在VSLAM系统的计算前端过滤非语义地标。考虑到车辆到数据后端通信的带宽限制，RoFlex根据其稳定性、准确性和可识别性选择有利于长期定位的地标。与现有的依赖训练数据的方法相比，RoFlex使用七个不同的属性来计算每个地标的单独得分，以评估它们作为本地化支持的适用性。该方法在多个数据集上进行了定性评估，并在不同的环境和条件下确定了稳定、准确和可识别的地标。此外，我们基于三个实验（可识别性、稳定性和定位精度）进行了定量评估，表明即使只使用50%的地标，RoFlex仍能保持90%左右的可识别性和定位性能。由于这个原因，这项工作对汽车领域的长期本地化做出了有效的贡献。此外，RoFlex的模块化设计为进一步研究非语义地标的过滤奠定了基础。

{"title":"RoFlex: Robust and flexible filtering of non-semantic landmarks for automotive applications","authors":"Tobias Fichtmueller, Alexander Witt, Christoph Holst","doi":"10.1016/j.ophoto.2025.100107","DOIUrl":"10.1016/j.ophoto.2025.100107","url":null,"abstract":"<div><div>Visual Simultaneous Localization and Mapping (VSLAM) provides a reliable option for the precise vehicle localization required for planning and executing autonomous driving maneuvers, especially in areas where traditional GNSS-based systems fail. Therefore, our current objective is to transmit the generated 3D points (non-semantic landmarks) to the data backend to store them in a map-layer for application as future localization support. However, the limited bandwidth between the vehicle and the data backend requires filtering the landmarks before transmission.</div><div>This paper introduces RoFlex, a robust and flexible approach for filtering non-semantic landmarks within the calculation front-end of a VSLAM system. Given the bandwidth restrictions in vehicle-to-data-backend communication, RoFlex selects landmarks beneficial to long-term localization based on their stability, accuracy, and recognizability. In contrast to existing approaches that rely on training data, RoFlex computes an individual score for each landmark using seven distinct attributes to assess their suitability as localization support. The methodology was qualitatively evaluated on several datasets and identified stable, accurate, and recognizable landmarks across different environments and conditions. In addition, we conducted a quantitative evaluation based on three experiments (recognizability, stability, and localization accuracy), demonstrating that RoFlex retains around 90% recognizability and preserves localization performance even when only 50% of the landmarks are used. For this reason, the work represents an effective contribution to long-term localization within the automotive domain. Moreover, the modular design of RoFlex serves as a foundation for further research on filtering non-semantic landmarks.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100107"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The potential & limitations of monoplotting in cross-view geo-localization conditions 单标绘在交叉视角地理定位条件下的潜力和局限性

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-05-23 DOI: 10.1016/j.ophoto.2025.100090

Bradley J. Koskowich , Michael J. Starek , Scott A. King

Cross-view geolocalization (CVGL) describes the general problem of determining a correlation between terrestrial and nadir oriented imagery. Classical keypoint matching methods find the extreme pose transitions between cameras present in a CVGL configuration challenging to operate in, while deep neural networks demonstrate superb capacity in this area. Traditional photogrammetry methods like structure-from-motion (SfM) or simultaneous localization and mapping (SLAM) can technically accomplish CVGL, but require a sufficiently dense collection of camera views in order to recover camera pose. This research proposes an alternative CVGL solution, a series of algorithmic operations which can completely automate the calculation of target camera pose via a less common photogrammetry method known as monoplotting, also called single camera resectioning. Monoplotting only requires three inputs, which are a target terrestrial camera image, a nadir-oriented image, and an underlying digital surface model. 2D-3D point correspondences are derived from the inputs to optimize for the target terrestrial camera pose. The proposed method applies affine keypointing, pixel color quantization, and keypoint neighbor triangulation to codify explicit relationships used to augment keypoint matching operations done in a CVGL context. These matching results are used to achieve better initial 2D-3D point correlations from monoplotting image pairs, resulting in lower error for single camera resectioning. To gauge the effectiveness of the proposed method, this proposed methodology is applied to urban, suburban, and natural environment datasets. This proposed methodology demonstrates an average 42x improvement in feature matching between CVGL image pairs, which improves on inconsistent baseline methodology by reducing translation errors between 50%–75%.

交叉视图地理定位（CVGL）描述了确定地面和最低点定向图像之间相关性的一般问题。经典的关键点匹配方法发现，在CVGL配置中存在的相机之间的极端姿势转换具有挑战性，而深度神经网络在这一领域表现出卓越的能力。传统的摄影测量方法，如运动结构（SfM）或同步定位和测绘（SLAM）可以在技术上完成CVGL，但需要足够密集的相机视图集合才能恢复相机姿势。本研究提出了另一种CVGL解决方案，即一系列算法操作，通过一种不太常见的摄影测量方法，即单标绘，也称为单相机切片，可以完全自动计算目标相机姿态。单点绘图只需要三个输入，即目标地面相机图像，最低点定向图像和底层数字表面模型。从输入中导出2D-3D点对应，以优化目标地面相机姿态。提出的方法应用仿射关键点、像素颜色量化和关键点邻居三角测量来编纂用于增加在CVGL上下文中完成的关键点匹配操作的显式关系。这些匹配结果用于从单图图像对中获得更好的初始2D-3D点相关性，从而降低单相机切割的误差。为了评估该方法的有效性，将该方法应用于城市、郊区和自然环境数据集。该方法在CVGL图像对之间的特征匹配方面平均提高了42倍，通过减少50%-75%的翻译错误，改进了不一致的基线方法。

{"title":"The potential & limitations of monoplotting in cross-view geo-localization conditions","authors":"Bradley J. Koskowich , Michael J. Starek , Scott A. King","doi":"10.1016/j.ophoto.2025.100090","DOIUrl":"10.1016/j.ophoto.2025.100090","url":null,"abstract":"<div><div>Cross-view geolocalization (CVGL) describes the general problem of determining a correlation between terrestrial and nadir oriented imagery. Classical keypoint matching methods find the extreme pose transitions between cameras present in a CVGL configuration challenging to operate in, while deep neural networks demonstrate superb capacity in this area. Traditional photogrammetry methods like structure-from-motion (SfM) or simultaneous localization and mapping (SLAM) can technically accomplish CVGL, but require a sufficiently dense collection of camera views in order to recover camera pose. This research proposes an alternative CVGL solution, a series of algorithmic operations which can completely automate the calculation of target camera pose via a less common photogrammetry method known as monoplotting, also called single camera resectioning. Monoplotting only requires three inputs, which are a target terrestrial camera image, a nadir-oriented image, and an underlying digital surface model. 2D-3D point correspondences are derived from the inputs to optimize for the target terrestrial camera pose. The proposed method applies affine keypointing, pixel color quantization, and keypoint neighbor triangulation to codify explicit relationships used to augment keypoint matching operations done in a CVGL context. These matching results are used to achieve better initial 2D-3D point correlations from monoplotting image pairs, resulting in lower error for single camera resectioning. To gauge the effectiveness of the proposed method, this proposed methodology is applied to urban, suburban, and natural environment datasets. This proposed methodology demonstrates an average 42x improvement in feature matching between CVGL image pairs, which improves on inconsistent baseline methodology by reducing translation errors between 50%–75%.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100090"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0