首页 > 最新文献

ISPRS Open Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
Accuracy analysis and improvement of worldwide elevation models by ICESat-2 data 基于ICESat-2数据的全球高程模型精度分析与改进
Pub Date : 2026-01-01 DOI: 10.1016/j.ophoto.2026.100115
Karsten Jacobsen , Gürcan Büyüksalih , Cem Gazioglu
Global or near global elevation models than TanDEM-X Edited Digital Elevation Model (EDEM), ALOS World 3D (AW3D), SRTM and ASTER GDEM (GDEM) show some systematic errors which can be determined and respected by accurate reference Digital Elevation Models (DEMs). Of course, if accurate DEMs are available, they can be used instead, but with ICESat-2 satellite LiDAR profiles accurate reference data are available that can be used to improve the freely available DEMs worldwide. The density of the ICESat-2 LiDAR ground points for the ATL08-data is limited to approximately 100 m in orbit direction and even just up to over 1 km between the orbits, but this is sufficient for improving the free DEMs. The correction can involve a simple elevation shift, a model tilt, or a correction by higher-order systematic errors.
A comparison with reference data from aerial survey revealed a bias in the range of ∼2m, which could also be due the aerial photogrammetric reference data. A comparison of the ICESat-2 data with the freely available AW3D and EDEM yielded satisfactory results, after improvement by bias and model tilt correction. The original RMSZ of EDEM compared to the reference DTM could be reduced from 2.23m to 1.30 m by height model tilt and shift determined by ICESat-2 and for AW3D from originally 1.81 m–1.64 m. In this data set, the influence of systematic errors is limited, it reduced the deformations slightly, but the accuracy numbers only had a negligible improvement. SRTM and ASTER GDEM should no longer be used due to their significantly lower accuracy compared to EDEM and AW3D.
与TanDEM-X编辑的数字高程模型(EDEM)、ALOS World 3D (AW3D)、SRTM和ASTER GDEM (GDEM)相比,全球或近全球高程模型显示出一些系统误差,这些误差可以通过精确的参考数字高程模型(dem)来确定和消除。当然,如果可以获得精确的dem,则可以使用它们,但是使用ICESat-2卫星LiDAR剖面可以获得精确的参考数据,可用于改进全球免费提供的dem。用于atl08数据的ICESat-2激光雷达地面点的密度被限制在轨道方向上约100米,甚至轨道之间仅超过1公里,但这足以改善自由dem。校正可以包括简单的高程偏移、模型倾斜或高阶系统误差的校正。与来自航空测量的参考数据的比较显示了在~ 2m范围内的偏差,这也可能是由于航空摄影测量参考数据。将ICESat-2数据与现有的AW3D和EDEM数据进行比较,经过偏置和模型倾斜校正,结果令人满意。与参考DTM相比,EDEM的原始RMSZ可以通过ICESat-2确定的高度模型倾斜和移位从2.23m减少到1.30 m, AW3D从原来的1.81 m到1.64 m。在该数据集中,系统误差的影响是有限的,它稍微减少了变形,但精度数字只有微不足道的提高。由于SRTM和ASTER GDEM的精度明显低于EDEM和AW3D,因此不应再使用它们。
{"title":"Accuracy analysis and improvement of worldwide elevation models by ICESat-2 data","authors":"Karsten Jacobsen ,&nbsp;Gürcan Büyüksalih ,&nbsp;Cem Gazioglu","doi":"10.1016/j.ophoto.2026.100115","DOIUrl":"10.1016/j.ophoto.2026.100115","url":null,"abstract":"<div><div>Global or near global elevation models than TanDEM-X Edited Digital Elevation Model (EDEM), ALOS World 3D (AW3D), SRTM and ASTER GDEM (GDEM) show some systematic errors which can be determined and respected by accurate reference Digital Elevation Models (DEMs). Of course, if accurate DEMs are available, they can be used instead, but with ICESat-2 satellite LiDAR profiles accurate reference data are available that can be used to improve the freely available DEMs worldwide. The density of the ICESat-2 LiDAR ground points for the ATL08-data is limited to approximately 100 m in orbit direction and even just up to over 1 km between the orbits, but this is sufficient for improving the free DEMs. The correction can involve a simple elevation shift, a model tilt, or a correction by higher-order systematic errors.</div><div>A comparison with reference data from aerial survey revealed a bias in the range of ∼2m, which could also be due the aerial photogrammetric reference data. A comparison of the ICESat-2 data with the freely available AW3D and EDEM yielded satisfactory results, after improvement by bias and model tilt correction. The original RMSZ of EDEM compared to the reference DTM could be reduced from 2.23m to 1.30 m by height model tilt and shift determined by ICESat-2 and for AW3D from originally 1.81 m–1.64 m. In this data set, the influence of systematic errors is limited, it reduced the deformations slightly, but the accuracy numbers only had a negligible improvement. SRTM and ASTER GDEM should no longer be used due to their significantly lower accuracy compared to EDEM and AW3D.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"19 ","pages":"Article 100115"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146037617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GAST: A graph-augmented spectral–spatial transformer with adaptive gated fusion for small-sample hyperspectral image classification GAST:一种用于小样本高光谱图像分类的自适应门控融合图增强光谱空间转换器
Pub Date : 2026-01-01 DOI: 10.1016/j.ophoto.2026.100116
Faruk Keskin , Fesih Keskin , Gültekin Işık
Accurate hyperspectral image (HSI) classification under scarce labels and class imbalance requires models that couple long-range spectral reasoning with irregular local spatial context. We present GAST, a Graph-Augmented spectral–spatial Transformer with Adaptive Gated Fusion for Small-Sample Hyperspectral Image Classification. GAST pairs a lightweight spectral Transformer with a GATv2-based spatial branch on an 8-neighbor pixel graph, and fuses them via a center-conditioned, channel-wise gating mechanism that uses the center-pixel representation to modulate all tokens in the patch. Unlike conventional static fusion strategies (e.g., concatenation or summation) that assign fixed importance to modalities regardless of image content, this adaptive fusion dynamically modulates the spectral and spatial streams at the pixel level, allowing the model to prioritize spatial texture for complex urban structures while shifting focus to spectral signatures for subtle vegetation classes. Training is further stabilized by an imbalance-aware objective that switches between weighted cross-entropy and focal loss according to a measured class ratio, and by a two-stage Bayesian hyperparameter search that aligns capacity with scene statistics. Across eight public benchmarks under a 5%-label protocol, GAST consistently matches or surpasses recent hybrid graph-Transformer architectures while remaining compact and fast at inference. Ablation studies confirm the complementary roles of both branches and the benefit of gated fusion. The resulting architecture offers a strong accuracy–efficiency trade-off and reliable performance across seeds, making it a practical solution for low-data HSI applications. The code is publicly available at https://github.com/fesihkeskin/GAST.
在标签稀缺和类别不平衡的情况下,高光谱图像的精确分类需要将远程光谱推理与不规则的局部空间背景相结合的模型。我们提出了一种用于小样本高光谱图像分类的带有自适应门控融合的图增强光谱空间变压器GAST。GAST将一个轻量级的频谱转换器与一个基于gatv2的空间分支配对在一个8相邻像素图上,并通过一个中心条件的、通道方向的门控机制融合它们,该机制使用中心像素表示来调制补丁中的所有令牌。与传统的静态融合策略(例如,拼接或求和)不同,这种自适应融合在像素级动态调节光谱和空间流,允许模型优先考虑复杂城市结构的空间纹理,同时将焦点转移到微妙植被类别的光谱特征上。通过根据测量的类比在加权交叉熵和焦点损失之间切换的不平衡感知目标,以及通过将容量与场景统计相一致的两阶段贝叶斯超参数搜索,进一步稳定了训练。在5%标签协议下的八个公共基准测试中,GAST始终匹配或超过最近的混合图形转换器架构,同时保持紧凑和快速的推理。消融研究证实了两个分支的互补作用和门控融合的好处。由此产生的架构提供了强大的准确性和效率权衡和可靠的跨种子性能,使其成为低数据HSI应用的实用解决方案。该代码可在https://github.com/fesihkeskin/GAST上公开获得。
{"title":"GAST: A graph-augmented spectral–spatial transformer with adaptive gated fusion for small-sample hyperspectral image classification","authors":"Faruk Keskin ,&nbsp;Fesih Keskin ,&nbsp;Gültekin Işık","doi":"10.1016/j.ophoto.2026.100116","DOIUrl":"10.1016/j.ophoto.2026.100116","url":null,"abstract":"<div><div>Accurate hyperspectral image (HSI) classification under scarce labels and class imbalance requires models that couple long-range spectral reasoning with irregular local spatial context. We present GAST, a Graph-Augmented spectral–spatial Transformer with Adaptive Gated Fusion for Small-Sample Hyperspectral Image Classification. GAST pairs a lightweight spectral Transformer with a GATv2-based spatial branch on an 8-neighbor pixel graph, and fuses them via a center-conditioned, channel-wise gating mechanism that uses the center-pixel representation to modulate all tokens in the patch. Unlike conventional static fusion strategies (e.g., concatenation or summation) that assign fixed importance to modalities regardless of image content, this adaptive fusion dynamically modulates the spectral and spatial streams at the pixel level, allowing the model to prioritize spatial texture for complex urban structures while shifting focus to spectral signatures for subtle vegetation classes. Training is further stabilized by an imbalance-aware objective that switches between weighted cross-entropy and focal loss according to a measured class ratio, and by a two-stage Bayesian hyperparameter search that aligns capacity with scene statistics. Across eight public benchmarks under a 5%-label protocol, GAST consistently matches or surpasses recent hybrid graph-Transformer architectures while remaining compact and fast at inference. Ablation studies confirm the complementary roles of both branches and the benefit of gated fusion. The resulting architecture offers a strong accuracy–efficiency trade-off and reliable performance across seeds, making it a practical solution for low-data HSI applications. The code is publicly available at <span><span>https://github.com/fesihkeskin/GAST</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"19 ","pages":"Article 100116"},"PeriodicalIF":0.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Map2ImLas: Large-scale 2D-3D airborne dataset with map-based annotations Map2ImLas:基于地图注释的大规模2D-3D航空数据集
Pub Date : 2025-12-15 DOI: 10.1016/j.ophoto.2025.100112
Geethanjali Anjanappa, Sander Oude Elberink, Abhisek Maiti, Yaping Lin, George Vosselman
Airborne data are commonly used in mapping, urban planning, and environmental monitoring. However, deep learning (DL) for these tasks is often limited by the lack of large-scale multimodal datasets that represent diverse landscapes and detailed classes. In this work, we present Map2ImLas, a large-scale dataset created using topographic maps and high-resolution airborne data from the Netherlands. The dataset includes 2413 spatially matching tiles of maps, 2D orthoimages, digital surface models, and 3D point clouds, covering approximately 217 km2 across urban, suburban, industrial, rural, and forested areas. Map2ImLas provides per-pixel and per-point annotations for 20 different classes, applicable to both 2D and 3D semantic segmentation, as well as vector polygons for object delineation tasks. The proposed labeling process is fully automated, dynamically aligning all data sources to generate structured and consistent annotations. The pipeline is scalable and can be adapted to other regions in the Netherlands with minimal changes. We also introduce a DL-based workflow for labeling trees in 3D point clouds, using map data as semantic priors. To support DL applications, we provide a two-fold data split with non-overlapping training, validation, and test tiles. The dataset is benchmarked using several state-of-the-art 2D and 3D segmentation models to demonstrate its usability for semantic segmentation tasks. While the present evaluation focuses on segmentation, the structured vector annotations also enable future research on boundary extraction and object delineation. Overall, Map2ImLas reduces the need for manual annotation by reusing existing map data and supports geospatial AI in large-scale mapping and semantic labeling for multimodal data.
航空数据通常用于制图、城市规划和环境监测。然而,这些任务的深度学习(DL)往往受到缺乏大规模多模态数据集的限制,这些数据集代表了不同的景观和详细的类别。在这项工作中,我们展示了Map2ImLas,这是一个使用地形图和来自荷兰的高分辨率航空数据创建的大型数据集。该数据集包括2413张空间匹配的地图、2D正射影图、数字表面模型和3D点云,覆盖城市、郊区、工业、农村和森林地区约217平方公里。Map2ImLas为20个不同的类提供逐像素和逐点注释,适用于2D和3D语义分割,以及用于对象描绘任务的矢量多边形。建议的标记过程是完全自动化的,动态地对齐所有数据源以生成结构化和一致的注释。该管道是可扩展的,可以在荷兰的其他地区进行最小的修改。我们还介绍了一个基于dl的工作流,用于在3D点云中标记树,使用地图数据作为语义先验。为了支持深度学习应用程序,我们提供了具有非重叠训练、验证和测试块的双重数据分割。使用几个最先进的2D和3D分割模型对数据集进行基准测试,以证明其对语义分割任务的可用性。虽然目前的评估主要集中在分割上,但结构化向量注释也可以用于未来的边界提取和对象描绘的研究。总的来说,Map2ImLas通过重用现有地图数据减少了手动标注的需要,并支持大规模地图绘制和多模态数据语义标注的地理空间人工智能。
{"title":"Map2ImLas: Large-scale 2D-3D airborne dataset with map-based annotations","authors":"Geethanjali Anjanappa,&nbsp;Sander Oude Elberink,&nbsp;Abhisek Maiti,&nbsp;Yaping Lin,&nbsp;George Vosselman","doi":"10.1016/j.ophoto.2025.100112","DOIUrl":"10.1016/j.ophoto.2025.100112","url":null,"abstract":"<div><div>Airborne data are commonly used in mapping, urban planning, and environmental monitoring. However, deep learning (DL) for these tasks is often limited by the lack of large-scale multimodal datasets that represent diverse landscapes and detailed classes. In this work, we present <em>Map2ImLas</em>, a large-scale dataset created using topographic maps and high-resolution airborne data from the Netherlands. The dataset includes 2413 spatially matching tiles of maps, 2D orthoimages, digital surface models, and 3D point clouds, covering approximately 217 km<sup>2</sup> across urban, suburban, industrial, rural, and forested areas. Map2ImLas provides per-pixel and per-point annotations for 20 different classes, applicable to both 2D and 3D semantic segmentation, as well as vector polygons for object delineation tasks. The proposed labeling process is fully automated, dynamically aligning all data sources to generate structured and consistent annotations. The pipeline is scalable and can be adapted to other regions in the Netherlands with minimal changes. We also introduce a DL-based workflow for labeling trees in 3D point clouds, using map data as semantic priors. To support DL applications, we provide a two-fold data split with non-overlapping training, validation, and test tiles. The dataset is benchmarked using several state-of-the-art 2D and 3D segmentation models to demonstrate its usability for semantic segmentation tasks. While the present evaluation focuses on segmentation, the structured vector annotations also enable future research on boundary extraction and object delineation. Overall, Map2ImLas reduces the need for manual annotation by reusing existing map data and supports geospatial AI in large-scale mapping and semantic labeling for multimodal data.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"19 ","pages":"Article 100112"},"PeriodicalIF":0.0,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Monitoring tropical forests with light drones: ensuring spatial and temporal consistency in stereophotogrammetric products 用轻型无人机监测热带森林:确保立体摄影测量产品的时空一致性
Pub Date : 2025-12-09 DOI: 10.1016/j.ophoto.2025.100114
Nicolas Barbier , Pierre Ploton , Hadrien Tulet , Gaëlle Viennois , Hugo Leblanc , Benoît Burban , Maxime Réjou-Méchain , Philippe Verley , James Ball , Denis Feurer , Grégoire Vincent
Light drones provide a cheap and effective tool to monitor forest canopy, especially in tropical and equatorial contexts, where infrastructure and resources are limiting. In these regions, good quality optical satellite images are rare, yet the stakes are maximal to characterize forest function, dynamics, diversity, and phenology, and more generally the vegetation-climate interplay.
We describe a complete processing chain based on photogrammetric tools that seeks to optimize the spatial and spectral coherence between repeat image mosaics at centimetric resolution. Our target is to allow individual tree-level monitoring over tens to hundreds of hectare scales with consumer grade equipment (i.e., quadcopter with stabilized RGB camera, standard GNSS positioning).
We demonstrate the increase in spatial accuracy achieved using Time-SIFT and Arosics algorithms, which allow (individually and synergistically) to reduce global and local spatial misalignment between mosaics from several meters to a few centimeters. Time-SIFT provides the advantage of increased robustness in initial image alignment and 3D reconstruction, and hence reduces occasional distortions or data gaps. Using Agisoft's color and white balance corrections combined with the use of vegetation indices provides meaningful quantitative signal despite considerable changes in acquisition conditions.
In particular, indices that are less sensitive to illumination changes, like the green chromatic coordinate (GCC), allowed evidencing a seasonal signal over four years of monitoring in the evergreen moist forest at Paracou in French Guiana. The signal was decorrelated from obvious geometrical effect (sun height), and provided information on the vegetative stage at tree, species, and stand levels.
轻型无人机为监测森林冠层提供了一种廉价而有效的工具,特别是在基础设施和资源有限的热带和赤道地区。在这些地区,高质量的光学卫星图像很少,但对表征森林功能、动态、多样性和物候,以及更普遍的植被-气候相互作用具有最大的意义。我们描述了一个基于摄影测量工具的完整处理链,旨在优化厘米分辨率重复图像拼接之间的空间和光谱一致性。我们的目标是允许使用消费级设备(即带有稳定RGB相机的四轴飞行器,标准GNSS定位)进行数十至数百公顷规模的单个树级监测。我们展示了使用Time-SIFT和Arosics算法实现的空间精度的提高,这允许(单独和协同)减少马赛克之间从几米到几厘米的全局和局部空间不对准。Time-SIFT在初始图像对齐和3D重建方面具有增强的鲁棒性,因此减少了偶尔的失真或数据缺口。使用Agisoft的颜色和白平衡校正结合使用植被指数提供了有意义的定量信号,尽管在采集条件有相当大的变化。特别是,对光照变化不太敏感的指数,如绿色色坐标(GCC),在法属圭亚那帕拉库的常绿潮湿森林中进行了四年的监测,证明了季节信号。该信号与明显的几何效应(太阳高度)无关,提供了树、种和林分水平的营养阶段信息。
{"title":"Monitoring tropical forests with light drones: ensuring spatial and temporal consistency in stereophotogrammetric products","authors":"Nicolas Barbier ,&nbsp;Pierre Ploton ,&nbsp;Hadrien Tulet ,&nbsp;Gaëlle Viennois ,&nbsp;Hugo Leblanc ,&nbsp;Benoît Burban ,&nbsp;Maxime Réjou-Méchain ,&nbsp;Philippe Verley ,&nbsp;James Ball ,&nbsp;Denis Feurer ,&nbsp;Grégoire Vincent","doi":"10.1016/j.ophoto.2025.100114","DOIUrl":"10.1016/j.ophoto.2025.100114","url":null,"abstract":"<div><div>Light drones provide a cheap and effective tool to monitor forest canopy, especially in tropical and equatorial contexts, where infrastructure and resources are limiting. In these regions, good quality optical satellite images are rare, yet the stakes are maximal to characterize forest function, dynamics, diversity, and phenology, and more generally the vegetation-climate interplay.</div><div>We describe a complete processing chain based on photogrammetric tools that seeks to optimize the spatial and spectral coherence between repeat image mosaics at centimetric resolution. Our target is to allow individual tree-level monitoring over tens to hundreds of hectare scales with consumer grade equipment (i.e., quadcopter with stabilized RGB camera, standard GNSS positioning).</div><div>We demonstrate the increase in spatial accuracy achieved using Time-SIFT and Arosics algorithms, which allow (individually and synergistically) to reduce global and local spatial misalignment between mosaics from several meters to a few centimeters. Time-SIFT provides the advantage of increased robustness in initial image alignment and 3D reconstruction, and hence reduces occasional distortions or data gaps. Using Agisoft's color and white balance corrections combined with the use of vegetation indices provides meaningful quantitative signal despite considerable changes in acquisition conditions.</div><div>In particular, indices that are less sensitive to illumination changes, like the green chromatic coordinate (GCC), allowed evidencing a seasonal signal over four years of monitoring in the evergreen moist forest at Paracou in French Guiana. The signal was decorrelated from obvious geometrical effect (sun height), and provided information on the vegetative stage at tree, species, and stand levels.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"19 ","pages":"Article 100114"},"PeriodicalIF":0.0,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145750140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative deep learning models for cloud removal in satellite imagery: A comparative review of GANs and diffusion methods 卫星图像中云去除的生成式深度学习模型:gan和扩散方法的比较综述
Pub Date : 2025-12-04 DOI: 10.1016/j.ophoto.2025.100110
Shanika Edirisinghe, Bianca Schoen-Phelan, Svetlana Hensman
Satellite imagery provides essential geospatial data to support various remote sensing applications, including environmental monitoring, disaster management, urban planning, and land utilization studies. However, cloud cover often obstructs the clarity and reliability of satellite images, reducing their usefulness. With advances in deep learning, generative models — particularly Generative Adversarial Networks (GANs) and denoising diffusion models — have emerged as promising solutions for cloud removal in satellite imagery. This review systematically evaluates GAN-based and diffusion-based methods, comparing their strengths, limitations, and performance across diverse geographic and cloud conditions. The analysis shows that GANs generate visually realistic outputs through adversarial training, while diffusion models offer superior spatial and structural fidelity due to iterative noise reduction. Integrating auxiliary data such as Synthetic Aperture Radar (SAR) imagery further enhances cloud removal accuracy. This review highlights current challenges and identifies research gaps to support future innovation in satellite image restoration, particularly in cloud removal and generative deep learning for remote sensing.
卫星图像为支持各种遥感应用提供了必要的地理空间数据,包括环境监测、灾害管理、城市规划和土地利用研究。然而,云层经常妨碍卫星图像的清晰度和可靠性,降低了它们的实用性。随着深度学习的进步,生成模型——特别是生成对抗网络(gan)和去噪扩散模型——已经成为卫星图像中云去除的有希望的解决方案。本文系统地评估了基于氮化镓和基于扩散的方法,比较了它们在不同地理和云条件下的优势、局限性和性能。分析表明,gan通过对抗训练产生视觉逼真的输出,而扩散模型由于迭代降噪而提供了优越的空间和结构保真度。结合合成孔径雷达(SAR)图像等辅助数据,进一步提高了云的去除精度。这篇综述强调了当前的挑战,并确定了研究差距,以支持卫星图像恢复的未来创新,特别是在遥感的云去除和生成深度学习方面。
{"title":"Generative deep learning models for cloud removal in satellite imagery: A comparative review of GANs and diffusion methods","authors":"Shanika Edirisinghe,&nbsp;Bianca Schoen-Phelan,&nbsp;Svetlana Hensman","doi":"10.1016/j.ophoto.2025.100110","DOIUrl":"10.1016/j.ophoto.2025.100110","url":null,"abstract":"<div><div>Satellite imagery provides essential geospatial data to support various remote sensing applications, including environmental monitoring, disaster management, urban planning, and land utilization studies. However, cloud cover often obstructs the clarity and reliability of satellite images, reducing their usefulness. With advances in deep learning, generative models — particularly Generative Adversarial Networks (GANs) and denoising diffusion models — have emerged as promising solutions for cloud removal in satellite imagery. This review systematically evaluates GAN-based and diffusion-based methods, comparing their strengths, limitations, and performance across diverse geographic and cloud conditions. The analysis shows that GANs generate visually realistic outputs through adversarial training, while diffusion models offer superior spatial and structural fidelity due to iterative noise reduction. Integrating auxiliary data such as Synthetic Aperture Radar (SAR) imagery further enhances cloud removal accuracy. This review highlights current challenges and identifies research gaps to support future innovation in satellite image restoration, particularly in cloud removal and generative deep learning for remote sensing.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"19 ","pages":"Article 100110"},"PeriodicalIF":0.0,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145798481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Circlegrammetry for drone imaging: Evaluating a novel technique for mission planning and 3D mapping 无人机成像的圆周测量:评估任务规划和三维测绘的新技术
Pub Date : 2025-12-01 DOI: 10.1016/j.ophoto.2025.100111
Mathieu F. Bilodeau , Travis J. Esau , Mason T. MacDonald , Aitazaz A. Farooque
Circlegrammetry is a new drone photogrammetry technique that utilizes circular flight paths. This approach promises higher efficiency for 3D modelling compared to traditional grid-based methods. This study evaluates its performance in a Christmas tree (Balsam fir) field, a complex agricultural environment characterized by intricate vegetation geometry. Experiments were conducted in a 2-ha orchard located in Truro, Nova Scotia, using a DJI Matrice 300 RTK equipped with a high-resolution optical camera. Three Circlegrammetry missions with varying overlaps (25 and 50 %) and flight heights (40 and 60 m) were compared against standard oblique and smart oblique drone missions flown at an flight heights of 60 m. Mission assessments focused on flight efficiency, processing performance and reconstruction accuracy. The point density of the tree canopy, generated from dense point clouds, was also evaluated against different survey methods. Results demonstrated that Circlegrammetry significantly reduced flight times and the number of images required, particularly at lower overlap configurations. For example, Circlegrammetry with a 25 % overlap achieved mission completion in about half the time required for smart oblique methods and in approximately one-third the duration of standard oblique missions. Processing efficiency was similarly favoured by Circlegrammetry (25 % overlap), with notable reductions in processing times. In terms of reconstruction quality, Circlegrammetry produced spatially accurate models with ground-control RMSE values ranging from 1.38 to 1.53 cm. These results were comparable to those of traditional oblique methods, despite not utilizing nadir imagery. However, Circlegrammetry showed limitations in capturing lower canopy details on the tree, with an average point density higher than that of other methods. For example, Circle 25 % performed the worst, with an average point spacing of 15.79 points per millimetre for the lower canopy. In contrast, the standard oblique approach performed the best, with an average point spacing of 11.89 points per millimetre. This suggested some constraints inherent to the inward-facing of the camera and higher oblique-angle flight paths on Cirlegrammetry missions. Overall, Circlegrammetry emerges as a promising method for precision agriculture applications by striking a balance between flight efficiency and reconstruction detail. Circlegrammetry with a 50 % overlap was demonstrated to be a comparable alternative to the smart oblique acquisition method. Future research should focus on optimizing overlap percentages and flight configurations to improve lower canopy coverage further and generalize these findings across diverse agricultural contexts.
圆测法是一种利用圆形飞行路径的新型无人机摄影测量技术。与传统的基于网格的方法相比,这种方法有望提高3D建模的效率。本研究评估了其在圣诞树(香脂冷杉)田的性能,这是一个复杂的农业环境,其特征是复杂的植被几何。实验在新斯科舍省特鲁罗的一个2公顷的果园进行,使用的是配备高分辨率光学相机的大疆Matrice 300 RTK。三个不同重叠(25%和50%)和飞行高度(40和60米)的圆周测量任务与飞行高度为60米的标准斜向和智能斜向无人机任务进行了比较。任务评估侧重于飞行效率、处理性能和重建精度。用不同的调查方法对稠密点云生成的树冠点密度进行了评价。结果表明,圆周测量法显著减少飞行时间和所需图像的数量,特别是在低重叠配置。例如,25%重叠的圆周测量法完成任务所需的时间约为智能倾斜方法所需时间的一半,而标准倾斜任务所需时间约为三分之一。加工效率同样有利于圆周测量(25%重叠),加工时间显著减少。在重建质量方面,Circlegrammetry生成了空间精确的模型,地面控制RMSE值在1.38至1.53 cm之间。这些结果与传统的斜位法相当,尽管没有利用最低点图像。然而,圆形测量法在捕获树下冠层细节方面存在局限性,平均点密度高于其他方法。例如,圆圈25%表现最差,下层冠层的平均点间距为每毫米15.79点。相比之下,标准斜入路表现最好,平均点间距为11.89点/毫米。这表明了一些固有的限制,相机的内面向和更高的斜角飞行路径在圆周测量任务。总的来说,通过在飞行效率和重建细节之间取得平衡,圆周测量法成为精准农业应用的一种有前途的方法。具有50%重叠的圆周测量被证明是智能斜向采集方法的可比替代方案。未来的研究应侧重于优化重叠百分比和飞行配置,以进一步提高低层冠层覆盖度,并将这些发现推广到不同的农业环境中。
{"title":"Circlegrammetry for drone imaging: Evaluating a novel technique for mission planning and 3D mapping","authors":"Mathieu F. Bilodeau ,&nbsp;Travis J. Esau ,&nbsp;Mason T. MacDonald ,&nbsp;Aitazaz A. Farooque","doi":"10.1016/j.ophoto.2025.100111","DOIUrl":"10.1016/j.ophoto.2025.100111","url":null,"abstract":"<div><div>Circlegrammetry is a new drone photogrammetry technique that utilizes circular flight paths. This approach promises higher efficiency for 3D modelling compared to traditional grid-based methods. This study evaluates its performance in a Christmas tree (Balsam fir) field, a complex agricultural environment characterized by intricate vegetation geometry. Experiments were conducted in a 2-ha orchard located in Truro, Nova Scotia, using a DJI Matrice 300 RTK equipped with a high-resolution optical camera. Three Circlegrammetry missions with varying overlaps (25 and 50 %) and flight heights (40 and 60 m) were compared against standard oblique and smart oblique drone missions flown at an flight heights of 60 m. Mission assessments focused on flight efficiency, processing performance and reconstruction accuracy. The point density of the tree canopy, generated from dense point clouds, was also evaluated against different survey methods. Results demonstrated that Circlegrammetry significantly reduced flight times and the number of images required, particularly at lower overlap configurations. For example, Circlegrammetry with a 25 % overlap achieved mission completion in about half the time required for smart oblique methods and in approximately one-third the duration of standard oblique missions. Processing efficiency was similarly favoured by Circlegrammetry (25 % overlap), with notable reductions in processing times. In terms of reconstruction quality, Circlegrammetry produced spatially accurate models with ground-control RMSE values ranging from 1.38 to 1.53 cm. These results were comparable to those of traditional oblique methods, despite not utilizing nadir imagery. However, Circlegrammetry showed limitations in capturing lower canopy details on the tree, with an average point density higher than that of other methods. For example, Circle 25 % performed the worst, with an average point spacing of 15.79 points per millimetre for the lower canopy. In contrast, the standard oblique approach performed the best, with an average point spacing of 11.89 points per millimetre. This suggested some constraints inherent to the inward-facing of the camera and higher oblique-angle flight paths on Cirlegrammetry missions. Overall, Circlegrammetry emerges as a promising method for precision agriculture applications by striking a balance between flight efficiency and reconstruction detail. Circlegrammetry with a 50 % overlap was demonstrated to be a comparable alternative to the smart oblique acquisition method. Future research should focus on optimizing overlap percentages and flight configurations to improve lower canopy coverage further and generalize these findings across diverse agricultural contexts.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100111"},"PeriodicalIF":0.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145693690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Direct 3D mapping with a 2D LiDAR using sparse reference maps 直接3D地图与2D激光雷达使用稀疏参考地图
Pub Date : 2025-11-28 DOI: 10.1016/j.ophoto.2025.100109
Eugeniu Vezeteu , Aimad El Issaoui , Heikki Hyyti , Jesse Muhojoki , Petri Manninen , Teemu Hakala , Eric Hyyppä , Antero Kukko , Harri Kaartinen , Ville Kyrki , Juha Hyyppä
Precise 3D mapping is crucial for a wide range of geospatial applications, including forest monitoring, infrastructure assessment, and autonomous navigation. While 2D Light Detection and Ranging (LiDAR) sensors offer superior range accuracy and higher point density compared to many 3D LiDARs, their limited sensing geometry makes full 3D reconstruction challenging. In this paper, we address these limitations and achieve robust 3D mapping by proposing a direct method for integrating 2D LiDAR with a 6 Degrees of Freedom (DoF) trajectory and sparse 3D reference maps derived from mobile laser scanning (MLS) or airborne laser scanning (ALS). Our method begins with an initial 6 DoF trajectory and performs batch optimisation by jointly co-registering buffered 2D LiDAR scans to a 3D reference map, enhancing both trajectory accuracy and mapping completeness without relying on 2D scans’ overlap or segmentation. We also introduce a novel, targetless extrinsic calibration approach between 2D LiDAR, 3D LiDAR, and a Global Navigation Satellite System–Inertial Navigation System (GNSS–INS) system that does not rely on overlapping sensor Field of View (FOV). We validate our approach in forest road environments using sparse ALS or MLS reference maps and initial poses from GNSS–INS or 3D LiDAR-inertial odometry. Experiments in forest roads achieved mean localisation accuracies of 0.1 m (using 3D MLS initialisation) and 0.16 m (using GNSS–INS initialisation), reducing drift by up to nine times in translation and six times in rotation. The extrinsic calibration method converges even with initial misalignments of up to 40° in rotation and 3 m in translation. The proposed framework enables multi-platform, multi-temporal data fusion, offering a practical solution for field deployment and map correction tasks.
精确的3D测绘对于广泛的地理空间应用至关重要,包括森林监测、基础设施评估和自主导航。虽然与许多3D激光雷达相比,2D光探测和测距(LiDAR)传感器提供了卓越的距离精度和更高的点密度,但其有限的传感几何结构使得全3D重建具有挑战性。在本文中,我们解决了这些限制,并通过提出一种将2D激光雷达与6自由度(DoF)轨迹和来自移动激光扫描(MLS)或机载激光扫描(ALS)的稀疏3D参考地图集成的直接方法,实现了鲁棒的3D映射。我们的方法从初始的6自由度轨迹开始,通过联合将缓冲的2D激光雷达扫描注册到3D参考地图来进行批量优化,从而提高轨迹精度和映射完整性,而不依赖于2D扫描的重叠或分割。我们还介绍了一种新的、无目标的二维激光雷达、三维激光雷达和全球导航卫星系统-惯性导航系统(GNSS-INS)系统之间的外部校准方法,该方法不依赖于重叠的传感器视场(FOV)。我们使用稀疏的ALS或MLS参考地图以及来自GNSS-INS或3D lidar -惯性里程计的初始姿态在森林道路环境中验证了我们的方法。在森林道路上的实验实现了0.1米(使用3D MLS初始化)和0.16米(使用GNSS-INS初始化)的平均定位精度,将平移偏移减少了9倍,旋转偏移减少了6倍。外部校准方法收敛,即使初始偏差高达40°旋转和3米平移。提出的框架能够实现多平台、多时间数据融合,为现场部署和地图校正任务提供实用的解决方案。
{"title":"Direct 3D mapping with a 2D LiDAR using sparse reference maps","authors":"Eugeniu Vezeteu ,&nbsp;Aimad El Issaoui ,&nbsp;Heikki Hyyti ,&nbsp;Jesse Muhojoki ,&nbsp;Petri Manninen ,&nbsp;Teemu Hakala ,&nbsp;Eric Hyyppä ,&nbsp;Antero Kukko ,&nbsp;Harri Kaartinen ,&nbsp;Ville Kyrki ,&nbsp;Juha Hyyppä","doi":"10.1016/j.ophoto.2025.100109","DOIUrl":"10.1016/j.ophoto.2025.100109","url":null,"abstract":"<div><div>Precise 3D mapping is crucial for a wide range of geospatial applications, including forest monitoring, infrastructure assessment, and autonomous navigation. While 2D Light Detection and Ranging (LiDAR) sensors offer superior range accuracy and higher point density compared to many 3D LiDARs, their limited sensing geometry makes full 3D reconstruction challenging. In this paper, we address these limitations and achieve robust 3D mapping by proposing a direct method for integrating 2D LiDAR with a 6 Degrees of Freedom (DoF) trajectory and sparse 3D reference maps derived from mobile laser scanning (MLS) or airborne laser scanning (ALS). Our method begins with an initial 6 DoF trajectory and performs batch optimisation by jointly co-registering buffered 2D LiDAR scans to a 3D reference map, enhancing both trajectory accuracy and mapping completeness without relying on 2D scans’ overlap or segmentation. We also introduce a novel, targetless extrinsic calibration approach between 2D LiDAR, 3D LiDAR, and a Global Navigation Satellite System–Inertial Navigation System (GNSS–INS) system that does not rely on overlapping sensor Field of View (FOV). We validate our approach in forest road environments using sparse ALS or MLS reference maps and initial poses from GNSS–INS or 3D LiDAR-inertial odometry. Experiments in forest roads achieved mean localisation accuracies of 0.1 m (using 3D MLS initialisation) and 0.16 m (using GNSS–INS initialisation), reducing drift by up to nine times in translation and six times in rotation. The extrinsic calibration method converges even with initial misalignments of up to 40° in rotation and 3 m in translation. The proposed framework enables multi-platform, multi-temporal data fusion, offering a practical solution for field deployment and map correction tasks.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"19 ","pages":"Article 100109"},"PeriodicalIF":0.0,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145694805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generation of precise 3D building models for digital twin projects using multi-source data fusion and integration into virtual tours 使用多源数据融合和集成到虚拟旅行中,为数字孪生项目生成精确的3D建筑模型
Pub Date : 2025-11-19 DOI: 10.1016/j.ophoto.2025.100108
Umut Gunes Sefercik, Ilyas Aydin, Mertcan Nazar
High-quality production of building digital twins (DT) is always a challenging issue. In this study, a methodology is proposed to obtain a precise georeferenced 3D building model with high geometric and spectral quality, which is one of the essential components of a high-quality DT production, through the fusion of UAV and terrestrial photogrammetric data. To better evaluate the performance of the proposed methodology a complex building with glass facades, entrance porches, outdoor stairs, and architectural coverings was chosen. The techniques, used for overcoming the challenging issues about multi-source image orientation, spectral enhancement and precise building model production were presented. In the proposed methodology, distinct from the existing literature studies, photos obtained from different sources were not merged in an image-pool before photogrammetric processing, and geometric and spectral calibrations of aerial and terrestrial photos are completed separately before data fusion. In this manner, individual dense point clouds were both generated based on structure from motion (SfM) and made noise-free with filtering in Bentley ContextCapture software. Precise 3D building model production involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The production methodology involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The building model was achieved with the geometric accuracy (RMSE) of ≤ ±2 cm by the fusion of ±1.87 cm and ±1.17 cm accuracy UAV and terrestrial photogrammetry dense point clouds, respectively. In addition, an indoor model was generated by capturing 360° panoramic photos of the building and a complete virtual tour was created by merging indoor and outdoor data in the Unity game engine platform.
建筑数字孪生体(DT)的高质量生产一直是一个具有挑战性的问题。本研究提出了一种方法,通过融合无人机和地面摄影测量数据,获得具有高几何和光谱质量的精确地理参考三维建筑模型,这是高质量DT生产的重要组成部分之一。为了更好地评估所提出的方法的性能,选择了一个具有玻璃外墙、入口门廊、室外楼梯和建筑覆盖物的复杂建筑。提出了用于克服多源图像定向、光谱增强和精确建筑模型生成等难题的技术。与现有文献研究不同的是,该方法在进行摄影测量处理之前,没有将不同来源的照片合并到一个图像池中,而是在数据融合之前分别完成航空和地面照片的几何和光谱校准。这样,单个密集点云既基于运动结构(SfM)生成,又在Bentley ContextCapture软件中进行滤波,使其无噪声。精确的3D建筑模型制作包括首先合并地理参考点云,然后从融合云生成3D模型。制作方法包括首先合并地理参考点云,然后从融合云生成3D模型。通过融合精度分别为±1.87 cm和±1.17 cm的无人机和地面摄影测量密集点云,实现了建筑模型的几何精度(RMSE)≤±2 cm。此外,通过拍摄建筑的360°全景照片生成室内模型,并在Unity游戏引擎平台中合并室内外数据创建完整的虚拟游览。
{"title":"Generation of precise 3D building models for digital twin projects using multi-source data fusion and integration into virtual tours","authors":"Umut Gunes Sefercik,&nbsp;Ilyas Aydin,&nbsp;Mertcan Nazar","doi":"10.1016/j.ophoto.2025.100108","DOIUrl":"10.1016/j.ophoto.2025.100108","url":null,"abstract":"<div><div>High-quality production of building digital twins (DT) is always a challenging issue. In this study, a methodology is proposed to obtain a precise georeferenced 3D building model with high geometric and spectral quality, which is one of the essential components of a high-quality DT production, through the fusion of UAV and terrestrial photogrammetric data. To better evaluate the performance of the proposed methodology a complex building with glass facades, entrance porches, outdoor stairs, and architectural coverings was chosen. The techniques, used for overcoming the challenging issues about multi-source image orientation, spectral enhancement and precise building model production were presented. In the proposed methodology, distinct from the existing literature studies, photos obtained from different sources were not merged in an image-pool before photogrammetric processing, and geometric and spectral calibrations of aerial and terrestrial photos are completed separately before data fusion. In this manner, individual dense point clouds were both generated based on structure from motion (SfM) and made noise-free with filtering in Bentley ContextCapture software. Precise 3D building model production involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The production methodology involved first merging the geo-referenced point clouds, followed by 3D model generation from the fused cloud. The building model was achieved with the geometric accuracy (RMSE) of ≤ ±2 cm by the fusion of ±1.87 cm and ±1.17 cm accuracy UAV and terrestrial photogrammetry dense point clouds, respectively. In addition, an indoor model was generated by capturing 360° panoramic photos of the building and a complete virtual tour was created by merging indoor and outdoor data in the Unity game engine platform.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100108"},"PeriodicalIF":0.0,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RoFlex: Robust and flexible filtering of non-semantic landmarks for automotive applications RoFlex:用于汽车应用的鲁棒和灵活的非语义地标过滤
Pub Date : 2025-11-13 DOI: 10.1016/j.ophoto.2025.100107
Tobias Fichtmueller, Alexander Witt, Christoph Holst
Visual Simultaneous Localization and Mapping (VSLAM) provides a reliable option for the precise vehicle localization required for planning and executing autonomous driving maneuvers, especially in areas where traditional GNSS-based systems fail. Therefore, our current objective is to transmit the generated 3D points (non-semantic landmarks) to the data backend to store them in a map-layer for application as future localization support. However, the limited bandwidth between the vehicle and the data backend requires filtering the landmarks before transmission.
This paper introduces RoFlex, a robust and flexible approach for filtering non-semantic landmarks within the calculation front-end of a VSLAM system. Given the bandwidth restrictions in vehicle-to-data-backend communication, RoFlex selects landmarks beneficial to long-term localization based on their stability, accuracy, and recognizability. In contrast to existing approaches that rely on training data, RoFlex computes an individual score for each landmark using seven distinct attributes to assess their suitability as localization support. The methodology was qualitatively evaluated on several datasets and identified stable, accurate, and recognizable landmarks across different environments and conditions. In addition, we conducted a quantitative evaluation based on three experiments (recognizability, stability, and localization accuracy), demonstrating that RoFlex retains around 90% recognizability and preserves localization performance even when only 50% of the landmarks are used. For this reason, the work represents an effective contribution to long-term localization within the automotive domain. Moreover, the modular design of RoFlex serves as a foundation for further research on filtering non-semantic landmarks.
视觉同步定位和地图(VSLAM)为规划和执行自动驾驶机动所需的精确车辆定位提供了可靠的选择,特别是在传统gnss系统失效的区域。因此,我们当前的目标是将生成的3D点(非语义地标)传输到数据后端,以将它们存储在地图层中,作为未来本地化支持的应用程序。然而,车辆和数据后端之间有限的带宽要求在传输前对地标进行过滤。本文介绍了一种鲁棒且灵活的方法RoFlex,用于在VSLAM系统的计算前端过滤非语义地标。考虑到车辆到数据后端通信的带宽限制,RoFlex根据其稳定性、准确性和可识别性选择有利于长期定位的地标。与现有的依赖训练数据的方法相比,RoFlex使用七个不同的属性来计算每个地标的单独得分,以评估它们作为本地化支持的适用性。该方法在多个数据集上进行了定性评估,并在不同的环境和条件下确定了稳定、准确和可识别的地标。此外,我们基于三个实验(可识别性、稳定性和定位精度)进行了定量评估,表明即使只使用50%的地标,RoFlex仍能保持90%左右的可识别性和定位性能。由于这个原因,这项工作对汽车领域的长期本地化做出了有效的贡献。此外,RoFlex的模块化设计为进一步研究非语义地标的过滤奠定了基础。
{"title":"RoFlex: Robust and flexible filtering of non-semantic landmarks for automotive applications","authors":"Tobias Fichtmueller,&nbsp;Alexander Witt,&nbsp;Christoph Holst","doi":"10.1016/j.ophoto.2025.100107","DOIUrl":"10.1016/j.ophoto.2025.100107","url":null,"abstract":"<div><div>Visual Simultaneous Localization and Mapping (VSLAM) provides a reliable option for the precise vehicle localization required for planning and executing autonomous driving maneuvers, especially in areas where traditional GNSS-based systems fail. Therefore, our current objective is to transmit the generated 3D points (non-semantic landmarks) to the data backend to store them in a map-layer for application as future localization support. However, the limited bandwidth between the vehicle and the data backend requires filtering the landmarks before transmission.</div><div>This paper introduces RoFlex, a robust and flexible approach for filtering non-semantic landmarks within the calculation front-end of a VSLAM system. Given the bandwidth restrictions in vehicle-to-data-backend communication, RoFlex selects landmarks beneficial to long-term localization based on their stability, accuracy, and recognizability. In contrast to existing approaches that rely on training data, RoFlex computes an individual score for each landmark using seven distinct attributes to assess their suitability as localization support. The methodology was qualitatively evaluated on several datasets and identified stable, accurate, and recognizable landmarks across different environments and conditions. In addition, we conducted a quantitative evaluation based on three experiments (recognizability, stability, and localization accuracy), demonstrating that RoFlex retains around 90% recognizability and preserves localization performance even when only 50% of the landmarks are used. For this reason, the work represents an effective contribution to long-term localization within the automotive domain. Moreover, the modular design of RoFlex serves as a foundation for further research on filtering non-semantic landmarks.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100107"},"PeriodicalIF":0.0,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145579513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards monitoring livestock using satellite imagery: Transferability of object detection and segmentation models in Kenyan rangelands 利用卫星图像监测牲畜:肯尼亚牧场目标检测和分割模型的可转移性
Pub Date : 2025-10-31 DOI: 10.1016/j.ophoto.2025.100106
Ian A. Ocholla , Janne Heiskanen , Faith Karanja , Mark Boitt , Petri Pellikka
Over the past four decades, rising demand for livestock products in Africa has led to increased stocking rates resulting in overgrazing and land degradation. As the population is projected to rise, the need for sustainable livestock management is more urgent than ever, yet efforts are hindered by the lack of accurate, up-to-date livestock counts. Recent advances in remote sensing and deep learning have made it possible to count livestock from space. However, the extent to which models trained on aerial imagery can enhance livestock detection in satellite images and across diverse landscapes remains limited. This study assessed the transferability of YOLO, Faster R-CNN, U-Net, and ResNet models for livestock detection across three contrasting landscapes, Choke bushland (Pleiades Neo), Kapiti savanna (WorldView-3), and LUMO open grassland (WorldView-3), using satellite imagery with 0.3 m and 0.4 m spatial resolution. Additionally, we applied a multi-stage transfer learning to evaluate the effectiveness of aerial imagery (0.1 m) trained models in improving livestock detection in satellite imagery. Results indicate that YOLOv5 consistently outperformed other models, achieving F1 scores of 0.55, 0.67, and 0.85 in Choke, Kapiti, and LUMO, respectively, demonstrating robustness across varying land cover types and sensors. Although segmentation models performed moderately on 0.3 m imagery (F1 scores of 0.51 and 0.40 for Choke and LUMO), their performance dropped significantly on the coarser resolution (0.4 m) Kapiti imagery (F1 score of 0.14). In addition, multi-stage transfer learning improved segmentation models recall by 9.8 % in heterogeneous bushland site. Our results highlight that the integration of multi-source imagery and deep learning can help in large scale livestock monitoring, which is crucial in implementing sustainable rangeland management.
在过去四十年中,非洲对畜产品需求的不断增长导致牲畜放养率上升,导致过度放牧和土地退化。随着人口的增长,对可持续牲畜管理的需求比以往任何时候都更加迫切,但由于缺乏准确、最新的牲畜数量,这方面的努力受到了阻碍。遥感和深度学习的最新进展使从太空中计算牲畜成为可能。然而,利用航空图像训练的模型在卫星图像和不同景观中增强牲畜检测的程度仍然有限。本研究利用0.3 m和0.4 m空间分辨率的卫星图像,评估了YOLO、Faster R-CNN、U-Net和ResNet模型在3种不同景观(Choke bushland (Pleiades Neo)、Kapiti savannah (WorldView-3)和LUMO open grass (WorldView-3))中牲畜检测的可转移性。此外,我们应用多阶段迁移学习来评估航空图像(0.1 m)训练模型在提高卫星图像中牲畜检测方面的有效性。结果表明,YOLOv5始终优于其他模型,在Choke、Kapiti和LUMO中分别获得0.55、0.67和0.85的F1分数,显示出对不同土地覆盖类型和传感器的鲁棒性。尽管分割模型在0.3 m图像上表现一般(Choke和LUMO的F1得分分别为0.51和0.40),但在较粗分辨率(0.4 m) Kapiti图像上表现明显下降(F1得分为0.14)。此外,多阶段迁移学习使不同类型林地的分割模型的召回率提高了9.8%。我们的研究结果强调了多源图像和深度学习的集成可以帮助大规模牲畜监测,这对于实施可持续的牧场管理至关重要。
{"title":"Towards monitoring livestock using satellite imagery: Transferability of object detection and segmentation models in Kenyan rangelands","authors":"Ian A. Ocholla ,&nbsp;Janne Heiskanen ,&nbsp;Faith Karanja ,&nbsp;Mark Boitt ,&nbsp;Petri Pellikka","doi":"10.1016/j.ophoto.2025.100106","DOIUrl":"10.1016/j.ophoto.2025.100106","url":null,"abstract":"<div><div>Over the past four decades, rising demand for livestock products in Africa has led to increased stocking rates resulting in overgrazing and land degradation. As the population is projected to rise, the need for sustainable livestock management is more urgent than ever, yet efforts are hindered by the lack of accurate, up-to-date livestock counts. Recent advances in remote sensing and deep learning have made it possible to count livestock from space. However, the extent to which models trained on aerial imagery can enhance livestock detection in satellite images and across diverse landscapes remains limited. This study assessed the transferability of YOLO, Faster R-CNN, U-Net, and ResNet models for livestock detection across three contrasting landscapes, Choke bushland (Pleiades Neo), Kapiti savanna (WorldView-3), and LUMO open grassland (WorldView-3), using satellite imagery with 0.3 m and 0.4 m spatial resolution. Additionally, we applied a multi-stage transfer learning to evaluate the effectiveness of aerial imagery (0.1 m) trained models in improving livestock detection in satellite imagery. Results indicate that YOLOv5 consistently outperformed other models, achieving F1 scores of 0.55, 0.67, and 0.85 in Choke, Kapiti, and LUMO, respectively, demonstrating robustness across varying land cover types and sensors. Although segmentation models performed moderately on 0.3 m imagery (F1 scores of 0.51 and 0.40 for Choke and LUMO), their performance dropped significantly on the coarser resolution (0.4 m) Kapiti imagery (F1 score of 0.14). In addition, multi-stage transfer learning improved segmentation models recall by 9.8 % in heterogeneous bushland site. Our results highlight that the integration of multi-source imagery and deep learning can help in large scale livestock monitoring, which is crucial in implementing sustainable rangeland management.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"18 ","pages":"Article 100106"},"PeriodicalIF":0.0,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145466838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Open Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1