ISPRS Open Journal of Photogrammetry and Remote Sensing最新文献_第3页

Drone imaging-based wall-to-wall processing pipelines for individual tree level inventory in boreal forest plots 基于无人机成像的北寒带森林样地单树级库存的墙对墙处理管道

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-08-14 DOI: 10.1016/j.ophoto.2025.100099

Olli Nevalainen , Niko Koivumäki , Raquel Alves de Oliveira , Teemu Hakala , Roope Näsi , Xinlian Liang , Yunsheng Wang , Juha Hyyppä , Eija Honkavaara

Precise individual tree data are essential for forest management, strategic planning, efficient commercial forestry, and accurate carbon stock assessments. In this study, a wall-to-wall drone-imaging-based forest inventory processing pipeline was developed and assessed. Different cameras and data analysis methods were assessed for individual tree detection and attribute estimation at the tree and plot levels. The experiment was conducted in Finland in six boreal forest study areas, with three major tree species: Scots pine (Pinus sylvestris), Norway spruce (Picea abies), and birch (Betula pendula and Betula pubescens). RGB and multispectral (MS) cameras provided single-sensor solutions for the forest inventory pipeline, whereas a hyperspectral (HS) camera was used in combination with the RGB camera to enhance species classification. High-quality RGB data performed better than MS data for tree detection and attribute estimation. The best tree detection rates were 56–84 % in areas with mostly dominant and co-dominant trees. The two evaluated tree detection methods (local maximum and segmentation) provided similar tree detection rates and tree attribute estimation accuracies. Tree level attributes were estimated with root mean square errors (RMSEs) of 0.97 m (5.1 %) for tree height, 3.1 cm (14 %) for diameter at breast height (DBH), 129.6 cm² (25 %) for the basal area, and 0.13 m³ (23 %) for the volume. The HS camera yielded the highest tree species classification performance, with maximum f-scores of 0.81 for RGB, 0.88 for MS, and 0.89 for combined HS + RGB data. At the plot level, RMSEs for stem density, basal area, and volume were 855.7 ha^-1 (74.6 %), 6.9 m² ha⁻¹ (24.2 %), and 48.6 m³ ha⁻¹ (17.6 %), respectively. This study was the first to assess entire inventory pipelines with a comprehensive camera setup and proved that low-cost RGB and MS cameras provide acceptable performance for tree inventories in boreal forests. These results can guide the implementation of low-cost forest inventory processes.

精确的单株树木数据对于森林管理、战略规划、有效的商业林业和准确的碳储量评估至关重要。在本研究中，开发并评估了基于无人机成像的墙对墙森林清查处理管道。评估了不同的相机和数据分析方法在树和地块水平上对单个树的检测和属性估计。实验在芬兰六个北方森林研究区进行，主要树种为苏格兰松（Pinus sylvestris）、挪威云杉（Picea abies）和桦树（Betula pendula和Betula pubescens）。RGB和多光谱（MS）相机为森林清查管线提供了单传感器解决方案，而高光谱（HS）相机与RGB相机结合使用以增强物种分类。高质量RGB数据在树检测和属性估计方面优于MS数据。主要优势树和共优势树的检出率为56 ~ 84%。两种评估的树检测方法（局部最大值和分割）提供了相似的树检测率和树属性估计精度。树高的均方根误差（rmse）为0.97 m(5.1%)，胸径（DBH）为3.1 cm(14%)，基底面积为129.6 cm2(25%)，体积为0.13 m3（23%）。HS相机的树种分类性能最高，RGB数据的最高f值为0.81，MS数据的最高f值为0.88，HS + RGB数据的最高f值为0.89。在样地水平上，茎密度、基面积和体积的rmse分别为855.7 ha-1（74.6%）、6.9 m2 ha-1（24.2%）和48.6 m3 ha-1（17.6%）。这项研究是第一个用全面的相机设置评估整个库存管道的研究，并证明低成本的RGB和MS相机为北方森林的树木库存提供了可接受的性能。这些结果可以指导低成本森林清查过程的实施。

{"title":"Drone imaging-based wall-to-wall processing pipelines for individual tree level inventory in boreal forest plots","authors":"Olli Nevalainen , Niko Koivumäki , Raquel Alves de Oliveira , Teemu Hakala , Roope Näsi , Xinlian Liang , Yunsheng Wang , Juha Hyyppä , Eija Honkavaara","doi":"10.1016/j.ophoto.2025.100099","DOIUrl":"10.1016/j.ophoto.2025.100099","url":null,"abstract":"<div><div>Precise individual tree data are essential for forest management, strategic planning, efficient commercial forestry, and accurate carbon stock assessments. In this study, a wall-to-wall drone-imaging-based forest inventory processing pipeline was developed and assessed. Different cameras and data analysis methods were assessed for individual tree detection and attribute estimation at the tree and plot levels. The experiment was conducted in Finland in six boreal forest study areas, with three major tree species: Scots pine <em>(Pinus sylvestris),</em> Norway spruce (<em>Picea abies</em>), and birch <em>(Betula pendula</em> and <em>Betula pubescens).</em> RGB and multispectral (MS) cameras provided single-sensor solutions for the forest inventory pipeline, whereas a hyperspectral (HS) camera was used in combination with the RGB camera to enhance species classification. High-quality RGB data performed better than MS data for tree detection and attribute estimation. The best tree detection rates were 56–84 % in areas with mostly dominant and co-dominant trees. The two evaluated tree detection methods (local maximum and segmentation) provided similar tree detection rates and tree attribute estimation accuracies. Tree level attributes were estimated with root mean square errors (RMSEs) of 0.97 m (5.1 %) for tree height, 3.1 cm (14 %) for diameter at breast height (DBH), 129.6 cm<sup>2</sup> (25 %) for the basal area, and 0.13 m<sup>3</sup> (23 %) for the volume. The HS camera yielded the highest tree species classification performance, with maximum f-scores of 0.81 for RGB, 0.88 for MS, and 0.89 for combined HS + RGB data. At the plot level, RMSEs for stem density, basal area, and volume were 855.7 ha<sup>-1</sup> (74.6 %), 6.9 m<sup>2</sup> ha<sup>−1</sup> (24.2 %), and 48.6 m<sup>3</sup> ha<sup>−1</sup> (17.6 %), respectively. This study was the first to assess entire inventory pipelines with a comprehensive camera setup and proved that low-cost RGB and MS cameras provide acceptable performance for tree inventories in boreal forests. These results can guide the implementation of low-cost forest inventory processes.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100099"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144903839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Permanent terrestrial laser scanning for near-continuous environmental observations: Systems, methods, challenges and applications 近连续环境观测的永久地面激光扫描：系统、方法、挑战和应用

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-07-11 DOI: 10.1016/j.ophoto.2025.100094

Roderik Lindenbergh , Katharina Anders , Mariana Campos , Daniel Czerwonka-Schröder , Bernhard Höfle , Mieke Kuschnerus , Eetu Puttonen , Rainer Prinz , Martin Rutzinger , Annelies Voordendag , Sander Vos

Many topographic scenes exhibit complex dynamic behavior that is difficult to map, quantify, predict and understand. A terrestrial laser scanner fixed on a permanent position can be used to monitor such scenes in an automated way with centimeter to decimeter quality at ranges of up to several kilometers. Laser scanners are active sensors, and are therefore able to continue operation during night. Their independence from texture conditions ensures that in principle they provide stable range measurements for varying surface conditions. Recent years have seen a strong increase in the employment of such systems for different scientific applications in geosciences, environmental and ecological sciences, including forestry, glaciology, and geomorphology. At the same time, this employment resulted in a new type of 4D topographic data sets (3D point clouds + time) with a significant temporal dimension, as systems are now able to acquire thousands of consecutive epochs in a row. Extracting information from these 4D data sets turns out to be challenging, first, because of insufficient knowledge on error budget and correlations, and, second, because of lack of algorithms, benchmarks, and best-practice workflows. This paper provides an overview of different 4D systems for near-continuous laser scanning, and discusses systematic challenges including instability of the sensor system, meteorological and atmospheric influences, and data alignment, before discussing recently developed methods and scientific software for extracting and parameterizing changes from 4D topographic data sets, in connection to the different applications.

许多地形场景表现出复杂的动态行为，难以绘制，量化，预测和理解。固定在固定位置的地面激光扫描仪可用于在长达几公里的范围内以厘米到分米的自动方式监测此类场景。激光扫描仪是主动传感器，因此能够在夜间继续工作。它们与纹理条件的独立性确保了它们原则上为不同的表面条件提供稳定的范围测量。近年来，在地球科学、环境和生态科学（包括林业、冰川学和地貌学）的不同科学应用中，使用这种系统的情况大大增加。与此同时，由于系统现在能够连续获取数千个连续的epoch，这种应用产生了一种具有重要时间维度的新型4D地形数据集（3D点云+时间）。从这些4D数据集中提取信息是具有挑战性的，首先，因为对误差预算和相关性的了解不足，其次，因为缺乏算法、基准和最佳实践工作流程。本文概述了用于近连续激光扫描的不同四维系统，并讨论了系统挑战，包括传感器系统的不稳定性、气象和大气影响以及数据对齐，然后讨论了最近开发的方法和科学软件，用于从四维地形数据集中提取和参数化变化，并与不同的应用相关联。

{"title":"Permanent terrestrial laser scanning for near-continuous environmental observations: Systems, methods, challenges and applications","authors":"Roderik Lindenbergh , Katharina Anders , Mariana Campos , Daniel Czerwonka-Schröder , Bernhard Höfle , Mieke Kuschnerus , Eetu Puttonen , Rainer Prinz , Martin Rutzinger , Annelies Voordendag , Sander Vos","doi":"10.1016/j.ophoto.2025.100094","DOIUrl":"10.1016/j.ophoto.2025.100094","url":null,"abstract":"<div><div>Many topographic scenes exhibit complex dynamic behavior that is difficult to map, quantify, predict and understand. A terrestrial laser scanner fixed on a permanent position can be used to monitor such scenes in an automated way with centimeter to decimeter quality at ranges of up to several kilometers. Laser scanners are active sensors, and are therefore able to continue operation during night. Their independence from texture conditions ensures that in principle they provide stable range measurements for varying surface conditions. Recent years have seen a strong increase in the employment of such systems for different scientific applications in geosciences, environmental and ecological sciences, including forestry, glaciology, and geomorphology. At the same time, this employment resulted in a new type of 4D topographic data sets (3D point clouds + time) with a significant temporal dimension, as systems are now able to acquire thousands of consecutive epochs in a row. Extracting information from these 4D data sets turns out to be challenging, first, because of insufficient knowledge on error budget and correlations, and, second, because of lack of algorithms, benchmarks, and best-practice workflows. This paper provides an overview of different 4D systems for near-continuous laser scanning, and discusses systematic challenges including instability of the sensor system, meteorological and atmospheric influences, and data alignment, before discussing recently developed methods and scientific software for extracting and parameterizing changes from 4D topographic data sets, in connection to the different applications.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100094"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144604308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A structured review and taxonomy of next-best-view strategies for 3D reconstruction 三维重建的次优视图策略的结构化回顾和分类

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-08-14 DOI: 10.1016/j.ophoto.2025.100098

Bashar Alsadik , Hussein Alwan Mahdi , Nagham Amer Abdulateef

Next-Best-View (NBV) strategies are a class of approaches that solve the important problem of selecting the best possible viewpoints of an autonomous robot sensor for effective and complete 3D scene reconstruction. NBV methodologies have developed significantly over the years from rule-based approaches to those driven from deep learning. Consequently, NBV strategies have become diverse and uncategorized which makes it difficult for researchers and practitioners to navigate or standardize the methods. Therefore, in this paper, a comprehensive review was conducted to separate NBV methods into five distinct strategies: rule-based, uncertainty-based, sampling-based, learning-based, and prediction-based approaches. It is aimed to give a structured understanding after systematically reviewing over 100 publications including outlining key methodologies, open-access tools, and respective applications. Each strategy is investigated with related research questions such as understanding the role of geometric heuristics in rule-based methods, identifying efficient sampling mechanisms for exploration, leveraging predictive models for optimization, addressing uncertainty in unknown environments, and applying learning-based techniques to enhance adaptability and performance. Some suggestions are made for making classifications explicit, thus helping pull together more organized frameworks and collaborations across disciplines. This work not only offers a comprehensive resource for beginners and expert researchers but also empowers readers to answer strategy-specific research questions, providing actionable insights into NBV planning trends and emerging perspectives.

次优视点（NBV）策略是一类解决自主机器人传感器选择最佳视点以实现有效和完整的3D场景重建的重要方法。多年来，NBV方法已经从基于规则的方法发展到深度学习驱动的方法。因此，NBV策略已经变得多样化和不分类，这使得研究人员和从业者很难导航或标准化的方法。因此，本文将nvb方法分为五种不同的策略：基于规则的、基于不确定性的、基于抽样的、基于学习的和基于预测的方法。它的目的是在系统地审查了100多份出版物后，给出一个结构化的理解，包括概述关键方法，开放获取工具和各自的应用。每个策略都有相关的研究问题，如理解几何启发式在基于规则的方法中的作用，确定有效的探索抽样机制，利用预测模型进行优化，解决未知环境中的不确定性，以及应用基于学习的技术来增强适应性和性能。提出了一些明确分类的建议，从而有助于将更有组织的框架和跨学科的合作结合在一起。这项工作不仅为初学者和专家研究人员提供了全面的资源，而且使读者能够回答战略特定的研究问题，为NBV规划趋势和新兴观点提供可操作的见解。

{"title":"A structured review and taxonomy of next-best-view strategies for 3D reconstruction","authors":"Bashar Alsadik , Hussein Alwan Mahdi , Nagham Amer Abdulateef","doi":"10.1016/j.ophoto.2025.100098","DOIUrl":"10.1016/j.ophoto.2025.100098","url":null,"abstract":"<div><div>Next-Best-View (NBV) strategies are a class of approaches that solve the important problem of selecting the best possible viewpoints of an autonomous robot sensor for effective and complete 3D scene reconstruction. NBV methodologies have developed significantly over the years from rule-based approaches to those driven from deep learning. Consequently, NBV strategies have become diverse and uncategorized which makes it difficult for researchers and practitioners to navigate or standardize the methods. Therefore, in this paper, a comprehensive review was conducted to separate NBV methods into five distinct strategies: rule-based, uncertainty-based, sampling-based, learning-based, and prediction-based approaches. It is aimed to give a structured understanding after systematically reviewing over 100 publications including outlining key methodologies, open-access tools, and respective applications. Each strategy is investigated with related research questions such as understanding the role of geometric heuristics in rule-based methods, identifying efficient sampling mechanisms for exploration, leveraging predictive models for optimization, addressing uncertainty in unknown environments, and applying learning-based techniques to enhance adaptability and performance. Some suggestions are made for making classifications explicit, thus helping pull together more organized frameworks and collaborations across disciplines. This work not only offers a comprehensive resource for beginners and expert researchers but also empowers readers to answer strategy-specific research questions, providing actionable insights into NBV planning trends and emerging perspectives.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100098"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144861224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating the role of training data origin for country-scale cropland mapping in data-scarce regions: A case study of Nigeria 评估培训数据来源在数据匮乏地区的国家尺度农田制图中的作用：以尼日利亚为例

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-07-09 DOI: 10.1016/j.ophoto.2025.100091

Joaquin Gajardo , Michele Volpi , Daniel Onwude , Thijs Defraeye

Cropland maps are essential for remote sensing-based agricultural monitoring, providing timely insights about agricultural development without requiring extensive field surveys. While machine learning enables large-scale mapping, it relies on geo-referenced ground-truth data, which is time-consuming to collect, motivating efforts to integrate global datasets for mapping in data-scarce regions. A key challenge is understanding how the quantity, quality, and proximity of the training data to the target region influences model performance in regions with limited local ground truth. To address this, we evaluate the impact of combining global and local datasets for cropland mapping in Nigeria at 10 m resolution. We manually labelled 1,827 data points evenly distributed across Nigeria and leveraged the crowd-sourced Geowiki dataset, evaluating three subsets of it: Nigeria, Nigeria + neighbouring countries, and worldwide. Using Google Earth Engine (GEE), we extracted multi-source time series data from Sentinel-1, Sentinel-2, ERA5 climate, and a digital elevation model (DEM) and compared Random Forest (RF) classifiers with Long Short-Term Memory (LSTM) networks, including a lightweight multi-task learning variant (multi-headed LSTM), previously applied to cropland mapping in other regions. Our findings highlight the importance of local training data, which consistently improved performance, with accuracy gains up to 0.246 (RF) and 0.178 (LSTM). Models trained on Nigeria-only or regional datasets outperformed those trained on global data, except for the multi-headed LSTM, which uniquely benefited from global samples when local data was unavailable. A sensitivity analysis revealed that Sentinel-1, climate, and topographic data were particularly important, as their removal reduced accuracy by up to 0.154 and F1-score by 0.593. Handling class imbalance was also critical, with weighted loss functions improving accuracy by up to 0.071 for the single-headed LSTM. Our best-performing model, a single-headed LSTM trained on Nigeria-only data, achieved an F1-score of 0.814 and accuracy of 0.842, performing competitively with the best global land cover product and showing strong recall performance, a metric highly-relevant for food security applications. These results underscore the value of regionally focused training data, proper class imbalance handling, and multi-modal feature integration for improving cropland mapping in data-scarce regions. We release our data, source code, output maps, and an interactive GEE web application to facilitate further research.

农田地图对于基于遥感的农业监测至关重要，它提供了关于农业发展的及时见解，而不需要广泛的实地调查。虽然机器学习可以实现大规模制图，但它依赖于地理参考的真实数据，这些数据的收集非常耗时，这促使人们努力整合全球数据集，以便在数据稀缺的地区进行制图。一个关键的挑战是理解训练数据与目标区域的数量、质量和接近程度如何影响局部地面真值有限的区域的模型性能。为了解决这个问题，我们评估了将全球和当地数据集结合起来以10米分辨率在尼日利亚进行农田测绘的影响。我们手动标记了均匀分布在尼日利亚的1827个数据点，并利用众包的Geowiki数据集，评估了其中的三个子集：尼日利亚、尼日利亚+邻国和全球。利用谷歌Earth Engine （GEE）提取了来自Sentinel-1、Sentinel-2、ERA5气候和数字高程模型（DEM）的多源时间序列数据，并将随机森林（RF）分类器与长短期记忆（LSTM）网络进行了比较，其中包括轻量级多任务学习变量（multihead LSTM），该方法此前已应用于其他地区的农田测绘。我们的研究结果强调了局部训练数据的重要性，它持续提高了性能，准确率提高了0.246 （RF）和0.178 （LSTM）。仅在尼日利亚或区域数据集上训练的模型优于在全球数据集上训练的模型，但多头LSTM除外，当本地数据不可用时，多头LSTM唯一受益于全球样本。敏感性分析显示，Sentinel-1、气候和地形数据尤其重要，因为它们的移除使精度降低了0.154，F1-score降低了0.593。处理类不平衡也很关键，对于单头LSTM，加权损失函数将准确率提高了0.071。我们表现最好的模型是仅在尼日利亚数据上训练的单头LSTM，其f1得分为0.814，准确率为0.842，与全球最佳土地覆盖产品竞争，并显示出强大的召回性能，这是一个与粮食安全应用高度相关的指标。这些结果强调了以区域为重点的训练数据、适当的类不平衡处理和多模式特征集成对于改善数据稀缺地区的农田制图的价值。我们发布了我们的数据、源代码、输出地图和一个交互式的GEE web应用程序，以促进进一步的研究。

{"title":"Evaluating the role of training data origin for country-scale cropland mapping in data-scarce regions: A case study of Nigeria","authors":"Joaquin Gajardo , Michele Volpi , Daniel Onwude , Thijs Defraeye","doi":"10.1016/j.ophoto.2025.100091","DOIUrl":"10.1016/j.ophoto.2025.100091","url":null,"abstract":"<div><div>Cropland maps are essential for remote sensing-based agricultural monitoring, providing timely insights about agricultural development without requiring extensive field surveys. While machine learning enables large-scale mapping, it relies on geo-referenced ground-truth data, which is time-consuming to collect, motivating efforts to integrate global datasets for mapping in data-scarce regions. A key challenge is understanding how the quantity, quality, and proximity of the training data to the target region influences model performance in regions with limited local ground truth. To address this, we evaluate the impact of combining global and local datasets for cropland mapping in Nigeria at 10 m resolution. We manually labelled 1,827 data points evenly distributed across Nigeria and leveraged the crowd-sourced Geowiki dataset, evaluating three subsets of it: Nigeria, Nigeria + neighbouring countries, and worldwide. Using Google Earth Engine (GEE), we extracted multi-source time series data from Sentinel-1, Sentinel-2, ERA5 climate, and a digital elevation model (DEM) and compared Random Forest (RF) classifiers with Long Short-Term Memory (LSTM) networks, including a lightweight multi-task learning variant (multi-headed LSTM), previously applied to cropland mapping in other regions. Our findings highlight the importance of local training data, which consistently improved performance, with accuracy gains up to 0.246 (RF) and 0.178 (LSTM). Models trained on Nigeria-only or regional datasets outperformed those trained on global data, except for the multi-headed LSTM, which uniquely benefited from global samples when local data was unavailable. A sensitivity analysis revealed that Sentinel-1, climate, and topographic data were particularly important, as their removal reduced accuracy by up to 0.154 and F1-score by 0.593. Handling class imbalance was also critical, with weighted loss functions improving accuracy by up to 0.071 for the single-headed LSTM. Our best-performing model, a single-headed LSTM trained on Nigeria-only data, achieved an F1-score of 0.814 and accuracy of 0.842, performing competitively with the best global land cover product and showing strong recall performance, a metric highly-relevant for food security applications. These results underscore the value of regionally focused training data, proper class imbalance handling, and multi-modal feature integration for improving cropland mapping in data-scarce regions. We release our data, source code, output maps, and an interactive GEE web application to facilitate further research.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100091"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144596268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient tree mapping through deep distance transform (DDT) learning 有效的树映射通过深度距离变换（DDT）学习

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-06-28 DOI: 10.1016/j.ophoto.2025.100095

Jan Schindler , Ziyi Sun , Bing Xue , Mengjie Zhang

Trees provide essential ecosystem services in urban areas, rural landscapes and forests. Individual tree information can inform forest and risk modelling, health studies and decision-making in public and non-governmental sectors. The increase in available remote sensing data and advances in automated object detection makes it feasible to map trees over large areas in unprecedented detail. Deep learning-based instance segmentation methods have thereby become the state-of-the-art in tree crown delineations tasks from aerial ortho-photography. Many of these methods are based on one- and two-stage detector frameworks such as Mask-RCNN and YOLO, which were developed focussing on speed and accuracy against common benchmark datasets. Another class of object detectors is based on encoder-decoder networks such as UNet which offer easy integration into existing workflows and high accuracy even in complex forest scenes in regional and national tree studies. While previous methods had to combine multi-model and multi-task outputs to create decision surfaces, we developed an efficient UNet-based modelling approach which focusses solely on learning the distance transforms of tree objects as cost surface for watershed segmentation. Our algorithm achieves superior instance segmentation across native forest, rural and urban environments in Aotearoa New Zealand, with an overall F1 score of 0.53 — 0.18 for small, 0.45 for medium and 0.67 for large crowns — surpassing previous approaches while decreasing modelling complexity, enabling fast and large-scale tree mapping.

树木在城市地区、农村景观和森林中提供重要的生态系统服务。单个树木的信息可以为公共和非政府部门的森林和风险建模、健康研究和决策提供信息。现有遥感数据的增加和自动目标探测技术的进步，使得以前所未有的详细程度绘制大面积树木地图成为可能。因此，基于深度学习的实例分割方法已成为航空正交摄影树冠描绘任务的最新技术。这些方法中的许多都是基于一阶段和两阶段检测器框架，如Mask-RCNN和YOLO，这些框架的开发重点是针对常见基准数据集的速度和准确性。另一类目标探测器基于编码器-解码器网络，如UNet，它可以轻松集成到现有的工作流程中，即使在区域和国家树木研究中复杂的森林场景中也能提供高精度。虽然以前的方法必须结合多模型和多任务输出来创建决策面，但我们开发了一种高效的基于unet的建模方法，该方法仅专注于学习树对象的距离变换作为分水岭分割的代价面。我们的算法在新西兰Aotearoa的原生森林、农村和城市环境中实现了卓越的实例分割，小树冠的总体F1得分为0.53 - 0.18，中树冠为0.45，大树冠为0.67，超越了以前的方法，同时降低了建模复杂性，实现了快速和大规模的树木映射。

{"title":"Efficient tree mapping through deep distance transform (DDT) learning","authors":"Jan Schindler , Ziyi Sun , Bing Xue , Mengjie Zhang","doi":"10.1016/j.ophoto.2025.100095","DOIUrl":"10.1016/j.ophoto.2025.100095","url":null,"abstract":"<div><div>Trees provide essential ecosystem services in urban areas, rural landscapes and forests. Individual tree information can inform forest and risk modelling, health studies and decision-making in public and non-governmental sectors. The increase in available remote sensing data and advances in automated object detection makes it feasible to map trees over large areas in unprecedented detail. Deep learning-based instance segmentation methods have thereby become the state-of-the-art in tree crown delineations tasks from aerial ortho-photography. Many of these methods are based on one- and two-stage detector frameworks such as Mask-RCNN and YOLO, which were developed focussing on speed and accuracy against common benchmark datasets. Another class of object detectors is based on encoder-decoder networks such as UNet which offer easy integration into existing workflows and high accuracy even in complex forest scenes in regional and national tree studies. While previous methods had to combine multi-model and multi-task outputs to create decision surfaces, we developed an efficient UNet-based modelling approach which focusses solely on learning the distance transforms of tree objects as cost surface for watershed segmentation. Our algorithm achieves superior instance segmentation across native forest, rural and urban environments in Aotearoa New Zealand, with an overall F1 score of 0.53 — 0.18 for small, 0.45 for medium and 0.67 for large crowns — surpassing previous approaches while decreasing modelling complexity, enabling fast and large-scale tree mapping.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100095"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144595538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Virtual replication of sediment cores for geoarchaeological research in Uruk-Warka (Iraq) 伊拉克乌鲁克-瓦尔卡地质考古研究中沉积物岩心的虚拟复制

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-06-25 DOI: 10.1016/j.ophoto.2025.100093

Max Haibt , Felix Reize , Helmut Brückner , Jörg W.E. Fassbinder , Margarete van Ess

This study presents a novel methodology for the production of high-detail, georeferenced virtual replicas of sediment cores extracted using vibracoring, a widely used technique for subsurface investigations in geoscientific research. In a case study conducted around the ancient city of Uruk in southern Iraq, 150 meters of sediment cores from 25 locations were documented. A specialized photogrammetric technique was developed to rapidly capture the visual characteristics of the stratified sediments before sampling and reuse. Cross-polarization was applied to normalize the resulting textures for enhanced sedimentological analysis. An automated processing pipeline generated georeferenced 3D models with high-detail textures, which were integrated into the UAV-based landscape model of the Uruk-VR digital twin. This comprehensive integration of surface and subsurface data offers a foundation for three-dimensional spatial analysis of stratigraphy, facilitating the reconstruction of ancient canal systems and landscape evolution of one of the oldest cities of humankind.

这项研究提出了一种新的方法，用于生产高细节的、地理参考的沉积物岩心虚拟复制品，这些岩心是用振动法提取的，振动法是地球科学研究中广泛使用的地下调查技术。在伊拉克南部乌鲁克古城周围进行的一项案例研究中，记录了来自25个地点的150米沉积物岩心。开发了一种专门的摄影测量技术，以便在采样和再利用之前快速捕捉分层沉积物的视觉特征。应用交叉极化对所得纹理进行归一化处理，以增强沉积学分析。自动处理流水线生成具有高细节纹理的地理参考3D模型，并将其集成到Uruk-VR数字孪生体的基于无人机的景观模型中。这种地表和地下数据的综合整合为地层学的三维空间分析奠定了基础，促进了古代运河系统的重建和人类最古老城市之一的景观演变。

{"title":"Virtual replication of sediment cores for geoarchaeological research in Uruk-Warka (Iraq)","authors":"Max Haibt , Felix Reize , Helmut Brückner , Jörg W.E. Fassbinder , Margarete van Ess","doi":"10.1016/j.ophoto.2025.100093","DOIUrl":"10.1016/j.ophoto.2025.100093","url":null,"abstract":"<div><div>This study presents a novel methodology for the production of high-detail, georeferenced virtual replicas of sediment cores extracted using vibracoring, a widely used technique for subsurface investigations in geoscientific research. In a case study conducted around the ancient city of Uruk in southern Iraq, 150 meters of sediment cores from 25 locations were documented. A specialized photogrammetric technique was developed to rapidly capture the visual characteristics of the stratified sediments before sampling and reuse. Cross-polarization was applied to normalize the resulting textures for enhanced sedimentological analysis. An automated processing pipeline generated georeferenced 3D models with high-detail textures, which were integrated into the UAV-based landscape model of the Uruk-VR digital twin. This comprehensive integration of surface and subsurface data offers a foundation for three-dimensional spatial analysis of stratigraphy, facilitating the reconstruction of ancient canal systems and landscape evolution of one of the oldest cities of humankind.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100093"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144510710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An end-to-end deep learning solution for automated LiDAR tree detection in the urban environment 一个端到端的深度学习解决方案，用于城市环境中自动激光雷达树木检测

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-06-07 DOI: 10.1016/j.ophoto.2025.100092

Julian R. Rice , G. Andrew Fricker , Jonathan Ventura

Cataloging and classifying trees in the urban environment is a crucial step in urban and environmental planning; however, manual collection and maintenance of this data is expensive and time-consuming. Although algorithmic approaches that rely on remote sensing data have been developed for tree detection in forests, they generally struggle in the more varied urban environment. This work proposes a novel end-to-end deep learning method for the detection of trees in the urban environment from remote sensing data. Specifically, we develop and train a novel PointNet-based neural network architecture to predict tree locations directly from LiDAR data augmented with multi-spectral imagery. We compare this model to a number of high-performing baselines on a large and varied dataset in the Southern California region, and find that our method outperforms all baselines in terms of tree detection ability (75.5% F-score) and positional accuracy (2.28 meter root mean squared error), while being highly efficient. We then analyze and compare the sources of errors, and how these reveal the strengths and weaknesses of each approach. Our results highlight the importance of fusing spectral and structural information for remote sensing tasks in complex urban environments.

城市环境树木的编目分类是城市环境规划的重要环节；然而，手工收集和维护这些数据既昂贵又耗时。虽然已经开发了依靠遥感数据的算法方法来探测森林中的树木，但它们通常难以适应变化较多的城市环境。这项工作提出了一种新的端到端深度学习方法，用于从遥感数据中检测城市环境中的树木。具体来说，我们开发并训练了一种新的基于pointnet的神经网络架构，可以直接从多光谱图像增强的激光雷达数据中预测树木的位置。我们将该模型与南加州地区大量不同数据集上的许多高性能基线进行了比较，发现我们的方法在树检测能力（75.5% F-score）和位置精度（2.28米均方根误差）方面优于所有基线，同时效率很高。然后，我们分析和比较错误的来源，以及它们如何揭示每种方法的优点和缺点。我们的研究结果强调了在复杂的城市环境中融合光谱和结构信息对遥感任务的重要性。

{"title":"An end-to-end deep learning solution for automated LiDAR tree detection in the urban environment","authors":"Julian R. Rice , G. Andrew Fricker , Jonathan Ventura","doi":"10.1016/j.ophoto.2025.100092","DOIUrl":"10.1016/j.ophoto.2025.100092","url":null,"abstract":"<div><div>Cataloging and classifying trees in the urban environment is a crucial step in urban and environmental planning; however, manual collection and maintenance of this data is expensive and time-consuming. Although algorithmic approaches that rely on remote sensing data have been developed for tree detection in forests, they generally struggle in the more varied urban environment. This work proposes a novel end-to-end deep learning method for the detection of trees in the urban environment from remote sensing data. Specifically, we develop and train a novel PointNet-based neural network architecture to predict tree locations directly from LiDAR data augmented with multi-spectral imagery. We compare this model to a number of high-performing baselines on a large and varied dataset in the Southern California region, and find that our method outperforms all baselines in terms of tree detection ability (75.5% F-score) and positional accuracy (2.28 meter root mean squared error), while being highly efficient. We then analyze and compare the sources of errors, and how these reveal the strengths and weaknesses of each approach. Our results highlight the importance of fusing spectral and structural information for remote sensing tasks in complex urban environments.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100092"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144306762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Asynchronous Lidar: Proof-of-concept simulation and demonstration tests 异步激光雷达：概念验证仿真和演示测试

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-08-05 DOI: 10.1016/j.ophoto.2025.100096

Craig L. Glennie , Luyen K. Bui , Francisco Haces-Garcia , Derek D. Lichti

This study proposes an asynchronous airborne lidar design in which the laser transmitter and detectors/receivers are disconnected and carried on separate platforms. This design is more advantageous than conventional synchronous lidar systems operating in monostatic mode because redundant lidar observations can be captured. First, proof-of-concept experiments are conducted based on Monte Carlo simulations assuming a transmitter is combined with different numbers of receivers. In this way, different receiver configurations, i.e., the locations of the transmitter and receivers relative to each other, are tested with both single beam (nadir and slant range) and multi beam transmitters. Networks with the transmitter, receivers, and ground point forming a plane result in very high dilution of precision corresponding to high ground point uncertainties, which are weak configurations and should be avoided. A laboratory demonstration of an asynchronous lidar system is also presented. The results from the lab demo validate the observations made by the simulation studies. Networks with three or four receivers appear to be a reasonable balance between the number of receivers used and the ground point uncertainties. Ground point uncertainties are also dependent on the transmitter and receiver flight altitudes; multi beam simulations of four-receiver networks with varying transmitter/receiver flight heights show that the horizontal uncertainties are almost completely dependent on the transmitter flight altitude, however, both flight altitudes affect the vertical uncertainty with the receiver flight altitude having a greater influence. The best configuration with the lowest uncertainties is obtained by maximizing the ratio of transmitter height to receiver height.

本研究提出了一种异步机载激光雷达设计，其中激光发射器和探测器/接收器断开连接并在单独的平台上进行。这种设计比传统的同步激光雷达系统在单站模式下更有利，因为可以捕获冗余的激光雷达观测。首先，基于蒙特卡罗模拟进行了概念验证实验，假设发射机与不同数量的接收器相结合。通过这种方式，用单波束（最低点和倾斜范围）和多波束发射机测试不同的接收机配置，即发射机和接收机相对于彼此的位置。发射机、接收机和接地点形成一个平面的网络，由于接地点的不确定性大，导致精度的稀释非常大，这是一种弱配置，应该避免。本文还介绍了一种异步激光雷达系统的实验室演示。实验室演示的结果验证了模拟研究的观察结果。具有三个或四个接收器的网络似乎是所使用的接收器数量和接地点不确定性之间的合理平衡。地面点的不确定性还取决于发射机和接收机的飞行高度；不同飞行高度的四接收机网络多波束仿真表明，水平不确定性几乎完全依赖于发射机飞行高度，但两种飞行高度都会影响垂直不确定性，其中接收机飞行高度的影响更大。通过最大化发射机高度与接收机高度之比，可以获得具有最小不确定性的最佳配置。

{"title":"Asynchronous Lidar: Proof-of-concept simulation and demonstration tests","authors":"Craig L. Glennie , Luyen K. Bui , Francisco Haces-Garcia , Derek D. Lichti","doi":"10.1016/j.ophoto.2025.100096","DOIUrl":"10.1016/j.ophoto.2025.100096","url":null,"abstract":"<div><div>This study proposes an asynchronous airborne lidar design in which the laser transmitter and detectors/receivers are disconnected and carried on separate platforms. This design is more advantageous than conventional synchronous lidar systems operating in monostatic mode because redundant lidar observations can be captured. First, proof-of-concept experiments are conducted based on Monte Carlo simulations assuming a transmitter is combined with different numbers of receivers. In this way, different receiver configurations, i.e., the locations of the transmitter and receivers relative to each other, are tested with both single beam (nadir and slant range) and multi beam transmitters. Networks with the transmitter, receivers, and ground point forming a plane result in very high dilution of precision corresponding to high ground point uncertainties, which are weak configurations and should be avoided. A laboratory demonstration of an asynchronous lidar system is also presented. The results from the lab demo validate the observations made by the simulation studies. Networks with three or four receivers appear to be a reasonable balance between the number of receivers used and the ground point uncertainties. Ground point uncertainties are also dependent on the transmitter and receiver flight altitudes; multi beam simulations of four-receiver networks with varying transmitter/receiver flight heights show that the horizontal uncertainties are almost completely dependent on the transmitter flight altitude, however, both flight altitudes affect the vertical uncertainty with the receiver flight altitude having a greater influence. The best configuration with the lowest uncertainties is obtained by maximizing the ratio of transmitter height to receiver height.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100096"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144829175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A method for extracting water surface and hydrophytic vegetation from ICESat-2 data in wetlands 基于ICESat-2数据提取湿地水面和水生植被的方法

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-08-13 DOI: 10.1016/j.ophoto.2025.100097

Rong Zhao , Shijuan Gao , Kun Zhang , Defang Li , Yi Li

The Ice, Cloud, and Land Elevation Satellite-2 provides a great opportunity to measure water surface and hydrophytic vegetation in complex wetlands. Obtaining reliable signal photons from ICESat-2 data in wetlands is challenging because there are many types of noise photons, such as specular return photons, after-pulse photons, and noise photons caused by sunlight. In addition, the high photon density difference between the water and hydrophytic vegetation makes it difficult to find accurate hydrophytic vegetation photons. Therefore, this research aims to propose a method to obtain high-accuracy signal photons and classify water body photons and hydrophytic vegetation photons in complex wetlands. First, we introduced the modified elevation histogram statistics vector-based (MEHSV) method to filter out noise photons caused by sunlight. The MEHSV method was developed to retain sparse canopy photons. Therefore, the MEHSV method can retain sparse hydrophytic vegetation photons. Second, peak analysis of the elevation histogram statistics removed the specular return photons and after-pulse photons caused by the water surface. Finally, the manually labeled photons and reference water surface level data were used to assess the proposed method. The filtering results showed that the F value of the proposed method achieved 0.99. Compared with other reference methods, the proposed method both preserved hydrophytic vegetation photons being misrecognized and removed all types of noise photons effectively. The water photons and hydrophytic vegetation photons were distinguished accurately. Additionally, the accuracy of water surface level (R² = 0.97, and RMSE = 0.84 m) witnessed the good performance of the proposed method.

冰、云和陆地高程卫星-2为测量复杂湿地的水面和水生植被提供了很好的机会。从ICESat-2湿地数据中获得可靠的信号光子具有挑战性，因为存在多种类型的噪声光子，如镜面返回光子、后脉冲光子和阳光引起的噪声光子。此外，水体和水生植被之间的光子密度差较大，使得很难找到准确的水生植被光子。因此，本研究旨在提出一种在复杂湿地中获取高精度信号光子并对水体光子和水生植被光子进行分类的方法。首先，我们引入了改进的基于高程直方图统计向量（MEHSV）的方法来过滤太阳光引起的噪声光子。开发了MEHSV方法来保留稀疏的冠层光子。因此，MEHSV方法可以保留稀疏的水生植被光子。其次，对高程直方图统计进行峰值分析，去除水面引起的镜面反射光子和后脉冲光子。最后，使用人工标记的光子和参考水面数据对所提出的方法进行评估。滤波结果表明，该方法的F值达到0.99。与其他参考方法相比，该方法既能有效地保留被误识别的水生植被光子，又能有效地去除各种类型的噪声光子。对水光子和水生植被光子进行了准确的区分。此外，水位精度（R2 = 0.97, RMSE = 0.84 m）证明了该方法的良好性能。

{"title":"A method for extracting water surface and hydrophytic vegetation from ICESat-2 data in wetlands","authors":"Rong Zhao , Shijuan Gao , Kun Zhang , Defang Li , Yi Li","doi":"10.1016/j.ophoto.2025.100097","DOIUrl":"10.1016/j.ophoto.2025.100097","url":null,"abstract":"<div><div>The Ice, Cloud, and Land Elevation Satellite-2 provides a great opportunity to measure water surface and hydrophytic vegetation in complex wetlands. Obtaining reliable signal photons from ICESat-2 data in wetlands is challenging because there are many types of noise photons, such as specular return photons, after-pulse photons, and noise photons caused by sunlight. In addition, the high photon density difference between the water and hydrophytic vegetation makes it difficult to find accurate hydrophytic vegetation photons. Therefore, this research aims to propose a method to obtain high-accuracy signal photons and classify water body photons and hydrophytic vegetation photons in complex wetlands. First, we introduced the modified elevation histogram statistics vector-based (MEHSV) method to filter out noise photons caused by sunlight. The MEHSV method was developed to retain sparse canopy photons. Therefore, the MEHSV method can retain sparse hydrophytic vegetation photons. Second, peak analysis of the elevation histogram statistics removed the specular return photons and after-pulse photons caused by the water surface. Finally, the manually labeled photons and reference water surface level data were used to assess the proposed method. The filtering results showed that the F value of the proposed method achieved 0.99. Compared with other reference methods, the proposed method both preserved hydrophytic vegetation photons being misrecognized and removed all types of noise photons effectively. The water photons and hydrophytic vegetation photons were distinguished accurately. Additionally, the accuracy of water surface level (R<sup>2</sup> = 0.97, and RMSE = 0.84 m) witnessed the good performance of the proposed method.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100097"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144903838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

From gaps to granularity: CRPAG-DSHAT based multi-modal deep learning framework for DEM void repair and super-resolution reconstruction in Himalayas 从间隙到粒度：基于CRPAG-DSHAT的喜马拉雅山DEM空洞修复与超分辨率重建多模态深度学习框架

ISPRS Open Journal of Photogrammetry and Remote Sensing

Pub Date : 2025-08-01 Epub Date: 2025-09-02 DOI: 10.1016/j.ophoto.2025.100101

Sayantan Mandal, Ashis Kumar Saha

Digital Elevation Models (DEMs) are essential for terrain characterization and environmental modeling, yet their utility is limited by data voids and coarse resolution, especially in complex mountainous regions of Himalayas. To address these challenges, we propose a novel dual-stage deep learning pipeline that unifies void filling and super-resolution into a cohesive framework, leveraging both topographic fidelity and spectral texture. First, the Conditional Residual Pyramid Attentional Generator (CRPAG) a hybrid model that integrates multi-scale DEM features with Sentinel-2 red band reflectance (∼665 nm) using an Improved Channel Attention Module (ICAM), Residual Pyramid Attention Block (TFG_RPAB), and a dual-encoder design. This allows CRPAG to prioritize structural fidelity (RMSE 9.1–28.9 m) while reconstructing missing terrain features (Mean Absolute Error MAE 1.9–8.1 m). This void-filled, high-resolution DEM then supervise the training of Dual-Stream Hierarchical Attention Transformer (DS-HAT), which performs super-resolution on globally available low-resolution DEMs (ALOS PALSAR), guided by pixel-wise height attention and texture-aware mechanisms. Compared to benchmark models such as MCU-Net-EDF and conventional U-Nets, our integrated system shows improvements in elevation accuracy (RMSE ↓, P95 = 9.2 m), spatial consistency (Moran's I ↑), and structural similarity (SSIM ↑), particularly across high-curvature and spectrally ambiguous regions. Besides, Ablation studies confirm the complementary applications of topographic variables in mitigating oversmoothing and enhancing terrain realism. This dual-stage strategy not only enhances DEM fidelity but also provides a scalable framework for improving DEM quality. Through this multi-modal fusion, this work transforms topographic knowledge into computable framework, advancing DEM applicability in hydrological modeling, detection mechanisms and disaster prediction.

数字高程模型（dem）对于地形表征和环境建模至关重要，但其效用受到数据空洞和粗分辨率的限制，特别是在喜马拉雅复杂的山区。为了解决这些挑战，我们提出了一种新的双阶段深度学习管道，将空隙填充和超分辨率统一到一个内聚框架中，同时利用地形保真度和光谱纹理。首先，条件残差金字塔注意发生器（CRPAG）是一种混合模型，它将多尺度DEM特征与Sentinel-2红色波段反射率（~ 665 nm）集成在一起，采用改进的通道注意模块（ICAM）、残差金字塔注意块（TFG_RPAB）和双编码器设计。这允许CRPAG在重建缺失地形特征（平均绝对误差MAE 1.9-8.1 m）时优先考虑结构保真度（RMSE 9.1-28.9 m）。然后，这个充满空白的高分辨率DEM监督双流分层注意转换器（DS-HAT）的训练，该转换器在像素级高度注意和纹理感知机制的指导下，在全球可用的低分辨率DEM （ALOS PALSAR）上执行超分辨率。与MCU-Net-EDF和传统U-Nets等基准模型相比，我们的集成系统在高程精度（RMSE↓，P95 = 9.2 m）、空间一致性（Moran’s I↑）和结构相似性（SSIM↑）方面有所提高，特别是在高曲率和光谱模糊区域。此外，消融研究证实了地形变量在缓解过平滑和增强地形真实感方面的互补应用。这种双阶段策略不仅提高了DEM保真度，而且为提高DEM质量提供了可扩展的框架。通过这种多模态融合，这项工作将地形知识转化为可计算的框架，提高了DEM在水文建模、探测机制和灾害预测中的适用性。

{"title":"From gaps to granularity: CRPAG-DSHAT based multi-modal deep learning framework for DEM void repair and super-resolution reconstruction in Himalayas","authors":"Sayantan Mandal, Ashis Kumar Saha","doi":"10.1016/j.ophoto.2025.100101","DOIUrl":"10.1016/j.ophoto.2025.100101","url":null,"abstract":"<div><div>Digital Elevation Models (DEMs) are essential for terrain characterization and environmental modeling, yet their utility is limited by data voids and coarse resolution, especially in complex mountainous regions of Himalayas. To address these challenges, we propose a novel dual-stage deep learning pipeline that unifies void filling and super-resolution into a cohesive framework, leveraging both topographic fidelity and spectral texture. First, the <strong>Conditional Residual Pyramid Attentional Generator (CRPAG)</strong> a hybrid model that integrates multi-scale DEM features with Sentinel-2 red band reflectance (∼665 nm) using an <strong>Improved Channel Attention Module</strong> (ICAM), <strong>Residual Pyramid Attention Block</strong> (TFG_RPAB), and a dual-encoder design. This allows CRPAG to prioritize structural fidelity (RMSE 9.1–28.9 m) while reconstructing missing terrain features (Mean Absolute Error MAE 1.9–8.1 m). This void-filled, high-resolution DEM then supervise the training of <strong>Dual-Stream Hierarchical Attention Transformer (DS-HAT)</strong>, which performs super-resolution on globally available low-resolution DEMs (ALOS PALSAR), guided by pixel-wise height attention and texture-aware mechanisms. Compared to benchmark models such as MCU-Net-EDF and conventional U-Nets, our integrated system shows improvements in elevation accuracy (RMSE ↓, P95 = 9.2 m), spatial consistency (Moran's I ↑), and structural similarity (SSIM ↑), particularly across high-curvature and spectrally ambiguous regions. Besides, Ablation studies confirm the complementary applications of topographic variables in mitigating oversmoothing and enhancing terrain realism. This dual-stage strategy not only enhances DEM fidelity but also provides a scalable framework for improving DEM quality. Through this multi-modal fusion, this work transforms topographic knowledge into computable framework, advancing DEM applicability in hydrological modeling, detection mechanisms and disaster prediction.</div></div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"17 ","pages":"Article 100101"},"PeriodicalIF":0.0,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145009967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0