The Photogrammetric Record最新文献_第4页

Multi‐tiling neural radiance field (NeRF)—geometric assessment on large‐scale aerial datasets 对大规模航空数据集进行多层次神经辐射场（NeRF）几何评估

The Photogrammetric Record

Pub Date : 2024-05-01 DOI: 10.1111/phor.12498

Ningli Xu, Rongjun Qin, Debao Huang, Fabio Remondino

Neural radiance fields (NeRF) offer the potential to benefit 3D reconstruction tasks, including aerial photogrammetry. However, the scalability and accuracy of the inferred geometry are not well‐documented for large‐scale aerial assets. We aim to provide a thorough assessment of NeRF in 3D reconstruction from aerial images and compare it with three traditional multi‐view stereo (MVS) pipelines. However, typical NeRF approaches are not designed for large‐format aerial images, which result in very high memory consumption (often cost‐prohibitive) and slow convergence when directly applied to aerial assets. Despite a few NeRF variants adopting a representation tiling scheme to increase scalability, the random ray‐sampling strategy during training still hinders its general applicability for aerial assets. To perform an effective evaluation, we propose a new scheme to scale NeRF. In addition to representation tiling, we introduce a location‐specific sampling technique as well as a multi‐camera tiling (MCT) strategy to reduce memory consumption during image loading for RAM, representation training for GPU memory and increase the convergence rate within tiles. The MCT method decomposes a large‐frame image into multiple tiled images with different camera models, allowing these small‐frame images to be fed into the training process as needed for specific locations without a loss of accuracy. This enables NeRF approaches to be applied to aerial datasets on affordable computing devices, such as regular workstations. The proposed adaptation can be implemented to adapt for scaling any existing NeRF methods. Therefore, in this paper, instead of comparing accuracy performance against different NeRF variants, we implement our method based on a representative approach, Mip‐NeRF, and compare it against three traditional photogrammetric MVS pipelines on a typical aerial dataset against lidar reference data to assess NeRF's performance. Both qualitative and quantitative results suggest that the proposed NeRF approach produces better completeness and object details than traditional approaches, although as of now, it still falls short in terms of accuracy. The codes and datasets are made publicly available at https://github.com/GDAOSU/MCT_NERF.

神经辐射场（NeRF）为三维重建任务（包括航空摄影测量）提供了潜在的益处。然而，对于大规模航空资产而言，推断几何图形的可扩展性和准确性并没有得到充分的证明。我们旨在全面评估 NeRF 在航空图像三维重建中的应用，并将其与三种传统的多视角立体（MVS）管道进行比较。然而，典型的 NeRF 方法并非专为大尺寸航空图像而设计，因此直接应用于航空资产时内存消耗非常高（通常成本高昂），收敛速度也很慢。尽管有一些 NeRF 变体采用了平铺表示方案来提高可扩展性，但训练过程中的随机光线采样策略仍然阻碍了其在航空资产中的普遍应用。为了进行有效评估，我们提出了一种新的 NeRF 扩展方案。除了平铺表示法，我们还引入了特定位置采样技术和多摄像机平铺（MCT）策略，以减少 RAM 图像加载和 GPU 表示法训练时的内存消耗，并提高平铺内的收敛速度。MCT 方法可将大帧图像分解为多个具有不同相机模型的平铺图像，使这些小帧图像能够根据特定位置的需要输入到训练过程中，而不会降低精度。这样，NeRF 方法就能在经济实惠的计算设备（如普通工作站）上应用于航空数据集。建议的适应性可用于调整现有 NeRF 方法的规模。因此，在本文中，我们没有比较不同 NeRF 变体的精度性能，而是基于一种代表性方法 Mip-NeRF 实施了我们的方法，并在一个典型的航空数据集和激光雷达参考数据上将其与三种传统摄影测量 MVS 管道进行比较，以评估 NeRF 的性能。定性和定量结果都表明，与传统方法相比，所提出的 NeRF 方法能产生更好的完整性和物体细节，但目前在精度方面仍有不足。代码和数据集可在 https://github.com/GDAOSU/MCT_NERF 网站上公开获取。

{"title":"Multi‐tiling neural radiance field (NeRF)—geometric assessment on large‐scale aerial datasets","authors":"Ningli Xu, Rongjun Qin, Debao Huang, Fabio Remondino","doi":"10.1111/phor.12498","DOIUrl":"https://doi.org/10.1111/phor.12498","url":null,"abstract":"Neural radiance fields (NeRF) offer the potential to benefit 3D reconstruction tasks, including aerial photogrammetry. However, the scalability and accuracy of the inferred geometry are not well‐documented for large‐scale aerial assets. We aim to provide a thorough assessment of NeRF in 3D reconstruction from aerial images and compare it with three traditional multi‐view stereo (MVS) pipelines. However, typical NeRF approaches are not designed for large‐format aerial images, which result in very high memory consumption (often cost‐prohibitive) and slow convergence when directly applied to aerial assets. Despite a few NeRF variants adopting a representation tiling scheme to increase scalability, the random ray‐sampling strategy during training still hinders its general applicability for aerial assets. To perform an effective evaluation, we propose a new scheme to scale NeRF. In addition to representation tiling, we introduce a location‐specific sampling technique as well as a multi‐camera tiling (MCT) strategy to reduce memory consumption during image loading for RAM, representation training for GPU memory and increase the convergence rate within tiles. The MCT method decomposes a large‐frame image into multiple tiled images with different camera models, allowing these small‐frame images to be fed into the training process as needed for specific locations without a loss of accuracy. This enables NeRF approaches to be applied to aerial datasets on affordable computing devices, such as regular workstations. The proposed adaptation can be implemented to adapt for scaling any existing NeRF methods. Therefore, in this paper, instead of comparing accuracy performance against different NeRF variants, we implement our method based on a representative approach, Mip‐NeRF, and compare it against three traditional photogrammetric MVS pipelines on a typical aerial dataset against lidar reference data to assess NeRF's performance. Both qualitative and quantitative results suggest that the proposed NeRF approach produces better completeness and object details than traditional approaches, although as of now, it still falls short in terms of accuracy. The codes and datasets are made publicly available at https://github.com/GDAOSU/MCT_NERF.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140838538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A photogrammetric approach for real‐time visual SLAM applied to an omnidirectional system 应用于全向系统的实时视觉 SLAM 摄影测量方法

The Photogrammetric Record

Pub Date : 2024-05-01 DOI: 10.1111/phor.12494

Thaisa Aline Correia Garcia, Antonio Maria Garcia Tommaselli, Letícia Ferrari Castanheiro, Mariana Batista Campos

The problem of sequential estimation of the exterior orientation of imaging sensors and the three‐dimensional environment reconstruction in real time is commonly known as visual simultaneous localisation and mapping (vSLAM). Omnidirectional optical sensors have been increasingly used in vSLAM solutions, mainly for providing a wider view of the scene, allowing the extraction of more features. However, dealing with unmodelled points in the hyperhemispherical field poses challenges, mainly due to the complex lens geometry entailed in the image formation process. To address these challenges, the use of rigorous photogrammetric models that appropriately handle the geometry of fisheye lens cameras can overcome these challenges. Thus, this study presents a real‐time vSLAM approach for omnidirectional systems adapting ORB‐SLAM with a rigorous projection model (equisolid‐angle). The implementation was conducted on the Nvidia Jetson TX2 board, and the approach was evaluated using hyperhemispherical images captured by a dual‐fisheye camera (Ricoh Theta S) embedded into a mobile backpack platform. The trajectory covered a distance of 140 m, with the approach demonstrating accuracy better than 0.12 m at the beginning and achieving metre‐level accuracy at the end of the trajectory. Additionally, we compared the performance of our proposed approach with a generic model for fisheye lens cameras.

成像传感器外部方位的顺序估计和三维环境的实时重建问题通常被称为视觉同步定位和绘图（vSLAM）。全向光学传感器在 vSLAM 解决方案中的使用越来越多，主要是为了提供更广阔的场景视野，从而提取更多特征。然而，处理超半球领域中的未建模点带来了挑战，这主要是由于图像形成过程中涉及复杂的透镜几何形状。为了应对这些挑战，使用能适当处理鱼眼镜头相机几何形状的严格摄影测量模型可以克服这些挑战。因此，本研究提出了一种适用于全向系统的实时 vSLAM 方法，该方法将 ORB-SLAM 与严格的投影模型（等实角）相适配。该方法在 Nvidia Jetson TX2 板上实现，并使用嵌入移动背包平台的双鱼眼相机（理光 Theta S）捕获的超半球图像进行评估。轨迹覆盖了 140 米的距离，该方法在轨迹开始时的精度优于 0.12 米，在轨迹结束时达到了米级精度。此外，我们还比较了我们提出的方法与鱼眼镜头相机通用模型的性能。

{"title":"A photogrammetric approach for real‐time visual SLAM applied to an omnidirectional system","authors":"Thaisa Aline Correia Garcia, Antonio Maria Garcia Tommaselli, Letícia Ferrari Castanheiro, Mariana Batista Campos","doi":"10.1111/phor.12494","DOIUrl":"https://doi.org/10.1111/phor.12494","url":null,"abstract":"The problem of sequential estimation of the exterior orientation of imaging sensors and the three‐dimensional environment reconstruction in real time is commonly known as visual simultaneous localisation and mapping (vSLAM). Omnidirectional optical sensors have been increasingly used in vSLAM solutions, mainly for providing a wider view of the scene, allowing the extraction of more features. However, dealing with unmodelled points in the hyperhemispherical field poses challenges, mainly due to the complex lens geometry entailed in the image formation process. To address these challenges, the use of rigorous photogrammetric models that appropriately handle the geometry of fisheye lens cameras can overcome these challenges. Thus, this study presents a real‐time vSLAM approach for omnidirectional systems adapting ORB‐SLAM with a rigorous projection model (equisolid‐angle). The implementation was conducted on the Nvidia Jetson TX2 board, and the approach was evaluated using hyperhemispherical images captured by a dual‐fisheye camera (Ricoh Theta S) embedded into a mobile backpack platform. The trajectory covered a distance of 140 m, with the approach demonstrating accuracy better than 0.12 m at the beginning and achieving metre‐level accuracy at the end of the trajectory. Additionally, we compared the performance of our proposed approach with a generic model for fisheye lens cameras.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140838492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detecting change in graffiti using a hybrid framework 使用混合框架检测涂鸦的变化

The Photogrammetric Record

Pub Date : 2024-04-24 DOI: 10.1111/phor.12496

Benjamin Wild, Geert Verhoeven, Rafał Muszyński, Norbert Pfeifer

Graffiti, by their very nature, are ephemeral, sometimes even vanishing before creators finish them. This transience is part of graffiti's allure yet signifies the continuous loss of this often disputed form of cultural heritage. To counteract this, graffiti documentation efforts have steadily increased over the past decade. One of the primary challenges in any documentation endeavour is identifying and recording new creations. Image‐based change detection can greatly help in this process, effectuating more comprehensive documentation, less biased digital safeguarding and improved understanding of graffiti. This paper introduces a novel and largely automated image‐based graffiti change detection method. The methodology uses an incremental structure‐from‐motion approach and synthetic cameras to generate co‐registered graffiti images from different areas. These synthetic images are fed into a hybrid change detection pipeline combining a new pixel‐based change detection method with a feature‐based one. The approach was tested on a large and publicly available reference dataset captured along the Donaukanal (Eng. Danube Canal), one of Vienna's graffiti hotspots. With a precision of 87% and a recall of 77%, the results reveal that the proposed change detection workflow can indicate newly added graffiti in a monitored graffiti‐scape, thus supporting a more comprehensive graffiti documentation.

涂鸦就其本质而言是短暂的，有时甚至在创作者完成涂鸦之前就已消失。这种短暂性是涂鸦魅力的一部分，但同时也意味着这种经常引起争议的文化遗产形式正在不断消失。为了应对这种情况，在过去十年中，涂鸦文献工作稳步增加。任何记录工作的主要挑战之一都是识别和记录新的创作。基于图像的变化检测可以极大地帮助这一过程，实现更全面的记录，减少数字保护的偏差，提高对涂鸦的理解。本文介绍了一种基于图像的新型涂鸦变化自动检测方法。该方法采用从运动到结构的增量方法和合成摄像机，生成来自不同区域的共注册涂鸦图像。这些合成图像被送入混合变化检测管道，该管道结合了基于像素的新变化检测方法和基于特征的方法。该方法在沿维也纳涂鸦热点之一的多瑙河（Donaukanal）采集的大型公开参考数据集上进行了测试。结果表明，所提出的变化检测工作流程的精确度为 87%，召回率为 77%，能够在受监控的涂鸦景观中显示新添加的涂鸦，从而支持更全面的涂鸦记录。

{"title":"Detecting change in graffiti using a hybrid framework","authors":"Benjamin Wild, Geert Verhoeven, Rafał Muszyński, Norbert Pfeifer","doi":"10.1111/phor.12496","DOIUrl":"https://doi.org/10.1111/phor.12496","url":null,"abstract":"Graffiti, by their very nature, are ephemeral, sometimes even vanishing before creators finish them. This transience is part of graffiti's allure yet signifies the continuous loss of this often disputed form of cultural heritage. To counteract this, graffiti documentation efforts have steadily increased over the past decade. One of the primary challenges in any documentation endeavour is identifying and recording new creations. Image‐based change detection can greatly help in this process, effectuating more comprehensive documentation, less biased digital safeguarding and improved understanding of graffiti. This paper introduces a novel and largely automated image‐based graffiti change detection method. The methodology uses an incremental structure‐from‐motion approach and synthetic cameras to generate co‐registered graffiti images from different areas. These synthetic images are fed into a hybrid change detection pipeline combining a new pixel‐based change detection method with a feature‐based one. The approach was tested on a large and publicly available reference dataset captured along the Donaukanal (Eng. Danube Canal), one of Vienna's graffiti hotspots. With a precision of 87% and a recall of 77%, the results reveal that the proposed change detection workflow can indicate newly added graffiti in a monitored graffiti‐scape, thus supporting a more comprehensive graffiti documentation.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140665187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Quantitative regularization in robust vision transformer for remote sensing image classification 用于遥感图像分类的鲁棒视觉变换器中的定量正则化

The Photogrammetric Record

Pub Date : 2024-04-24 DOI: 10.1111/phor.12489

Huaxiang Song, Yuxuan Yuan, Zhiwei Ouyang, Yu Yang, Hui Xiang

Vision Transformers (ViTs) are exceptional at vision tasks. However, when applied to remote sensing images (RSIs), existing methods often necessitate extensive modifications of ViTs to rival convolutional neural networks (CNNs). This requirement significantly impedes the application of ViTs in geosciences, particularly for researchers who lack the time for comprehensive model redesign. To address this issue, we introduce the concept of quantitative regularization (QR), designed to enhance the performance of ViTs in RSI classification. QR represents an effective algorithm that adeptly manages domain discrepancies in RSIs and can be integrated with any ViTs in transfer learning. We evaluated the effectiveness of QR using three ViT architectures: vanilla ViT, Swin‐ViT and Next‐ViT, on four datasets: AID30, NWPU45, AFGR50 and UCM21. The results reveal that our Next‐ViT model surpasses 39 other advanced methods published in the past 3 years, maintaining robust performance even with a limited number of training samples. We also discovered that our ViT and Swin‐ViT achieve significantly higher accuracy and robustness compared to other methods using the same backbone. Our findings confirm that ViTs can be as effective as CNNs for RSI classification, regardless of the dataset size. Our approach exclusively employs open‐source ViTs and easily accessible training strategies. Consequently, we believe that our method can significantly lower the barriers for geoscience researchers intending to use ViT for RSI applications.

视觉变换器（ViTs）在视觉任务中表现出色。然而，在应用于遥感图像（RSI）时，现有方法往往需要对 ViT 进行大量修改，才能与卷积神经网络（CNN）相媲美。这一要求极大地阻碍了 ViTs 在地球科学领域的应用，尤其是对那些没有时间进行全面模型重新设计的研究人员而言。为了解决这个问题，我们引入了定量正则化（QR）的概念，旨在提高 ViTs 在 RSI 分类中的性能。量化正则化是一种有效的算法，它能巧妙地处理 RSI 中的领域差异，并能在迁移学习中与任何 ViTs 相结合。我们使用三种 ViT 架构：vanilla ViT、Swin-ViT 和 Next-ViT，在四个数据集上评估了 QR 的有效性：这四个数据集是：AID30、NWPU45、AFGR50 和 UCM21。结果表明，我们的 Next-ViT 模型超越了过去 3 年中发布的 39 种其他先进方法，即使在训练样本数量有限的情况下也能保持强劲的性能。我们还发现，与使用相同骨干网的其他方法相比，我们的 ViT 和 Swin-ViT 在准确性和稳健性方面都有显著提高。我们的研究结果证实，在 RSI 分类方面，无论数据集大小如何，ViT 都能像 CNN 一样有效。我们的方法完全采用开源 ViT 和易于获取的训练策略。因此，我们相信我们的方法可以大大降低地球科学研究人员将 ViT 用于 RSI 应用的门槛。

{"title":"Quantitative regularization in robust vision transformer for remote sensing image classification","authors":"Huaxiang Song, Yuxuan Yuan, Zhiwei Ouyang, Yu Yang, Hui Xiang","doi":"10.1111/phor.12489","DOIUrl":"https://doi.org/10.1111/phor.12489","url":null,"abstract":"Vision Transformers (ViTs) are exceptional at vision tasks. However, when applied to remote sensing images (RSIs), existing methods often necessitate extensive modifications of ViTs to rival convolutional neural networks (CNNs). This requirement significantly impedes the application of ViTs in geosciences, particularly for researchers who lack the time for comprehensive model redesign. To address this issue, we introduce the concept of quantitative regularization (QR), designed to enhance the performance of ViTs in RSI classification. QR represents an effective algorithm that adeptly manages domain discrepancies in RSIs and can be integrated with any ViTs in transfer learning. We evaluated the effectiveness of QR using three ViT architectures: vanilla ViT, Swin‐ViT and Next‐ViT, on four datasets: AID30, NWPU45, AFGR50 and UCM21. The results reveal that our Next‐ViT model surpasses 39 other advanced methods published in the past 3 years, maintaining robust performance even with a limited number of training samples. We also discovered that our ViT and Swin‐ViT achieve significantly higher accuracy and robustness compared to other methods using the same backbone. Our findings confirm that ViTs can be as effective as CNNs for RSI classification, regardless of the dataset size. Our approach exclusively employs open‐source ViTs and easily accessible training strategies. Consequently, we believe that our method can significantly lower the barriers for geoscience researchers intending to use ViT for RSI applications.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140659413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comparative analysis of surface deformation monitoring in a mining area based on UAV‐lidar and UAV photogrammetry 基于无人机激光雷达和无人机摄影测量的矿区地表变形监测对比分析

The Photogrammetric Record

Pub Date : 2024-04-17 DOI: 10.1111/phor.12490

Xilin Zhan, Xingzhong Zhang, Xiao Wang, Xinpeng Diao, Lizhuan Qi

Unmanned aerial vehicle light detection and ranging (UAV‐lidar) and unmanned aerial vehicle (UAV) photogrammetry are currently commonly used surface monitoring technologies. Previous studies have used the two technologies interchangeably and ignored their correlation, or only compared them on a single product. However, there are few quantitative assessments of the differences between these two techniques in monitoring surface deformation and prediction of their application prospects. Therefore, the paper compared the differences between the digital elevation model (DEM) and subsidence basins obtained by the two techniques using Gaussian analysis. The results indicate that the surface DEMs obtained by both the techniques exhibit a high degree of similarity. The statistical analysis of the difference values in the z direction between the two DEMs follows a Gaussian distribution with a standard deviation of less than 0.36 m. When comparing the surface subsidence values monitored by the two techniques, it was found that UAV‐lidar was more sensitive to small‐scale deformation, with a difference range of 0.23–0.44 m compared to photogrammetry. The conclusion provides valuable information regarding the utilisation of multisource monitoring data.

无人飞行器光探测与测距（UAV-lidar）和无人飞行器摄影测量是目前常用的地表监测技术。以往的研究将这两种技术交替使用，忽略了它们之间的相关性，或者只在单一产品上对它们进行比较。然而，对于这两种技术在监测地表变形方面的差异和应用前景的预测，却鲜有定量评估。因此，本文利用高斯分析法比较了两种技术获得的数字高程模型（DEM）和沉降盆地之间的差异。结果表明，两种技术获得的地表 DEM 具有高度相似性。在对两种技术监测到的地表沉降值进行比较时发现，与摄影测量法相比，无人机激光雷达对小尺度变形更为敏感，差值范围为 0.23-0.44 米。这一结论为利用多源监测数据提供了有价值的信息。

{"title":"Comparative analysis of surface deformation monitoring in a mining area based on UAV‐lidar and UAV photogrammetry","authors":"Xilin Zhan, Xingzhong Zhang, Xiao Wang, Xinpeng Diao, Lizhuan Qi","doi":"10.1111/phor.12490","DOIUrl":"https://doi.org/10.1111/phor.12490","url":null,"abstract":"Unmanned aerial vehicle light detection and ranging (UAV‐lidar) and unmanned aerial vehicle (UAV) photogrammetry are currently commonly used surface monitoring technologies. Previous studies have used the two technologies interchangeably and ignored their correlation, or only compared them on a single product. However, there are few quantitative assessments of the differences between these two techniques in monitoring surface deformation and prediction of their application prospects. Therefore, the paper compared the differences between the digital elevation model (DEM) and subsidence basins obtained by the two techniques using Gaussian analysis. The results indicate that the surface DEMs obtained by both the techniques exhibit a high degree of similarity. The statistical analysis of the difference values in the z direction between the two DEMs follows a Gaussian distribution with a standard deviation of less than 0.36 m. When comparing the surface subsidence values monitored by the two techniques, it was found that UAV‐lidar was more sensitive to small‐scale deformation, with a difference range of 0.23–0.44 m compared to photogrammetry. The conclusion provides valuable information regarding the utilisation of multisource monitoring data.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A novel method based on a starburst pattern to register UAV and terrestrial lidar point clouds in forest environments 基于星爆模式的新型方法，用于在森林环境中配准无人机和地面激光雷达点云

The Photogrammetric Record

Pub Date : 2024-04-17 DOI: 10.1111/phor.12487

Baokun Feng, Sheng Nie, Cheng Wang, Jinliang Wang, Xiaohuan Xi, Haoyu Wang, Jieying Lao, Xuebo Yang, Dachao Wang, Yiming Chen, Bo Yang

Accurate and efficient registration of unmanned aerial vehicle light detection and ranging (UAV‐lidar) and terrestrial lidar (T‐lidar) data is crucial for forest structure parameter extraction. This study proposes a novel method based on a starburst pattern for the automatic registration of UAV‐lidar and T‐lidar data in forest scenes. It employs density‐based spatial clustering of applications with noise (DBSCAN) for individual tree identification, constructs starburst patterns separately from both lidar sources, and utilises polar coordinate rotation and matching to achieve coarse registration. Fine registration is achieved using the iterative closest point (ICP) algorithm. Experimental results demonstrate that the starburst‐pattern‐based method achieves the desired registration accuracy (average coarse registration error of 0.157 m). Further optimisation with ICP yields slight improvements with an average fine registration error of 0.149 m. Remarkably, the proposed method is insensitive to the individual tree detection number when exceeding 10, and the tree position error has minimal impact on registration accuracy. Furthermore, our proposed method outperforms two existing methods in T‐lidar and UAV‐lidar registration over forest environments.

无人机光探测与测距（UAV-lidar）和地面激光雷达（T-lidar）数据的准确、高效配准对于森林结构参数提取至关重要。本研究提出了一种基于星爆模式的新方法，用于无人机激光雷达和地面激光雷达数据在森林场景中的自动配准。该方法采用基于密度的带噪声应用空间聚类（DBSCAN）来识别单棵树木，从两个激光雷达源分别构建星芒图案，并利用极坐标旋转和匹配来实现粗配准。利用迭代最邻近点（ICP）算法实现精细配准。实验结果表明，基于星爆图案的方法达到了预期的配准精度（平均粗配准误差为 0.157 米）。值得注意的是，当超过 10 棵树时，所提出的方法对单棵树的检测数量并不敏感，树的位置误差对配准精度的影响也很小。此外，在森林环境下的 T-lidar 和 UAV-lidar 注册中，我们提出的方法优于现有的两种方法。

{"title":"A novel method based on a starburst pattern to register UAV and terrestrial lidar point clouds in forest environments","authors":"Baokun Feng, Sheng Nie, Cheng Wang, Jinliang Wang, Xiaohuan Xi, Haoyu Wang, Jieying Lao, Xuebo Yang, Dachao Wang, Yiming Chen, Bo Yang","doi":"10.1111/phor.12487","DOIUrl":"https://doi.org/10.1111/phor.12487","url":null,"abstract":"Accurate and efficient registration of unmanned aerial vehicle light detection and ranging (UAV‐lidar) and terrestrial lidar (T‐lidar) data is crucial for forest structure parameter extraction. This study proposes a novel method based on a starburst pattern for the automatic registration of UAV‐lidar and T‐lidar data in forest scenes. It employs density‐based spatial clustering of applications with noise (DBSCAN) for individual tree identification, constructs starburst patterns separately from both lidar sources, and utilises polar coordinate rotation and matching to achieve coarse registration. Fine registration is achieved using the iterative closest point (ICP) algorithm. Experimental results demonstrate that the starburst‐pattern‐based method achieves the desired registration accuracy (average coarse registration error of 0.157 m). Further optimisation with ICP yields slight improvements with an average fine registration error of 0.149 m. Remarkably, the proposed method is insensitive to the individual tree detection number when exceeding 10, and the tree position error has minimal impact on registration accuracy. Furthermore, our proposed method outperforms two existing methods in T‐lidar and UAV‐lidar registration over forest environments.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A disparity‐aware Siamese network for building change detection in bi‐temporal remote sensing images 用于双时相遥感图像中建筑物变化检测的差异感知连体网络

The Photogrammetric Record

Pub Date : 2024-04-17 DOI: 10.1111/phor.12495

Yansheng Li, Xinwei Li, Wei Chen, Yongjun Zhang

Building change detection has various applications, such as urban management and disaster assessment. Along with the exponential growth of remote sensing data and computing power, an increasing number of deep‐learning‐based remote sensing building change detection methods have been proposed in recent years. Objectively, the overwhelming majority of existing methods can perfectly deal with the change detection of low‐rise buildings. By contrast, high‐rise buildings often present a large disparity in multitemporal high‐resolution remote sensing images, which degrades the performance of existing methods dramatically. To alleviate this problem, we propose a disparity‐aware Siamese network for detecting building changes in bi‐temporal high‐resolution remote sensing images. The proposed network utilises a cycle‐alignment module to address the disparity problem at both the image and feature levels. A multi‐task learning framework with joint semantic segmentation and change detection loss is used to train the entire deep network, including the cycle‐alignment module in an end‐to‐end manner. Extensive experiments on three publicly open building change detection datasets demonstrate that our method achieves significant improvements on datasets with severe building disparity and state‐of‐the‐art performance on datasets with minimal building disparity simultaneously.

建筑物变化检测有多种应用，如城市管理和灾害评估。随着遥感数据和计算能力的指数级增长，近年来提出了越来越多基于深度学习的遥感建筑物变化检测方法。客观地说，绝大多数现有方法都能完美地处理低层建筑的变化检测。相比之下，高层建筑往往在多时高分辨率遥感图像中呈现出较大的差异，这就大大降低了现有方法的性能。为了缓解这一问题，我们提出了一种差异感知连体网络，用于检测双时相高分辨率遥感图像中建筑物的变化。该网络利用循环对齐模块来解决图像和特征层面的差异问题。多任务学习框架与语义分割和变化检测损失相结合，用于训练整个深度网络，包括以端到端的方式训练循环对齐模块。在三个公开开放的建筑物变化检测数据集上进行的广泛实验表明，我们的方法在建筑物差异严重的数据集上取得了显著的改进，同时在建筑物差异极小的数据集上取得了最先进的性能。

{"title":"A disparity‐aware Siamese network for building change detection in bi‐temporal remote sensing images","authors":"Yansheng Li, Xinwei Li, Wei Chen, Yongjun Zhang","doi":"10.1111/phor.12495","DOIUrl":"https://doi.org/10.1111/phor.12495","url":null,"abstract":"Building change detection has various applications, such as urban management and disaster assessment. Along with the exponential growth of remote sensing data and computing power, an increasing number of deep‐learning‐based remote sensing building change detection methods have been proposed in recent years. Objectively, the overwhelming majority of existing methods can perfectly deal with the change detection of low‐rise buildings. By contrast, high‐rise buildings often present a large disparity in multitemporal high‐resolution remote sensing images, which degrades the performance of existing methods dramatically. To alleviate this problem, we propose a disparity‐aware Siamese network for detecting building changes in bi‐temporal high‐resolution remote sensing images. The proposed network utilises a cycle‐alignment module to address the disparity problem at both the image and feature levels. A multi‐task learning framework with joint semantic segmentation and change detection loss is used to train the entire deep network, including the cycle‐alignment module in an end‐to‐end manner. Extensive experiments on three publicly open building change detection datasets demonstrate that our method achieves significant improvements on datasets with severe building disparity and state‐of‐the‐art performance on datasets with minimal building disparity simultaneously.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Two‐branch global spatial–spectral fusion transformer network for hyperspectral image classification 用于高光谱图像分类的双分支全局空间-光谱融合变换器网络

The Photogrammetric Record

Pub Date : 2024-04-17 DOI: 10.1111/phor.12491

Erxin Xie, Na Chen, Genwei Zhang, Jiangtao Peng, Weiwei Sun

Transformer has achieved outstanding performance in hyperspectral image classification (HSIC) thanks to its effectiveness in modelling the long‐term dependence relation. However, most of the existing algorithms combine convolution with transformer and use convolution for spatial–spectral information fusion, which cannot adequately learn the spatial–spectral fusion features of hyperspectral images (HSIs). To mine the rich spatial and spectral features, a two‐branch global spatial–spectral fusion transformer (GSSFT) model is designed in this paper, in which a spatial–spectral information fusion (SSIF) module is designed to fuse features of spectral and spatial branches. For the spatial branch, the local multiscale swin transformer (LMST) module is devised to obtain local–global spatial information of the samples and the background filtering (BF) module is constructed to weaken the weights of irrelevant pixels. The information learned from the spatial branch and the spectral branch is effectively fused to get final classification results. Extensive experiments are conducted on three HSI datasets, and the results of experiments show that the designed GSSFT method performs well compared with the traditional convolutional neural network and transformer‐based methods.

变换器能有效模拟长期依赖关系，因此在高光谱图像分类（HSIC）中表现出色。然而，现有算法大多将卷积与变换器相结合，利用卷积进行空间-光谱信息融合，无法充分学习高光谱图像（HSI）的空间-光谱融合特征。为了挖掘丰富的空间和光谱特征，本文设计了一种双分支全局空间-光谱融合变换器（GSSFT）模型，其中设计了一个空间-光谱信息融合（SSIF）模块来融合光谱分支和空间分支的特征。对于空间分支，设计了局部多尺度斯温变换器（LMST）模块来获取样本的局部-全局空间信息，并构建了背景过滤（BF）模块来削弱无关像素的权重。从空间分支和光谱分支获得的信息被有效融合，从而得到最终的分类结果。实验结果表明，与传统的卷积神经网络和基于变换器的方法相比，所设计的 GSSFT 方法性能良好。

{"title":"Two‐branch global spatial–spectral fusion transformer network for hyperspectral image classification","authors":"Erxin Xie, Na Chen, Genwei Zhang, Jiangtao Peng, Weiwei Sun","doi":"10.1111/phor.12491","DOIUrl":"https://doi.org/10.1111/phor.12491","url":null,"abstract":"Transformer has achieved outstanding performance in hyperspectral image classification (HSIC) thanks to its effectiveness in modelling the long‐term dependence relation. However, most of the existing algorithms combine convolution with transformer and use convolution for spatial–spectral information fusion, which cannot adequately learn the spatial–spectral fusion features of hyperspectral images (HSIs). To mine the rich spatial and spectral features, a two‐branch global spatial–spectral fusion transformer (GSSFT) model is designed in this paper, in which a spatial–spectral information fusion (SSIF) module is designed to fuse features of spectral and spatial branches. For the spatial branch, the local multiscale swin transformer (LMST) module is devised to obtain local–global spatial information of the samples and the background filtering (BF) module is constructed to weaken the weights of irrelevant pixels. The information learned from the spatial branch and the spectral branch is effectively fused to get final classification results. Extensive experiments are conducted on three HSI datasets, and the results of experiments show that the designed GSSFT method performs well compared with the traditional convolutional neural network and transformer‐based methods.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamic measurement of a long-distance moving object using multi-binocular high-speed videogrammetry with adaptive-weighting bundle adjustment 利用带有自适应加权束调整功能的多双目高速摄像测量法对远距离移动物体进行动态测量

The Photogrammetric Record

Pub Date : 2024-03-29 DOI: 10.1111/phor.12485

Xiaohua Tong, Yi Gao, Zhen Ye, Huan Xie, Peng Chen, Haibo Shi, Ziqi Liu, Xianglei Liu, Yusheng Xu, Rong Huang, Shijie Liu

The dynamic measurement of position and attitude information of a long-distance moving object is a common demand in ground testing of aerospace engineering. Due to the movement from far to near and the limitations of camera resolution, it is necessary to use multi-binocular cameras for segmented observation at different distances. However, achieving accurate and continuous position and attitude estimation is a challenging task. Therefore, this paper proposes a dynamic monitoring technique for long-distance movement based on a multi-binocular videogrammetric system. Aiming to solve the problem that the scale in images changes constantly during the moving process, a scale-adaptive tracking method of circular targets is presented. Bundle adjustment (BA) with joint segments using an adaptive-weighting least-squares strategy is developed to enhance the measurement accuracy. The feasibility and reliability of the proposed technique are validated by a ground testing of relative measurement for spacecraft rendezvous and docking. The experimental results indicate that the proposed technique can obtain the actual motion state of the moving object, with a positioning accuracy of 3.2 mm (root mean square error), which can provide a reliable third-party verification for on-orbit measurement systems in ground testing. Compared with the results of BA with individual segments and vision measurement software PhotoModeler, the accuracy is improved by 45% and 30%, respectively.

对远距离运动物体的位置和姿态信息进行动态测量是航空航天工程地面测试的常见需求。由于从远到近的运动和相机分辨率的限制，有必要使用多双目相机对不同距离进行分段观测。然而，实现精确、连续的位置和姿态估计是一项具有挑战性的任务。因此，本文提出了一种基于多双目视频测量系统的远距离运动动态监测技术。为了解决移动过程中图像尺度不断变化的问题，本文提出了一种圆形目标的尺度自适应跟踪方法。利用自适应加权最小二乘策略开发了关节段束调整（BA），以提高测量精度。通过对航天器交会对接的相对测量进行地面测试，验证了所提技术的可行性和可靠性。实验结果表明，所提出的技术可以获得运动物体的实际运动状态，定位精度为 3.2 毫米（均方根误差），可以为地面测试中的在轨测量系统提供可靠的第三方验证。与使用单个片段和视觉测量软件 PhotoModeler 的 BA 结果相比，精度分别提高了 45% 和 30%。

{"title":"Dynamic measurement of a long-distance moving object using multi-binocular high-speed videogrammetry with adaptive-weighting bundle adjustment","authors":"Xiaohua Tong, Yi Gao, Zhen Ye, Huan Xie, Peng Chen, Haibo Shi, Ziqi Liu, Xianglei Liu, Yusheng Xu, Rong Huang, Shijie Liu","doi":"10.1111/phor.12485","DOIUrl":"https://doi.org/10.1111/phor.12485","url":null,"abstract":"The dynamic measurement of position and attitude information of a long-distance moving object is a common demand in ground testing of aerospace engineering. Due to the movement from far to near and the limitations of camera resolution, it is necessary to use multi-binocular cameras for segmented observation at different distances. However, achieving accurate and continuous position and attitude estimation is a challenging task. Therefore, this paper proposes a dynamic monitoring technique for long-distance movement based on a multi-binocular videogrammetric system. Aiming to solve the problem that the scale in images changes constantly during the moving process, a scale-adaptive tracking method of circular targets is presented. Bundle adjustment (BA) with joint segments using an adaptive-weighting least-squares strategy is developed to enhance the measurement accuracy. The feasibility and reliability of the proposed technique are validated by a ground testing of relative measurement for spacecraft rendezvous and docking. The experimental results indicate that the proposed technique can obtain the actual motion state of the moving object, with a positioning accuracy of 3.2 mm (root mean square error), which can provide a reliable third-party verification for on-orbit measurement systems in ground testing. Compared with the results of BA with individual segments and vision measurement software PhotoModeler, the accuracy is improved by 45% and 30%, respectively.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140322280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Innsbruck Summer School of Alpine Research‐Close‐Range Sensing Techniques in Alpine Terrain 因斯布鲁克阿尔卑斯研究暑期班--阿尔卑斯地形的近距离传感技术

The Photogrammetric Record

Pub Date : 2024-03-27 DOI: 10.1111/phor.8_12486

引用次数: 0