首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
Improving PRISMA hyperspectral spatial resolution and geolocation by using Sentinel-2: development and test of an operational procedure in urban and rural areas 利用哨兵-2 号提高 PRISMA 高光谱空间分辨率和地理定位:开发并测试城市和农村地区的操作程序
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-07-08 DOI: 10.1016/j.isprsjprs.2024.07.003
Giandomenico De Luca , Federico Carotenuto , Lorenzo Genesio , Monica Pepe , Piero Toscano , Mirco Boschetti , Franco Miglietta , Beniamino Gioli

Hyperspectral (HS) satellites like PRISMA (PRecursore IperSpettrale della Missione Applicativa) offer remarkable capabilities, yet they are constrained by a relatively coarse spatial resolution, curbing their efficacy in those applications that require pinpoint accuracy. Here we propose a fusion process, aimed at the enhancement of PRISMA HS spatial resolution by using the spatial and spectral information of Sentinel-2 multispectral (MS) data (HS-MS fusion process), validated against four airborne HS flights simultaneous to satellite overpasses on different land use distributions. Adopting the PRISMA panchromatic (PAN) image, the proposed solution was also compared with the results of a HS-PAN pansharpening process. A two-steps operational workflow is proposed, based on two state-of-the-art and open-source algorithms. The first step consisted of the geocoding of PRISMA L2 products using Senintel-2 as reference and was accomplished with the phase-based algorithm implemented in AROSICS (Automated and Robust Open-Source Image Co-registration Software). The geometric displacement in L2 data was found to be between 80 m and 250 m, irregularly spatially distributed throughout the same scene and among scenes, and it was corrected by means of thousands of regularly spatially distributed tie points. A second-order polynomial transformation function was integrated in the algorithm. The second step consisted of employing the HySure (HS Super resolution) fusion algorithm to perform both the HS-MS fusion and the HS-PAN pansharpening, returning a PRISMA HS improved dataset with a spatial resolution of 10 m and 5 m, respectively. Four different per-band accuracy metrics were used to evaluate the accuracy of both products against airborne data. Overall, HS-MS data achieved increased accuracy in all validation metrics, i.e. + 28 % (root mean square error, RMSE), +23 % (spectral angle mapper, SAM), +7% (peak signal-to-noise ratio, PSNR) and + 11 % (universal image quality index, UIQI), with respect of HS-PAN data. These outcomes showed that using the spectral information of Sentinel-2 both spectral and spatial patterns were reconstructed more consistently in three different urban and rural scenarios, avoiding the presence of blur and at-edge artefacts as opposed to HS-PAN pansharpening, therefore suggesting an optimal strategy for satellite HS data resolution enhancement.

PRISMA()等高光谱(HS)卫星具有非凡的能力,但受限于相对粗糙的空间分辨率,限制了其在需要精确定位的应用中的功效。在此,我们提出了一种融合程序,旨在利用哨兵-2 多光谱(MS)数据的空间和光谱信息(HS-MS 融合程序)提高 PRISMA HS 的空间分辨率。采用 PRISMA 全色 (PAN) 图像,还将建议的解决方案与 HS-PAN 平差处理的结果进行了比较。基于两种最先进的开源算法,提出了一个分为两步的操作流程。第一步是以 Senintel-2 为基准,对 PRISMA L2 产品进行地理编码,并采用 AROSICS(自动和稳健的开源图像协同配准软件)中实施的基于相位的算法来完成。发现 L2 数据中的几何位移在 80 米至 250 米之间,在同一场景和不同场景中呈不规则空间分布,并通过数千个规则空间分布的连接点对其进行了校正。算法中集成了一个二阶多项式变换函数。第二步是采用 HySure(HS 超级分辨率)融合算法来执行 HS-MS 融合和 HS-PAN 泛锐化,从而得到空间分辨率分别为 10 米和 5 米的 PRISMA HS 改进数据集。使用四种不同的每波段精度指标来评估这两种产品与机载数据相比的精度。总体而言,与 HS-PAN 数据相比,HS-MS 数据在所有验证指标上都提高了精度,即 + 28 %(均方根误差,RMSE)、+23 %(光谱角度绘图仪,SAM)、+7 %(峰值信噪比,PSNR)和 + 11 %(通用图像质量指数,UIQI)。这些结果表明,与 HS-PAN 平锐化相比,利用哨兵-2 的光谱信息,在三种不同的城市和农村场景中重建的光谱和空间模式更加一致,避免了模糊和边缘伪影的出现,因此是卫星 HS 数据分辨率增强的最佳策略。
{"title":"Improving PRISMA hyperspectral spatial resolution and geolocation by using Sentinel-2: development and test of an operational procedure in urban and rural areas","authors":"Giandomenico De Luca ,&nbsp;Federico Carotenuto ,&nbsp;Lorenzo Genesio ,&nbsp;Monica Pepe ,&nbsp;Piero Toscano ,&nbsp;Mirco Boschetti ,&nbsp;Franco Miglietta ,&nbsp;Beniamino Gioli","doi":"10.1016/j.isprsjprs.2024.07.003","DOIUrl":"10.1016/j.isprsjprs.2024.07.003","url":null,"abstract":"<div><p>Hyperspectral (HS) satellites like PRISMA (<em>PRecursore IperSpettrale della Missione Applicativa</em>) offer remarkable capabilities, yet they are constrained by a relatively coarse spatial resolution, curbing their efficacy in those applications that require pinpoint accuracy. Here we propose a fusion process, aimed at the enhancement of PRISMA HS spatial resolution by using the spatial and spectral information of Sentinel-2 multispectral (MS) data (HS-MS fusion process), validated against four airborne HS flights simultaneous to satellite overpasses on different land use distributions. Adopting the PRISMA panchromatic (PAN) image, the proposed solution was also compared with the results of a HS-PAN pansharpening process. A two-steps operational workflow is proposed, based on two state-of-the-art and open-source algorithms. The first step consisted of the geocoding of PRISMA L2 products using Senintel-2 as reference and was accomplished with the phase-based algorithm implemented in AROSICS (Automated and Robust Open-Source Image Co-registration Software). The geometric displacement in L2 data was found to be between 80 m and 250 m, irregularly spatially distributed throughout the same scene and among scenes, and it was corrected by means of thousands of regularly spatially distributed tie points. A second-order polynomial transformation function was integrated in the algorithm. The second step consisted of employing the HySure (HS Super resolution) fusion algorithm to perform both the HS-MS fusion and the HS-PAN pansharpening, returning a PRISMA HS improved dataset with a spatial resolution of 10 m and 5 m, respectively. Four different per-band accuracy metrics were used to evaluate the accuracy of both products against airborne data. Overall, HS-MS data achieved increased accuracy in all validation metrics, i.e. + 28 % (root mean square error, RMSE), +23 % (spectral angle mapper, SAM), +7% (peak signal-to-noise ratio, PSNR) and + 11 % (universal image quality index, UIQI), with respect of HS-PAN data. These outcomes showed that using the spectral information of Sentinel-2 both spectral and spatial patterns were reconstructed more consistently in three different urban and rural scenarios, avoiding the presence of blur and at-edge artefacts as opposed to HS-PAN pansharpening, therefore suggesting an optimal strategy for satellite HS data resolution enhancement.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0924271624002648/pdfft?md5=c5b68490a7611cd3eb4fa81e99548d39&pid=1-s2.0-S0924271624002648-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141588690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CodeUNet: Autonomous underwater vehicle real visual enhancement via underwater codebook priors CodeUNet:通过水下编码本先验实现自主潜水器真实视觉增强
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-07-06 DOI: 10.1016/j.isprsjprs.2024.06.009
Linling Wang , Xiaoyan Xu , Shunmin An , Bing Han , Yi Guo

The vision enhancement of autonomous underwater vehicle (AUV) has received increasing attention and rapid development in recent years. However, existing methods based on prior knowledge struggle to adapt to all scenarios, while learning-based approaches lack paired datasets from real-world scenes, limiting their enhancement capabilities. Consequently, this severely hampers their generalization and application in AUVs. Besides, the existing deep learning-based methods largely overlook the advantages of prior knowledge-based approaches. To address the aforementioned issues, a novel architecture called CodeUNet is proposed in this paper. Instead of relying on physical scattering models, a real-world scene vision enhancement network based on a codebook prior is considered. First, the VQGAN is pretrained on underwater datasets to obtain a discrete codebook, encapsulating the underwater priors (UPs). The decoder is equipped with a novel feature alignment module that effectively leverages underwater features to generate clean results. Then, the distance between the features and the matches is recalibrated by controllable matching operations, enabling better matching. Extensive experiments demonstrate that CodeUNet outperforms state-of-the-art methods in terms of visual quality and quantitative metrics. The testing results of geometric rotation, SIFT salient point detection, and edge detection applications are shown in this paper, providing strong evidence for the feasibility of CodeUNet in the field of autonomous underwater vehicles. Specifically, on the full reference dataset, the proposed method outperforms most of the 14 state-of-the-art methods in four evaluation metrics, with an improvement of up to 3.7722 compared to MLLE. On the no-reference dataset, the proposed method achieves excellent results, with an improvement of up to 0.0362 compared to MLLE. Links to the dataset and code for this project can be found at: https://github.com/An-Shunmin/CodeUNet.

近年来,自动潜航器(AUV)的视觉增强技术受到越来越多的关注,并得到了快速发展。然而,现有的基于先验知识的方法难以适应所有场景,而基于学习的方法缺乏真实场景的配对数据集,限制了其增强能力。因此,这严重阻碍了这些方法在自动潜航器中的推广和应用。此外,现有的基于深度学习的方法在很大程度上忽视了基于先验知识的方法的优势。为解决上述问题,本文提出了一种名为 CodeUNet 的新型架构。它不依赖物理散射模型,而是考虑了基于编码本先验知识的真实世界场景视觉增强网络。首先,在水下数据集上对 VQGAN 进行预训练,以获得包含水下先验(UPs)的离散码本。解码器配备了一个新颖的特征对齐模块,可有效利用水下特征生成干净的结果。然后,通过可控匹配操作重新校准特征与匹配结果之间的距离,从而实现更好的匹配。大量实验证明,CodeUNet 在视觉质量和定量指标方面都优于最先进的方法。本文展示了几何旋转、SIFT 突出点检测和边缘检测应用的测试结果,为 CodeUNet 在自主水下航行器领域的可行性提供了有力证据。具体来说,在完整参考数据集上,本文提出的方法在四项评价指标上均优于 14 种最先进方法中的大多数,与 MLLE 相比最高提高了 3.7722。在无参考数据集上,所提出的方法也取得了优异的成绩,与 MLLE 相比最多提高了 0.0362。本项目的数据集和代码链接可在以下网址找到:.
{"title":"CodeUNet: Autonomous underwater vehicle real visual enhancement via underwater codebook priors","authors":"Linling Wang ,&nbsp;Xiaoyan Xu ,&nbsp;Shunmin An ,&nbsp;Bing Han ,&nbsp;Yi Guo","doi":"10.1016/j.isprsjprs.2024.06.009","DOIUrl":"10.1016/j.isprsjprs.2024.06.009","url":null,"abstract":"<div><p>The vision enhancement of autonomous underwater vehicle (AUV) has received increasing attention and rapid development in recent years. However, existing methods based on prior knowledge struggle to adapt to all scenarios, while learning-based approaches lack paired datasets from real-world scenes, limiting their enhancement capabilities. Consequently, this severely hampers their generalization and application in AUVs. Besides, the existing deep learning-based methods largely overlook the advantages of prior knowledge-based approaches. To address the aforementioned issues, a novel architecture called CodeUNet is proposed in this paper. Instead of relying on physical scattering models, a real-world scene vision enhancement network based on a codebook prior is considered. First, the VQGAN is pretrained on underwater datasets to obtain a discrete codebook, encapsulating the underwater priors (UPs). The decoder is equipped with a novel feature alignment module that effectively leverages underwater features to generate clean results. Then, the distance between the features and the matches is recalibrated by controllable matching operations, enabling better matching. Extensive experiments demonstrate that CodeUNet outperforms state-of-the-art methods in terms of visual quality and quantitative metrics. The testing results of geometric rotation, SIFT salient point detection, and edge detection applications are shown in this paper, providing strong evidence for the feasibility of CodeUNet in the field of autonomous underwater vehicles. Specifically, on the full reference dataset, the proposed method outperforms most of the 14 state-of-the-art methods in four evaluation metrics, with an improvement of up to 3.7722 compared to MLLE. On the no-reference dataset, the proposed method achieves excellent results, with an improvement of up to 0.0362 compared to MLLE. Links to the dataset and code for this project can be found at: <span>https://github.com/An-Shunmin/CodeUNet</span><svg><path></path></svg>.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141588662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ALCSF: An adaptive and anti-noise filtering method for extracting ground and top of canopy from ICESat-2 LiDAR data along single tracks ALCSF:从 ICESat-2 激光雷达数据中提取单轨地面和冠层顶部的自适应抗噪滤波方法
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-07-05 DOI: 10.1016/j.isprsjprs.2024.07.002
Bingtao Chang, Hao Xiong, Yuan Li, Dong Pan, Xiaodong Cui, Wuming Zhang

The Ice, Cloud and Land Elevation Satellite-2 (ICESat-2) is an active spaceborne remote sensing system that utilizes photon-counting LiDAR to capture highly detailed information about under-vegetation terrain and forest structure over vast spatial regions. It facilitates the accurate retrieval of terrain elevation and canopy height information, critical for assessing the global carbon budget and understanding the role of forests in climate change mitigation. However, challenges arise from the characteristics of the ICESat-2 photon-counting LiDAR data, such as their linear distribution, extensive spatial coverage, and substantial residual noise. These challenges hinder the performances of the state-of-the-art methods when applied on ICESat-2 data for extracting ground or top of canopy, while they perform well on airborne LiDAR that is featured with planar distribution, small coverage, and high signal-to-noise ratio. Consequently, this study proposes a novel algorithm termed Adaptive Linear Cloth Simulation Filtering (ALCSF), for the automated extraction of ground and top-of-canopy photons from ICESat-2 signal photons. The ALCSF algorithm innovatively introduces a cloth strip model as a reference to accommodate the distribution characteristics of ICESat-2 photons. Additionally, it employs a terrain-adaptive strategy to adjust the rigidity of the cloth strip by utilizing terrain slope information, thus making ALCSF applicable to large-scale areas with significant topographical changes. Furthermore, the proposed ALCSF addresses noise interference by simultaneously considering the movability of particles of the cloth strip model and the photon distribution during iterative adjustments of the cloth strip. The performance of the ALCSF is evaluated by comparing it with the ICESat-2 Land–Vegetation Along-Track Products (ATL08) across twelve datasets that encompass various times of day and scenes. In the results, the ALCSF exhibits notable improvements over ATL08 products, effectively reducing the root mean square error (RMSE) of ground elevation by 21.8% and canopy height by 25.8%, with superior performance in preserving terrain details. This highlights the significance of ALCSF as a valuable tool for enhancing the accuracy of ICESat-2 land and vegetation products, ultimately contributing to the estimation of the global carbon budget in future studies.

冰、云和陆地高程卫星-2(ICESat-2)是一个有源星载遥感系统,利用光子计数激光雷达捕捉广阔空间区域植被不足地形和森林结构的高度详细信息。它有助于准确检索地形高程和冠层高度信息,这对于评估全球碳预算和了解森林在减缓气候变化中的作用至关重要。然而,ICESat-2 光子计数激光雷达数据的线性分布、广泛的空间覆盖和大量的残余噪声等特性带来了挑战。这些挑战阻碍了最先进方法在 ICESat-2 数据上提取地面或冠层顶部的性能,而这些方法在具有平面分布、小覆盖范围和高信噪比等特点的机载激光雷达上却表现出色。因此,本研究提出了一种名为 "自适应线性布模拟滤波(ALCSF)"的新型算法,用于从 ICESat-2 信号光子中自动提取地面和冠顶光子。ALCSF 算法创新性地引入了布带模型作为参考,以适应 ICESat-2 光子的分布特征。此外,该算法还采用了地形适应策略,通过利用地形坡度信息来调整布带的刚度,从而使 ALCSF 适用于地形变化较大的大面积区域。此外,所提出的 ALCSF 在布带迭代调整过程中同时考虑了布带模型颗粒的可移动性和光子分布,从而解决了噪声干扰问题。通过与 ICESat-2 Land-Vegetation Along-Track Products (ATL08) 的 12 个数据集(涵盖一天中的不同时间和场景)进行比较,对 ALCSF 的性能进行了评估。结果显示,ALCSF 与 ATL08 产品相比有明显改善,地面高程的均方根误差 (RMSE) 有效降低了 21.8%,冠层高度降低了 25.8%,在保留地形细节方面表现出色。这凸显了 ALCSF 的重要意义,它是提高 ICESat-2 土地和植被产品精度的重要工具,最终有助于未来研究中全球碳预算的估算。
{"title":"ALCSF: An adaptive and anti-noise filtering method for extracting ground and top of canopy from ICESat-2 LiDAR data along single tracks","authors":"Bingtao Chang,&nbsp;Hao Xiong,&nbsp;Yuan Li,&nbsp;Dong Pan,&nbsp;Xiaodong Cui,&nbsp;Wuming Zhang","doi":"10.1016/j.isprsjprs.2024.07.002","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.07.002","url":null,"abstract":"<div><p>The Ice, Cloud and Land Elevation Satellite-2 (ICESat-2) is an active spaceborne remote sensing system that utilizes photon-counting LiDAR to capture highly detailed information about under-vegetation terrain and forest structure over vast spatial regions. It facilitates the accurate retrieval of terrain elevation and canopy height information, critical for assessing the global carbon budget and understanding the role of forests in climate change mitigation. However, challenges arise from the characteristics of the ICESat-2 photon-counting LiDAR data, such as their linear distribution, extensive spatial coverage, and substantial residual noise. These challenges hinder the performances of the state-of-the-art methods when applied on ICESat-2 data for extracting ground or top of canopy, while they perform well on airborne LiDAR that is featured with planar distribution, small coverage, and high signal-to-noise ratio. Consequently, this study proposes a novel algorithm termed Adaptive Linear Cloth Simulation Filtering (ALCSF), for the automated extraction of ground and top-of-canopy photons from ICESat-2 signal photons. The ALCSF algorithm innovatively introduces a cloth strip model as a reference to accommodate the distribution characteristics of ICESat-2 photons. Additionally, it employs a terrain-adaptive strategy to adjust the rigidity of the cloth strip by utilizing terrain slope information, thus making ALCSF applicable to large-scale areas with significant topographical changes. Furthermore, the proposed ALCSF addresses noise interference by simultaneously considering the movability of particles of the cloth strip model and the photon distribution during iterative adjustments of the cloth strip. The performance of the ALCSF is evaluated by comparing it with the ICESat-2 Land–Vegetation Along-Track Products (ATL08) across twelve datasets that encompass various times of day and scenes. In the results, the ALCSF exhibits notable improvements over ATL08 products, effectively reducing the root mean square error (RMSE) of ground elevation by 21.8% and canopy height by 25.8%, with superior performance in preserving terrain details. This highlights the significance of ALCSF as a valuable tool for enhancing the accuracy of ICESat-2 land and vegetation products, ultimately contributing to the estimation of the global carbon budget in future studies.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141540109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A flexible trajectory estimation methodology for kinematic laser scanning 运动学激光扫描的灵活轨迹估计方法
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-07-03 DOI: 10.1016/j.isprsjprs.2024.06.014
Florian Pöppl , Andreas Ullrich , Gottfried Mandlburger , Norbert Pfeifer

Kinematic laser scanning is a widely-used surveying technique based on light detection and ranging (LiDAR) that enables efficient data acquisition by mounting the laser scanner on a moving platform. In order to obtain a georeferenced point cloud, the trajectory of the moving platform must be accurately known. To this end, most commercial laser scanning systems comprise an inertial measurement unit (IMU) and a global navigation satellite system (GNSS) receiver and antenna. Trajectory estimation is then the task of determining the platform’s position and orientation by integrating measurements from the IMU, GNSS, and possibly the laser scanner itself. Here, we present a comprehensive approach to trajectory estimation for kinematic laser scanning, based on batch least-squares adjustment incorporating pre-processed GNSS positions, raw IMU data and plane-based LiDAR correspondences in a single estimation procedure. In comparison to the classic workflow of Kalman filtering followed by strip adjustment, this is a holistic approach with tight coupling of IMU and LiDAR. For the latter, we extend the data-derived stochastic model for the LiDAR plane observations with prior knowledge of the LiDAR measurement process. The proposed trajectory estimation approach is flexible and allows different system configurations as well as joint registration of multiple independent kinematic datasets. This is demonstrated using as a practical example a combined dataset consisting of two independent data acquisitions from crewed aircraft and uncrewed aerial vehicle. All measurements from both datasets are jointly adjusted in order to obtain a single high-quality point cloud, without the need for ground control. The performance of this approach is evaluated in terms of point cloud consistency, precision, and accuracy. The latter is done by comparison to terrestrially surveyed reference data on the ground. The results show improved consistency, accuracy, and precision compared to a standard workflow, with the RMSE reduced from 7.43 cm to 3.85 cm w.r.t. the reference data surfaces, and the point-to-plane standard deviation on the surfaces reduced from 3.01 cm to 2.44 cm. Although a direct comparison to the state-of-the-art can only be made with caution, we can state that the suggested method performs better in terms of point cloud consistency and precision, while at the same time achieving better absolute accuracy.

运动激光扫描是一种广泛使用的测量技术,它以光探测和测距(LiDAR)为基础,通过将激光扫描仪安装在移动平台上实现高效的数据采集。为了获取地理坐标点云,必须准确知道移动平台的轨迹。为此,大多数商用激光扫描系统都包含一个惯性测量单元(IMU)和一个全球导航卫星系统(GNSS)接收器和天线。轨迹估计是通过整合来自惯性测量单元、全球导航卫星系统以及激光扫描仪本身的测量数据来确定平台位置和方向的任务。在此,我们介绍一种用于运动学激光扫描的综合轨迹估计方法,该方法基于批量最小二乘调整,将预处理的 GNSS 位置、原始 IMU 数据和基于平面的 LiDAR 对应关系整合到单一估计程序中。与卡尔曼滤波后再进行带状调整的传统工作流程相比,这是一种将 IMU 和激光雷达紧密结合的整体方法。对于后者,我们利用有关激光雷达测量过程的先验知识,扩展了激光雷达平面观测数据衍生随机模型。所提出的轨迹估计方法非常灵活,允许不同的系统配置以及多个独立运动学数据集的联合注册。举例说明了这一点,实际例子是由载人飞机和无人驾驶飞行器的两个独立数据采集组成的组合数据集。两个数据集的所有测量数据都经过联合调整,以获得一个高质量的点云,而无需地面控制。我们从点云的一致性、精确度和准确性方面对这种方法的性能进行了评估。后者是通过与地面勘测参考数据进行比较来实现的。结果表明,与标准工作流程相比,一致性、精确度和准确性都有所提高,与参考数据表面的均方根误差从 7.43 厘米降低到 3.85 厘米,表面的点到平面标准偏差从 3.01 厘米降低到 2.44 厘米。虽然直接与最先进的方法进行比较还需谨慎,但我们可以说,建议的方法在点云一致性和精度方面表现更好,同时绝对精度也更高。
{"title":"A flexible trajectory estimation methodology for kinematic laser scanning","authors":"Florian Pöppl ,&nbsp;Andreas Ullrich ,&nbsp;Gottfried Mandlburger ,&nbsp;Norbert Pfeifer","doi":"10.1016/j.isprsjprs.2024.06.014","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.06.014","url":null,"abstract":"<div><p>Kinematic laser scanning is a widely-used surveying technique based on light detection and ranging (LiDAR) that enables efficient data acquisition by mounting the laser scanner on a moving platform. In order to obtain a georeferenced point cloud, the trajectory of the moving platform must be accurately known. To this end, most commercial laser scanning systems comprise an inertial measurement unit (IMU) and a global navigation satellite system (GNSS) receiver and antenna. Trajectory estimation is then the task of determining the platform’s position and orientation by integrating measurements from the IMU, GNSS, and possibly the laser scanner itself. Here, we present a comprehensive approach to trajectory estimation for kinematic laser scanning, based on batch least-squares adjustment incorporating pre-processed GNSS positions, raw IMU data and plane-based LiDAR correspondences in a single estimation procedure. In comparison to the classic workflow of Kalman filtering followed by strip adjustment, this is a holistic approach with tight coupling of IMU and LiDAR. For the latter, we extend the data-derived stochastic model for the LiDAR plane observations with prior knowledge of the LiDAR measurement process. The proposed trajectory estimation approach is flexible and allows different system configurations as well as joint registration of multiple independent kinematic datasets. This is demonstrated using as a practical example a combined dataset consisting of two independent data acquisitions from crewed aircraft and uncrewed aerial vehicle. All measurements from both datasets are jointly adjusted in order to obtain a single high-quality point cloud, without the need for ground control. The performance of this approach is evaluated in terms of point cloud consistency, precision, and accuracy. The latter is done by comparison to terrestrially surveyed reference data on the ground. The results show improved consistency, accuracy, and precision compared to a standard workflow, with the RMSE reduced from 7.43<!--> <!-->cm to 3.85<!--> <!-->cm w.r.t. the reference data surfaces, and the point-to-plane standard deviation on the surfaces reduced from 3.01<!--> <!-->cm to 2.44<!--> <!-->cm. Although a direct comparison to the state-of-the-art can only be made with caution, we can state that the suggested method performs better in terms of point cloud consistency and precision, while at the same time achieving better absolute accuracy.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0924271624002521/pdfft?md5=2a19ef97db3b9dfcefeb96e38f66d8b7&pid=1-s2.0-S0924271624002521-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141540108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SSG2: A new modeling paradigm for semantic segmentation SSG2:语义分割的新建模范式
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-07-03 DOI: 10.1016/j.isprsjprs.2024.06.011
Foivos I. Diakogiannis , Suzanne Furby , Peter Caccetta , Xiaoliang Wu , Rodrigo Ibata , Ondrej Hlinka , John Taylor

State-of-the-art models in semantic segmentation primarily operate on single, static images, generating corresponding segmentation masks. This one-shot approach leaves little room for error correction, as the models lack the capability to integrate multiple observations for enhanced accuracy. Inspired by work on semantic change detection, we address this limitation by introducing a methodology that leverages a sequence of observables generated for each static input image. By adding this “temporal” dimension, we exploit strong signal correlations between successive observations in the sequence to reduce error rates. Our framework, dubbed SSG2 (Semantic Segmentation Generation 2), employs a dual-encoder, single-decoder base network augmented with a sequence model. The base model learns to predict the set intersection, union, and difference of labels from dual-input images. Given a fixed target input image and a set of support images, the sequence model builds the predicted mask of the target by synthesizing the partial views from each sequence step and filtering out noise. We evaluate SSG2 across four diverse datasets: UrbanMonitor, featuring orthoimage tiles from Darwin, Australia with four spectral bands at 0.2 m spatial resolution and a surface model; ISPRS Potsdam, which includes true orthophoto images with multiple spectral bands and a 5 cm ground sampling distance; ISPRS Vahingen, which also includes true orthophoto images and a 9 cm ground sampling distance; and ISIC2018, a medical dataset focused on skin lesion segmentation, particularly melanoma. The SSG2 model demonstrates rapid convergence within the first few tens of epochs and significantly outperforms UNet-like baseline models with the same number of gradient updates. However, the addition of the temporal dimension results in an increased memory footprint. While this could be a limitation, it is offset by the advent of higher-memory GPUs and coding optimizations. Our code is available at https://github.com/feevos/ssg2.

最先进的语义分割模型主要针对单张静态图像进行操作,生成相应的分割掩码。这种一锤子买卖的方法几乎没有纠错的余地,因为这些模型缺乏整合多个观察结果以提高准确性的能力。受语义变化检测工作的启发,我们引入了一种方法,利用为每幅静态输入图像生成的一系列观测值来解决这一局限性。通过添加这一 "时间 "维度,我们利用序列中连续观测值之间的强信号相关性来降低错误率。我们的框架被称为 SSG2(语义分割 2 代),它采用双编码器、单解码器基础网络,并辅以序列模型。基础模型通过学习来预测双输入图像中标签的交集、联合和差异。给定一个固定的目标输入图像和一组支持图像,序列模型通过合成每个序列步骤的部分视图并滤除噪声,建立目标的预测掩码。我们通过四个不同的数据集对 SSG2 进行了评估:UrbanMonitor 包含澳大利亚达尔文的正射影像瓦片,具有 0.2 米空间分辨率的四个光谱波段和一个表面模型;ISPRS Potsdam 包含具有多个光谱波段和 5 厘米地面采样距离的真实正射影像;ISPRS Vahingen 也包含具有 9 厘米地面采样距离的真实正射影像;ISIC2018 是一个专注于皮肤病变(尤其是黑色素瘤)分割的医疗数据集。SSG2 模型在最初的几十个历时内就实现了快速收敛,在梯度更新次数相同的情况下,明显优于类似 UNet 的基线模型。不过,增加时间维度会导致内存占用增加。虽然这可能是一个限制,但高内存 GPU 的出现和编码优化抵消了这一限制。我们的代码见 https://github.com/feevos/ssg2。
{"title":"SSG2: A new modeling paradigm for semantic segmentation","authors":"Foivos I. Diakogiannis ,&nbsp;Suzanne Furby ,&nbsp;Peter Caccetta ,&nbsp;Xiaoliang Wu ,&nbsp;Rodrigo Ibata ,&nbsp;Ondrej Hlinka ,&nbsp;John Taylor","doi":"10.1016/j.isprsjprs.2024.06.011","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.06.011","url":null,"abstract":"<div><p>State-of-the-art models in semantic segmentation primarily operate on single, static images, generating corresponding segmentation masks. This one-shot approach leaves little room for error correction, as the models lack the capability to integrate multiple observations for enhanced accuracy. Inspired by work on semantic change detection, we address this limitation by introducing a methodology that leverages a sequence of observables generated for each static input image. By adding this “temporal” dimension, we exploit strong signal correlations between successive observations in the sequence to reduce error rates. Our framework, dubbed SSG2 (Semantic Segmentation Generation 2), employs a dual-encoder, single-decoder base network augmented with a sequence model. The base model learns to predict the set intersection, union, and difference of labels from dual-input images. Given a fixed target input image and a set of support images, the sequence model builds the predicted mask of the target by synthesizing the partial views from each sequence step and filtering out noise. We evaluate SSG2 across four diverse datasets: UrbanMonitor, featuring orthoimage tiles from Darwin, Australia with four spectral bands at 0.2 m spatial resolution and a surface model; ISPRS Potsdam, which includes true orthophoto images with multiple spectral bands and a 5 cm ground sampling distance; ISPRS Vahingen, which also includes true orthophoto images and a 9 cm ground sampling distance; and ISIC2018, a medical dataset focused on skin lesion segmentation, particularly melanoma. The SSG2 model demonstrates rapid convergence within the first few tens of epochs and significantly outperforms UNet-like baseline models with the same number of gradient updates. However, the addition of the temporal dimension results in an increased memory footprint. While this could be a limitation, it is offset by the advent of higher-memory GPUs and coding optimizations. Our code is available at <span>https://github.com/feevos/ssg2</span><svg><path></path></svg>.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0924271624002491/pdfft?md5=65ee7216f317555c3caca70da926eeb9&pid=1-s2.0-S0924271624002491-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141540107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust feature aggregation network for lightweight and effective remote sensing image change detection 用于轻量级、高效遥感图像变化检测的鲁棒特征聚合网络
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-07-02 DOI: 10.1016/j.isprsjprs.2024.06.013
Zhi-Hui You, Si-Bao Chen, Jia-Xin Wang, Bin Luo

In remote sensing (RS) image change detection (CD) task, many existing CD methods focus more on how to improve accuracy, but they usually have more parameters, higher computational costs, and heavier memory usage. Designing lightweight and performance-sustainable CD model that is more compatible with real-world applications is an urgent problem to be solved. Therefore, we propose a lightweight change detection network, called as robust feature aggregation network (RFANet). To improve representative capability of weaker features extracted from lightweight backbone, a feature reinforcement module (FRM) is proposed. FRM allows current level feature to densely interact and fuse with other level features, thus accomplishing the complementarity of fine-grained details and semantic information. Considering massive objects with rich correlations in RS images, we design semantic split-aggregation module (SSAM) to better capture global semantic information of changed objects. Besides, we present a lightweight decoder containing channel interaction module (CIM), which allows multi-level refined difference features to emphasize changed areas and suppress background and pseudo-changes. Extensive experiments carried out on four challenging RS image CD datasets illustrate that RFANet achieves competitive performance with fewer parameters and lower computational costs. The source code is available at https://github.com/Youzhihui/RFANet.

在遥感(RS)图像变化检测(CD)任务中,许多现有的CD方法更侧重于如何提高精度,但它们通常参数较多、计算成本较高、内存占用较大。设计更符合实际应用的轻量级、性能可持续的 CD 模型是一个亟待解决的问题。因此,我们提出了一种轻量级变化检测网络,即鲁棒性特征聚合网络(RFANet)。为了提高从轻量级骨干网中提取的较弱特征的代表性,我们提出了一个特征增强模块(FRM)。FRM 允许当前级别的特征与其他级别的特征进行密集交互和融合,从而实现细粒度细节和语义信息的互补。考虑到 RS 图像中具有丰富相关性的海量对象,我们设计了语义分割聚合模块(SSAM),以更好地捕捉变化对象的全局语义信息。此外,我们还提出了一种包含信道交互模块(CIM)的轻量级解码器,该模块可提供多级精细差异特征,以强调变化区域,抑制背景和伪变化。在四个具有挑战性的 RS 图像 CD 数据集上进行的广泛实验表明,RFANet 以较少的参数和较低的计算成本实现了具有竞争力的性能。源代码见 https://github.com/Youzhihui/RFANet。
{"title":"Robust feature aggregation network for lightweight and effective remote sensing image change detection","authors":"Zhi-Hui You,&nbsp;Si-Bao Chen,&nbsp;Jia-Xin Wang,&nbsp;Bin Luo","doi":"10.1016/j.isprsjprs.2024.06.013","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.06.013","url":null,"abstract":"<div><p>In remote sensing (RS) image change detection (CD) task, many existing CD methods focus more on how to improve accuracy, but they usually have more parameters, higher computational costs, and heavier memory usage. Designing lightweight and performance-sustainable CD model that is more compatible with real-world applications is an urgent problem to be solved. Therefore, we propose a lightweight change detection network, called as robust feature aggregation network (RFANet). To improve representative capability of weaker features extracted from lightweight backbone, a feature reinforcement module (FRM) is proposed. FRM allows current level feature to densely interact and fuse with other level features, thus accomplishing the complementarity of fine-grained details and semantic information. Considering massive objects with rich correlations in RS images, we design semantic split-aggregation module (SSAM) to better capture global semantic information of changed objects. Besides, we present a lightweight decoder containing channel interaction module (CIM), which allows multi-level refined difference features to emphasize changed areas and suppress background and pseudo-changes. Extensive experiments carried out on four challenging RS image CD datasets illustrate that RFANet achieves competitive performance with fewer parameters and lower computational costs. The source code is available at <span>https://github.com/Youzhihui/RFANet</span><svg><path></path></svg>.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141540106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards long-term, high-accuracy, and continuous satellite total and fine-mode aerosol records: Enhanced Land General Aerosol (e-LaGA) retrieval algorithm for VIIRS 实现长期、高精度和连续的卫星总气溶胶和精细模式气溶胶记录:用于 VIIRS 的增强型陆地一般气溶胶(e-LaGA)检索算法
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-07-01 DOI: 10.1016/j.isprsjprs.2024.06.022
Lunche Wang , Xin Su , Yi Wang , Mengdan Cao , Qin Lang , Huaping Li , Junyao Sun , Ming Zhang , Wenmin Qin , Lei Li , Leiku Yang

Long-term, accurate, stable, and continuous aerosol records from space are a major requirement for climate and atmospheric environment research. Due to the limited period span of the single satellite mission, solving the problem requires the combination of multiple satellite missions. In this study, a novel aerosol retrieval algorithm, named enhanced Land General Aerosol Retrieval (e-LaGA), for Visible/Infrared Imager Radiometer Suite (VIIRS) was developed to continue Moderate-resolution Imaging Spectro-radiometer (MODIS) aerosol retrieval. e-LaGA utilizes a Surface Reflectance (SR) relationship model map to more accurately estimate the SR parameter, optimizes the priori aerosol-type map, and improves the multi-band retrieval strategy using the residual-interpolation approach. Additionally, e-LaGA expands the retrieval ability of the traditional Dark Target (DT) algorithm over bright surfaces, and can more accurately retrieve Aerosol Optical Depth (AOD), Fine-mode AOD (AODF), and Fraction (FMF). The validation using global 533 AERONET sites shows that the volume of matchups of AOD (AODF) is approximately 80000, the correlation coefficient (R) is 0.902 (0.879) and the fraction of meeting the expected error envelope (±0.05 ± 0.15τ) is 0.806 (0.804). The inter-comparison indicates that the accuracy of e-LaGA AOD retrievals is comparable to seven commonly used AOD products. The validation performance and spatial distribution pattern of e-LaGA AODF and FMF are in good agreement with the POLDER (Polarization and Directionality of the Earth's Reflectances) products. e-LaGA is also applied to MODIS sensors. The VIIRS e-LaGA AOD, AODF, and FMF retrievals have high consistency with MODIS e-LaGA. Their average bias of AOD is 0.006, which is smaller than 0.024 for the Deep Blue and 0.011 for the DT. These preliminary results demonstrate the robustness of the e-LaGA algorithm and its potential for establishing a long-term climate record of total and fine-mode aerosol by combining multiple satellite missions, which is expected to reduce the uncertainty of climate change research.

长期、准确、稳定和连续的空间气溶胶记录是气候和大气环境研究的主要要求。由于单个卫星任务的时间跨度有限,解决这一问题需要多个卫星任务的组合。本研究针对可见光/红外成像辐射计套件(VIIRS)开发了一种新型气溶胶检索算法,命名为增强型陆地一般气溶胶检索(e-LaGA),以继续进行中分辨率成像光谱辐射计(MODIS)气溶胶检索。e-LaGA利用表面反射率(SR)关系模型图来更准确地估计SR参数,优化先验气溶胶类型图,并利用残差插值方法改进多波段检索策略。此外,e-LaGA 还扩展了传统暗目标(DT)算法在亮面上的检索能力,并能更准确地检索气溶胶光学深度(AOD)、细模 AOD(AODF)和馏分(FMF)。利用全球 533 个 AERONET 站点进行的验证表明,AOD(AODF)的匹配量约为 80000,相关系数(R)为 0.902(0.879),符合预期误差包络(±0.05 ± 0.15τ)的比例为 0.806(0.804)。相互比较结果表明,e-LaGA AOD 提取的精度与七种常用的 AOD 产品相当。e-LaGA AODF 和 FMF 的验证性能和空间分布模式与 POLDER(地球反射偏振和方向性)产品有很好的一致性。VIIRS e-LaGA AOD、AODF 和 FMF 检索结果与 MODIS e-LaGA 高度一致。其 AOD 平均偏差为 0.006,小于深蓝的 0.024 和 DT 的 0.011。这些初步结果表明了 e-LaGA 算法的稳健性及其通过结合多个卫星任务建立总气溶胶和微模式气溶胶长期气候记录的潜力,有望降低气候变化研究的不确定性。
{"title":"Towards long-term, high-accuracy, and continuous satellite total and fine-mode aerosol records: Enhanced Land General Aerosol (e-LaGA) retrieval algorithm for VIIRS","authors":"Lunche Wang ,&nbsp;Xin Su ,&nbsp;Yi Wang ,&nbsp;Mengdan Cao ,&nbsp;Qin Lang ,&nbsp;Huaping Li ,&nbsp;Junyao Sun ,&nbsp;Ming Zhang ,&nbsp;Wenmin Qin ,&nbsp;Lei Li ,&nbsp;Leiku Yang","doi":"10.1016/j.isprsjprs.2024.06.022","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.06.022","url":null,"abstract":"<div><p>Long-term, accurate, stable, and continuous aerosol records from space are a major requirement for climate and atmospheric environment research. Due to the limited period span of the single satellite mission, solving the problem requires the combination of multiple satellite missions. In this study, a novel aerosol retrieval algorithm, named enhanced Land General Aerosol Retrieval (e-LaGA), for Visible/Infrared Imager Radiometer Suite (VIIRS) was developed to continue Moderate-resolution Imaging Spectro-radiometer (MODIS) aerosol retrieval. e-LaGA utilizes a Surface Reflectance (SR) relationship model map to more accurately estimate the SR parameter, optimizes the priori aerosol-type map, and improves the multi-band retrieval strategy using the residual-interpolation approach. Additionally, e-LaGA expands the retrieval ability of the traditional Dark Target (DT) algorithm over bright surfaces, and can more accurately retrieve Aerosol Optical Depth (AOD), Fine-mode AOD (AODF), and Fraction (FMF). The validation using global 533 AERONET sites shows that the volume of matchups of AOD (AODF) is approximately 80000, the correlation coefficient (R) is 0.902 (0.879) and the fraction of meeting the expected error envelope (±0.05 ± 0.15τ) is 0.806 (0.804). The inter-comparison indicates that the accuracy of e-LaGA AOD retrievals is comparable to seven commonly used AOD products. The validation performance and spatial distribution pattern of e-LaGA AODF and FMF are in good agreement with the POLDER (Polarization and Directionality of the Earth's Reflectances) products. e-LaGA is also applied to MODIS sensors. The VIIRS e-LaGA AOD, AODF, and FMF retrievals have high consistency with MODIS e-LaGA. Their average bias of AOD is 0.006, which is smaller than 0.024 for the Deep Blue and 0.011 for the DT. These preliminary results demonstrate the robustness of the e-LaGA algorithm and its potential for establishing a long-term climate record of total and fine-mode aerosol by combining multiple satellite missions, which is expected to reduce the uncertainty of climate change research.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141483242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Look at the whole scene: General point cloud place recognition by classification proxy 查看整个场景通过分类代理识别一般点云位置
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-07-01 DOI: 10.1016/j.isprsjprs.2024.06.017
Yue Xie , Bing Wang , Haiping Wang , Fuxun Liang , Wenxiao Zhang , Zhen Dong , Bisheng Yang

Deep learning models centered on retrieval have made significant strides in point cloud place recognition. However, existing approaches struggle to generate discriminative global descriptors and often rely on labor-intensive negative sample mining. Such constraints limit their usability in dynamic and open-world scenarios. To address these challenges, we introduce LAWS, a pioneering classification-centric neural framework that emphasizes looking at the whole scene for superior point cloud descriptor extraction. Central to our approach is the space partitioning design, constructed to provide holistic scene supervision, ensuring the comprehensive learning of scene features. To counteract potential ambiguities arising from the single orthogonal partition boundary, a complementary mechanism of repartitioning space diagonally is specifically designed to dispel classification uncertainties. Under the enhanced partitioning mechanism, the space is separated into several classes and groups. Furthermore, to prevent knowledge forgetting between different groups, a special training strategy is employed, allowing for distinct training of each group. The extensive experiments, encompassing both indoor and outdoor settings and different tasks, validate the generality of LAWS. It not only outperforms contemporary methods but also demonstrates a profound generalization ability across various unseen environments and sensor modalities. Our method achieves a 2.6% higher average top-1 recall on Oxford RobotCar Dataset and a 7.8% higher average recall when generalized to In-house Dataset compared with retrieval-based methods. Furthermore, LAWS also outperforms retrieval-based methods in terms of F1 score, with improvements of 12.7 and 29.2 on the MulRan and KITTI datasets, respectively. Notably, the average localization accuracy of LAWS in indoor environments reached about 68.1%. Moreover, the scalability and efficiency places LAWS in a leading position for continuous exploration and long-term autonomy. Our code is available at https://github.com/BrusonX/LAWS.

以检索为中心的深度学习模型在点云地点识别方面取得了长足进步。然而,现有的方法很难生成具有辨别力的全局描述符,而且往往依赖于劳动密集型的负样本挖掘。这些制约因素限制了它们在动态和开放世界场景中的可用性。为了应对这些挑战,我们引入了 LAWS,这是一种开创性的以分类为中心的神经框架,强调从整个场景中提取卓越的点云描述符。我们方法的核心是空间分区设计,旨在提供整体场景监督,确保全面学习场景特征。为了消除单一正交分区边界可能产生的模糊性,我们专门设计了一种对角线空间再分区的补充机制,以消除分类的不确定性。在增强型分区机制下,空间被分为多个类别和组别。此外,为了防止不同组之间的知识遗忘,还采用了一种特殊的训练策略,允许对每个组进行不同的训练。包括室内和室外环境以及不同任务在内的大量实验验证了 LAWS 的通用性。它不仅优于当代的方法,而且在各种未知环境和传感器模式下都表现出了很强的泛化能力。与基于检索的方法相比,我们的方法在牛津RobotCar数据集上的top-1平均召回率提高了2.6%,在In-house数据集上的平均召回率提高了7.8%。此外,LAWS 的 F1 分数也优于基于检索的方法,在 MulRan 和 KITTI 数据集上分别提高了 12.7 分和 29.2 分。值得注意的是,LAWS 在室内环境中的平均定位精度达到了约 68.1%。此外,LAWS 的可扩展性和高效性使其在持续探索和长期自主方面处于领先地位。我们的代码见 https://github.com/BrusonX/LAWS。
{"title":"Look at the whole scene: General point cloud place recognition by classification proxy","authors":"Yue Xie ,&nbsp;Bing Wang ,&nbsp;Haiping Wang ,&nbsp;Fuxun Liang ,&nbsp;Wenxiao Zhang ,&nbsp;Zhen Dong ,&nbsp;Bisheng Yang","doi":"10.1016/j.isprsjprs.2024.06.017","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.06.017","url":null,"abstract":"<div><p>Deep learning models centered on retrieval have made significant strides in point cloud place recognition. However, existing approaches struggle to generate discriminative global descriptors and often rely on labor-intensive negative sample mining. Such constraints limit their usability in dynamic and open-world scenarios. To address these challenges, we introduce LAWS, a pioneering classification-centric neural framework that emphasizes <em>looking at the whole scene</em> for superior point cloud descriptor extraction. Central to our approach is the space partitioning design, constructed to provide holistic scene supervision, ensuring the comprehensive learning of scene features. To counteract potential ambiguities arising from the single orthogonal partition boundary, a complementary mechanism of repartitioning space diagonally is specifically designed to dispel classification uncertainties. Under the enhanced partitioning mechanism, the space is separated into several classes and groups. Furthermore, to prevent knowledge forgetting between different groups, a special training strategy is employed, allowing for distinct training of each group. The extensive experiments, encompassing both indoor and outdoor settings and different tasks, validate the generality of LAWS. It not only outperforms contemporary methods but also demonstrates a profound generalization ability across various unseen environments and sensor modalities. Our method achieves a 2.6% higher average top-1 recall on Oxford RobotCar Dataset and a 7.8% higher average recall when generalized to In-house Dataset compared with retrieval-based methods. Furthermore, LAWS also outperforms retrieval-based methods in terms of <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> score, with improvements of 12.7 and 29.2 on the MulRan and KITTI datasets, respectively. Notably, the average localization accuracy of LAWS in indoor environments reached about 68.1%. Moreover, the scalability and efficiency places LAWS in a leading position for continuous exploration and long-term autonomy. Our code is available at <span>https://github.com/BrusonX/LAWS</span><svg><path></path></svg>.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141484577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quick calibration of massive urban outdoor surveillance cameras 快速校准大规模城市室外监控摄像机
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-06-29 DOI: 10.1016/j.isprsjprs.2024.06.012
Lin Shi , Xiaoji Lan , Xin Lan , Tianliang Zhang

The wide application of urban outdoor surveillance systems has greatly improved the efficiency of urban management and social security index. However, most of the existing urban outdoor surveillance cameras lack the records of important parameters such as geospatial coordinates, field of view angle and lens distortion, which brings difficulties to the unified management and layout optimization of the cameras, geospatial analysis of video data, and the computer vision applications such as the trajectory tracking of moving targets. To address this problem, this paper designs a marker with a chessboard pattern and a positioning device, makes the marker move in outdoor space through vehicles and other mobile carriers, and utilizes the marker image captured by the surveillance camera and the spatial position information obtained by the positioning device to batch calibrate the outdoor surveillance cameras and calculate its geospatial coordinates and field of view angle, which achieves the rapid acquisition of important parameters of the surveillance camera, and provides a new method for the rapid calibration of urban outdoor surveillance cameras, which contributes to the informationization management of urban surveillance resources and the spatial analysis and computation of surveillance video data, and make it play a greater role in the application of smart transportation and smart city. Taking the outdoor surveillance cameras within 2.5Km2 of a city as an example, calibration tests were performed on 295 surveillance cameras in the test area, and the geospatial coordinates, field of view angle and lens parameters of 269 surveillance cameras were obtained, and the average error of the spatial position was 0.527 m, and the maximum error was 1.573 m, and the average error of the field of view angle was 1.63°, and the maximum error was 3.4°, which verified the effectiveness and accuracy of the method in this paper.

城市户外监控系统的广泛应用极大地提高了城市管理效率和社会安全指数。然而,现有的城市室外监控摄像机大多缺乏对地理空间坐标、视场角、镜头畸变等重要参数的记录,这给摄像机的统一管理和布局优化、视频数据的地理空间分析以及移动目标轨迹跟踪等计算机视觉应用带来了困难。针对这一问题,本文设计了一种棋盘图案的标记物和定位装置,通过车辆等移动载体使标记物在室外空间移动,并利用监控摄像机捕捉到的标记物图像和定位装置获取的空间位置信息对室外监控摄像机进行批量标定,计算其地理空间坐标和视场角、实现了监控摄像机重要参数的快速获取,为城市室外监控摄像机的快速标定提供了一种新的方法,有助于城市监控资源的信息化管理和监控视频数据的空间分析计算,使其在智能交通和智慧城市的应用中发挥更大的作用。以某市 2.5Km2 范围内的室外监控摄像机为例,对测试区域内的 295 台监控摄像机进行了标定测试,得到了 269 台监控摄像机的地理空间坐标、视场角和镜头参数,空间位置平均误差为 0.527 m,最大误差为 1.573 m,视场角平均误差为 1.63°,最大误差为 3.4°,验证了本文方法的有效性和准确性。
{"title":"Quick calibration of massive urban outdoor surveillance cameras","authors":"Lin Shi ,&nbsp;Xiaoji Lan ,&nbsp;Xin Lan ,&nbsp;Tianliang Zhang","doi":"10.1016/j.isprsjprs.2024.06.012","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.06.012","url":null,"abstract":"<div><p>The wide application of urban outdoor surveillance systems has greatly improved the efficiency of urban management and social security index. However, most of the existing urban outdoor surveillance cameras lack the records of important parameters such as geospatial coordinates, field of view angle and lens distortion, which brings difficulties to the unified management and layout optimization of the cameras, geospatial analysis of video data, and the computer vision applications such as the trajectory tracking of moving targets. To address this problem, this paper designs a marker with a chessboard pattern and a positioning device, makes the marker move in outdoor space through vehicles and other mobile carriers, and utilizes the marker image captured by the surveillance camera and the spatial position information obtained by the positioning device to batch calibrate the outdoor surveillance cameras and calculate its geospatial coordinates and field of view angle, which achieves the rapid acquisition of important parameters of the surveillance camera, and provides a new method for the rapid calibration of urban outdoor surveillance cameras, which contributes to the informationization management of urban surveillance resources and the spatial analysis and computation of surveillance video data, and make it play a greater role in the application of smart transportation and smart city. Taking the outdoor surveillance cameras within 2.5Km<sup>2</sup> of a city as an example, calibration tests were performed on 295 surveillance cameras in the test area, and the geospatial coordinates, field of view angle and lens parameters of 269 surveillance cameras were obtained, and the average error of the spatial position was 0.527 m, and the maximum error was 1.573 m, and the average error of the field of view angle was 1.63°, and the maximum error was 3.4°, which verified the effectiveness and accuracy of the method in this paper.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141483245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Empowering lightweight detectors: Orientation Distillation via anti-ambiguous spatial transformation for remote sensing images 增强轻型探测器的能力:通过遥感图像的反模糊空间变换进行方向蒸馏
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-06-29 DOI: 10.1016/j.isprsjprs.2024.05.023
Yidan Zhang , Wenhui Zhang , Junxi Li , Xiyu Qi , Xiaonan Lu , Lei Wang , Yingyan Hou

Knowledge distillation (KD) has been one of the most potential methods to implement a lightweight detector, which plays a significant role in satellite in-orbit processing and unmanned aerial vehicle tracking. However, existing distillation paradigms exhibit limited accuracy in detecting arbitrary-oriented objects represented with rotated bounding boxes in remote sensing images. This issue is attributed to two aspects: (i) boundary discontinuity localization distillation, caused by angle periodicity of rotated bounding boxes, and (ii) spatial ossified feature distillation, induced by orientation-agnostic knowledge transitive regions, both of which contribute to ambiguous orientation estimation of objects. To address these issues, we propose an effective KD method called Orientation Distillation (OD) via anti-ambiguous spatial transformation, which consists of two modules. (i) Anti-ambiguous Location Prediction (ALP) module reformulates the regression transformation between teacher–student bounding boxes as Gaussian distributions fitting procedure. These distributions with distilled potential are optimized to accurately localize objects with the aid of boundary continuity cost. (ii) Orientation-guided Feature Calibration (OFC) module employs a learnable affine matrix to augment fixed CNN sampling grid into a spatially remapped one, which bridges between the multi-scale feature of teacher and student for effectively delivering the refined oriented awareness within adaptively distillation regions. Overall, OD customizes the spatial transformation of bounding box representation and sampling grid to transfer anti-ambiguous orientation knowledge, and significantly improves the performance of lightweight detectors upon non-axially arranged objects. Extensive experiments on multiple datasets demonstrate that our plug-and-play distillation framework achieves state-of-the-art performance. Codes are available at https://github.com/Molly6/OD.

知识蒸馏(KD)是实现轻量级探测器最有潜力的方法之一,在卫星在轨处理和无人机跟踪中发挥着重要作用。然而,现有的蒸馏范式在检测遥感图像中以旋转边界框表示的任意方向物体时表现出有限的准确性。这一问题可归因于两个方面:(i) 由旋转边界框的角度周期性引起的边界不连续性定位蒸馏;(ii) 由与方向无关的知识传递区域引起的空间僵化特征蒸馏,这两个方面都会导致对物体的方向估计模糊不清。为了解决这些问题,我们提出了一种有效的 KD 方法,即通过反模糊空间变换进行方向蒸馏(OD),该方法由两个模块组成。(i) 反模糊位置预测(ALP)模块将师生边界框之间的回归变换重构为高斯分布拟合过程。借助边界连续性成本,对这些具有提炼潜力的分布进行优化,以准确定位对象。(ii) 方向引导特征校准(OFC)模块采用可学习的仿射矩阵,将固定的 CNN 采样网格增强为空间重映射网格,从而在教师和学生的多尺度特征之间架起桥梁,有效地在自适应蒸馏区域内提供精炼的方向认知。总之,OD 定制了边界框表示和采样网格的空间转换,以传递反模糊方向知识,并显著提高了轻量级检测器在非轴向排列物体上的性能。在多个数据集上的广泛实验证明,我们的即插即用提炼框架达到了最先进的性能。代码见 https://github.com/Molly6/OD。
{"title":"Empowering lightweight detectors: Orientation Distillation via anti-ambiguous spatial transformation for remote sensing images","authors":"Yidan Zhang ,&nbsp;Wenhui Zhang ,&nbsp;Junxi Li ,&nbsp;Xiyu Qi ,&nbsp;Xiaonan Lu ,&nbsp;Lei Wang ,&nbsp;Yingyan Hou","doi":"10.1016/j.isprsjprs.2024.05.023","DOIUrl":"https://doi.org/10.1016/j.isprsjprs.2024.05.023","url":null,"abstract":"<div><p>Knowledge distillation (KD) has been one of the most potential methods to implement a lightweight detector, which plays a significant role in satellite in-orbit processing and unmanned aerial vehicle tracking. However, existing distillation paradigms exhibit limited accuracy in detecting arbitrary-oriented objects represented with rotated bounding boxes in remote sensing images. This issue is attributed to two aspects: (i) boundary discontinuity localization distillation, caused by angle periodicity of rotated bounding boxes, and (ii) spatial ossified feature distillation, induced by orientation-agnostic knowledge transitive regions, both of which contribute to ambiguous orientation estimation of objects. To address these issues, we propose an effective KD method called Orientation Distillation (OD) via anti-ambiguous spatial transformation, which consists of two modules. (i) Anti-ambiguous Location Prediction (ALP) module reformulates the regression transformation between teacher–student bounding boxes as Gaussian distributions fitting procedure. These distributions with distilled potential are optimized to accurately localize objects with the aid of boundary continuity cost. (ii) Orientation-guided Feature Calibration (OFC) module employs a learnable affine matrix to augment fixed CNN sampling grid into a spatially remapped one, which bridges between the multi-scale feature of teacher and student for effectively delivering the refined oriented awareness within adaptively distillation regions. Overall, OD customizes the spatial transformation of bounding box representation and sampling grid to transfer anti-ambiguous orientation knowledge, and significantly improves the performance of lightweight detectors upon non-axially arranged objects. Extensive experiments on multiple datasets demonstrate that our plug-and-play distillation framework achieves state-of-the-art performance. Codes are available at <span>https://github.com/Molly6/OD</span><svg><path></path></svg>.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141483244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1