首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
Variational Autoencoder with Gaussian Random Field prior: Application to unsupervised animal detection in aerial images 带有高斯随机场先验的变异自动编码器:应用于航空图像中的无监督动物检测
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-03 DOI: 10.1016/j.isprsjprs.2024.09.028
Hugo Gangloff , Minh-Tan Pham , Luc Courtrai , Sébastien Lefèvre
In real world datasets of aerial images, the objects of interest are often missing, hard to annotate and of varying aspects. The framework of unsupervised Anomaly Detection (AD) is highly relevant in this context, and Variational Autoencoders (VAEs), a family of popular probabilistic models, are often used. We develop on the literature of VAEs for AD in order to take advantage of the particular textures that appear in natural aerial images. More precisely we propose a new VAE model with a Gaussian Random Field (GRF) prior (VAE-GRF), which generalizes the classical VAE model, and we provide the necessary procedures and hypotheses required for the model to be tractable. We show that, under some assumptions, the VAE-GRF largely outperforms the traditional VAE and some other probabilistic models developed for AD. Our results suggest that the VAE-GRF could be used as a relevant VAE baseline in place of the traditional VAE with very limited additional computational cost. We provide competitive results on the MVTec reference dataset for visual inspection, and two other datasets dedicated to the task of unsupervised animal detection in aerial images.
在现实世界的航空图像数据集中,所关注的对象往往是缺失的、难以注释的,而且涉及不同的方面。在这种情况下,无监督异常检测(AD)框架就显得非常重要,而变异自动编码器(VAE)是一种流行的概率模型,经常被使用。为了利用自然航空图像中出现的特殊纹理,我们开发了用于 AD 的变异自动编码器文献。更确切地说,我们提出了一种具有高斯随机场(GRF)先验的新 VAE 模型(VAE-GRF),它是对经典 VAE 模型的概括,我们还提供了使该模型具有可操作性所需的必要程序和假设。我们的研究表明,在某些假设条件下,VAE-GRF 在很大程度上优于传统的 VAE 和其他一些针对 AD 开发的概率模型。我们的研究结果表明,VAE-GRF 可以作为相关的 VAE 基线,取代传统的 VAE,而且额外的计算成本非常有限。我们在 MVTec 视觉检测参考数据集和另外两个专门用于航空图像中无监督动物检测任务的数据集上提供了具有竞争力的结果。
{"title":"Variational Autoencoder with Gaussian Random Field prior: Application to unsupervised animal detection in aerial images","authors":"Hugo Gangloff ,&nbsp;Minh-Tan Pham ,&nbsp;Luc Courtrai ,&nbsp;Sébastien Lefèvre","doi":"10.1016/j.isprsjprs.2024.09.028","DOIUrl":"10.1016/j.isprsjprs.2024.09.028","url":null,"abstract":"<div><div>In real world datasets of aerial images, the objects of interest are often missing, hard to annotate and of varying aspects. The framework of unsupervised Anomaly Detection (AD) is highly relevant in this context, and Variational Autoencoders (VAEs), a family of popular probabilistic models, are often used. We develop on the literature of VAEs for AD in order to take advantage of the particular textures that appear in natural aerial images. More precisely we propose a new VAE model with a Gaussian Random Field (GRF) prior (VAE-GRF), which generalizes the classical VAE model, and we provide the necessary procedures and hypotheses required for the model to be tractable. We show that, under some assumptions, the VAE-GRF largely outperforms the traditional VAE and some other probabilistic models developed for AD. Our results suggest that the VAE-GRF could be used as a relevant VAE baseline in place of the traditional VAE with very limited additional computational cost. We provide competitive results on the MVTec reference dataset for visual inspection, and two other datasets dedicated to the task of unsupervised animal detection in aerial images.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 600-609"},"PeriodicalIF":10.6,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OR-LIM: Observability-aware robust LiDAR-inertial-mapping under high dynamic sensor motion OR-LIM:高动态传感器运动下的可观测性感知鲁棒激光雷达-惯性制图
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-03 DOI: 10.1016/j.isprsjprs.2024.09.036
Yangzi Cong , Chi Chen , Bisheng Yang , Ruofei Zhong , Shangzhe Sun , Yuhang Xu , Zhengfei Yan , Xianghong Zou , Zhigang Tu
Light Detection And Ranging (LiDAR) technology has provided an impactful way to capture 3D data. However, consistent mapping in sensing-degenerated and perceptually-limited scenes (e.g. multi-story buildings) or under high dynamic sensor motion (e.g. rotating platform) remains a significant challenge. In this paper, we present OR-LIM, a novel observability-aware LiDAR-inertial-mapping system. Essentially, it combines a robust real-time LiDAR-inertial-odometry (LIO) module with an efficient surfel-map-smoothing (SMS) module that seamlessly optimizes the sensor poses and scene geometry at the same time. To improve robustness, the planar surfels are hierarchically generated and grown from point cloud maps to provide reliable correspondences for fixed-lag optimization. Moreover, the normals of surfels are analyzed for the observability evaluation of each frame. To maintain global consistency, a factor graph is utilized integrating the information from IMU propagation, LIO as well as the SMS. The system is extensively tested on the datasets collected by a low-cost multi-beam LiDAR (MBL) mounted on a rotating platform. The experiments with various settings of sensor motion, conducted on complex multi-story buildings and large-scale outdoor scenes, demonstrate the superior performance of our system over multiple state-of-the-art methods. The improvement of point accuracy reaches 3.39–13.6 % with an average 8.71 % outdoor and correspondingly 1.89–15.88 % with 9.09 % indoor, with reference to the collected Terrestrial Laser Scanning (TLS) map.
光探测与测距(LiDAR)技术为获取三维数据提供了一种极具影响力的方法。然而,在感知衰减和感知受限的场景(如多层建筑)中或在传感器高动态运动(如旋转平台)的情况下进行一致的测绘仍然是一项重大挑战。在本文中,我们介绍了一种新型的可观测性感知激光雷达-惯性测绘系统--OR-LIM。从本质上讲,该系统将稳健的实时激光雷达-惯性测距(LIO)模块与高效的曲面贴图平滑(SMS)模块相结合,可同时无缝优化传感器姿势和场景几何。为了提高鲁棒性,平面曲面是由点云图分层生成和生长的,从而为固定滞后优化提供可靠的对应关系。此外,还对曲面的法线进行分析,以评估每个帧的可观测性。为保持全局一致性,利用因子图整合了 IMU 传播、LIO 和 SMS 的信息。该系统在安装在旋转平台上的低成本多波束激光雷达(MBL)收集的数据集上进行了广泛测试。在复杂的多层建筑和大规模室外场景中进行的各种传感器运动设置实验表明,我们的系统性能优于多种最先进的方法。参照收集到的地面激光扫描(TLS)地图,点精度提高了 3.39-13.6%,室外平均提高 8.71%,室内相应提高 1.89-15.88%,室内平均提高 9.09%。
{"title":"OR-LIM: Observability-aware robust LiDAR-inertial-mapping under high dynamic sensor motion","authors":"Yangzi Cong ,&nbsp;Chi Chen ,&nbsp;Bisheng Yang ,&nbsp;Ruofei Zhong ,&nbsp;Shangzhe Sun ,&nbsp;Yuhang Xu ,&nbsp;Zhengfei Yan ,&nbsp;Xianghong Zou ,&nbsp;Zhigang Tu","doi":"10.1016/j.isprsjprs.2024.09.036","DOIUrl":"10.1016/j.isprsjprs.2024.09.036","url":null,"abstract":"<div><div>Light Detection And Ranging (LiDAR) technology has provided an impactful way to capture 3D data. However, consistent mapping in sensing-degenerated and perceptually-limited scenes (e.g. multi-story buildings) or under high dynamic sensor motion (e.g. rotating platform) remains a significant challenge. In this paper, we present OR-LIM, a novel observability-aware LiDAR-inertial-mapping system. Essentially, it combines a robust real-time LiDAR-inertial-odometry (LIO) module with an efficient surfel-map-smoothing (SMS) module that seamlessly optimizes the sensor poses and scene geometry at the same time. To improve robustness, the planar surfels are hierarchically generated and grown from point cloud maps to provide reliable correspondences for fixed-lag optimization. Moreover, the normals of surfels are analyzed for the observability evaluation of each frame. To maintain global consistency, a factor graph is utilized integrating the information from IMU propagation, LIO as well as the SMS. The system is extensively tested on the datasets collected by a low-cost multi-beam LiDAR (MBL) mounted on a rotating platform. The experiments with various settings of sensor motion, conducted on complex multi-story buildings and large-scale outdoor scenes, demonstrate the superior performance of our system over multiple state-of-the-art methods. The improvement of point accuracy reaches 3.39–13.6 % with an average 8.71 % outdoor and correspondingly 1.89–15.88 % with 9.09 % indoor, with reference to the collected Terrestrial Laser Scanning (TLS) map.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 610-627"},"PeriodicalIF":10.6,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Re-evaluating winter carbon sink in Southern Ocean by recovering MODIS-Aqua chlorophyll-a product at high solar zenith angles 通过恢复高天顶角的 MODIS-Aqua 叶绿素-a 产物重新评估南大洋冬季碳汇
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-02 DOI: 10.1016/j.isprsjprs.2024.09.033
Ke Zhang , Zhaoru Zhang , Jianfeng He , Walker O. Smith , Na Liu , Chengfeng Le
Satellite ocean color observations are extensively utilized in global carbon sink evaluation. However, the valid coverage of chlorophyll-a concentration (Chla, mg m−3) measurements from these observations is severely limited during autumn and winter in high latitude oceans. The high solar zenith angle (SZA) stands as one of the primary contributors to the reduced quality of Chla products in the high-latitude Southern Ocean during these seasons. This study addresses this challenge by employing a random forest-based regression ensemble (RFRE) method to enhance the quality of Moderate Resolution Imaging Spectroradiometer (MODIS) Chla products affected by high SZA conditions. The RFRE model incorporates the color index (CI), band-ratio index (R), SZA, sensor zenith angle (senz), and Rayleigh-corrected reflectance at 869 nm (Rrc(869)) as predictors. The results indicate that the RFRE model significantly increased the MODIS observed Chla coverage (1.03 to 3.24 times) in high-latitude Southern Ocean regions to the quality of standard Chla products. By applying the recovered Chla to re-evaluate the carbon sink in South Ocean, results showed that the Southern Ocean’s ability to absorb carbon dioxide (CO2) in winter has been underestimated (5.9–18.6 Tg C year−1) in previous assessments. This study underscores the significance of improving the Chla products for a more accurate estimation of winter carbon sink in the Southern Ocean.
卫星海洋颜色观测数据被广泛用于全球碳汇评估。然而,在高纬度海洋的秋冬季节,这些观测数据对叶绿素-a 浓度(Chla,毫克/立方米)测量的有效覆盖范围非常有限。高太阳天顶角(SZA)是造成这些季节高纬度南大洋 Chla 产品质量下降的主要原因之一。为了应对这一挑战,本研究采用了一种基于随机森林的回归集合(RFRE)方法,以提高受高 SZA 条件影响的中分辨率成像分光仪(MODIS)Chla 产品的质量。RFRE 模型将色彩指数 (CI)、带比指数 (R)、SZA、传感器天顶角 (senz) 和 869 nm 处的瑞利校正反射率 (Rrc(869)) 作为预测因子。结果表明,RFRE 模型显著提高了 MODIS 在南大洋高纬度地区观测到的 Chla 覆盖率(1.03 至 3.24 倍),达到了标准 Chla 产品的质量。通过应用恢复的 Chla 重新评估南大洋的碳汇,结果表明南大洋冬季吸收二氧化碳(CO2)的能力在以前的评估中被低估了(5.9-18.6 Tg C year-1)。这项研究强调了改进 Chla 产品以更准确地估计南大洋冬季碳汇的重要性。
{"title":"Re-evaluating winter carbon sink in Southern Ocean by recovering MODIS-Aqua chlorophyll-a product at high solar zenith angles","authors":"Ke Zhang ,&nbsp;Zhaoru Zhang ,&nbsp;Jianfeng He ,&nbsp;Walker O. Smith ,&nbsp;Na Liu ,&nbsp;Chengfeng Le","doi":"10.1016/j.isprsjprs.2024.09.033","DOIUrl":"10.1016/j.isprsjprs.2024.09.033","url":null,"abstract":"<div><div>Satellite ocean color observations are extensively utilized in global carbon sink evaluation. However, the valid coverage of chlorophyll-a concentration (Chla, mg m<sup>−3</sup>) measurements from these observations is severely limited during autumn and winter in high latitude oceans. The high solar zenith angle (SZA) stands as one of the primary contributors to the reduced quality of Chla products in the high-latitude Southern Ocean during these seasons. This study addresses this challenge by employing a random forest-based regression ensemble (RFRE) method to enhance the quality of Moderate Resolution Imaging Spectroradiometer (MODIS) Chla products affected by high SZA conditions. The RFRE model incorporates the color index (CI), band-ratio index (R), SZA, sensor zenith angle (senz), and Rayleigh-corrected reflectance at 869 nm (Rrc(869)) as predictors. The results indicate that the RFRE model significantly increased the MODIS observed Chla coverage (1.03 to 3.24 times) in high-latitude Southern Ocean regions to the quality of standard Chla products. By applying the recovered Chla to re-evaluate the carbon sink in South Ocean, results showed that the Southern Ocean’s ability to absorb carbon dioxide (CO<sub>2</sub>) in winter has been underestimated (5.9–18.6 Tg C year<sup>−1</sup>) in previous assessments. This study underscores the significance of improving the Chla products for a more accurate estimation of winter carbon sink in the Southern Ocean.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 588-599"},"PeriodicalIF":10.6,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new methodology for establishing an SOC content prediction model that is spatiotemporally transferable at multidecadal and intercontinental scales 建立可在多年代和洲际尺度上进行时空转换的 SOC 含量预测模型的新方法
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-02 DOI: 10.1016/j.isprsjprs.2024.09.038
Xiangtian Meng , Yilin Bao , Chong Luo , Xinle Zhang , Huanjun Liu
<div><div>Quantifying and tracking the soil organic carbon (SOC) content is a key step toward long-term terrestrial ecosystem monitoring. Over the past decade, numerous models have been proposed and have achieved promising results for predicting SOC content. However, many of these studies are confined to specific temporal or spatial contexts, neglecting model transferability. Temporal transferability refers to a model’s ability to be applied across different periods, while spatial transferability relates to its applicability across diverse geographic locations for prediction. Therefore, developing a new methodology to establish a prediction model with high spatiotemporal transferability for SOC content is critically important. In this study, two large intercontinental study areas were selected, and measured topsoil (0–20 cm) sample data, 27,059 cloudless Landsat 5/8 images, digital elevation models, and climate data were acquired for 3 periods. Based on these data, monthly average climate data, monthly average data reflecting soil properties, and topography data were calculated as original input (OI) variables. We established an innovative multivariate deep learning model with high spatiotemporal transferability, combining the advantages of attention mechanism, graph neural network, and long short-term memory network model (A-GNN-LSTM). Additionally, the spatiotemporal transferability of A-GNN-LSTM and commonly used prediction models were compared. Finally, the abilities of the OI variables and the OI variables processed by feature engineering (FEI) for different SOC prediction models were explored. The results show that 1) the A-GNN-LSTM that used OI as the input variable was the optimal prediction model (RMSE = 4.86 g kg<sup>−1</sup>, R<sup>2</sup> = 0.81, RPIQ = 2.46, and MAE = 3.78 g kg<sup>−1</sup>) with the highest spatiotemporal transferability. 2) Compared to the temporal transferability of the GNN, the A-GNN-LSTM demonstrates superior temporal transferability (ΔR<sup>2</sup><sub>T</sub> = −0.10 vs. −0.07). Furthermore, compared to the spatial transferability of LSTM, the A-GNN-LSTM shows enhanced spatial transferability (ΔR<sup>2</sup><sub>S</sub> = −0.16 vs. −0.09). These findings strongly suggest that the fusion of geospatial context and temporally dependent information, extracted through the integration of GNN and LSTM models, effectively enhances the spatiotemporal transferability of the models. 3) By introducing the attention mechanism, the weights of different input variables could be calculated, increasing the physical interpretability of the deep learning model. The largest weight was assigned to climate data (39.55 %), and the smallest weight was assigned to vegetation (19.96 %). 4) Among the commonly used prediction models, the deep learning model had higher prediction accuracy (RMSE = 6.64 g kg<sup>−1</sup>, R<sup>2</sup> = 0.64, RPIQ = 1.78, and MAE = 4.78 g kg<sup>−1</sup>) and spatial transferability (ΔRMSE<sub>S</sub> = 1.
量化和跟踪土壤有机碳(SOC)含量是实现长期陆地生态系统监测的关键一步。在过去十年中,人们提出了许多预测土壤有机碳含量的模型,并取得了可喜的成果。然而,这些研究大多局限于特定的时间或空间范围,忽视了模型的可移植性。时间上的可迁移性是指模型在不同时期的应用能力,而空间上的可迁移性则是指模型在不同地理位置的预测应用能力。因此,开发一种新方法来建立一个具有高度时空可转移性的 SOC 含量预测模型至关重要。本研究选取了两个大型洲际研究区,获取了三个时期的表土(0-20 厘米)测量样本数据、27059 幅无云 Landsat 5/8 图像、数字高程模型和气候数据。根据这些数据,计算出月均气候数据、反映土壤特性的月均数据和地形数据,作为原始输入变量(OI)。我们结合注意力机制、图神经网络和长短期记忆网络模型(A-GNN-LSTM)的优势,建立了一个创新的多变量深度学习模型,该模型具有很高的时空转移性。此外,还比较了 A-GNN-LSTM 和常用预测模型的时空转移性。最后,探讨了 OI 变量和经过特征工程(FEI)处理的 OI 变量在不同 SOC 预测模型中的能力。结果表明:1)以 OI 为输入变量的 A-GNN-LSTM 是最优预测模型(RMSE = 4.86 g kg-1,R2 = 0.81,RPIQ = 2.46,MAE = 3.78 g kg-1),具有最高的时空转移性。2) 与 GNN 的时空转移性相比,A-GNN-LSTM 的时空转移性更好(ΔR2T = -0.10 vs. -0.07)。此外,与 LSTM 的空间转移性相比,A-GNN-LSTM 显示出更强的空间转移性(ΔR2S = -0.16 vs. -0.09)。这些发现有力地表明,通过整合 GNN 和 LSTM 模型提取的地理空间上下文和时间相关信息的融合,有效地增强了模型的时空转移性。3) 通过引入注意力机制,可以计算不同输入变量的权重,提高深度学习模型的物理可解释性。其中,气候数据的权重最大(39.55%),植被数据的权重最小(19.96%)。4) 在常用预测模型中,深度学习模型具有更高的预测精度(RMSE = 6.64 g kg-1,R2 = 0.64,RPIQ = 1.78,MAE = 4.78 g kg-1)和空间转移性(ΔRMSES = 1.43 g kg-1,ΔR2S = -0.13,ΔRPIQS = -0.50,ΔMAES = 1.09 g kg-1),线性模型具有更高的时间转移性(ΔRMSET = 1.46 g kg-1,ΔR2T = -0.14,ΔRPIQT = -0.45,ΔMAET = 1.29 g kg-1)。5) 深度学习模型必须使用 OI,而线性和传统机器学习模型必须使用 FEI 才能达到更高的预测精度。本研究在整合多种深度学习模型以建立高时空转移性 SOC 预测模型方面迈出了重要一步。
{"title":"A new methodology for establishing an SOC content prediction model that is spatiotemporally transferable at multidecadal and intercontinental scales","authors":"Xiangtian Meng ,&nbsp;Yilin Bao ,&nbsp;Chong Luo ,&nbsp;Xinle Zhang ,&nbsp;Huanjun Liu","doi":"10.1016/j.isprsjprs.2024.09.038","DOIUrl":"10.1016/j.isprsjprs.2024.09.038","url":null,"abstract":"&lt;div&gt;&lt;div&gt;Quantifying and tracking the soil organic carbon (SOC) content is a key step toward long-term terrestrial ecosystem monitoring. Over the past decade, numerous models have been proposed and have achieved promising results for predicting SOC content. However, many of these studies are confined to specific temporal or spatial contexts, neglecting model transferability. Temporal transferability refers to a model’s ability to be applied across different periods, while spatial transferability relates to its applicability across diverse geographic locations for prediction. Therefore, developing a new methodology to establish a prediction model with high spatiotemporal transferability for SOC content is critically important. In this study, two large intercontinental study areas were selected, and measured topsoil (0–20 cm) sample data, 27,059 cloudless Landsat 5/8 images, digital elevation models, and climate data were acquired for 3 periods. Based on these data, monthly average climate data, monthly average data reflecting soil properties, and topography data were calculated as original input (OI) variables. We established an innovative multivariate deep learning model with high spatiotemporal transferability, combining the advantages of attention mechanism, graph neural network, and long short-term memory network model (A-GNN-LSTM). Additionally, the spatiotemporal transferability of A-GNN-LSTM and commonly used prediction models were compared. Finally, the abilities of the OI variables and the OI variables processed by feature engineering (FEI) for different SOC prediction models were explored. The results show that 1) the A-GNN-LSTM that used OI as the input variable was the optimal prediction model (RMSE = 4.86 g kg&lt;sup&gt;−1&lt;/sup&gt;, R&lt;sup&gt;2&lt;/sup&gt; = 0.81, RPIQ = 2.46, and MAE = 3.78 g kg&lt;sup&gt;−1&lt;/sup&gt;) with the highest spatiotemporal transferability. 2) Compared to the temporal transferability of the GNN, the A-GNN-LSTM demonstrates superior temporal transferability (ΔR&lt;sup&gt;2&lt;/sup&gt;&lt;sub&gt;T&lt;/sub&gt; = −0.10 vs. −0.07). Furthermore, compared to the spatial transferability of LSTM, the A-GNN-LSTM shows enhanced spatial transferability (ΔR&lt;sup&gt;2&lt;/sup&gt;&lt;sub&gt;S&lt;/sub&gt; = −0.16 vs. −0.09). These findings strongly suggest that the fusion of geospatial context and temporally dependent information, extracted through the integration of GNN and LSTM models, effectively enhances the spatiotemporal transferability of the models. 3) By introducing the attention mechanism, the weights of different input variables could be calculated, increasing the physical interpretability of the deep learning model. The largest weight was assigned to climate data (39.55 %), and the smallest weight was assigned to vegetation (19.96 %). 4) Among the commonly used prediction models, the deep learning model had higher prediction accuracy (RMSE = 6.64 g kg&lt;sup&gt;−1&lt;/sup&gt;, R&lt;sup&gt;2&lt;/sup&gt; = 0.64, RPIQ = 1.78, and MAE = 4.78 g kg&lt;sup&gt;−1&lt;/sup&gt;) and spatial transferability (ΔRMSE&lt;sub&gt;S&lt;/sub&gt; = 1.","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 531-550"},"PeriodicalIF":10.6,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated localization of dike leakage outlets using UAV-borne thermography and YOLO-based object detectors 利用无人机载热成像仪和基于 YOLO 的物体探测器自动定位堤坝渗漏口
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-02 DOI: 10.1016/j.isprsjprs.2024.09.039
Renlian Zhou , Monjee K. Almustafa , Moncef L. Nehdi , Huaizhi Su
Leakage-induced soil erosion poses a major threat to dike failure, particularly during floods. Timely detection and notification of leakage outlets to dike management are crucial for ensuring dike safety. However, manual inspection, the current main approach for identifying leakage outlets, is costly, inefficient, and lacks spatial coverage. To achieve efficient and automatic localization of dike leakage outlets, an innovative strategy combining drones, infrared thermography, and deep learning is presented. Drones are employed for dikes’ surface sensing. Real-time images from these drones are sent to a server where well-trained detectors are deployed. Once a leakage outlet is detected, alarming information is remotely sent to dike managers. To realize this strategy, 4 thermal imagers were employed to image leaking outlets of several models and actual dikes. 9,231 hand-labeled thermal images with 13,387 leaking objects were selected for analysis. 19 detectors were trained using transfer learning. The best detector achieved a mean average precision of 95.8 % on the challenging test set. A full-scale embankment was constructed for leakage outlet detection tests. Various field tests confirmed the efficiency of the proposed leakage outlet localization method. In some tough conditions, the trained detector also evidently outperformed manual judgement. Results indicate that under typical circumstances, the localization error of the proposed method is within 5 m, demonstrating its practical reliability. Finally, the influencing factors and limits of the suggested strategy are thoroughly examined.
渗漏引起的水土流失是堤坝溃决的主要威胁,尤其是在洪水期间。及时发现渗漏口并通知堤坝管理部门对确保堤坝安全至关重要。然而,目前识别渗漏口的主要方法是人工检查,这种方法成本高、效率低,而且缺乏空间覆盖。为了实现堤坝渗漏口的高效自动定位,本文提出了一种结合无人机、红外热成像和深度学习的创新策略。无人机用于堤坝表面感应。这些无人机拍摄的实时图像被发送到服务器,服务器上部署了训练有素的探测器。一旦检测到渗漏口,就会向堤坝管理人员远程发送警报信息。为了实现这一战略,我们使用了 4 台热成像仪对多个模型和实际堤坝的渗漏口进行成像。选取了 9,231 张带有 13,387 个泄漏对象的手工标记热图像进行分析。使用迁移学习训练了 19 个探测器。在具有挑战性的测试集中,最佳检测器的平均精度达到了 95.8%。为进行泄漏出口检测测试,建造了一个完整规模的堤坝。各种现场测试证实了所提出的渗漏口定位方法的效率。在一些苛刻的条件下,训练有素的检测器也明显优于人工判断。结果表明,在典型情况下,建议方法的定位误差在 5 米以内,证明了其实用可靠性。最后,对建议策略的影响因素和局限性进行了深入研究。
{"title":"Automated localization of dike leakage outlets using UAV-borne thermography and YOLO-based object detectors","authors":"Renlian Zhou ,&nbsp;Monjee K. Almustafa ,&nbsp;Moncef L. Nehdi ,&nbsp;Huaizhi Su","doi":"10.1016/j.isprsjprs.2024.09.039","DOIUrl":"10.1016/j.isprsjprs.2024.09.039","url":null,"abstract":"<div><div>Leakage-induced soil erosion poses a major threat to dike failure, particularly during floods. Timely detection and notification of leakage outlets to dike management are crucial for ensuring dike safety. However, manual inspection, the current main approach for identifying leakage outlets, is costly, inefficient, and lacks spatial coverage. To achieve efficient and automatic localization of dike leakage outlets, an innovative strategy combining drones, infrared thermography, and deep learning is presented. Drones are employed for dikes’ surface sensing. Real-time images from these drones are sent to a server where well-trained detectors are deployed. Once a leakage outlet is detected, alarming information is remotely sent to dike managers. To realize this strategy, 4 thermal imagers were employed to image leaking outlets of several models and actual dikes. 9,231 hand-labeled thermal images with 13,387 leaking objects were selected for analysis. 19 detectors were trained using transfer learning. The best detector achieved a mean average precision of 95.8 % on the challenging test set. A full-scale embankment was constructed for leakage outlet detection tests. Various field tests confirmed the efficiency of the proposed leakage outlet localization method. In some tough conditions, the trained detector also evidently outperformed manual judgement. Results indicate that under typical circumstances, the localization error of the proposed method is within 5 m, demonstrating its practical reliability. Finally, the influencing factors and limits of the suggested strategy are thoroughly examined.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 551-573"},"PeriodicalIF":10.6,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ASANet: Asymmetric Semantic Aligning Network for RGB and SAR image land cover classification ASANet:用于 RGB 和合成孔径雷达图像土地覆被分类的非对称语义对齐网络
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-02 DOI: 10.1016/j.isprsjprs.2024.09.025
Pan Zhang , Baochai Peng , Chaoran Lu , Quanjin Huang , Dongsheng Liu
Synthetic Aperture Radar (SAR) images have proven to be a valuable cue for multimodal Land Cover Classification (LCC) when combined with RGB images. Most existing studies on cross-modal fusion assume that consistent feature information is necessary between the two modalities, and as a result, they construct networks without adequately addressing the unique characteristics of each modality. In this paper, we propose a novel architecture, named the Asymmetric Semantic Aligning Network (ASANet), which introduces asymmetry at the feature level to address the issue that multi-modal architectures frequently fail to fully utilize complementary features. The core of this network is the Semantic Focusing Module (SFM), which explicitly calculates differential weights for each modality to account for the modality-specific features. Furthermore, ASANet incorporates a Cascade Fusion Module (CFM), which delves deeper into channel and spatial representations to efficiently select features from the two modalities for fusion. Through the collaborative effort of these two modules, the proposed ASANet effectively learns feature correlations between the two modalities and eliminates noise caused by feature differences. Comprehensive experiments demonstrate that ASANet achieves excellent performance on three multimodal datasets. Additionally, we have established a new RGB-SAR multimodal dataset, on which our ASANet outperforms other mainstream methods with improvements ranging from 1.21% to 17.69%. The ASANet runs at 48.7 frames per second (FPS) when the input image is 256 × 256 pixels.
合成孔径雷达(SAR)图像与 RGB 图像相结合,已被证明是多模态土地覆被分类(LCC)的重要线索。现有的大多数跨模态融合研究都假定两种模态之间需要一致的特征信息,因此在构建网络时没有充分考虑每种模态的独特性。在本文中,我们提出了一种名为 "非对称语义对齐网络(ASANet)"的新型架构,它在特征层面引入了非对称性,以解决多模态架构经常无法充分利用互补特征的问题。该网络的核心是语义聚焦模块(Semantic Focusing Module,SFM),它明确计算每种模态的不同权重,以考虑到特定模态的特征。此外,ASANet 还集成了级联融合模块 (CFM),该模块深入研究通道和空间表征,以有效地从两种模态中选择特征进行融合。通过这两个模块的协同工作,拟议的 ASANet 可以有效地学习两种模式之间的特征相关性,并消除由特征差异引起的噪声。综合实验证明,ASANet 在三个多模态数据集上取得了优异的性能。此外,我们还建立了一个新的 RGB-SAR 多模态数据集,在该数据集上,我们的 ASANet 优于其他主流方法,提高了 1.21% 到 17.69%。当输入图像为 256 × 256 像素时,ASANet 的运行速度为每秒 48.7 帧 (FPS)。
{"title":"ASANet: Asymmetric Semantic Aligning Network for RGB and SAR image land cover classification","authors":"Pan Zhang ,&nbsp;Baochai Peng ,&nbsp;Chaoran Lu ,&nbsp;Quanjin Huang ,&nbsp;Dongsheng Liu","doi":"10.1016/j.isprsjprs.2024.09.025","DOIUrl":"10.1016/j.isprsjprs.2024.09.025","url":null,"abstract":"<div><div>Synthetic Aperture Radar (SAR) images have proven to be a valuable cue for multimodal Land Cover Classification (LCC) when combined with RGB images. Most existing studies on cross-modal fusion assume that consistent feature information is necessary between the two modalities, and as a result, they construct networks without adequately addressing the unique characteristics of each modality. In this paper, we propose a novel architecture, named the Asymmetric Semantic Aligning Network (ASANet), which introduces asymmetry at the feature level to address the issue that multi-modal architectures frequently fail to fully utilize complementary features. The core of this network is the Semantic Focusing Module (SFM), which explicitly calculates differential weights for each modality to account for the modality-specific features. Furthermore, ASANet incorporates a Cascade Fusion Module (CFM), which delves deeper into channel and spatial representations to efficiently select features from the two modalities for fusion. Through the collaborative effort of these two modules, the proposed ASANet effectively learns feature correlations between the two modalities and eliminates noise caused by feature differences. Comprehensive experiments demonstrate that ASANet achieves excellent performance on three multimodal datasets. Additionally, we have established a new RGB-SAR multimodal dataset, on which our ASANet outperforms other mainstream methods with improvements ranging from 1.21% to 17.69%. The ASANet runs at 48.7 frames per second (FPS) when the input image is 256 × 256 pixels.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 574-587"},"PeriodicalIF":10.6,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VNI-Net: Vector neurons-based rotation-invariant descriptor for LiDAR place recognition VNI-Net:基于向量神经元的旋转不变描述符,用于激光雷达地点识别
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-01 DOI: 10.1016/j.isprsjprs.2024.09.011
Gengxuan Tian , Junqiao Zhao , Yingfeng Cai , Fenglin Zhang , Xufei Wang , Chen Ye , Sisi Zlatanova , Tiantian Feng
Despite the emergence of various LiDAR-based place recognition methods, the challenge of place recognition failure due to rotation remains critical. Existing studies have attempted to address this limitation through specific training strategies involving data augment and rotation-invariant networks. However, augmenting 3D rotations (SO(3)) is impractical for the former, while the latter primarily focuses on the reduced problem of 2D rotation (SO(2)) invariance. Existing methods targeting SO(3) rotation invariance suffer from limitations in discriminative capability. In this paper, we propose a novel approach (VNI-Net) based on the Vector Neurons Network (VNN) to achieve SO(3) rotation invariance. Our method begins by extracting rotation-equivariant features from neighboring points and projecting these low-dimensional features into a high-dimensional space using VNN. We then compute both Euclidean and cosine distances in the rotation-equivariant feature space to obtain rotation-invariant features. Finally, we aggregate these features using generalized-mean (GeM) pooling to generate the global descriptor. To mitigate the significant information loss associated with formulating rotation-invariant features, we propose computing distances between features at different layers within the Euclidean space neighborhood. This approach significantly enhances the discriminability of the descriptors while maintaining computational efficiency. We conduct experiments across multiple publicly available datasets captured with vehicle-mounted, drone-mounted LiDAR sensors and handheld. VNI-Net outperforms baseline methods by up to 15.3% in datasets with rotation, while achieving comparable results with state-of-the-art place recognition methods in datasets with less rotation. Our code is open-sourced at https://github.com/tiev-tongji/VNI-Net.
尽管出现了各种基于激光雷达的地点识别方法,但由于旋转而导致的地点识别失败仍然是一个严峻的挑战。现有研究试图通过涉及数据增强和旋转不变网络的特定训练策略来解决这一局限性。然而,对于前者来说,增强三维旋转(SO(3))是不切实际的,而后者则主要关注二维旋转(SO(2))不变性的简化问题。针对 SO(3) 旋转不变性的现有方法在判别能力方面存在局限性。在本文中,我们提出了一种基于矢量神经元网络(VNN)的新方法(VNI-Net)来实现 SO(3) 旋转不变性。我们的方法首先从相邻点中提取旋转不变性特征,然后使用 VNN 将这些低维特征投影到高维空间中。然后,我们计算旋转不变特征空间中的欧氏距离和余弦距离,以获得旋转不变特征。最后,我们使用广义均值(GeM)池法汇总这些特征,生成全局描述符。为了减少与制定旋转不变特征相关的重大信息损失,我们建议计算欧几里得空间邻域内不同层特征之间的距离。这种方法在保持计算效率的同时,大大提高了描述符的可辨别性。我们在多个公开可用的数据集上进行了实验,这些数据集由车载、无人机安装的激光雷达传感器和手持设备采集。在有旋转的数据集上,VNI-Net 的性能比基线方法高出 15.3%,而在旋转较少的数据集上,VNI-Net 与最先进的地点识别方法取得了不相上下的结果。我们的代码开源于 https://github.com/tiev-tongji/VNI-Net。
{"title":"VNI-Net: Vector neurons-based rotation-invariant descriptor for LiDAR place recognition","authors":"Gengxuan Tian ,&nbsp;Junqiao Zhao ,&nbsp;Yingfeng Cai ,&nbsp;Fenglin Zhang ,&nbsp;Xufei Wang ,&nbsp;Chen Ye ,&nbsp;Sisi Zlatanova ,&nbsp;Tiantian Feng","doi":"10.1016/j.isprsjprs.2024.09.011","DOIUrl":"10.1016/j.isprsjprs.2024.09.011","url":null,"abstract":"<div><div>Despite the emergence of various LiDAR-based place recognition methods, the challenge of place recognition failure due to rotation remains critical. Existing studies have attempted to address this limitation through specific training strategies involving data augment and rotation-invariant networks. However, augmenting 3D rotations (<span><math><mrow><mi>SO</mi><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></mrow></math></span>) is impractical for the former, while the latter primarily focuses on the reduced problem of 2D rotation (<span><math><mrow><mi>SO</mi><mrow><mo>(</mo><mn>2</mn><mo>)</mo></mrow></mrow></math></span>) invariance. Existing methods targeting <span><math><mrow><mi>SO</mi><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></mrow></math></span> rotation invariance suffer from limitations in discriminative capability. In this paper, we propose a novel approach (VNI-Net) based on the Vector Neurons Network (VNN) to achieve <span><math><mrow><mi>SO</mi><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></mrow></math></span> rotation invariance. Our method begins by extracting rotation-equivariant features from neighboring points and projecting these low-dimensional features into a high-dimensional space using VNN. We then compute both Euclidean and cosine distances in the rotation-equivariant feature space to obtain rotation-invariant features. Finally, we aggregate these features using generalized-mean (GeM) pooling to generate the global descriptor. To mitigate the significant information loss associated with formulating rotation-invariant features, we propose computing distances between features at different layers within the Euclidean space neighborhood. This approach significantly enhances the discriminability of the descriptors while maintaining computational efficiency. We conduct experiments across multiple publicly available datasets captured with vehicle-mounted, drone-mounted LiDAR sensors and handheld. VNI-Net outperforms baseline methods by up to 15.3% in datasets with rotation, while achieving comparable results with state-of-the-art place recognition methods in datasets with less rotation. Our code is open-sourced at <span><span>https://github.com/tiev-tongji/VNI-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 506-517"},"PeriodicalIF":10.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142357945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A boundary-aware point clustering approach in Euclidean and embedding spaces for roof plane segmentation 欧几里得空间和嵌入空间中用于屋顶平面分割的边界感知点聚类方法
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-01 DOI: 10.1016/j.isprsjprs.2024.09.030
Li Li , Qingqing Li , Guozheng Xu , Pengwei Zhou , Jingmin Tu , Jie Li , Mingming Li , Jian Yao
Roof plane segmentation from airborne light detection and ranging (LiDAR) point clouds is an important technology for three-dimensional (3D) building model reconstruction. One of the key issues of plane segmentation is how to design powerful features that can exactly distinguish adjacent planar patches. The quality of point feature directly determines the accuracy of roof plane segmentation. Most of existing approaches use handcrafted features, such as point-to-plane distance, normal vector, etc., to extract roof planes. However, the abilities of these features are relatively low, especially in boundary areas. To solve this problem, we propose a boundary-aware point clustering approach in Euclidean and embedding spaces constructed by a multi-task deep network for roof plane segmentation. We design a three-branch multi-task network to predict semantic labels, point offsets and extract deep embedding features. In the first branch, we classify the input data as non-roof, boundary and plane points. In the second branch, we predict point offsets for shifting each point towards its respective instance center. In the third branch, we constrain that points of the same plane instance should have the similar embeddings. We aim to ensure that points of the same plane instance are close as much as possible in both Euclidean and embedding spaces. However, although deep network has strong feature representative ability, it is still hard to accurately distinguish points near the plane instance boundary. Therefore, we first robustly group plane points into many clusters in Euclidean and embedding spaces to find candidate planes. Then, we assign the rest boundary points to their closest clusters to generate the final complete roof planes. In this way, we can effectively reduce the influence of unreliable boundary points. In addition, to train the network and evaluate the performance of our approach, we prepare a synthetic dataset and two real datasets. The experiments conducted on synthetic and real datasets show that the proposed approach significantly outperforms the existing state-of-the-art approaches in both qualitative evaluation and quantitative metrics. To facilitate future research, we will make datasets and source code of our approach publicly available at https://github.com/Li-Li-Whu/DeepRoofPlane.
从机载光探测与测距(LiDAR)点云中分割屋顶平面是三维(3D)建筑模型重建的一项重要技术。平面分割的关键问题之一是如何设计出能准确区分相邻平面斑块的强大特征。点特征的质量直接决定了屋顶平面分割的准确性。现有的方法大多使用手工制作的特征,如点到平面的距离、法向量等来提取屋顶平面。然而,这些特征的能力相对较低,尤其是在边界区域。为了解决这个问题,我们提出了一种边界感知的点聚类方法,该方法由多任务深度网络在欧几里得空间和嵌入空间中构建,用于屋顶平面分割。我们设计了一个三分支多任务网络来预测语义标签、点偏移和提取深度嵌入特征。在第一个分支中,我们将输入数据分类为非屋顶点、边界点和平面点。在第二个分支中,我们预测点偏移量,以便将每个点移向各自的实例中心。在第三个分支中,我们规定同一平面实例的点应具有相似的嵌入。我们的目标是确保同一平面实例的点在欧几里得空间和嵌入空间中尽可能接近。然而,虽然深度网络具有很强的特征代表能力,但仍难以准确区分平面实例边界附近的点。因此,我们首先在欧几里得空间和嵌入空间中将平面点稳健地分为多个簇,以找到候选平面。然后,我们将其余的边界点分配到与其最接近的簇,最终生成完整的屋顶平面。这样,我们就能有效减少不可靠边界点的影响。此外,为了训练网络并评估我们方法的性能,我们准备了一个合成数据集和两个真实数据集。在合成数据集和真实数据集上进行的实验表明,所提出的方法在定性评估和定量指标上都明显优于现有的最先进方法。为了方便未来的研究,我们将在 https://github.com/Li-Li-Whu/DeepRoofPlane 上公开我们方法的数据集和源代码。
{"title":"A boundary-aware point clustering approach in Euclidean and embedding spaces for roof plane segmentation","authors":"Li Li ,&nbsp;Qingqing Li ,&nbsp;Guozheng Xu ,&nbsp;Pengwei Zhou ,&nbsp;Jingmin Tu ,&nbsp;Jie Li ,&nbsp;Mingming Li ,&nbsp;Jian Yao","doi":"10.1016/j.isprsjprs.2024.09.030","DOIUrl":"10.1016/j.isprsjprs.2024.09.030","url":null,"abstract":"<div><div>Roof plane segmentation from airborne light detection and ranging (LiDAR) point clouds is an important technology for three-dimensional (3D) building model reconstruction. One of the key issues of plane segmentation is how to design powerful features that can exactly distinguish adjacent planar patches. The quality of point feature directly determines the accuracy of roof plane segmentation. Most of existing approaches use handcrafted features, such as point-to-plane distance, normal vector, etc., to extract roof planes. However, the abilities of these features are relatively low, especially in boundary areas. To solve this problem, we propose a boundary-aware point clustering approach in Euclidean and embedding spaces constructed by a multi-task deep network for roof plane segmentation. We design a three-branch multi-task network to predict semantic labels, point offsets and extract deep embedding features. In the first branch, we classify the input data as non-roof, boundary and plane points. In the second branch, we predict point offsets for shifting each point towards its respective instance center. In the third branch, we constrain that points of the same plane instance should have the similar embeddings. We aim to ensure that points of the same plane instance are close as much as possible in both Euclidean and embedding spaces. However, although deep network has strong feature representative ability, it is still hard to accurately distinguish points near the plane instance boundary. Therefore, we first robustly group plane points into many clusters in Euclidean and embedding spaces to find candidate planes. Then, we assign the rest boundary points to their closest clusters to generate the final complete roof planes. In this way, we can effectively reduce the influence of unreliable boundary points. In addition, to train the network and evaluate the performance of our approach, we prepare a synthetic dataset and two real datasets. The experiments conducted on synthetic and real datasets show that the proposed approach significantly outperforms the existing state-of-the-art approaches in both qualitative evaluation and quantitative metrics. To facilitate future research, we will make datasets and source code of our approach publicly available at <span><span>https://github.com/Li-Li-Whu/DeepRoofPlane</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 518-530"},"PeriodicalIF":10.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142357926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using difference features effectively: A multi-task network for exploring change areas and change moments in time series remote sensing images 有效利用差异特征:探索时间序列遥感图像中变化区域和变化时刻的多任务网络
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-01 DOI: 10.1016/j.isprsjprs.2024.09.029
Jialu Li, Chen Wu
With the rapid advancement in remote sensing Earth observation technology, an abundance of Time Series multispectral remote sensing Images (TSIs) from platforms like Landsat and Sentinel-2 are now accessible, offering essential data support for Time Series remote sensing images Change Detection (TSCD). However, TSCD faces misalignment challenges due to variations in radiation incidence angles, satellite orbit deviations, and other factors when capturing TSIs at the same geographic location but different times. Furthermore, another important issue that needs immediate attention is the precise determination of change moments for change areas within TSIs. To tackle these challenges, this paper proposes Multi-RLD-Net, a multi-task network that efficiently utilizes difference features to explore change areas and corresponding change moments in TSIs. To the best of our knowledge, this is the first time that using deep learning for identifying change moments in TSIs. Multi-RLD-Net integrates Optical Flow with Long Short-Term Memory (LSTM) to derive differences between TSIs. Initially, a lightweight encoder is introduced to extract multi-scale spatial features, which maximally preserve original features through a siamese structure. Subsequently, shallow spatial features extracted by the encoder are input into the novel Recursive Optical Flow Difference (ROD) module to align input features and detect differences between them, while deep spatial features extracted by the encoder are input into LSTM to capture long-term temporal dependencies and differences between hidden states. Both branches output differences among TSIs, enhancing the expressive capacity of the model. Finally, the decoder identifies change areas and their corresponding change moments using multi-task branches. Experiments on UTRNet dataset and DynamicEarthNet dataset demonstrate that proposed RLD-Net and Multi-RLD-Net outperform representative approaches, achieving F1 value improvements of 1.29% and 10.42% compared to the state-of-the art method MC2ABNet. The source code will be available soon at https://github.com/lijialu144/Multi-RLD-Net.
随着遥感地球观测技术的快速发展,现在可以从大地遥感卫星和哨兵-2 等平台获取大量时间序列多光谱遥感图像(TSIs),为时间序列遥感图像变化探测(TSCD)提供重要的数据支持。然而,由于辐射入射角的变化、卫星轨道偏差和其他因素,在捕捉同一地理位置但不同时间的 TSIs 时,TSCD 面临着不对齐的挑战。此外,另一个亟需关注的重要问题是如何精确确定 TSI 中变化区域的变化时刻。为了应对这些挑战,本文提出了多任务网络 Multi-RLD-Net,它能有效利用差异特征来探索 TSI 中的变化区域和相应的变化时刻。据我们所知,这是首次利用深度学习识别 TSI 中的变化时刻。Multi-RLD-Net 将光流与长短时记忆(LSTM)相结合,以推导出 TSI 之间的差异。首先,引入轻量级编码器来提取多尺度空间特征,通过连体结构最大限度地保留原始特征。随后,编码器提取的浅层空间特征被输入到新颖的递归光流差分(ROD)模块中,以对齐输入特征并检测它们之间的差异;编码器提取的深层空间特征被输入到 LSTM 中,以捕捉隐藏状态之间的长期时间依赖性和差异。这两个分支都会输出 TSI 之间的差异,从而增强模型的表达能力。最后,解码器使用多任务分支识别变化区域及其相应的变化时刻。在 UTRNet 数据集和 DynamicEarthNet 数据集上的实验表明,所提出的 RLD-Net 和 Multi-RLD-Net 优于代表性方法,与最先进的方法 MC2ABNet 相比,F1 值分别提高了 1.29% 和 10.42%。源代码即将在 https://github.com/lijialu144/Multi-RLD-Net 上公布。
{"title":"Using difference features effectively: A multi-task network for exploring change areas and change moments in time series remote sensing images","authors":"Jialu Li,&nbsp;Chen Wu","doi":"10.1016/j.isprsjprs.2024.09.029","DOIUrl":"10.1016/j.isprsjprs.2024.09.029","url":null,"abstract":"<div><div>With the rapid advancement in remote sensing Earth observation technology, an abundance of Time Series multispectral remote sensing Images (TSIs) from platforms like Landsat and Sentinel-2 are now accessible, offering essential data support for Time Series remote sensing images Change Detection (TSCD). However, TSCD faces misalignment challenges due to variations in radiation incidence angles, satellite orbit deviations, and other factors when capturing TSIs at the same geographic location but different times. Furthermore, another important issue that needs immediate attention is the precise determination of change moments for change areas within TSIs. To tackle these challenges, this paper proposes Multi-RLD-Net, a multi-task network that efficiently utilizes difference features to explore change areas and corresponding change moments in TSIs. To the best of our knowledge, this is the first time that using deep learning for identifying change moments in TSIs. Multi-RLD-Net integrates Optical Flow with Long Short-Term Memory (LSTM) to derive differences between TSIs. Initially, a lightweight encoder is introduced to extract multi-scale spatial features, which maximally preserve original features through a siamese structure. Subsequently, shallow spatial features extracted by the encoder are input into the novel Recursive Optical Flow Difference (ROD) module to align input features and detect differences between them, while deep spatial features extracted by the encoder are input into LSTM to capture long-term temporal dependencies and differences between hidden states. Both branches output differences among TSIs, enhancing the expressive capacity of the model. Finally, the decoder identifies change areas and their corresponding change moments using multi-task branches. Experiments on UTRNet dataset and DynamicEarthNet dataset demonstrate that proposed RLD-Net and Multi-RLD-Net outperform representative approaches, achieving F1 value improvements of 1.29% and 10.42% compared to the state-of-the art method MC<sup>2</sup>ABNet. The source code will be available soon at <span><span>https://github.com/lijialu144/Multi-RLD-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 487-505"},"PeriodicalIF":10.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142357944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mangrove mapping in China using Gaussian mixture model with a novel mangrove index (SSMI) derived from optical and SAR imagery 利用高斯混合物模型以及从光学和合成孔径雷达图像得出的新型红树林指数(SSMI)绘制中国红树林地图
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-09-28 DOI: 10.1016/j.isprsjprs.2024.09.026
Zhaojun Chen , Huaiqing Zhang , Meng Zhang , Yehong Wu , Yang Liu
As an important shoreline vegetation and highly productive ecosystem, mangroves play an essential role in the protection of coastlines and ecological diversity. Accurate mapping of the spatial distribution of mangroves is crucial for the protection and restoration of mangrove ecosystems. Supervised classification methods rely on large sample sets and complex classifiers and traditional thresholding methods that require empirical thresholds, given the problems that limit the feasibility and stability of existing mangrove identification and mapping methods on large scales. Thus, this paper develops a novel mangrove index (spectral and SAR mangrove index, SSMI) and Gaussian mixture model (GMM) mangrove mapping method, which does not require training samples and can automatically and accurately map mangrove boundaries by utilizing only single-scene Sentinel-1 and single-scene Sentinel-2 images from the same time period. The SSMI capitalizes on the fact that mangroves are differentiated from other land cover types in terms of optical characteristics (greenness and moisture) and backscattering coefficients of SAR images and ultimately highlights mangrove forest information through the product of three expressions (f(S) = red egde/SWIR1, f(B) = 1/(1 + e-VH), f(W)=(NIR-SWIR1)/(NIR+SWIR1)). The proposed SSMI was tested in six typical mangrove distribution areas in China where climatic conditions and mangrove species vary widely. The results indicated that the SSMI was more capable of mapping mangrove forests than the other mangrove indices (CMRI, NDMI, MVI, and MI), with overall accuracys (OA) higher than 0.90 and F1 scores as high as 0.93 for the other five areas except for the Maowei Gulf (S5). Moreover, the mangrove maps generated by the SSMI were highly consistent with the reference maps (HGMF_2020、LASAC_2018 and IMMA). In addition, the SSMI achieves stable performance, as shown by the mapping results of the other two classification methods (K-means and Otsu’s algorithm). Mangrove mapping in six typical mangrove distribution areas in China for five consecutive years (2019–2023) and experiments in three Southeast Asian countries with major mangrove distributions (Thailand, Vietnam, and Indonesia) demonstrated that the SSMIs constructed in this paper are highly stable across time and space. The SSMI proposed in this paper does not require reference samples or predefined parameters; thus, it has great flexibility and applicability in mapping mangroves on a large scale, especially in cloudy areas.
作为重要的海岸线植被和高产生态系统,红树林在保护海岸线和生态多样性方面发挥着至关重要的作用。准确绘制红树林的空间分布图对于保护和恢复红树林生态系统至关重要。监督分类方法依赖于大型样本集和复杂的分类器,而传统的阈值法需要经验阈值,这些问题限制了现有红树林识别和绘图方法在大尺度上的可行性和稳定性。因此,本文开发了一种新的红树林指数(光谱和合成孔径雷达红树林指数,SSMI)和高斯混合模型(GMM)红树林绘图方法,该方法无需训练样本,仅利用同一时期的单场景哨兵-1 和单场景哨兵-2 图像即可自动准确地绘制红树林边界。SSMI 利用了红树林与其他土地覆被类型在合成孔径雷达图像的光学特征(绿度和湿度)和后向散射系数方面的区别,并通过三个表达式(f(S) = red egde/SWIR1;f(B) = 1/(1+e-VH);f(W)=(NIR-SWIR1)/(NIR+SWIR1))的乘积最终突出了红树林信息。在气候条件和红树林物种差异较大的中国六个典型红树林分布区对所提出的 SSMI 进行了测试。结果表明,与其他红树林指数(CMRI、NDMI、MVI 和 MI)相比,SSMI 更能绘制红树林图,除茅尾湾(S5)外,其他五个地区的总精度(OA)均高于 0.90,F1 分数高达 0.93。此外,SSMI 生成的红树林地图与参考地图(HGMF_2020、LASAC_2018 和 IMMA)高度一致。此外,从其他两种分类方法(K-means 和大津算法)的绘图结果来看,SSMI 的性能也很稳定。连续五年(2019-2023 年)在中国六个典型红树林分布区进行的红树林测绘以及在东南亚三个主要红树林分布国家(泰国、越南和印度尼西亚)进行的实验表明,本文构建的 SSMI 在时间和空间上都具有很高的稳定性。本文提出的 SSMI 不需要参考样本或预定义参数,因此在大范围绘制红树林图,尤其是在多云地区具有极大的灵活性和适用性。
{"title":"Mangrove mapping in China using Gaussian mixture model with a novel mangrove index (SSMI) derived from optical and SAR imagery","authors":"Zhaojun Chen ,&nbsp;Huaiqing Zhang ,&nbsp;Meng Zhang ,&nbsp;Yehong Wu ,&nbsp;Yang Liu","doi":"10.1016/j.isprsjprs.2024.09.026","DOIUrl":"10.1016/j.isprsjprs.2024.09.026","url":null,"abstract":"<div><div>As an important shoreline vegetation and highly productive ecosystem, mangroves play an essential role in the protection of coastlines and ecological diversity. Accurate mapping of the spatial distribution of mangroves is crucial for the protection and restoration of mangrove ecosystems. Supervised classification methods rely on large sample sets and complex classifiers and traditional thresholding methods that require empirical thresholds, given the problems that limit the feasibility and stability of existing mangrove identification and mapping methods on large scales. Thus, this paper develops a novel mangrove index (spectral and SAR mangrove index, SSMI) and Gaussian mixture model (GMM) mangrove mapping method, which does not require training samples and can automatically and accurately map mangrove boundaries by utilizing only single-scene Sentinel-1 and single-scene Sentinel-2 images from the same time period. The SSMI capitalizes on the fact that mangroves are differentiated from other land cover types in terms of optical characteristics (greenness and moisture) and backscattering coefficients of SAR images and ultimately highlights mangrove forest information through the product of three expressions (<em>f</em>(<em>S</em>) = red egde/SWIR1, <em>f</em>(<em>B</em>) = 1/(1 + e<sup>-VH</sup>), <em>f</em>(<em>W</em>)=(NIR-SWIR1)/(NIR+SWIR1)). The proposed SSMI was tested in six typical mangrove distribution areas in China where climatic conditions and mangrove species vary widely. The results indicated that the SSMI was more capable of mapping mangrove forests than the other mangrove indices (CMRI, NDMI, MVI, and MI), with overall accuracys (OA) higher than 0.90 and F1 scores as high as 0.93 for the other five areas except for the Maowei Gulf (S5). Moreover, the mangrove maps generated by the SSMI were highly consistent with the reference maps (HGMF_2020、LASAC_2018 and IMMA). In addition, the SSMI achieves stable performance, as shown by the mapping results of the other two classification methods (K-means and Otsu’s algorithm). Mangrove mapping in six typical mangrove distribution areas in China for five consecutive years (2019–2023) and experiments in three Southeast Asian countries with major mangrove distributions (Thailand, Vietnam, and Indonesia) demonstrated that the SSMIs constructed in this paper are highly stable across time and space. The SSMI proposed in this paper does not require reference samples or predefined parameters; thus, it has great flexibility and applicability in mapping mangroves on a large scale, especially in cloudy areas.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 466-486"},"PeriodicalIF":10.6,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142357925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1