首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
TSG-Seg: Temporal-selective guidance for semi-supervised semantic segmentation of 3D LiDAR point clouds TSG-Seg:用于三维激光雷达点云半监督语义分割的时间选择性指导
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-08-08 DOI: 10.1016/j.isprsjprs.2024.07.020

LiDAR-based semantic scene understanding holds a pivotal role in various applications, including remote sensing and autonomous driving. However, the majority of LiDAR segmentation models rely on extensive and densely annotated training datasets, which is extremely laborious to annotate and hinder the widespread adoption of LiDAR systems. Semi-supervised learning (SSL) offers a promising solution by leveraging only a small amount of labeled data and a larger set of unlabeled data, aiming to train robust models with desired accuracy comparable to fully supervised learning. A typical pipeline of SSL involves the initial use of labeled data to train segmentation models, followed by the utilization of predictions generated from unlabeled data, which are used as pseudo-ground truths for model retraining. However, the scarcity of labeled data limits the capture of comprehensive representations, leading to the constraints of these pseudo-ground truths in reliability. We observed that objects captured by LiDAR sensors from varying perspectives showcase diverse data characteristics due to occlusions and distance variation, and LiDAR segmentation models trained with limited labels prove susceptible to these viewpoint disparities, resulting in inaccurately predicted pseudo-ground truths across viewpoints and the accumulation of retraining errors. To address this problem, we introduce the Temporal-Selective Guided Learning (TSG-Seg) framework. TSG-Seg explores temporal cues inherent in LiDAR frames to bridge the cross-viewpoint representations, fostering consistent and robust segmentation predictions across differing viewpoints. Specifically, we first establish point-wise correspondences across LiDAR frames with different time stamps through point registration. Subsequently, reliable point predictions are selected and propagated to points from adjacent views to the current view, serving as strong and refined supervision signals for subsequent model re-training to achieve better segmentation. We conducted extensive experiments on various SSL labeling setups across multiple public datasets, including SemanticKITTI and SemanticPOSS, to evaluate the effectiveness of TSG-Seg. Our results demonstrate its competitive performance and robustness in diverse scenarios, from data-limited to data-abundant settings. Notably, TSG-Seg achieves a mIoU of 48.6% using only 5% of and 62.3% with 40% of labeled data in the sequential split on SemanticKITTI. This consistently outperforms state-of-the-art segmentation methods, including GPC and LaserMix. These findings underscore TSG-Seg’s superior capability and potential for real-world applications. The project can be found at https://tsgseg.github.io.

基于激光雷达的语义场景理解在遥感和自动驾驶等各种应用中发挥着举足轻重的作用。然而,大多数激光雷达分割模型都依赖于大量密集标注的训练数据集,标注工作极其繁重,阻碍了激光雷达系统的广泛应用。半监督学习(SSL)提供了一种很有前景的解决方案,它只利用少量标注数据和更大的非标注数据集,旨在训练出稳健的模型,其预期精度可与完全监督学习相媲美。SSL 的典型流程包括首先使用标注数据训练分割模型,然后利用未标注数据生成的预测结果,将其作为伪地面真相用于模型再训练。然而,标注数据的稀缺性限制了对全面表征的捕捉,导致这些伪地面真实的可靠性受到制约。我们观察到,由于遮挡和距离变化,激光雷达传感器从不同视角捕捉到的物体呈现出不同的数据特征,而使用有限标签训练的激光雷达分割模型很容易受到这些视角差异的影响,从而导致不同视角的伪地面真值预测不准确,并积累了再训练误差。为了解决这个问题,我们引入了时间选择性指导学习(TSG-Seg)框架。TSG-Seg 利用激光雷达帧中固有的时间线索来弥合跨视点表征,从而在不同视点之间实现一致、稳健的分割预测。具体来说,我们首先通过点注册建立不同时间戳的激光雷达帧之间的点对应关系。随后,选择可靠的点预测并传播到当前视图相邻视图的点上,作为后续模型再训练的强大而精细的监督信号,以实现更好的分割。我们在多个公共数据集(包括 SemanticKITTI 和 SemanticPOSS)的各种 SSL 标签设置上进行了广泛的实验,以评估 TSG-Seg 的有效性。实验结果表明,在从数据有限到数据丰富的各种场景中,TSG-Seg 都具有极具竞争力的性能和鲁棒性。值得注意的是,在SemanticKITTI的顺序分割中,TSG-Seg仅使用5%的标记数据就实现了48.6%的mIoU,使用40%的标记数据实现了62.3%的mIoU。这始终优于最先进的分割方法,包括GPC和LaserMix。这些发现凸显了TSG-Seg在实际应用中的卓越能力和潜力。该项目的网址是 。
{"title":"TSG-Seg: Temporal-selective guidance for semi-supervised semantic segmentation of 3D LiDAR point clouds","authors":"","doi":"10.1016/j.isprsjprs.2024.07.020","DOIUrl":"10.1016/j.isprsjprs.2024.07.020","url":null,"abstract":"<div><p>LiDAR-based semantic scene understanding holds a pivotal role in various applications, including remote sensing and autonomous driving. However, the majority of LiDAR segmentation models rely on extensive and densely annotated training datasets, which is extremely laborious to annotate and hinder the widespread adoption of LiDAR systems. Semi-supervised learning (SSL) offers a promising solution by leveraging only a small amount of labeled data and a larger set of unlabeled data, aiming to train robust models with desired accuracy comparable to fully supervised learning. A typical pipeline of SSL involves the initial use of labeled data to train segmentation models, followed by the utilization of predictions generated from unlabeled data, which are used as pseudo-ground truths for model retraining. However, the scarcity of labeled data limits the capture of comprehensive representations, leading to the constraints of these pseudo-ground truths in reliability. We observed that objects captured by LiDAR sensors from varying perspectives showcase diverse data characteristics due to occlusions and distance variation, and LiDAR segmentation models trained with limited labels prove susceptible to these viewpoint disparities, resulting in inaccurately predicted pseudo-ground truths across viewpoints and the accumulation of retraining errors. To address this problem, we introduce the Temporal-Selective Guided Learning (TSG-Seg) framework. TSG-Seg explores temporal cues inherent in LiDAR frames to bridge the cross-viewpoint representations, fostering consistent and robust segmentation predictions across differing viewpoints. Specifically, we first establish point-wise correspondences across LiDAR frames with different time stamps through point registration. Subsequently, reliable point predictions are selected and propagated to points from adjacent views to the current view, serving as strong and refined supervision signals for subsequent model re-training to achieve better segmentation. We conducted extensive experiments on various SSL labeling setups across multiple public datasets, including SemanticKITTI and SemanticPOSS, to evaluate the effectiveness of TSG-Seg. Our results demonstrate its competitive performance and robustness in diverse scenarios, from data-limited to data-abundant settings. Notably, TSG-Seg achieves a mIoU of 48.6% using only 5% of and 62.3% with 40% of labeled data in the sequential split on SemanticKITTI. This consistently outperforms state-of-the-art segmentation methods, including GPC and LaserMix. These findings underscore TSG-Seg’s superior capability and potential for real-world applications. The project can be found at <span><span>https://tsgseg.github.io</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141904827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised multi-class tree crown delineation using aerial multispectral imagery and lidar data 利用航空多光谱图像和激光雷达数据进行半监督多类树冠划分
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-08-08 DOI: 10.1016/j.isprsjprs.2024.07.032

The segmentation of individual trees based on deep learning is more accurate than conventional meth- ods. However, a sufficient amount of training data is mandatory to leverage the accuracy potential of deep learning-based approaches. Semi-supervised learning techniques, by contrast, can help simplify the time-consuming labelling process. In this study, we introduce a new semi-supervised tree segmen- tation approach for the precise delineation and classification of individual trees that takes advantage of pre-clustered tree training labels. Specifically, the instance segmentation Mask R-CNN is combined with the normalized cut clustering method, which is applied to lidar point clouds. The study areas were located in the Bavarian Forest National Park, southeast Germany, where the tree composition includes coniferous, deciduous and mixed forest. Important tree species are European beech (Fagus sylvatica), Norway spruce (Picea abies) and silver fir (Abies alba). Multispectral image data with a ground sample distance of 10 cm and laser scanning data with a point density of approximately 55 points/m2 were acquired in June 2017. From the laser scanning data, three-channel images with a resolution of 10 cm were generated. The models were tested in seven reference plots in the national park, with a total of 516 trees measured on the ground. When the color infrared images were used, the experiments demonstrated that the Mask R-CNN models, trained with the tree labels generated through lidar-based clustering, yielded mean F1 scores of 79 % that were up to 18 % higher than those of the normalized cut baseline method and thus significantly improved. Similarly, the mean over- all accuracy of the classification results for the coniferous, deciduous, and standing deadwood tree groups was 96 % and enhanced by up to 6 % compared with the baseline classification approach. The experiments with lidar-based images yielded slightly worse (1–2 %) results both for segmentation and for classification. Our study demonstrates the utility of this simplified training data preparation pro- cedure, which leads to models trained with significantly larger amounts of data than is feasible with with manual labelling. The accuracy improvement of up to 18 % in terms of the F1 score is further evidence of its advantages.

与传统方法相比,基于深度学习的单棵树分割更为精确。然而,要发挥基于深度学习的方法的准确性潜力,必须有足够数量的训练数据。相比之下,半监督学习技术可以帮助简化耗时的标记过程。在本研究中,我们引入了一种新的半监督树划分方法,利用预先聚类的树训练标签,对单棵树进行精确划分和分类。具体来说,我们将实例分割面具 R-CNN 与归一化切割聚类方法相结合,并将其应用于激光雷达点云。研究区域位于德国东南部的巴伐利亚森林国家公园,那里的树木组成包括针叶林、落叶林和混交林。重要树种有欧洲山毛榉()、挪威云杉()和银杉()。2017 年 6 月获取了地面采样距离为 10 的多光谱图像数据和点密度约为 55 的激光扫描数据。根据激光扫描数据生成了分辨率为 10 的三通道图像。这些模型在国家公园的七个参考地块中进行了测试,共实地测量了 516 棵树。实验表明,使用基于激光雷达聚类生成的树木标签训练的 Mask R-CNN 模型在使用彩色红外图像时,平均 F1 分数为 79%,比归一化剪切基线方法高出 18%,因此得到了显著提高。同样,针叶树、落叶树和枯木树分类结果的平均总体准确率为 96%,与基线分类方法相比提高了 6%。基于激光雷达图像的实验在分割和分类方面的结果都略差(1-2%)。我们的研究证明了这种简化的训练数据准备方法的实用性,与人工标注方法相比,这种方法可以使用更多的数据来训练模型。从 F1 分数来看,准确率提高了 18%,这进一步证明了它的优势。
{"title":"Semi-supervised multi-class tree crown delineation using aerial multispectral imagery and lidar data","authors":"","doi":"10.1016/j.isprsjprs.2024.07.032","DOIUrl":"10.1016/j.isprsjprs.2024.07.032","url":null,"abstract":"<div><p>The segmentation of individual trees based on deep learning is more accurate than conventional meth- ods. However, a sufficient amount of training data is mandatory to leverage the accuracy potential of deep learning-based approaches. Semi-supervised learning techniques, by contrast, can help simplify the time-consuming labelling process. In this study, we introduce a new semi-supervised tree segmen- tation approach for the precise delineation and classification of individual trees that takes advantage of pre-clustered tree training labels. Specifically, the instance segmentation Mask R-CNN is combined with the normalized cut clustering method, which is applied to lidar point clouds. The study areas were located in the Bavarian Forest National Park, southeast Germany, where the tree composition includes coniferous, deciduous and mixed forest. Important tree species are European beech (<em>Fagus sylvatica</em>), Norway spruce (<em>Picea abies</em>) and silver fir (<em>Abies alba</em>). Multispectral image data with a ground sample distance of 10 <em>cm</em> and laser scanning data with a point density of approximately 55 <em>points/m</em><sup>2</sup> were acquired in June 2017. From the laser scanning data, three-channel images with a resolution of 10 <em>cm</em> were generated. The models were tested in seven reference plots in the national park, with a total of 516 trees measured on the ground. When the color infrared images were used, the experiments demonstrated that the Mask R-CNN models, trained with the tree labels generated through lidar-based clustering, yielded mean F1 scores of 79 % that were up to 18 % higher than those of the normalized cut baseline method and thus significantly improved. Similarly, the mean over- all accuracy of the classification results for the coniferous, deciduous, and standing deadwood tree groups was 96 % and enhanced by up to 6 % compared with the baseline classification approach. The experiments with lidar-based images yielded slightly worse (1–2 %) results both for segmentation and for classification. Our study demonstrates the utility of this simplified training data preparation pro- cedure, which leads to models trained with significantly larger amounts of data than is feasible with with manual labelling. The accuracy improvement of up to 18 % in terms of the F1 score is further evidence of its advantages.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141904825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The importance of spatial scale and vegetation complexity in woody species diversity and its relationship with remotely sensed variables 空间尺度和植被复杂性对木本物种多样性的重要性及其与遥感变量的关系
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-08-07 DOI: 10.1016/j.isprsjprs.2024.07.029

Plant species diversity is key to ecosystem functioning, but in recent decades anthropogenic activities have prompted an alarming decline in this community trait. Thus, developing strategies to understand diversity dynamics based on affordable and efficient remote sensing monitoring is essential, as well as examining the relevance of spatial scale and vegetation structural complexity to these dynamics. Here, we used two mathematical approaches to assess the relationship between tropical woody species diversity and spectral diversity in a human-modified landscape in two vegetation types differing in their degree of complexity. Vegetation complexity was measured through the fraction of species that concentrate different proportions of the cumulative importance value index. Species diversity was assessed using Hill numbers at three spatial scales, and metrics of spectral heterogeneity, vegetation indices, as well as raw data from Landsat 9 and Sentinel-2 sensors were calculated and analysed through general linear models (GLM) and Random Forest. Vegetation complexity emerged as an important variable in modelling species from remote sensing metrics, indicating the need to model species diversity by vegetation type rather than region. Hill numbers showed different relationships with remotely sensed metrics, in consistency with the scale-dependency of ecological processes on species diversity. Contrary to multiple previous reports, in our study, GLMs produced the best fits between Hill numbers of all orders and remotely sensed metrics. If we are to meet the need of conducting efficient and speedy woody species diversity monitoring globally, we propose modelling this diversity from remotely-sensed variables as an attractive strategy, so long as the intrinsic properties of each vegetation type are acknowledged to avoid under- or overestimation biases.

植物物种多样性是生态系统功能的关键,但近几十年来,人类活动已导致这一群落特征惊人地减少。因此,在经济、高效的遥感监测基础上制定了解多样性动态的策略,以及研究空间尺度和植被结构复杂性与这些动态的相关性至关重要。在这里,我们使用两种数学方法来评估人类改造景观中两种复杂程度不同的植被类型中热带木本物种多样性与光谱多样性之间的关系。植被复杂度是通过集中了累积重要性值指数不同比例的物种的比例来衡量的。使用希尔数评估了三种空间尺度的物种多样性,并计算了光谱异质性指标、植被指数以及来自 Landsat 9 和 Sentinel-2 传感器的原始数据,并通过一般线性模型(GLM)和随机森林进行了分析。植被复杂度是利用遥感指标建立物种模型的一个重要变量,这表明需要根据植被类型而不是区域建立物种多样性模型。山丘数量与遥感指标显示出不同的关系,这与生态过程对物种多样性的规模依赖性是一致的。与之前的多份报告相反,在我们的研究中,GLM 在各阶希尔数与遥感指标之间产生了最佳拟合。如果我们要满足在全球范围内开展高效、快速的木本物种多样性监测的需要,我们建议利用遥感变量对物种多样性进行建模是一种有吸引力的策略,但前提是必须承认每种植被类型的固有特性,以避免低估或高估偏差。
{"title":"The importance of spatial scale and vegetation complexity in woody species diversity and its relationship with remotely sensed variables","authors":"","doi":"10.1016/j.isprsjprs.2024.07.029","DOIUrl":"10.1016/j.isprsjprs.2024.07.029","url":null,"abstract":"<div><p>Plant species diversity is key to ecosystem functioning, but in recent decades anthropogenic activities have prompted an alarming decline in this community trait. Thus, developing strategies to understand diversity dynamics based on affordable and efficient remote sensing monitoring is essential, as well as examining the relevance of spatial scale and vegetation structural complexity to these dynamics. Here, we used two mathematical approaches to assess the relationship between tropical woody species diversity and spectral diversity in a human-modified landscape in two vegetation types differing in their degree of complexity. Vegetation complexity was measured through the fraction of species that concentrate different proportions of the cumulative importance value index. Species diversity was assessed using Hill numbers at three spatial scales, and metrics of spectral heterogeneity, vegetation indices, as well as raw data from Landsat 9 and Sentinel-2 sensors were calculated and analysed through general linear models (GLM) and Random Forest. Vegetation complexity emerged as an important variable in modelling species from remote sensing metrics, indicating the need to model species diversity by vegetation type rather than region. Hill numbers showed different relationships with remotely sensed metrics, in consistency with the scale-dependency of ecological processes on species diversity. Contrary to multiple previous reports, in our study, GLMs produced the best fits between Hill numbers of all orders and remotely sensed metrics. If we are to meet the need of conducting efficient and speedy woody species diversity monitoring globally, we propose modelling this diversity from remotely-sensed variables as an attractive strategy, so long as the intrinsic properties of each vegetation type are acknowledged to avoid under- or overestimation biases.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141904830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Training-free thick cloud removal for Sentinel-2 imagery using value propagation interpolation 利用值传播插值法为哨兵-2 号图像去除厚云,无需训练
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-08-07 DOI: 10.1016/j.isprsjprs.2024.07.030

Remote sensing imagery has an ever-increasing impact on important downstream applications, such as vegetation monitoring and climate change modelling. Clouds obscuring parts of the images create a substantial bottleneck in most machine learning tasks that use remote sensing data, and being robust to this issue is an important technical challenge. In many cases, cloudy images cannot be used in a machine learning pipeline, leading to either the removal of the images altogether, or to using suboptimal solutions reliant on recent cloud-free imagery or the availability of pre-trained models for the exact use case. In this work, we propose VPint2, a cloud removal method built upon the VPint algorithm, an easy-to-apply data-driven spatial interpolation method requiring no prior training, to address the problem of cloud removal. This method leverages previously sensed cloud-free images to represent the spatial structure of a region, which is then used to propagate up-to-date information from non-cloudy pixels to cloudy ones. We also created a benchmark dataset called SEN2-MSI-T, composed of 20 scenes with 5 full-sized images each, belonging to five common land cover classes. We used this dataset to evaluate our method against three alternatives: mosaicking, an AutoML-based regression method, and the nearest similar pixel interpolator. Additionally, we compared against two previously published neural network-based methods on SEN2-MSI-T, and evaluate our method on a subset of the popular SEN12MS-CR-TS benchmark dataset. The methods are compared using several performance metrics, including the structural similarity index, mean absolute error, and error rates on a downstream NDVI derivation task. Our experimental results show that VPint2 performed significantly better than competing methods over 20 experimental conditions, improving performance by 2.4% to 34.3% depending on the condition. We also found that the performance of VPint2 only decreases marginally as the temporal distance of its reference image increases, and that, unlike typical interpolation methods, the performance of VPint2 remains strong for larger percentages of cloud cover. Our findings furthermore support a cloud removal evaluation approach founded on the transfer of cloud masks over the use of cloud-free previous acquisitions as ground truth.

遥感图像对植被监测和气候变化建模等重要下游应用的影响与日俱增。在大多数使用遥感数据的机器学习任务中,云层遮挡了部分图像会造成很大的瓶颈,因此如何稳健地解决这一问题是一项重要的技术挑战。在很多情况下,多云图像不能用于机器学习管道,这导致要么完全删除图像,要么使用次优解决方案,依赖于最近的无云图像或针对具体使用情况的预训练模型。在这项工作中,我们提出了一种基于 VPint 算法的云去除方法 VPint2,这是一种易于应用的数据驱动空间插值方法,无需事先训练,即可解决云去除问题。该方法利用之前感测到的无云图像来表示一个区域的空间结构,然后利用该结构将最新信息从无云像素传播到有云像素。我们还创建了一个名为 SEN2-MSI-T 的基准数据集,该数据集由 20 个场景组成,每个场景有 5 幅全尺寸图像,分别属于五个常见的土地覆被类别。我们使用该数据集对我们的方法与三种替代方法进行了评估:镶嵌法、基于 AutoML 的回归法和最近相似像素插值法。此外,我们还在 SEN2-MSI-T 数据集上与之前发布的两种基于神经网络的方法进行了比较,并在广受欢迎的 SEN12MS-CR-TS 基准数据集的一个子集上对我们的方法进行了评估。这些方法使用多个性能指标进行比较,包括结构相似性指数、平均绝对误差和下游 NDVI 推导任务的误差率。实验结果表明,在 20 种实验条件下,VPint2 的性能明显优于其他竞争方法,根据条件的不同,性能提高了 2.4% 到 34.3%。我们还发现,VPint2 的性能只会随着其参考图像的时间距离的增加而略有下降,而且与典型的插值方法不同,VPint2 在云层覆盖比例较大的情况下仍然表现出色。此外,我们的研究结果还支持基于云掩膜转移的云去除评估方法,而不是使用无云的先前采集图像作为地面实况。
{"title":"Training-free thick cloud removal for Sentinel-2 imagery using value propagation interpolation","authors":"","doi":"10.1016/j.isprsjprs.2024.07.030","DOIUrl":"10.1016/j.isprsjprs.2024.07.030","url":null,"abstract":"<div><p>Remote sensing imagery has an ever-increasing impact on important downstream applications, such as vegetation monitoring and climate change modelling. Clouds obscuring parts of the images create a substantial bottleneck in most machine learning tasks that use remote sensing data, and being robust to this issue is an important technical challenge. In many cases, cloudy images cannot be used in a machine learning pipeline, leading to either the removal of the images altogether, or to using suboptimal solutions reliant on recent cloud-free imagery or the availability of pre-trained models for the exact use case. In this work, we propose VPint2, a cloud removal method built upon the VPint algorithm, an easy-to-apply data-driven spatial interpolation method requiring no prior training, to address the problem of cloud removal. This method leverages previously sensed cloud-free images to represent the spatial structure of a region, which is then used to propagate up-to-date information from non-cloudy pixels to cloudy ones. We also created a benchmark dataset called SEN2-MSI-T, composed of 20 scenes with 5 full-sized images each, belonging to five common land cover classes. We used this dataset to evaluate our method against three alternatives: mosaicking, an AutoML-based regression method, and the nearest similar pixel interpolator. Additionally, we compared against two previously published neural network-based methods on SEN2-MSI-T, and evaluate our method on a subset of the popular SEN12MS-CR-TS benchmark dataset. The methods are compared using several performance metrics, including the structural similarity index, mean absolute error, and error rates on a downstream NDVI derivation task. Our experimental results show that VPint2 performed significantly better than competing methods over 20 experimental conditions, improving performance by 2.4% to 34.3% depending on the condition. We also found that the performance of VPint2 only decreases marginally as the temporal distance of its reference image increases, and that, unlike typical interpolation methods, the performance of VPint2 remains strong for larger percentages of cloud cover. Our findings furthermore support a cloud removal evaluation approach founded on the transfer of cloud masks over the use of cloud-free previous acquisitions as ground truth.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0924271624002995/pdfft?md5=03a0c3a91dc45f1f5a72905e72c7cb9d&pid=1-s2.0-S0924271624002995-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141904829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond clouds: Seamless flood mapping using Harmonized Landsat and Sentinel-2 time series imagery and water occurrence data 超越云层:利用协调大地遥感卫星和哨兵-2 时间序列图像以及水文发生数据进行无缝洪水测绘
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-08-07 DOI: 10.1016/j.isprsjprs.2024.07.022

Floods are among the most devastating natural disasters, posing significant risks to life, property, and infrastructure globally. Earth observation satellites provide data for continuous and extensive flood monitoring, yet limitations exist in the spatial completeness of monitoring using optical images due to cloud cover. Recent studies have developed gap-filling methods for reconstructing cloud-covered areas in water maps. However, these methods are not tailored for and validated in cloudy and rainy flooding scenarios with rapid water extent changes and limited clear-sky observations, leaving room for further improvements. This study investigated and developed a novel reconstruction method for time series flood extent mapping, supporting spatially seamless monitoring of flood extents. The proposed method first identified surface water from time series images using a fine-tuned large foundation model. Then, the cloud-covered areas in the water maps were reconstructed, adhering to the introduced submaximal stability assumption, on the basis of the prior water occurrence data in the Global Surface Water dataset. The reconstructed time series water maps were refined through spatiotemporal Markov random field modeling for the final delineation of flooding areas. The effectiveness of the proposed method was evaluated with Harmonized Landsat and Sentinel-2 datasets under varying cloud cover conditions, enabling seamless flood mapping at 2–3-day frequency and 30 m resolution. Experiments at four global sites confirmed the superiority of the proposed method. It achieved higher reconstruction accuracy with average F1-scores of 0.931 during floods and 0.903 before/after floods, outperforming the typical gap-filling method with average F1-scores of 0.871 and 0.772, respectively. Additionally, the maximum flood extent maps and flood duration maps, which were composed on the basis of the reconstructed water maps, were more accurate than those using the original cloud-contaminated water maps. The benefits of synthetic aperture radar images (e.g., Sentinel-1) for enhancing flood mapping under cloud cover conditions were also discussed. The method proposed in this paper provided an effective way for flood monitoring in cloudy and rainy scenarios, supporting emergency response and disaster management. The code and datasets used in this study have been made available online (https://github.com/dr-lizhiwei/SeamlessFloodMapper).

洪水是最具破坏性的自然灾害之一,对全球生命、财产和基础设施构成重大风险。地球观测卫星为连续、广泛的洪水监测提供数据,但由于云层覆盖,使用光学图像进行监测的空间完整性受到限制。最近的研究开发了一些填补空白的方法,用于重建水地图中的云层覆盖区域。然而,这些方法并不适合水域范围变化迅速、晴空观测有限的多云和多雨洪水情况,也没有在这种情况下得到验证,因此仍有进一步改进的空间。本研究调查并开发了一种用于绘制时间序列洪水范围图的新型重建方法,以支持洪水范围的空间无缝监测。该方法首先使用微调的大型基础模型从时间序列图像中识别地表水。然后,以全球地表水数据集中的先期水情发生数据为基础,遵循引入的次最大稳定性假设,重建水情图中的云覆盖区域。通过时空马尔可夫随机场建模,对重建的时间序列水地图进行细化,以最终划定洪涝区。在不同的云层覆盖条件下,利用协调大地卫星和哨兵-2 数据集评估了所提方法的有效性,从而实现了 2-3 天频率和 30 米分辨率的无缝洪水测绘。在全球四个地点进行的实验证实了拟议方法的优越性。它实现了更高的重建精度,洪水期间的平均 F1 分数为 0.931,洪水前后的平均 F1 分数为 0.903,优于典型的填隙法(平均 F1 分数分别为 0.871 和 0.772)。此外,在重建水图的基础上绘制的最大洪水范围图和洪水持续时间图比使用原始云污染水图绘制的更为精确。此外,还讨论了合成孔径雷达图像(如哨兵-1)在云层覆盖条件下增强洪水测绘的优势。本文提出的方法为多云和多雨情况下的洪水监测提供了有效途径,为应急响应和灾害管理提供了支持。本研究中使用的代码和数据集可在网上查阅()。
{"title":"Beyond clouds: Seamless flood mapping using Harmonized Landsat and Sentinel-2 time series imagery and water occurrence data","authors":"","doi":"10.1016/j.isprsjprs.2024.07.022","DOIUrl":"10.1016/j.isprsjprs.2024.07.022","url":null,"abstract":"<div><p>Floods are among the most devastating natural disasters, posing significant risks to life, property, and infrastructure globally. Earth observation satellites provide data for continuous and extensive flood monitoring, yet limitations exist in the spatial completeness of monitoring using optical images due to cloud cover. Recent studies have developed gap-filling methods for reconstructing cloud-covered areas in water maps. However, these methods are not tailored for and validated in cloudy and rainy flooding scenarios with rapid water extent changes and limited clear-sky observations, leaving room for further improvements. This study investigated and developed a novel reconstruction method for time series flood extent mapping, supporting spatially seamless monitoring of flood extents. The proposed method first identified surface water from time series images using a fine-tuned large foundation model. Then, the cloud-covered areas in the water maps were reconstructed, adhering to the introduced submaximal stability assumption, on the basis of the prior water occurrence data in the Global Surface Water dataset. The reconstructed time series water maps were refined through spatiotemporal Markov random field modeling for the final delineation of flooding areas. The effectiveness of the proposed method was evaluated with Harmonized Landsat and Sentinel-2 datasets under varying cloud cover conditions, enabling seamless flood mapping at 2–3-day frequency and 30 m resolution. Experiments at four global sites confirmed the superiority of the proposed method. It achieved higher reconstruction accuracy with average F1-scores of 0.931 during floods and 0.903 before/after floods, outperforming the typical gap-filling method with average F1-scores of 0.871 and 0.772, respectively. Additionally, the maximum flood extent maps and flood duration maps, which were composed on the basis of the reconstructed water maps, were more accurate than those using the original cloud-contaminated water maps. The benefits of synthetic aperture radar images (e.g., Sentinel-1) for enhancing flood mapping under cloud cover conditions were also discussed. The method proposed in this paper provided an effective way for flood monitoring in cloudy and rainy scenarios, supporting emergency response and disaster management. The code and datasets used in this study have been made available online (<span><span>https://github.com/dr-lizhiwei/SeamlessFloodMapper</span><svg><path></path></svg></span>).</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0924271624002892/pdfft?md5=01f4ede2e5789a2d4f709563c851664a&pid=1-s2.0-S0924271624002892-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141904854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal-spectral-semantic-aware convolutional transformer network for multi-class tidal wetland change detection in Greater Bay Area 用于粤港澳大湾区多类潮汐湿地变化检测的时空-光谱-语义感知卷积变换器网络
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-08-06 DOI: 10.1016/j.isprsjprs.2024.07.024

Coastal tidal wetlands are crucial for environmental and economic health, but facing threats from various environmental changes. Detecting changes of tidal wetlands is essential for promoting sustainable development in coastal areas. Despite extensive researches on tidal wetland changes, persistent challenges still exist. Firstly, the high similarity among tidal wetland types hinders the effectiveness of existing common indices. Secondly, many current methods, relying on hand-crafted features, are time-consuming and subject to personal biases. Thirdly, few studies effectively integrate multi-temporal and semantic information, leading to misinterpretations from environmental noise and tidal variations. In view of the abovementioned issues, we proposed a novel temporal-spectral-semantic-aware convolutional transformer network (TSSA-CTNet) for multi-class tidal wetland change detection. Firstly, to address spectral similarity among different tidal wetlands, we proposed a sparse second order feature construction (SSFC) module to construct more separable spectral representations. Secondly, to get more separable features automatically, we constructed temporal-spatial feature extractor (TSFE) and siamese semantic sharing (SiamSS) blocks to extract temporal-spatial-semantic features. Thirdly, to fully utilize semantic information, we proposed a center comparative label smoothing (CCLS) module to generate semantic-aware labels. Experiments in the Greater Bay Area, using Landsat data from 2000 to 2019, demonstrated that TSSA-CTNet achieved 89.20% overall accuracy, outperforming other methods by 3.75%–16.39%. The study revealed significant area losses in tidal flats, mangroves, and tidal marshes, decreased by 3148 hectares, 35 hectares, and 240 hectares, respectively. Among the cities in GBA, Zhuhai shows the most significant area loss with a total of 1626 hectares. TSSA-CTNet proves effective for multi-class tidal wetland change detection, offering valuable insights for tidal wetland protection.

沿海潮汐湿地对环境和经济健康至关重要,但也面临着各种环境变化的威胁。检测潮汐湿地的变化对于促进沿海地区的可持续发展至关重要。尽管对潮汐湿地变化进行了广泛的研究,但仍然存在持续的挑战。首先,潮汐湿地类型之间的高度相似性阻碍了现有通用指数的有效性。其次,目前的许多方法依赖于手工创建特征,既费时又受个人偏见的影响。第三,很少有研究能有效整合多时信息和语义信息,导致环境噪声和潮汐变化造成误读。针对上述问题,我们提出了一种新型的时间-光谱-语义感知卷积变换网络(TSSA-CTNet),用于多类潮汐湿地变化检测。首先,针对不同潮汐湿地之间的光谱相似性,我们提出了稀疏二阶特征构建(SSFC)模块,以构建更多可分离的光谱表示。其次,为了自动获取更多可分离的特征,我们构建了时空特征提取器(TSFE)和连体语义共享(SiamSS)模块来提取时空语义特征。第三,为了充分利用语义信息,我们提出了中心比较标签平滑(CCLS)模块来生成语义感知标签。利用 2000 年至 2019 年的 Landsat 数据在大湾区进行的实验表明,TSSA-CTNet 的总体准确率达到 89.20%,优于其他方法 3.75%-16.39%。研究显示,滩涂、红树林和沼泽的面积损失巨大,分别减少了3148公顷、35公顷和240公顷。在全球滩涂区城市中,珠海的面积损失最为严重,共减少了 1626 公顷。事实证明,TSSA-CTNet 能够有效地进行多类潮汐湿地变化检测,为潮汐湿地保护提供有价值的见解。
{"title":"Temporal-spectral-semantic-aware convolutional transformer network for multi-class tidal wetland change detection in Greater Bay Area","authors":"","doi":"10.1016/j.isprsjprs.2024.07.024","DOIUrl":"10.1016/j.isprsjprs.2024.07.024","url":null,"abstract":"<div><p>Coastal tidal wetlands are crucial for environmental and economic health, but facing threats from various environmental changes. Detecting changes of tidal wetlands is essential for promoting sustainable development in coastal areas. Despite extensive researches on tidal wetland changes, persistent challenges still exist. Firstly, the high similarity among tidal wetland types hinders the effectiveness of existing common indices. Secondly, many current methods, relying on hand-crafted features, are time-consuming and subject to personal biases. Thirdly, few studies effectively integrate multi-temporal and semantic information, leading to misinterpretations from environmental noise and tidal variations. In view of the abovementioned issues, we proposed a novel temporal-spectral-semantic-aware convolutional transformer network (TSSA-CTNet) for multi-class tidal wetland change detection. Firstly, to address spectral similarity among different tidal wetlands, we proposed a sparse second order feature construction (SSFC) module to construct more separable spectral representations. Secondly, to get more separable features automatically, we constructed temporal-spatial feature extractor (TSFE) and siamese semantic sharing (SiamSS) blocks to extract temporal-spatial-semantic features. Thirdly, to fully utilize semantic information, we proposed a center comparative label smoothing (CCLS) module to generate semantic-aware labels. Experiments in the Greater Bay Area, using Landsat data from 2000 to 2019, demonstrated that TSSA-CTNet achieved 89.20% overall accuracy, outperforming other methods by 3.75%–16.39%. The study revealed significant area losses in tidal flats, mangroves, and tidal marshes, decreased by 3148 hectares, 35 hectares, and 240 hectares, respectively. Among the cities in GBA, Zhuhai shows the most significant area loss with a total of 1626 hectares. TSSA-CTNet proves effective for multi-class tidal wetland change detection, offering valuable insights for tidal wetland protection.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141904857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling the top-of-atmosphere radiance of alpine snow with topographic effects explicitly solved 利用明确求解的地形效应模拟高山积雪的大气顶部辐射率
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-08-03 DOI: 10.1016/j.isprsjprs.2024.07.017

Optical remote sensing of snow is challenged by the complex radiative transfer mechanism in alpine environments. The representation of topographic effects in interpreting satellite imagery of snow is still limited to inadequate analytical modelization. Here we develop a framework that explicitly solves multiple terrain reflections and generates the top-of-atmosphere (TOA) radiance of alpine snow by the modified four-stream radiative transfer theory. This framework comprises an atmosphere module, a terrain module and a surface spectra module relying on the approximate asymptotic radiative transfer (ART) model. In the terrain module, the iterative solution to multiple terrain reflections is facilitated with a viewshed calculating algorithm which identifies adjacent slopes and related geometric angles to derive terrain-reflected irradiance. The modeled TOA radiance is compared with Landsat-8/9 OLI, Sentinel-2A/B MSI and Terra MODIS radiance imagery. Experiments of several snow-covered mountainous regions in the Pamir area reveal that the TOA radiance modeling results agree well with satellite observations with reported R2 0.86, though subject to the uncertainties due to complex topography and seasonality. The modeled terrain-reflected irradiance is verified with the ray-tracing software called LargE-Scale Remote Sensing Data and Image Simulation Framework (LESS), and reliable modeling performance is confirmed as R2 values are 0.90. This model framework allows for better interpreting the apparent spectra of alpine snow through the physically-based linkage with snow’s intrinsic properties and environmental conditions.

高山环境中复杂的辐射传递机制给雪的光学遥感带来了挑战。在解释雪的卫星图像时,地形效应的表示仍然局限于不充分的分析建模。在此,我们开发了一个框架,明确地解决了多重地形反射问题,并通过修正的四流辐射传递理论生成了高山积雪的大气层顶(TOA)辐射率。该框架包括一个大气模块、一个地形模块和一个表面光谱模块,均依赖于近似渐近辐射传递(ART)模型。在地形模块中,多重地形反射的迭代求解采用视角计算算法,该算法可识别相邻斜坡和相关几何角度,从而得出地形反射辐照度。建模的 TOA 辐射率与 Landsat-8/9 OLI、Sentinel-2A/B MSI 和 Terra MODIS 辐射率图像进行了比较。对帕米尔地区几个积雪覆盖的山区进行的实验表明,TOA 辐射率建模结果与卫星观测结果非常吻合,尽管由于复杂的地形和季节性而存在不确定性。利用名为 LargE-Scale Remote Sensing Data and Image Simulation Framework (LESS) 的光线跟踪软件对模型的地形反射辐照度进行了验证,并确认了可靠的建模性能,其值为...。该模型框架通过与雪的固有特性和环境条件的物理联系,更好地解释了高山积雪的视光谱。
{"title":"Modeling the top-of-atmosphere radiance of alpine snow with topographic effects explicitly solved","authors":"","doi":"10.1016/j.isprsjprs.2024.07.017","DOIUrl":"10.1016/j.isprsjprs.2024.07.017","url":null,"abstract":"<div><p>Optical remote sensing of snow is challenged by the complex radiative transfer mechanism in alpine environments. The representation of topographic effects in interpreting satellite imagery of snow is still limited to inadequate analytical modelization. Here we develop a framework that explicitly solves multiple terrain reflections and generates the top-of-atmosphere (TOA) radiance of alpine snow by the modified four-stream radiative transfer theory. This framework comprises an atmosphere module, a terrain module and a surface spectra module relying on the approximate asymptotic radiative transfer (ART) model. In the terrain module, the iterative solution to multiple terrain reflections is facilitated with a viewshed calculating algorithm which identifies adjacent slopes and related geometric angles to derive terrain-reflected irradiance. The modeled TOA radiance is compared with Landsat-8/9 OLI, Sentinel-2A/B MSI and Terra MODIS radiance imagery. Experiments of several snow-covered mountainous regions in the Pamir area reveal that the TOA radiance modeling results agree well with satellite observations with reported <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> <span><math><mrow><mo>≥</mo><mspace></mspace><mn>0</mn><mo>.</mo><mn>86</mn></mrow></math></span>, though subject to the uncertainties due to complex topography and seasonality. The modeled terrain-reflected irradiance is verified with the ray-tracing software called LargE-Scale Remote Sensing Data and Image Simulation Framework (LESS), and reliable modeling performance is confirmed as <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> values are <span><math><mrow><mo>≥</mo><mspace></mspace><mn>0</mn><mo>.</mo><mn>90</mn></mrow></math></span>. This model framework allows for better interpreting the apparent spectra of alpine snow through the physically-based linkage with snow’s intrinsic properties and environmental conditions.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141904858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CropSight: Towards a large-scale operational framework for object-based crop type ground truth retrieval using street view and PlanetScope satellite imagery CropSight:利用街景和 PlanetScope 卫星图像实现基于对象的作物类型地面实况检索的大规模操作框架
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-08-01 DOI: 10.1016/j.isprsjprs.2024.07.025

Crop type maps are essential in informing agricultural policy decisions by providing crucial data on the specific crops cultivated in given regions. The generation of crop type maps usually involves the collection of ground truth data of various crop species, which can be challenging at large scales. As an alternative to conventional field observations, street view images offer a valuable and extensive resource for gathering large-scale crop type ground truth through imaging the crops cultivated in the roadside agricultural fields. Yet our ability to systematically retrieve crop type labels at large scales from street view images in an operational fashion is still limited. The crop type retrieval is usually at the pixel level with uncertainty seldom considered. In our study, we develop a novel deep learning-based CropSight modeling framework to retrieve the object-based crop type ground truth by synthesizing Google Street View (GSV) and PlanetScope satellite images. CropSight comprises three key components: (1) A large-scale operational cropland field-view imagery collection method is devised to systematically acquire representative geotagged cropland field-view images of various crop types across regions in an operational manner; (2) UncertainFusionNet, a novel Bayesian convolutional neural network, is developed to retrieve high-quality crop type labels from collected field-view images with uncertainty quantified; (3) Segmentation Anything Model (SAM) is fine-tuned and employed to delineate the cropland boundary tailored to each collected field-view image with its coordinate as the point prompt using the PlanetScope satellite imagery. With four agricultural dominated regions in the US as study areas, CropSight consistently shows high accuracy in retrieving crop type labels of multiple dominated crop species (overall accuracy around 97 %) and in delineating corresponding cropland boundaries (F1 score around 92 %). UncertainFusionNet outperforms the benchmark models (i.e., ResNet-50 and Vision Transformer) for crop type image classification, showing an improvement in overall accuracy of 2–8 %. The fine-tuned SAM surpasses the performance of Mask-RCNN and the base SAM in cropland boundary delineation, achieving a 4–12 % increase in F1 score. The further comparison with the benchmark crop type product (i.e., cropland data layer (CDL)) indicates that CropSight is a promising alternative to crop type mapping products for providing high-quality, object-based crop type ground truth of diverse crop species at large scales. CropSight holds considerable promise to extrapolate over space and time for operationalizing large-scale object-based crop type ground truth retrieval in a near-real-time manner.

作物类型图通过提供特定地区种植的特定作物的关键数据,为农业政策决策提供依据。作物类型图的生成通常需要收集各种作物种类的地面实况数据,而这在大尺度范围内具有挑战性。作为传统实地观测的替代方法,街景图像通过对路边农田中种植的作物进行成像,为收集大规模作物类型地面实况提供了宝贵而广泛的资源。然而,我们以可操作的方式从街景图像中系统地检索大尺度作物类型标签的能力仍然有限。作物类型检索通常是在像素级别上进行的,很少考虑不确定性。在我们的研究中,我们开发了一种基于深度学习的新型 CropSight 建模框架,通过综合谷歌街景(GSV)和 PlanetScope 卫星图像来检索基于对象的作物类型地面实况。CropSight 由三个关键部分组成:(1) 设计了一种大规模可操作的耕地田间视图图像收集方法,以可操作的方式系统地获取各地区各种作物类型的具有代表性的地理标记耕地田间视图图像;(2) 开发了一种新型贝叶斯卷积神经网络 UncertainFusionNet,用于从采集的田间视图图像中获取高质量的作物类型标签,并量化其不确定性;(3) 利用 PlanetScope 卫星图像,微调并采用 Segmentation Anything Model (SAM),以每张采集的田间视图图像的坐标作为点提示,为其量身定制耕地边界的划分方法。以美国四个以农业为主的地区为研究区域,CropSight 在检索多种优势作物种类的作物类型标签(总体准确率约为 97%)和划定相应的耕地边界(F1 分数约为 92%)方面始终表现出很高的准确性。在作物类型图像分类方面,UncertainFusionNet 的表现优于基准模型(即 ResNet-50 和 Vision Transformer),总体准确率提高了 2-8 %。微调后的 SAM 在耕地边界划分方面的性能超过了 Mask-RCNN 和基础 SAM,F1 分数提高了 4-12%。与基准作物类型产品(即耕地数据层 (CDL))的进一步比较表明,CropSight 是作物类型绘图产品的一个很有前途的替代品,可在大尺度上为不同作物物种提供高质量、基于对象的作物类型基本事实。CropSight 在空间和时间推断方面大有可为,可以以接近实时的方式实现基于对象的大规模作物类型地面实况检索。
{"title":"CropSight: Towards a large-scale operational framework for object-based crop type ground truth retrieval using street view and PlanetScope satellite imagery","authors":"","doi":"10.1016/j.isprsjprs.2024.07.025","DOIUrl":"10.1016/j.isprsjprs.2024.07.025","url":null,"abstract":"<div><p>Crop type maps are essential in informing agricultural policy decisions by providing crucial data on the specific crops cultivated in given regions. The generation of crop type maps usually involves the collection of ground truth data of various crop species, which can be challenging at large scales. As an alternative to conventional field observations, street view images offer a valuable and extensive resource for gathering large-scale crop type ground truth through imaging the crops cultivated in the roadside agricultural fields. Yet our ability to systematically retrieve crop type labels at large scales from street view images in an operational fashion is still limited. The crop type retrieval is usually at the pixel level with uncertainty seldom considered. In our study, we develop a novel deep learning-based CropSight modeling framework to retrieve the object-based crop type ground truth by synthesizing Google Street View (GSV) and PlanetScope satellite images. CropSight comprises three key components: (1) A large-scale operational cropland field-view imagery collection method is devised to systematically acquire representative geotagged cropland field-view images of various crop types across regions in an operational manner; (2) UncertainFusionNet, a novel Bayesian convolutional neural network, is developed to retrieve high-quality crop type labels from collected field-view images with uncertainty quantified; (3) Segmentation Anything Model (SAM) is fine-tuned and employed to delineate the cropland boundary tailored to each collected field-view image with its coordinate as the point prompt using the PlanetScope satellite imagery. With four agricultural dominated regions in the US as study areas, CropSight consistently shows high accuracy in retrieving crop type labels of multiple dominated crop species (overall accuracy around 97 %) and in delineating corresponding cropland boundaries (F1 score around 92 %). UncertainFusionNet outperforms the benchmark models (i.e., ResNet-50 and Vision Transformer) for crop type image classification, showing an improvement in overall accuracy of 2–8 %. The fine-tuned SAM surpasses the performance of Mask-RCNN and the base SAM in cropland boundary delineation, achieving a 4–12 % increase in F1 score. The further comparison with the benchmark crop type product (i.e., cropland data layer (CDL)) indicates that CropSight is a promising alternative to crop type mapping products for providing high-quality, object-based crop type ground truth of diverse crop species at large scales. CropSight holds considerable promise to extrapolate over space and time for operationalizing large-scale object-based crop type ground truth retrieval in a near-real-time manner.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0924271624002922/pdfft?md5=56094958cdf792c198b8ff25886be1bf&pid=1-s2.0-S0924271624002922-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141862560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for fully automated reconstruction of semantic building model at urban-scale using textured LoD2 data 利用纹理 LoD2 数据全自动重建城市尺度语义建筑模型的框架
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-08-01 DOI: 10.1016/j.isprsjprs.2024.07.019

The CityGML Level of Detail 3 (LoD3), a widely adopted standard for three-dimensional (3D) city modeling, has been accessible for an extended period. However, its comprehensive implementation remains limited due to challenges such as insufficient automation and inconsistent data quality. This research introduces an innovative and fully automated framework aimed at urban-scale semantic building model reconstruction. The proposed framework addresses three critical challenges: (1) proposing facade layout graph model to formalize the geometry and topological relationships of semantic entities on building facades, thereby promoting the deduction of structural completeness and the reconstruction of semantic facade models; (2) establishing a mapping relationship between texture images, semantic entities, and building shells guided by the facade layout graph to ensure consistent correlations among the geometry, semantics, and topology of building models; (3) developing an efficient representation methodology for semantic building models utilizing a parameter set derived from the facade layout graph. The proposed framework has been successfully validated by reconstructing 8,681 buildings from three different locations in Berlin. The results demonstrate an outstanding reconstruction accuracy of 91%, with a time efficiency of only 3.42 s per building. Visual analysis further confirms that the framework effectively fulfills the application prerequisites of 3D GIS. The code of the proposed framework is available in the repository: https://github.com/wangyuefeng2017/LoD3Framework-.

CityGML Level of Detail 3 (LoD3)是一个被广泛采用的三维(3D)城市建模标准,已被长期使用。然而,由于自动化不足和数据质量不一致等挑战,其全面实施仍然受到限制。本研究引入了一个创新的全自动框架,旨在重建城市尺度的语义建筑模型。所提出的框架解决了三个关键挑战:(1)提出外立面布局图模型,将建筑外立面上语义实体的几何和拓扑关系形式化,从而促进结构完整性的推导和语义外立面模型的重建;(2)在外立面布局图的指导下,建立纹理图像、语义实体和建筑外壳之间的映射关系,确保建筑模型的几何、语义和拓扑之间的一致性关联;(3)利用从外立面布局图导出的参数集,为语义建筑模型开发高效的表示方法。通过重建柏林三个不同地点的 8681 栋建筑,成功验证了所提出的框架。结果表明,重建准确率高达 91%,每栋建筑的时间效率仅为 3.42 秒。视觉分析进一步证实,该框架有效地满足了三维地理信息系统的应用要求。建议框架的代码可在以下资源库中获取:.NET Framework 3.0.0.0(.NET Framework 3.0.0.0)。
{"title":"A framework for fully automated reconstruction of semantic building model at urban-scale using textured LoD2 data","authors":"","doi":"10.1016/j.isprsjprs.2024.07.019","DOIUrl":"10.1016/j.isprsjprs.2024.07.019","url":null,"abstract":"<div><p>The CityGML Level of Detail 3 (LoD3), a widely adopted standard for three-dimensional (3D) city modeling, has been accessible for an extended period. However, its comprehensive implementation remains limited due to challenges such as insufficient automation and inconsistent data quality. This research introduces an innovative and fully automated framework aimed at urban-scale semantic building model reconstruction. The proposed framework addresses three critical challenges: (1) proposing facade layout graph model to formalize the geometry and topological relationships of semantic entities on building facades, thereby promoting the deduction of structural completeness and the reconstruction of semantic facade models; (2) establishing a mapping relationship between texture images, semantic entities, and building shells guided by the facade layout graph to ensure consistent correlations among the geometry, semantics, and topology of building models; (3) developing an efficient representation methodology for semantic building models utilizing a parameter set derived from the facade layout graph. The proposed framework has been successfully validated by reconstructing 8,681 buildings from three different locations in Berlin. The results demonstrate an outstanding reconstruction accuracy of 91%, with a time efficiency of only 3.42 s per building. Visual analysis further confirms that the framework effectively fulfills the application prerequisites of 3D GIS. The code of the proposed framework is available in the repository: <span><span>https://github.com/wangyuefeng2017/LoD3Framework-</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141891959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Catadioptric omnidirectional thermal odometry in dynamic environment 动态环境中的双向全方向热测距仪
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-07-31 DOI: 10.1016/j.isprsjprs.2024.07.021

This paper presents a catadioptric omnidirectional thermal odometry (COTO) system that estimates the six degrees of freedom (DoF) pose of a camera using only omnidirectional thermal images in visually degraded, fast-motion, and dynamic environments. First, we design and fabricate a central hyperbolic catadioptric omnidirectional thermal camera that captures surrounding thermal images with 360° horizontal field of view (FoV), and improve the omnidirectional camera model and calibration method to obtain high-precision camera intrinsic parameter. Second, we propose the epipolar curve constraint combining with omnidirectional thermal object detection to significantly reduce the interference of moving objects on pose estimation. Third, the implemented COTO pipeline consists of photometric calibration, dynamic region removal, tracking and mapping to overcome the drawbacks of photometric inconsistency and large distortion in omnidirectional thermal images. Experiments have been conducted on a total of 17 sequences of Lab, Outdoor and Driving, amounting to more than 60,000 omnidirectional thermal images of real environments. The experimental results indicate that the proposed COTO system has excellent localization accuracy and unparalleled robustness over the current state-of-the-art methods. The average localization accuracy measured by the absolute trajectory error (ATE) is less than 15 cm from the ground truth in both Lab and Outdoor sequences. In addition, COTO was the only system with complete and successful tracking in all sequences. The system can be used as an innovative localization solution, particularly in challenging environments with changes in ambient light, rapid vehicle motion, and moving object interference, which can be a difficult problem for visual odometry to solve.

本文介绍了一种双曲面全向热像仪测距(COTO)系统,该系统可在视觉衰减、快速运动和动态环境中,仅利用全向热图像估算热像仪的六个自由度(DoF)姿态。首先,我们设计并制造了一种中央双曲双曲面全向热像仪,该热像仪可捕捉周围具有水平视场(FoV)的热图像,并改进了全向热像仪模型和校准方法,从而获得了高精度的热像仪固有参数。其次,我们提出了结合全向热物体检测的外极曲线约束,以显著降低移动物体对姿态估计的干扰。第三,实现的 COTO 流水线包括光度校准、动态区域移除、跟踪和映射,以克服全向热图像中光度不一致和较大失真的缺点。我们在实验室、室外和驾驶等 17 个序列上进行了实验,共获得了 60,000 多张真实环境的全向红外图像。实验结果表明,与目前最先进的方法相比,拟议的 COTO 系统具有出色的定位精度和无与伦比的鲁棒性。在实验室和室外序列中,以绝对轨迹误差(ATE)测量的平均定位精度与地面实况的误差均小于 15 厘米。此外,COTO 是唯一一个在所有序列中都能成功完成跟踪的系统。该系统可作为一种创新的定位解决方案,特别是在环境光线变化、车辆快速运动和移动物体干扰等具有挑战性的环境中,这些都是视觉里程测量难以解决的问题。
{"title":"Catadioptric omnidirectional thermal odometry in dynamic environment","authors":"","doi":"10.1016/j.isprsjprs.2024.07.021","DOIUrl":"10.1016/j.isprsjprs.2024.07.021","url":null,"abstract":"<div><p>This paper presents a catadioptric omnidirectional thermal odometry (COTO) system that estimates the six degrees of freedom (DoF) pose of a camera using only omnidirectional thermal images in visually degraded, fast-motion, and dynamic environments. First, we design and fabricate a central hyperbolic catadioptric omnidirectional thermal camera that captures surrounding thermal images with <span><math><mrow><mn>360</mn><mo>°</mo></mrow></math></span> horizontal field of view (FoV), and improve the omnidirectional camera model and calibration method to obtain high-precision camera intrinsic parameter. Second, we propose the epipolar curve constraint combining with omnidirectional thermal object detection to significantly reduce the interference of moving objects on pose estimation. Third, the implemented COTO pipeline consists of photometric calibration, dynamic region removal, tracking and mapping to overcome the drawbacks of photometric inconsistency and large distortion in omnidirectional thermal images. Experiments have been conducted on a total of 17 sequences of Lab, Outdoor and Driving, amounting to more than 60,000 omnidirectional thermal images of real environments. The experimental results indicate that the proposed COTO system has excellent localization accuracy and unparalleled robustness over the current state-of-the-art methods. The average localization accuracy measured by the absolute trajectory error (ATE) is less than 15 cm from the ground truth in both Lab and Outdoor sequences. In addition, COTO was the only system with complete and successful tracking in all sequences. The system can be used as an innovative localization solution, particularly in challenging environments with changes in ambient light, rapid vehicle motion, and moving object interference, which can be a difficult problem for visual odometry to solve.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":10.6,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141862564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1