首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
ACMatch: Improving context capture for two-view correspondence learning via adaptive convolution ACMatch:通过自适应卷积改进双视角对应学习的上下文捕捉
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-16 DOI: 10.1016/j.isprsjprs.2024.11.004
Xiang Fang , Yifan Lu , Shihua Zhang , Yining Xie , Jiayi Ma
Two-view correspondence learning plays a pivotal role in the field of computer vision. However, this task is beset with great challenges stemming from the significant imbalance between true and false correspondences. Recent approaches have started leveraging the inherent filtering properties of convolution to eliminate false matches. Nevertheless, these methods tend to apply convolution in an ad hoc manner without careful design, thereby inheriting the limitations of convolution and hindering performance improvement. In this paper, we propose a novel convolution-based method called ACMatch, which aims to meticulously design convolutional filters to mitigate the shortcomings of convolution and enhance its effectiveness. Specifically, to address the limitation of existing convolutional filters of struggling to effectively capture global information due to the limited receptive field, we introduce a strategy to help them obtain relatively global information by guiding grid points to incorporate more contextual information, thus enabling a global perspective for two-view learning. Furthermore, we recognize that in the context of feature matching, inliers and outliers provide fundamentally different information. Hence, we design an adaptive weighted convolution module that allows the filters to focus more on inliers while ignoring outliers. Extensive experiments across various visual tasks demonstrate the effectiveness, superiority, and generalization. Notably, ACMatch attains an AUC@5° of 35.93% on YFCC100M without RANSAC, surpassing the previous state-of-the-art by 5.85 absolute percentage points and exceeding the 35% AUC@5° bar for the first time. Our code is publicly available at https://github.com/ShineFox/ACMatch.
双视角对应学习在计算机视觉领域发挥着举足轻重的作用。然而,由于真假对应关系严重失衡,这项任务面临着巨大的挑战。最近的方法开始利用卷积的固有过滤特性来消除错误匹配。然而,这些方法往往未经精心设计就临时应用卷积,从而继承了卷积的局限性,阻碍了性能的提高。在本文中,我们提出了一种名为 ACMatch 的基于卷积的新方法,旨在精心设计卷积滤波器,以减轻卷积的缺点并提高其有效性。具体来说,针对现有卷积滤波器因感受野有限而难以有效捕捉全局信息的局限,我们引入了一种策略,通过引导网格点纳入更多上下文信息,帮助它们获取相对全局的信息,从而实现双视角学习的全局视角。此外,我们还认识到,在特征匹配中,异常值和离群值提供了根本不同的信息。因此,我们设计了一个自适应加权卷积模块,允许滤波器更多地关注异常值,而忽略离群值。在各种视觉任务中进行的大量实验证明了 ACMatch 的有效性、优越性和通用性。值得注意的是,在不使用 RANSAC 的情况下,ACMatch 在 YFCC100M 上的 AUC@5° 达到了 35.93%,比之前的一流水平高出 5.85 个绝对百分点,并首次超过了 35% AUC@5° 的标准。我们的代码可在 https://github.com/ShineFox/ACMatch 公开获取。
{"title":"ACMatch: Improving context capture for two-view correspondence learning via adaptive convolution","authors":"Xiang Fang ,&nbsp;Yifan Lu ,&nbsp;Shihua Zhang ,&nbsp;Yining Xie ,&nbsp;Jiayi Ma","doi":"10.1016/j.isprsjprs.2024.11.004","DOIUrl":"10.1016/j.isprsjprs.2024.11.004","url":null,"abstract":"<div><div>Two-view correspondence learning plays a pivotal role in the field of computer vision. However, this task is beset with great challenges stemming from the significant imbalance between true and false correspondences. Recent approaches have started leveraging the inherent filtering properties of convolution to eliminate false matches. Nevertheless, these methods tend to apply convolution in an ad hoc manner without careful design, thereby inheriting the limitations of convolution and hindering performance improvement. In this paper, we propose a novel convolution-based method called ACMatch, which aims to meticulously design convolutional filters to mitigate the shortcomings of convolution and enhance its effectiveness. Specifically, to address the limitation of existing convolutional filters of struggling to effectively capture global information due to the limited receptive field, we introduce a strategy to help them obtain relatively global information by guiding grid points to incorporate more contextual information, thus enabling a global perspective for two-view learning. Furthermore, we recognize that in the context of feature matching, inliers and outliers provide fundamentally different information. Hence, we design an adaptive weighted convolution module that allows the filters to focus more on inliers while ignoring outliers. Extensive experiments across various visual tasks demonstrate the effectiveness, superiority, and generalization. Notably, ACMatch attains an AUC@5° of 35.93% on YFCC100M without RANSAC, surpassing the previous state-of-the-art by 5.85 absolute percentage points and exceeding the 35% AUC@5° bar for the first time. Our code is publicly available at <span><span>https://github.com/ShineFox/ACMatch</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 466-480"},"PeriodicalIF":10.6,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A universal adapter in segmentation models for transferable landslide mapping 用于可转移滑坡绘图的分段模型中的通用适配器
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-15 DOI: 10.1016/j.isprsjprs.2024.11.006
Ruilong Wei , Yamei Li , Yao Li , Bo Zhang , Jiao Wang , Chunhao Wu , Shunyu Yao , Chengming Ye
Efficient landslide mapping is crucial for disaster mitigation and relief. Recently, deep learning methods have shown promising results in landslide mapping using satellite imagery. However, the sample sparsity and geographic diversity of landslides have challenged the transferability of deep learning models. In this paper, we proposed a universal adapter module that can be seamlessly embedded into existing segmentation models for transferable landslide mapping. The adapter can achieve high-accuracy cross-regional landslide segmentation with a small sample set, requiring minimal parameter adjustments. In detail, the pre-trained baseline model freezes its parameters to keep learned knowledge of the source domain, while the lightweight adapter fine-tunes only a few parameters to learn new landslide features of the target domain. Structurally, we introduced an attention mechanism to enhance the feature extraction of the adapter. To validate the proposed adapter module, 4321 landslide samples were prepared, and the Segment Anything Model (SAM) and other baseline models, along with four transfer strategies were selected for controlled experiments. In addition, Sentinel-2 satellite imagery in the Himalayas and Hengduan Mountains, located on the southern and southeastern edges of the Tibetan Plateau was collected for evaluation. The controlled experiments reported that SAM, when combined with our adapter module, achieved a peak mean Intersection over Union (mIoU) of 82.3 %. For other baseline models, integrating the adapter improved mIoU by 2.6 % to 12.9 % compared with traditional strategies on cross-regional landslide mapping. In particular, baseline models with Transformers are more suitable for fine-tuning parameters. Furthermore, the visualized feature maps revealed that fine-tuning shallow encoders can achieve better effects in model transfer. Besides, the proposed adapter can effectively extract landslide features and focus on specific spatial and channel domains with significant features. We also quantified the spectral, scale, and shape features of landslides and analyzed their impacts on segmentation results. Our analysis indicated that weak spectral differences, as well as extreme scale and edge shapes are detrimental to the accuracy of landslide segmentation. Overall, this adapter module provides a new perspective for large-scale transferable landslide mapping.
高效的滑坡绘图对减灾救灾至关重要。最近,深度学习方法在利用卫星图像绘制滑坡地图方面取得了可喜的成果。然而,滑坡的样本稀疏性和地理多样性对深度学习模型的可移植性提出了挑战。在本文中,我们提出了一种通用适配器模块,可无缝嵌入现有的细分模型,实现滑坡绘图的可移植性。该适配器只需少量样本集,就能实现高精度的跨区域滑坡分割,参数调整量极小。具体来说,预先训练好的基线模型会冻结其参数,以保留源领域的已学知识,而轻量级适配器只需微调几个参数,就能学习目标领域的新滑坡特征。在结构上,我们引入了注意力机制,以加强适配器的特征提取。为了验证所提出的适配器模块,我们准备了 4321 个滑坡样本,并选择了分段任意模型(SAM)和其他基线模型以及四种转移策略进行对照实验。此外,还收集了位于青藏高原南部和东南部边缘的喜马拉雅山脉和横断山脉的哨兵-2 卫星图像进行评估。对照实验结果表明,当 SAM 与我们的适配器模块相结合时,峰值平均联合交叉率(mIoU)达到了 82.3%。对于其他基线模型,与跨区域滑坡绘图的传统策略相比,集成适配器可将 mIoU 提高 2.6% 至 12.9%。特别是,带有转换器的基线模型更适合微调参数。此外,可视化特征图显示,微调浅层编码器可在模型转移中取得更好的效果。此外,所提出的适配器能有效提取滑坡特征,并聚焦于具有重要特征的特定空间和通道域。我们还量化了滑坡的光谱、尺度和形状特征,并分析了它们对分割结果的影响。我们的分析表明,微弱的光谱差异以及极端的尺度和边缘形状不利于滑坡分割的准确性。总之,该适配器模块为大规模可转移滑坡绘图提供了新的视角。
{"title":"A universal adapter in segmentation models for transferable landslide mapping","authors":"Ruilong Wei ,&nbsp;Yamei Li ,&nbsp;Yao Li ,&nbsp;Bo Zhang ,&nbsp;Jiao Wang ,&nbsp;Chunhao Wu ,&nbsp;Shunyu Yao ,&nbsp;Chengming Ye","doi":"10.1016/j.isprsjprs.2024.11.006","DOIUrl":"10.1016/j.isprsjprs.2024.11.006","url":null,"abstract":"<div><div>Efficient landslide mapping is crucial for disaster mitigation and relief. Recently, deep learning methods have shown promising results in landslide mapping using satellite imagery. However, the sample sparsity and geographic diversity of landslides have challenged the transferability of deep learning models. In this paper, we proposed a universal adapter module that can be seamlessly embedded into existing segmentation models for transferable landslide mapping. The adapter can achieve high-accuracy cross-regional landslide segmentation with a small sample set, requiring minimal parameter adjustments. In detail, the pre-trained baseline model freezes its parameters to keep learned knowledge of the source domain, while the lightweight adapter fine-tunes only a few parameters to learn new landslide features of the target domain. Structurally, we introduced an attention mechanism to enhance the feature extraction of the adapter. To validate the proposed adapter module, 4321 landslide samples were prepared, and the Segment Anything Model (SAM) and other baseline models, along with four transfer strategies were selected for controlled experiments. In addition, Sentinel-2 satellite imagery in the Himalayas and Hengduan Mountains, located on the southern and southeastern edges of the Tibetan Plateau was collected for evaluation. The controlled experiments reported that SAM, when combined with our adapter module, achieved a peak mean Intersection over Union (mIoU) of 82.3 %. For other baseline models, integrating the adapter improved mIoU by 2.6 % to 12.9 % compared with traditional strategies on cross-regional landslide mapping. In particular, baseline models with Transformers are more suitable for fine-tuning parameters. Furthermore, the visualized feature maps revealed that fine-tuning shallow encoders can achieve better effects in model transfer. Besides, the proposed adapter can effectively extract landslide features and focus on specific spatial and channel domains with significant features. We also quantified the spectral, scale, and shape features of landslides and analyzed their impacts on segmentation results. Our analysis indicated that weak spectral differences, as well as extreme scale and edge shapes are detrimental to the accuracy of landslide segmentation. Overall, this adapter module provides a new perspective for large-scale transferable landslide mapping.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 446-465"},"PeriodicalIF":10.6,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contrastive learning for real SAR image despeckling 针对真实合成孔径雷达图像去斑的对比学习
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-15 DOI: 10.1016/j.isprsjprs.2024.11.003
Yangtian Fang , Rui Liu , Yini Peng , Jianjun Guan , Duidui Li , Xin Tian
The use of synthetic aperture radar (SAR) has greatly improved our ability to capture high-resolution terrestrial images under various weather conditions. However, SAR imagery is affected by speckle noise, which distorts image details and hampers subsequent applications. Recent forays into supervised deep learning-based denoising methods, like MRDDANet and SAR-CAM, offer a promising avenue for SAR despeckling. However, they are impeded by the domain gaps between synthetic data and realistic SAR images. To tackle this problem, we introduce a self-supervised speckle-aware network to utilize the limited near-real datasets and unlimited synthetic datasets simultaneously, which boosts the performance of the downstream despeckling module by teaching the module to discriminate the domain gap of different datasets in the embedding space. Specifically, based on contrastive learning, the speckle-aware network first characterizes the discriminative representations of spatial-correlated speckle noise in different images across diverse datasets, which provides priors of versatile speckles and image characteristics. Then, the representations are effectively modulated into a subsequent multi-scale despeckling network to generate authentic despeckled images. In this way, the despeckling module can reconstruct reliable SAR image characteristics by learning from near-real datasets, while the generalization performance is guaranteed by learning abundant patterns from synthetic datasets simultaneously. Additionally, a novel excitation aggregation pooling module is inserted into the despeckling network to enhance the network further, which utilizes features from different levels of scales and better preserves spatial details around strong scatters in real SAR images. Extensive experiments across real SAR datasets from Sentinel-1, Capella-X, and TerraSAR-X satellites are carried out to verify the effectiveness of the proposed method over other state-of-the-art methods. Specifically, the proposed method achieves the best PSNR and SSIM values evaluated on the near-real Sentinel-1 dataset, with gains of 0.22 dB in PSNR compared to MRDDANet, and improvements of 1.3% in SSIM over SAR-CAM. The code is available at https://github.com/YangtianFang2002/CL-SAR-Despeckling.
合成孔径雷达(SAR)的使用大大提高了我们在各种天气条件下捕捉高分辨率陆地图像的能力。然而,合成孔径雷达图像会受到斑点噪声的影响,从而扭曲图像细节,妨碍后续应用。最近,基于监督深度学习的去噪方法(如 MRDDANet 和 SAR-CAM)为合成孔径雷达去斑提供了一条前景广阔的途径。然而,合成数据与现实合成孔径雷达图像之间的领域差距阻碍了它们的发展。为了解决这个问题,我们引入了一种自监督斑点感知网络,同时利用有限的近真实数据集和无限的合成数据集,通过教会模块辨别嵌入空间中不同数据集的域差距,提高下游解斑模块的性能。具体来说,基于对比学习,斑点感知网络首先描述了不同数据集中不同图像中空间相关斑点噪声的判别表征,从而提供了多功能斑点和图像特征的先验。然后,将这些表征有效地调制到随后的多尺度去斑网络中,生成真实的去斑图像。这样,去斑模块就能通过学习近乎真实的数据集来重建可靠的合成孔径雷达图像特征,同时通过同时学习合成数据集的丰富模式来保证泛化性能。此外,除斑网络中还加入了一个新颖的激励聚合池化模块,以进一步增强网络,从而利用不同尺度的特征,更好地保留真实合成孔径雷达图像中强散射周围的空间细节。通过对来自 Sentinel-1、Capella-X 和 TerraSAR-X 卫星的真实合成孔径雷达数据集进行广泛实验,验证了所提方法相对于其他先进方法的有效性。具体来说,在近乎真实的 Sentinel-1 数据集上,所提方法获得了最佳的 PSNR 和 SSIM 值,PSNR 比 MRDDANet 提高了 0.22 dB,SSIM 比 SAR-CAM 提高了 1.3%。代码见 https://github.com/YangtianFang2002/CL-SAR-Despeckling。
{"title":"Contrastive learning for real SAR image despeckling","authors":"Yangtian Fang ,&nbsp;Rui Liu ,&nbsp;Yini Peng ,&nbsp;Jianjun Guan ,&nbsp;Duidui Li ,&nbsp;Xin Tian","doi":"10.1016/j.isprsjprs.2024.11.003","DOIUrl":"10.1016/j.isprsjprs.2024.11.003","url":null,"abstract":"<div><div>The use of synthetic aperture radar (SAR) has greatly improved our ability to capture high-resolution terrestrial images under various weather conditions. However, SAR imagery is affected by speckle noise, which distorts image details and hampers subsequent applications. Recent forays into supervised deep learning-based denoising methods, like MRDDANet and SAR-CAM, offer a promising avenue for SAR despeckling. However, they are impeded by the domain gaps between synthetic data and realistic SAR images. To tackle this problem, we introduce a self-supervised speckle-aware network to utilize the limited near-real datasets and unlimited synthetic datasets simultaneously, which boosts the performance of the downstream despeckling module by teaching the module to discriminate the domain gap of different datasets in the embedding space. Specifically, based on contrastive learning, the speckle-aware network first characterizes the discriminative representations of spatial-correlated speckle noise in different images across diverse datasets, which provides priors of versatile speckles and image characteristics. Then, the representations are effectively modulated into a subsequent multi-scale despeckling network to generate authentic despeckled images. In this way, the despeckling module can reconstruct reliable SAR image characteristics by learning from near-real datasets, while the generalization performance is guaranteed by learning abundant patterns from synthetic datasets simultaneously. Additionally, a novel excitation aggregation pooling module is inserted into the despeckling network to enhance the network further, which utilizes features from different levels of scales and better preserves spatial details around strong scatters in real SAR images. Extensive experiments across real SAR datasets from Sentinel-1, Capella-X, and TerraSAR-X satellites are carried out to verify the effectiveness of the proposed method over other state-of-the-art methods. Specifically, the proposed method achieves the best PSNR and SSIM values evaluated on the near-real Sentinel-1 dataset, with gains of 0.22 dB in PSNR compared to MRDDANet, and improvements of 1.3% in SSIM over SAR-CAM. The code is available at <span><span>https://github.com/YangtianFang2002/CL-SAR-Despeckling</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 376-391"},"PeriodicalIF":10.6,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MIWC: A multi-temporal image weighted composition method for satellite-derived bathymetry in shallow waters MIWC:用于浅水卫星水深测量的多时相图像加权合成法
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-15 DOI: 10.1016/j.isprsjprs.2024.10.009
Zhixin Duan , Liang Cheng , Qingzhou Mao , Yueting Song , Xiao Zhou , Manchun Li , Jianya Gong
Satellite-derived bathymetry (SDB) is a vital technique for the rapid and cost-effective measurement of shallow underwater terrain. However, it faces challenges of image noise, including clouds, bubble clouds, and sun glint. Consequently, the acquisition of no missing and accurate bathymetric maps is frequently challenging, particularly in cloudy, rainy, and large-scale regions. In this study, we propose a multi-temporal image weighted composition (MIWC) method. This method performs iterative segmentation and inverse distance weighted composition of multi-temporal images based only on the near-infrared (NIR) band information of multispectral images to obtain high-quality composite images. The method was applied to scenarios using Sentinel-2 imagery for bathymetry of four representative areas located in the South China Sea and the Atlantic Ocean. The results show that the root mean square error (RMSE) of bathymetry from the composite images using the log-transformed linear model (LLM) and the log-transformed ratio model (LRM) in the water depth range of 0–20 m are 0.67–1.22 m and 0.71–1.23 m, respectively. The RMSE of the bathymetry decreases with the number of images involved in the composition and tends to be relatively stable when the number of images reaches approximately 16. In addition, the composition images generated by the MIWC method generally exhibit not only superior visual quality, but also significant advantages in terms of bathymetric accuracy and robustness when compared to the best single images as well as the composition images generated by the median composition method and the maximum outlier removal method. The recommended value of the power parameter for inverse distance weighting in the MIWC method was experimentally determined to be 4, which typically does not require complex adjustments, making the method easy to apply or integrate. The MIWC method offers a reliable approach to improve the quality of remote sensing images, ensuring the completeness and accuracy of shallow water bathymetric maps.
卫星水深测量(SDB)是快速、经济地测量水下浅层地形的重要技术。然而,它面临着图像噪声的挑战,包括云层、气泡云和太阳光。因此,特别是在多云、多雨和大尺度地区,获取无遗漏和准确的水深测量图经常面临挑战。在本研究中,我们提出了一种多时相图像加权合成(MIWC)方法。该方法仅基于多光谱图像的近红外(NIR)波段信息,对多时态图像进行迭代分割和反距离加权合成,从而获得高质量的合成图像。该方法被应用于使用哨兵-2 图像对位于中国南海和大西洋的四个代表性区域进行水深测量的场景。结果表明,在 0-20 米水深范围内,使用对数变换线性模型(LLM)和对数变换比值模型(LRM)从合成图像得出的水深测量结果的均方根误差(RMSE)分别为 0.67-1.22 米和 0.71-1.23 米。水深测量的均方根误差随组成图像数量的增加而减小,当图像数量达到约 16 幅时,均方根误差趋于相对稳定。此外,与最佳单幅图像以及中值合成法和最大离群点去除法生成的合成图像相比,MIWC 方法生成的合成图像不仅视觉质量上乘,而且在测深精度和鲁棒性方面也有显著优势。经实验确定,MIWC 方法中用于反距离加权的功率参数推荐值为 4,通常不需要进行复杂的调整,因此该方法易于应用或集成。MIWC 方法是提高遥感图像质量的可靠方法,可确保浅水测深图的完整性和准确性。
{"title":"MIWC: A multi-temporal image weighted composition method for satellite-derived bathymetry in shallow waters","authors":"Zhixin Duan ,&nbsp;Liang Cheng ,&nbsp;Qingzhou Mao ,&nbsp;Yueting Song ,&nbsp;Xiao Zhou ,&nbsp;Manchun Li ,&nbsp;Jianya Gong","doi":"10.1016/j.isprsjprs.2024.10.009","DOIUrl":"10.1016/j.isprsjprs.2024.10.009","url":null,"abstract":"<div><div>Satellite-derived bathymetry (SDB) is a vital technique for the rapid and cost-effective measurement of shallow underwater terrain. However, it faces challenges of image noise, including clouds, bubble clouds, and sun glint. Consequently, the acquisition of no missing and accurate bathymetric maps is frequently challenging, particularly in cloudy, rainy, and large-scale regions. In this study, we propose a multi-temporal image weighted composition (MIWC) method. This method performs iterative segmentation and inverse distance weighted composition of multi-temporal images based only on the near-infrared (NIR) band information of multispectral images to obtain high-quality composite images. The method was applied to scenarios using Sentinel-2 imagery for bathymetry of four representative areas located in the South China Sea and the Atlantic Ocean. The results show that the root mean square error (RMSE) of bathymetry from the composite images using the log-transformed linear model (LLM) and the log-transformed ratio model (LRM) in the water depth range of 0–20 m are 0.67–1.22 m and 0.71–1.23 m, respectively. The RMSE of the bathymetry decreases with the number of images involved in the composition and tends to be relatively stable when the number of images reaches approximately 16. In addition, the composition images generated by the MIWC method generally exhibit not only superior visual quality, but also significant advantages in terms of bathymetric accuracy and robustness when compared to the best single images as well as the composition images generated by the median composition method and the maximum outlier removal method. The recommended value of the power parameter for inverse distance weighting in the MIWC method was experimentally determined to be 4, which typically does not require complex adjustments, making the method easy to apply or integrate. The MIWC method offers a reliable approach to improve the quality of remote sensing images, ensuring the completeness and accuracy of shallow water bathymetric maps.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 430-445"},"PeriodicalIF":10.6,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Common-feature-track-matching approach for multi-epoch UAV photogrammetry co-registration 多波段无人机摄影测量共准法的共同特征轨迹匹配方法
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-14 DOI: 10.1016/j.isprsjprs.2024.10.025
Xinlong Li , Mingtao Ding , Zhenhong Li , Peng Cui
Automatic co-registration of multi-epoch Unmanned Aerial Vehicle (UAV) image sets remains challenging due to the radiometric differences in complex dynamic scenes. Specifically, illumination changes and vegetation variations usually lead to insufficient and spatially unevenly distributed common tie points (CTPs), resulting in under-fitting of co-registration near the areas without CTPs. In this paper, we propose a novel Common-Feature-Track-Matching (CFTM) approach for UAV image sets co-registration, to alleviate the shortage of CTPs in complex dynamic scenes. Instead of matching features between multi-epoch images, we first search correspondences between multi-epoch feature tracks (i.e., groups of features corresponding to the same 3D points), which avoids the removal of matches due to unreliable estimation of the relative pose between inter-epoch image pairs. Then, the CTPs are triangulated from the successfully matched track pairs. Since an even distribution of CTPs is crucial for robust co-registration, a block-based strategy is designed, as well as enabling parallel computation. Finally, an iterative optimization algorithm is developed to gradually select the best CTPs to refine the poses of multi-epoch images. We assess the performance of our method on two challenging datasets. The results show that CFTM can automatically acquire adequate and evenly distributed CTPs in complex dynamic scenes, achieving a high co-registration accuracy approximately four times higher than the state-of-the-art in challenging scenario. Our code is available at https://github.com/lixinlong1998/CoSfM.
由于复杂动态场景中的辐射测量差异,多波长无人机(UAV)图像集的自动共配准仍然具有挑战性。具体来说,光照变化和植被变化通常会导致公共连接点(CTP)不足且在空间上分布不均,从而导致在没有公共连接点的区域附近的协同注册拟合不足。在本文中,我们提出了一种用于无人机图像集协同注册的新型公共特征轨迹匹配(CFTM)方法,以缓解复杂动态场景中公共连接点不足的问题。我们首先搜索多时序特征轨迹(即对应于相同三维点的特征组)之间的对应关系,而不是匹配多时序图像之间的特征,这避免了由于对时序间图像对的相对姿态估计不可靠而删除匹配。然后,根据成功匹配的轨迹对进行 CTP 三角测量。由于 CTPs 的均匀分布对稳健的协同注册至关重要,因此设计了一种基于块的策略,并实现了并行计算。最后,我们开发了一种迭代优化算法,以逐步选择最佳 CTP,从而完善多波段图像的姿态。我们在两个具有挑战性的数据集上评估了我们方法的性能。结果表明,CFTM 可以在复杂的动态场景中自动获取足够且分布均匀的 CTP,在具有挑战性的场景中实现了比最先进方法高约四倍的高协同注册精度。我们的代码见 https://github.com/lixinlong1998/CoSfM。
{"title":"Common-feature-track-matching approach for multi-epoch UAV photogrammetry co-registration","authors":"Xinlong Li ,&nbsp;Mingtao Ding ,&nbsp;Zhenhong Li ,&nbsp;Peng Cui","doi":"10.1016/j.isprsjprs.2024.10.025","DOIUrl":"10.1016/j.isprsjprs.2024.10.025","url":null,"abstract":"<div><div>Automatic co-registration of multi-epoch Unmanned Aerial Vehicle (UAV) image sets remains challenging due to the radiometric differences in complex dynamic scenes. Specifically, illumination changes and vegetation variations usually lead to insufficient and spatially unevenly distributed common tie points (CTPs), resulting in under-fitting of co-registration near the areas without CTPs. In this paper, we propose a novel Common-Feature-Track-Matching (CFTM) approach for UAV image sets co-registration, to alleviate the shortage of CTPs in complex dynamic scenes. Instead of matching features between multi-epoch images, we first search correspondences between multi-epoch feature tracks (i.e., groups of features corresponding to the same 3D points), which avoids the removal of matches due to unreliable estimation of the relative pose between inter-epoch image pairs. Then, the CTPs are triangulated from the successfully matched track pairs. Since an even distribution of CTPs is crucial for robust co-registration, a block-based strategy is designed, as well as enabling parallel computation. Finally, an iterative optimization algorithm is developed to gradually select the best CTPs to refine the poses of multi-epoch images. We assess the performance of our method on two challenging datasets. The results show that CFTM can automatically acquire adequate and evenly distributed CTPs in complex dynamic scenes, achieving a high co-registration accuracy approximately four times higher than the state-of-the-art in challenging scenario. Our code is available at <span><span>https://github.com/lixinlong1998/CoSfM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 392-407"},"PeriodicalIF":10.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
B3-CDG: A pseudo-sample diffusion generator for bi-temporal building binary change detection B3-CDG:用于双时态建筑物二进制变化检测的伪样本扩散发生器
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-14 DOI: 10.1016/j.isprsjprs.2024.10.021
Peng Chen , Peixian Li , Bing Wang , Sihai Zhao , Yongliang Zhang , Tao Zhang , Xingcheng Ding
Building change detection (CD) plays a crucial role in urban planning, land resource management, and disaster monitoring. Currently, deep learning has become a key approach in building CD, but challenges persist. Obtaining large-scale, accurately registered bi-temporal images is difficult, and annotation is time-consuming. Therefore, we propose B3-CDG, a bi-temporal building binary CD pseudo-sample generator based on the principle of latent diffusion. This generator treats building change processes as local semantic states transformations. It utilizes textual instructions and mask prompts to generate specific class changes in designated regions of single-temporal images, creating different temporal images with clear semantic transitions. B3-CDG is driven by large-scale pretrained models and utilizes external adapters to guide the model in learning remote sensing image distributions. To generate seamless building boundaries, B3-CDG adopts a simple and effective approach—dilation masks—to compel the model to learn boundary details. In addition, B3-CDG incorporates diffusion guidance and data augmentation to enhance image realism. In the generation experiments, B3-CDG achieved the best performance with the lowest FID (26.40) and the highest IS (4.60) compared to previous baseline methods (such as Inpaint and IAug). This method effectively addresses challenges such as boundary continuity, shadow generation, and vegetation occlusion while ensuring that the generated building roof structures and colors are realistic and diverse. In the application experiments, B3-CDG improved the IOU of the validation model (SFFNet) by 6.34 % and 7.10 % on the LEVIR and WHUCD datasets, respectively. When the real data is extremely limited (using only 5 % of the original data), the improvement further reaches 33.68 % and 32.40 %. Moreover, B3-CDG can enhance the baseline performance of advanced CD models, such as SNUNet and ChangeFormer. Ablation studies further confirm the effectiveness of the B3-CDG design. This study introduces a novel research paradigm for building CD, potentially advancing the field. Source code and datasets will be available at https://github.com/ABCnutter/B3-CDG.
建筑物变化检测(CD)在城市规划、土地资源管理和灾害监测中发挥着至关重要的作用。目前,深度学习已成为建筑物变化检测的关键方法,但挑战依然存在。获取大规模、精确注册的双时相图像非常困难,标注也非常耗时。因此,我们提出了基于潜在扩散原理的双时态建筑二进制 CD 伪样本生成器 B3-CDG。该生成器将建筑变化过程视为局部语义状态转换。它利用文字说明和掩码提示,在单时相图像的指定区域生成特定类别的变化,从而创建具有清晰语义转换的不同时相图像。B3-CDG 由大规模预训练模型驱动,并利用外部适配器引导模型学习遥感图像分布。为了生成无缝的建筑边界,B3-CDG 采用了一种简单而有效的方法--压缩遮罩,迫使模型学习边界细节。此外,B3-CDG 还结合了扩散引导和数据增强技术,以增强图像的真实性。在生成实验中,与之前的基线方法(如 Inpaint 和 IAug)相比,B3-CDG 性能最佳,FID 最低(26.40),IS 最高(4.60)。该方法有效地解决了边界连续性、阴影生成和植被遮挡等难题,同时确保生成的建筑屋顶结构和颜色逼真多样。在应用实验中,B3-CDG 在 LEVIR 和 WHUCD 数据集上分别将验证模型(SFFNet)的 IOU 提高了 6.34 % 和 7.10 %。当真实数据极其有限时(仅使用原始数据的 5%),改进幅度进一步达到 33.68 % 和 32.40 %。此外,B3-CDG 还能提高 SNUNet 和 ChangeFormer 等高级 CD 模型的基线性能。消融研究进一步证实了 B3-CDG 设计的有效性。这项研究为构建 CD 引入了一种新的研究范式,有可能推动该领域的发展。源代码和数据集将发布在 https://github.com/ABCnutter/B3-CDG 网站上。
{"title":"B3-CDG: A pseudo-sample diffusion generator for bi-temporal building binary change detection","authors":"Peng Chen ,&nbsp;Peixian Li ,&nbsp;Bing Wang ,&nbsp;Sihai Zhao ,&nbsp;Yongliang Zhang ,&nbsp;Tao Zhang ,&nbsp;Xingcheng Ding","doi":"10.1016/j.isprsjprs.2024.10.021","DOIUrl":"10.1016/j.isprsjprs.2024.10.021","url":null,"abstract":"<div><div>Building change detection (CD) plays a crucial role in urban planning, land resource management, and disaster monitoring. Currently, deep learning has become a key approach in building CD, but challenges persist. Obtaining large-scale, accurately registered bi-temporal images is difficult, and annotation is time-consuming. Therefore, we propose B<sup>3</sup>-CDG, a bi-temporal building binary CD pseudo-sample generator based on the principle of latent diffusion. This generator treats building change processes as local semantic states transformations. It utilizes textual instructions and mask prompts to generate specific class changes in designated regions of single-temporal images, creating different temporal images with clear semantic transitions. B<sup>3</sup>-CDG is driven by large-scale pretrained models and utilizes external adapters to guide the model in learning remote sensing image distributions. To generate seamless building boundaries, B<sup>3</sup>-CDG adopts a simple and effective approach—dilation masks—to compel the model to learn boundary details. In addition, B<sup>3</sup>-CDG incorporates diffusion guidance and data augmentation to enhance image realism. In the generation experiments, B<sup>3</sup>-CDG achieved the best performance with the lowest FID (26.40) and the highest IS (4.60) compared to previous baseline methods (such as Inpaint and IAug). This method effectively addresses challenges such as boundary continuity, shadow generation, and vegetation occlusion while ensuring that the generated building roof structures and colors are realistic and diverse. In the application experiments, B<sup>3</sup>-CDG improved the IOU of the validation model (SFFNet) by 6.34 % and 7.10 % on the LEVIR and WHUCD datasets, respectively. When the real data is extremely limited (using only 5 % of the original data), the improvement further reaches 33.68 % and 32.40 %. Moreover, B<sup>3</sup>-CDG can enhance the baseline performance of advanced CD models, such as SNUNet and ChangeFormer. Ablation studies further confirm the effectiveness of the B<sup>3</sup>-CDG design. This study introduces a novel research paradigm for building CD, potentially advancing the field. Source code and datasets will be available at <span><span>https://github.com/ABCnutter/B3-CDG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 408-429"},"PeriodicalIF":10.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mesh refinement method for multi-view stereo with unary operations 采用单值运算的多视角立体网格细化方法
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-12 DOI: 10.1016/j.isprsjprs.2024.10.023
Jianchen Liu, Shuang Han, Jin Li
3D reconstruction is an important part of digital city, high-accuracy 3D modeling method has been widely studied as an important pathway to visualizing 3D city scenes. However, the problems of image resolution, noise, and occlusion result in low quality and smooth features in the mesh model. Therefore, the model needs to be refined to improve the mesh quality and enhance the visual effect. This paper proposes a mesh refinement algorithm to fine-tune the vertices of the mesh and constrain their evolution direction on the normal vector, reducing their freedom degrees to one. The evolution of vertices only involves one motion distance parameter on the normal vector, simplifying the complexity of the energy function derivation. Meanwhile, Gaussian curvature is used as a regularization term, which is anisotropic and preserves the edge features during the reconstruction process. The mesh refinement algorithm with unary operations fully utilizes the original image information and effectively enriches the local detail features of the mesh model. This paper utilizes five public datasets to conduct comparative experiments, and the experimental results show that the proposed algorithm can better restore the detailed features of the model and has a better refinement effect in the same number of iterations compared with OpenMVS library refinement algorithm. At the same time, in the comparison of refinement results with fewer iterations, the algorithm in this paper can achieve more desirable results.
三维重建是数字城市的重要组成部分,高精度三维建模方法作为可视化三维城市场景的重要途径已被广泛研究。然而,由于图像分辨率、噪声和遮挡等问题,导致网格模型质量不高,特征不平滑。因此,需要对模型进行细化,以提高网格质量,增强视觉效果。本文提出了一种网格细化算法,对网格顶点进行微调,并约束其在法向量上的演化方向,将其自由度降为 1。顶点的演化只涉及法向量上的一个运动距离参数,简化了能量函数推导的复杂性。同时,高斯曲率被用作正则化项,它是各向异性的,在重建过程中能保留边缘特征。采用单值运算的网格细化算法充分利用了原始图像信息,有效地丰富了网格模型的局部细节特征。本文利用五个公开数据集进行了对比实验,实验结果表明,与 OpenMVS 库细化算法相比,在相同的迭代次数下,本文提出的算法能更好地还原模型的细节特征,细化效果更好。同时,在迭代次数较少的细化结果对比中,本文算法能取得更理想的结果。
{"title":"Mesh refinement method for multi-view stereo with unary operations","authors":"Jianchen Liu,&nbsp;Shuang Han,&nbsp;Jin Li","doi":"10.1016/j.isprsjprs.2024.10.023","DOIUrl":"10.1016/j.isprsjprs.2024.10.023","url":null,"abstract":"<div><div>3D reconstruction is an important part of digital city, high-accuracy 3D modeling method has been widely studied as an important pathway to visualizing 3D city scenes. However, the problems of image resolution, noise, and occlusion result in low quality and smooth features in the mesh model. Therefore, the model needs to be refined to improve the mesh quality and enhance the visual effect. This paper proposes a mesh refinement algorithm to fine-tune the vertices of the mesh and constrain their evolution direction on the normal vector, reducing their freedom degrees to one. The evolution of vertices only involves one motion distance parameter on the normal vector, simplifying the complexity of the energy function derivation. Meanwhile, Gaussian curvature is used as a regularization term, which is anisotropic and preserves the edge features during the reconstruction process. The mesh refinement algorithm with unary operations fully utilizes the original image information and effectively enriches the local detail features of the mesh model. This paper utilizes five public datasets to conduct comparative experiments, and the experimental results show that the proposed algorithm can better restore the detailed features of the model and has a better refinement effect in the same number of iterations compared with OpenMVS library refinement algorithm. At the same time, in the comparison of refinement results with fewer iterations, the algorithm in this paper can achieve more desirable results.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 361-375"},"PeriodicalIF":10.6,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast and accurate SAR geocoding with a plane approximation 利用平面近似进行快速准确的合成孔径雷达地理编码
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-11 DOI: 10.1016/j.isprsjprs.2024.10.031
Shaokun Guo , Jie Dong , Yian Wang , Mingsheng Liao
Geocoding is the procedure of finding the mapping between the Synthetic Aperture Radar (SAR) image and the imaged scene. The inverse form of the Range-Doppler (RD) model has been adopted to approximate the geocoding results. However, with advances in SAR imaging geodesy, its imprecise nature becomes more perceptible. The forward RD model gives reliable solutions but is time-consuming and unable to detect geometric distortions. This study proposes a highly optimized forward geocoding method to find the precise ground position of each image sample with a Digital Elevation Model (DEM). By following the intersection of the terrain and the so-called solution surface of an azimuth line, which can be locally approximated by a plane, it produces geo-location results almost identical to the analytical solutions of the RD model. At the same time, the non-unique geocoding solutions and the geometric distortions are determined. Deviations from the employed approximations are assessed, showing that they are highly predictable and lead to negligible range/azimuth residuals. The general robustness is verified by experiments on SAR images of different resolutions covering diversified terrains in the native or zero Doppler geometry. Comparisons with other forward algorithms demonstrate that, with extra geometric distortions detection ability, its accuracy and efficiency are comparable to them. For a Sentinel-1 IW burst of high topographic relief, the algorithm ends in a 3 s using 16 parallel cores, with an average residual smaller than one millimeter. Its impressive blend of efficiency, accuracy, and geometric distortion detection capabilities makes it ideal for large-scale remote sensing applications.
地理编码是找到合成孔径雷达(SAR)图像与成像场景之间映射关系的过程。测距-多普勒(RD)模型的反形式被用来近似地理编码结果。然而,随着合成孔径雷达成像大地测量技术的发展,其不精确性变得越来越明显。前向 RD 模型能给出可靠的解,但耗时长,且无法检测几何变形。本研究提出了一种高度优化的前向大地编码方法,利用数字高程模型(DEM)找到每个图像样本的精确地面位置。通过跟踪地形与方位角线的所谓解面(可局部近似为平面)的交点,该方法得出的地理定位结果与 RD 模型的解析解几乎完全一致。同时,还确定了非唯一地理编码解和几何变形。对所采用近似值的偏差进行了评估,结果表明这些偏差具有很高的可预测性,导致的测距/方位角残差可以忽略不计。通过对不同分辨率的合成孔径雷达图像进行实验,验证了在原生或零多普勒几何条件下覆盖不同地形的总体稳健性。与其他前向算法的比较表明,该算法具有额外的几何畸变检测能力,其精度和效率与其他算法不相上下。对于地形起伏较大的哨兵-1 IW 阵列,该算法使用 16 个并行内核在 3 秒内完成,平均残差小于 1 毫米。该算法在效率、准确性和几何畸变检测能力方面的出色表现使其成为大规模遥感应用的理想选择。
{"title":"Fast and accurate SAR geocoding with a plane approximation","authors":"Shaokun Guo ,&nbsp;Jie Dong ,&nbsp;Yian Wang ,&nbsp;Mingsheng Liao","doi":"10.1016/j.isprsjprs.2024.10.031","DOIUrl":"10.1016/j.isprsjprs.2024.10.031","url":null,"abstract":"<div><div>Geocoding is the procedure of finding the mapping between the Synthetic Aperture Radar (SAR) image and the imaged scene. The inverse form of the Range-Doppler (RD) model has been adopted to approximate the geocoding results. However, with advances in SAR imaging geodesy, its imprecise nature becomes more perceptible. The forward RD model gives reliable solutions but is time-consuming and unable to detect geometric distortions. This study proposes a highly optimized forward geocoding method to find the precise ground position of each image sample with a Digital Elevation Model (DEM). By following the intersection of the terrain and the so-called solution surface of an azimuth line, which can be locally approximated by a plane, it produces geo-location results almost identical to the analytical solutions of the RD model. At the same time, the non-unique geocoding solutions and the geometric distortions are determined. Deviations from the employed approximations are assessed, showing that they are highly predictable and lead to negligible range/azimuth residuals. The general robustness is verified by experiments on SAR images of different resolutions covering diversified terrains in the native or zero Doppler geometry. Comparisons with other forward algorithms demonstrate that, with extra geometric distortions detection ability, its accuracy and efficiency are comparable to them. For a Sentinel-1 IW burst of high topographic relief, the algorithm ends in a 3 s using 16 parallel cores, with an average residual smaller than one millimeter. Its impressive blend of efficiency, accuracy, and geometric distortion detection capabilities makes it ideal for large-scale remote sensing applications.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 344-360"},"PeriodicalIF":10.6,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3D point cloud regularization method for uniform mesh generation of mining excavations 采矿挖掘均匀网格生成的三维点云正则化方法
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-09 DOI: 10.1016/j.isprsjprs.2024.10.024
Przemysław Dąbek, Jacek Wodecki, Paulina Kujawa, Adam Wróblewski, Arkadiusz Macek, Radosław Zimroz
Mine excavation systems are usually dozens of kilometers long with varying geometry on a small scale (roughness and shape of the walls) and on a large scale (varying widths of the tunnels, turns, and crossings). In this article, the authors address the problem of analyzing laser scanning data from large mining structures that can be used for various purposes, with focus on ventilation simulations. Together with the quality of the measurement data (diverse point-cloud density, missing samples, holes induced by obstructions in the field of view, measurement noise), this creates problems that require multi-stage processing of the obtained data. The authors propose a robust methodology to process a single segmented section of the mining system. The presented approach focuses on obtaining a point cloud ready for application in the computational fluid dynamics (CFD) analysis of airflow with minimal need for additional manual corrections on the generated mesh model. This requires the point cloud to have evenly distributed points and reduced noise (together with removal of objects inside) while keeping the unique geometrical properties and shape of the scanned tunnels. Proposed methodology uses trajectory of the excavation either obtained during the measurements or by skeletonization process explained in the article. Cross-sections obtained on planes perpendicular to the trajectory are processed towards the equalization of point distribution, removing measurement noise, holes in the point cloud and objects inside the excavation. The effects of the proposed algorithm are validated by comparing the processed cloud with the original cloud and testing within the CFD environment. The algorithm proved high effectiveness in improving skewness rate of the obtained mesh and geometry mapping accuracy (standard deviation below 5 centimeters in cloud-to-mesh comparison).
矿山挖掘系统通常长达数十公里,其几何形状在小范围(墙壁的粗糙度和形状)和大范围(隧道、转弯和交叉口的宽度不同)都各不相同。在这篇文章中,作者探讨了分析大型采矿结构激光扫描数据的问题,这些数据可用于各种用途,重点是通风模拟。由于测量数据的质量问题(点云密度不同、样本缺失、视场中的障碍物造成的孔洞、测量噪音),这就产生了需要对所获数据进行多阶段处理的问题。作者提出了一种稳健的方法来处理采矿系统的单一分段。该方法的重点是获取可用于气流计算流体动力学(CFD)分析的点云,并尽量减少对生成的网格模型进行额外手动修正的需要。这就要求点云具有均匀分布的点并减少噪音(同时去除内部物体),同时保持扫描隧道的独特几何特性和形状。所提出的方法使用了在测量过程中或通过文章中解释的骨架化过程获得的挖掘轨迹。在垂直于轨迹的平面上获得的横截面经过处理,以实现点分布的均衡化,去除测量噪声、点云中的孔洞和挖掘物内部的物体。通过比较处理后的云和原始云,并在 CFD 环境中进行测试,验证了所提算法的效果。事实证明,该算法在提高所获网格的偏斜率和几何映射精度(在云与网格的比较中,标准偏差低于 5 厘米)方面非常有效。
{"title":"3D point cloud regularization method for uniform mesh generation of mining excavations","authors":"Przemysław Dąbek,&nbsp;Jacek Wodecki,&nbsp;Paulina Kujawa,&nbsp;Adam Wróblewski,&nbsp;Arkadiusz Macek,&nbsp;Radosław Zimroz","doi":"10.1016/j.isprsjprs.2024.10.024","DOIUrl":"10.1016/j.isprsjprs.2024.10.024","url":null,"abstract":"<div><div>Mine excavation systems are usually dozens of kilometers long with varying geometry on a small scale (roughness and shape of the walls) and on a large scale (varying widths of the tunnels, turns, and crossings). In this article, the authors address the problem of analyzing laser scanning data from large mining structures that can be used for various purposes, with focus on ventilation simulations. Together with the quality of the measurement data (diverse point-cloud density, missing samples, holes induced by obstructions in the field of view, measurement noise), this creates problems that require multi-stage processing of the obtained data. The authors propose a robust methodology to process a single segmented section of the mining system. The presented approach focuses on obtaining a point cloud ready for application in the computational fluid dynamics (CFD) analysis of airflow with minimal need for additional manual corrections on the generated mesh model. This requires the point cloud to have evenly distributed points and reduced noise (together with removal of objects inside) while keeping the unique geometrical properties and shape of the scanned tunnels. Proposed methodology uses trajectory of the excavation either obtained during the measurements or by skeletonization process explained in the article. Cross-sections obtained on planes perpendicular to the trajectory are processed towards the equalization of point distribution, removing measurement noise, holes in the point cloud and objects inside the excavation. The effects of the proposed algorithm are validated by comparing the processed cloud with the original cloud and testing within the CFD environment. The algorithm proved high effectiveness in improving skewness rate of the obtained mesh and geometry mapping accuracy (standard deviation below 5 centimeters in cloud-to-mesh comparison).</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 324-343"},"PeriodicalIF":10.6,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalization in deep learning-based aircraft classification for SAR imagery 基于深度学习的合成孔径雷达图像飞机分类中的泛化问题
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-08 DOI: 10.1016/j.isprsjprs.2024.10.030
Andrea Pulella , Francescopaolo Sica , Carlos Villamil Lopez , Harald Anglberger , Ronny Hänsch
Automatic Target Recognition (ATR) from Synthetic Aperture Radar (SAR) data covers a wide range of applications. SAR ATR helps to detect and track vehicles and other objects, e.g. in disaster relief and surveillance operations. Aircraft classification covers a significant part of this research area, which differs from other SAR-based ATR tasks, such as ship and ground vehicle detection and classification, in that aircrafts are usually a static target, often remaining at the same location and in a given orientation for longer time frames. Today, there is a significant mismatch between the abundance of deep learning-based aircraft classification models and the availability of corresponding datasets. This mismatch has led to models with improved classification performance on specific datasets, but the challenge of generalizing to conditions not present in the training data (which are expected to occur in operational conditions) has not yet been satisfactorily analyzed. This paper aims to evaluate how classification performance and generalization capabilities of deep learning models are influenced by the diversity of the training dataset. Our goal is to understand the model’s competence and the conditions under which it can achieve proficiency in aircraft classification tasks for high-resolution SAR images while demonstrating generalization capabilities when confronted with novel data that include different geographic locations, environmental conditions, and geometric variations. We address this gap by using manually annotated high-resolution SAR data from TerraSAR-X and TanDEM-X and show how the classification performance changes for different application scenarios requiring different training and evaluation setups. We find that, as expected, the type of aircraft plays a crucial role in the classification problem, since it will vary in shape and dimension. However, these aspects are secondary to how the SAR image is acquired, with the acquisition geometry playing the primary role. Therefore, we find that the characteristics of the acquisition are much more relevant for generalization than the complex geometry of the target. We show this for various models selected among the standard classification algorithms.
合成孔径雷达(SAR)数据的自动目标识别(ATR)应用范围广泛。合成孔径雷达自动目标识别(ATR)有助于探测和跟踪飞行器和其他物体,例如在救灾和监视行动中。飞机分类是这一研究领域的重要组成部分,它不同于其他基于合成孔径雷达的 ATR 任务,如船舶和地面车辆的探测和分类,因为飞机通常是静态目标,经常在同一地点和特定方向停留较长时间。如今,基于深度学习的飞机分类模型的丰富程度与相应数据集的可用性之间存在严重不匹配。这种不匹配导致模型在特定数据集上的分类性能有所提高,但对训练数据中不存在的条件(预计会在运行条件下出现)进行泛化的挑战尚未得到令人满意的分析。本文旨在评估深度学习模型的分类性能和泛化能力如何受到训练数据集多样性的影响。我们的目标是了解模型的能力,以及它在什么条件下可以熟练完成高分辨率合成孔径雷达图像的飞机分类任务,同时在面对包括不同地理位置、环境条件和几何变化在内的新数据时展示泛化能力。我们利用 TerraSAR-X 和 TanDEM-X 人工标注的高分辨率合成孔径雷达数据弥补了这一不足,并展示了在需要不同训练和评估设置的不同应用场景下,分类性能的变化情况。我们发现,正如预期的那样,飞机的类型在分类问题中起着至关重要的作用,因为飞机的形状和尺寸会有所不同。然而,这些方面对于如何获取合成孔径雷达图像是次要的,而获取几何图形才是主要的。因此,我们发现获取图像的特征比目标的复杂几何形状更适合进行归纳。我们从标准分类算法中选择了多种模型来说明这一点。
{"title":"Generalization in deep learning-based aircraft classification for SAR imagery","authors":"Andrea Pulella ,&nbsp;Francescopaolo Sica ,&nbsp;Carlos Villamil Lopez ,&nbsp;Harald Anglberger ,&nbsp;Ronny Hänsch","doi":"10.1016/j.isprsjprs.2024.10.030","DOIUrl":"10.1016/j.isprsjprs.2024.10.030","url":null,"abstract":"<div><div>Automatic Target Recognition (ATR) from Synthetic Aperture Radar (SAR) data covers a wide range of applications. SAR ATR helps to detect and track vehicles and other objects, e.g. in disaster relief and surveillance operations. Aircraft classification covers a significant part of this research area, which differs from other SAR-based ATR tasks, such as ship and ground vehicle detection and classification, in that aircrafts are usually a static target, often remaining at the same location and in a given orientation for longer time frames. Today, there is a significant mismatch between the abundance of deep learning-based aircraft classification models and the availability of corresponding datasets. This mismatch has led to models with improved classification performance on specific datasets, but the challenge of generalizing to conditions not present in the training data (which are expected to occur in operational conditions) has not yet been satisfactorily analyzed. This paper aims to evaluate how classification performance and generalization capabilities of deep learning models are influenced by the diversity of the training dataset. Our goal is to understand the model’s competence and the conditions under which it can achieve proficiency in aircraft classification tasks for high-resolution SAR images while demonstrating generalization capabilities when confronted with novel data that include different geographic locations, environmental conditions, and geometric variations. We address this gap by using manually annotated high-resolution SAR data from TerraSAR-X and TanDEM-X and show how the classification performance changes for different application scenarios requiring different training and evaluation setups. We find that, as expected, the type of aircraft plays a crucial role in the classification problem, since it will vary in shape and dimension. However, these aspects are secondary to how the SAR image is acquired, with the acquisition geometry playing the primary role. Therefore, we find that the characteristics of the acquisition are much more relevant for generalization than the complex geometry of the target. We show this for various models selected among the standard classification algorithms.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 312-323"},"PeriodicalIF":10.6,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1