首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
Accurate spaceborne waveform simulation in heterogeneous forests using small-footprint airborne LiDAR point clouds 基于小足迹机载激光雷达点云的异质森林精确星载波形模拟
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-02-01 DOI: 10.1016/j.isprsjprs.2024.11.020
Yi Li , Guangjian Yan , Weihua Li , Donghui Xie , Hailan Jiang , Linyuan Li , Jianbo Qi , Ronghai Hu , Xihan Mu , Xiao Chen , Shanshan Wei , Hao Tang
Spaceborne light detection and ranging (LiDAR) waveform sensors require accurate signal simulations to facilitate prelaunch calibration, postlaunch validation, and the development of land surface data products. However, accurately simulating spaceborne LiDAR waveforms over heterogeneous forests remains challenging because data-driven methods do not account for complicated pulse transport within heterogeneous canopies, whereas analytical radiative transfer models overly rely on assumptions about canopy structure and distribution. Thus, a comprehensive simulation method is needed to account for both the complexity of pulse transport within canopies and the structural heterogeneity of forests. In this study, we propose a framework for spaceborne LiDAR waveform simulation by integrating a new radiative transfer model – the canopy voxel radiative transfer (CVRT) model – with reconstructed three-dimensional (3D) voxel forest scenes from small-footprint airborne LiDAR (ALS) point clouds. The CVRT model describes the radiative transfer process within canopy voxels and uses fractional crown cover to account for within-voxel heterogeneity, minimizing the need for assumptions about canopy shape and distribution and significantly reducing the number of input parameters. All the parameters for scene construction and model inputs can be obtained from the ALS point clouds. The performance of the proposed framework was assessed by comparing the results to the simulated LiDAR waveforms from DART, Global Ecosystem Dynamics Investigation (GEDI) data over heterogeneous forest stands, and Land, Vegetation, and Ice Sensor (LVIS) data from the National Ecological Observatory Network (NEON) site. The results suggest that compared with existing models, the new framework with the CVRT model achieved improved agreement with both simulated and measured data, with an average R2 improvement of approximately 2% to 5% and an average RMSE reduction of approximately 0.5% to 3%. The proposed framework was also highly adaptive and robust to variations in model configurations, input data quality, and environmental attributes. In summary, this work extends current research on accurate and robust large-footprint LiDAR waveform simulations over heterogeneous forest canopies and could help refine product development for emerging spaceborne LiDAR missions.
星载光探测和测距(LiDAR)波形传感器需要精确的信号模拟,以促进发射前校准、发射后验证和陆地表面数据产品的开发。然而,准确模拟非均匀森林上的星载LiDAR波形仍然具有挑战性,因为数据驱动的方法不能考虑非均匀冠层内复杂的脉冲传输,而分析辐射传输模型过度依赖于对冠层结构和分布的假设。因此,需要一种综合的模拟方法来考虑脉冲在冠层内传输的复杂性和森林结构的异质性。在这项研究中,我们提出了一个星载激光雷达波形模拟框架,该框架将一种新的辐射传输模型-冠层体素辐射传输(CVRT)模型-与小足迹机载激光雷达(ALS)点云重建的三维(3D)体素森林场景相结合。CVRT模型描述了冠层体素内的辐射传输过程,并使用分数冠层覆盖度来解释体素内的异质性,从而最大限度地减少了对冠层形状和分布的假设需求,并显著减少了输入参数的数量。场景构建和模型输入的所有参数都可以从ALS点云中获得。通过与DART的模拟激光雷达波形、全球生态系统动力学调查(GEDI)数据以及国家生态观测站网络(NEON)的土地、植被和冰传感器(LVIS)数据进行比较,评估了该框架的性能。结果表明,与现有模型相比,CVRT模型的新框架与模拟和测量数据的一致性得到了改善,平均R2提高了约2%至5%,平均RMSE降低了约0.5%至3%。所提出的框架对模型配置、输入数据质量和环境属性的变化也具有高度的适应性和鲁棒性。总之,这项工作扩展了目前在异质森林冠层上精确和强大的大足迹激光雷达波形模拟的研究,并有助于改进新兴星载激光雷达任务的产品开发。
{"title":"Accurate spaceborne waveform simulation in heterogeneous forests using small-footprint airborne LiDAR point clouds","authors":"Yi Li ,&nbsp;Guangjian Yan ,&nbsp;Weihua Li ,&nbsp;Donghui Xie ,&nbsp;Hailan Jiang ,&nbsp;Linyuan Li ,&nbsp;Jianbo Qi ,&nbsp;Ronghai Hu ,&nbsp;Xihan Mu ,&nbsp;Xiao Chen ,&nbsp;Shanshan Wei ,&nbsp;Hao Tang","doi":"10.1016/j.isprsjprs.2024.11.020","DOIUrl":"10.1016/j.isprsjprs.2024.11.020","url":null,"abstract":"<div><div>Spaceborne light detection and ranging (LiDAR) waveform sensors require accurate signal simulations to facilitate prelaunch calibration, postlaunch validation, and the development of land surface data products. However, accurately simulating spaceborne LiDAR waveforms over heterogeneous forests remains challenging because data-driven methods do not account for complicated pulse transport within heterogeneous canopies, whereas analytical radiative transfer models overly rely on assumptions about canopy structure and distribution. Thus, a comprehensive simulation method is needed to account for both the complexity of pulse transport within canopies and the structural heterogeneity of forests. In this study, we propose a framework for spaceborne LiDAR waveform simulation by integrating a new radiative transfer model – the canopy voxel radiative transfer (CVRT) model – with reconstructed three-dimensional (3D) voxel forest scenes from small-footprint airborne LiDAR (ALS) point clouds. The CVRT model describes the radiative transfer process within canopy voxels and uses fractional crown cover to account for within-voxel heterogeneity, minimizing the need for assumptions about canopy shape and distribution and significantly reducing the number of input parameters. All the parameters for scene construction and model inputs can be obtained from the ALS point clouds. The performance of the proposed framework was assessed by comparing the results to the simulated LiDAR waveforms from DART, Global Ecosystem Dynamics Investigation (GEDI) data over heterogeneous forest stands, and Land, Vegetation, and Ice Sensor (LVIS) data from the National Ecological Observatory Network (NEON) site. The results suggest that compared with existing models, the new framework with the CVRT model achieved improved agreement with both simulated and measured data, with an average R<sup>2</sup> improvement of approximately 2% to 5% and an average RMSE reduction of approximately 0.5% to 3%. The proposed framework was also highly adaptive and robust to variations in model configurations, input data quality, and environmental attributes. In summary, this work extends current research on accurate and robust large-footprint LiDAR waveform simulations over heterogeneous forest canopies and could help refine product development for emerging spaceborne LiDAR missions.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 246-263"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142874574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Underwater image captioning: Challenges, models, and datasets 水下图像字幕:挑战、模型和数据集
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-02-01 DOI: 10.1016/j.isprsjprs.2024.12.002
Huanyu Li , Hao Wang , Ying Zhang , Li Li , Peng Ren
We delve into the nascent field of underwater image captioning from three perspectives: challenges, models, and datasets. One challenge arises from the disparities between natural images and underwater images, which hinder the use of the former to train models for the latter. Another challenge exists in the limited feature extraction capabilities of current image captioning models, impeding the generation of accurate underwater image captions. The final challenge, albeit not the least significant, revolves around the insufficiency of data available for underwater image captioning. This insufficiency not only complicates the training of models but also poses challenges for evaluating their performance effectively. To address these challenges, we make three novel contributions. First, we employ a physics-based degradation technique to transform natural images into degraded images that closely resemble realistic underwater images. Based on the degraded images, we develop a meta-learning strategy specifically tailored for underwater tasks. Second, we develop an underwater image captioning model based on scene-object feature fusion. It fuses underwater scene features extracted by ResNeXt and object features localized by YOLOv8, yielding comprehensive features for underwater image captioning. Last but not least, we construct an underwater image captioning dataset covering various underwater scenes, with each underwater image annotated with five accurate captions for the purpose of comprehensive training and validation. Experimental results on the new dataset validate the effectiveness of our novel models. The code and datasets are released at https://gitee.com/LHY-CODE/UICM-SOFF.
我们从三个角度深入研究水下图像字幕的新兴领域:挑战,模型和数据集。其中一个挑战来自自然图像和水下图像之间的差异,这阻碍了使用前者来训练后者的模型。另一个挑战是现有图像字幕模型的特征提取能力有限,阻碍了准确的水下图像字幕的生成。最后一个挑战,尽管不是最不重要的,围绕着水下图像字幕可用数据的不足。这种不足不仅使模型的训练变得复杂,而且对有效评估模型的性能提出了挑战。为了应对这些挑战,我们做出了三个新的贡献。首先,我们采用基于物理的退化技术将自然图像转化为与真实水下图像非常相似的退化图像。基于退化的图像,我们开发了专门为水下任务量身定制的元学习策略。其次,提出了一种基于景物特征融合的水下图像字幕模型。它融合了ResNeXt提取的水下场景特征和YOLOv8定位的目标特征,得到了用于水下图像字幕的综合特征。最后,我们构建了一个涵盖各种水下场景的水下图像字幕数据集,每个水下图像都标注了五个准确的字幕,以进行全面的训练和验证。在新数据集上的实验结果验证了新模型的有效性。代码和数据集发布在https://gitee.com/LHY-CODE/UICM-SOFF。
{"title":"Underwater image captioning: Challenges, models, and datasets","authors":"Huanyu Li ,&nbsp;Hao Wang ,&nbsp;Ying Zhang ,&nbsp;Li Li ,&nbsp;Peng Ren","doi":"10.1016/j.isprsjprs.2024.12.002","DOIUrl":"10.1016/j.isprsjprs.2024.12.002","url":null,"abstract":"<div><div>We delve into the nascent field of underwater image captioning from three perspectives: challenges, models, and datasets. One challenge arises from the disparities between natural images and underwater images, which hinder the use of the former to train models for the latter. Another challenge exists in the limited feature extraction capabilities of current image captioning models, impeding the generation of accurate underwater image captions. The final challenge, albeit not the least significant, revolves around the insufficiency of data available for underwater image captioning. This insufficiency not only complicates the training of models but also poses challenges for evaluating their performance effectively. To address these challenges, we make three novel contributions. First, we employ a physics-based degradation technique to transform natural images into degraded images that closely resemble realistic underwater images. Based on the degraded images, we develop a meta-learning strategy specifically tailored for underwater tasks. Second, we develop an underwater image captioning model based on scene-object feature fusion. It fuses underwater scene features extracted by ResNeXt and object features localized by YOLOv8, yielding comprehensive features for underwater image captioning. Last but not least, we construct an underwater image captioning dataset covering various underwater scenes, with each underwater image annotated with five accurate captions for the purpose of comprehensive training and validation. Experimental results on the new dataset validate the effectiveness of our novel models. The code and datasets are released at <span><span>https://gitee.com/LHY-CODE/UICM-SOFF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 440-453"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel airborne TomoSAR 3-D focusing method for accurate ice thickness and glacier volume estimation 一种新的机载TomoSAR三维聚焦方法,用于精确估算冰厚和冰川体积
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-02-01 DOI: 10.1016/j.isprsjprs.2025.01.011
Ke Wang , Yue Wu , Xiaolan Qiu , Jinbiao Zhu , Donghai Zheng , Songtao Shangguan , Jie Pan , Yuquan Liu , Liming Jiang , Xin Li
High-altitude mountain glaciers are highly responsive to environmental changes. However, their remote locations limit the applicability of traditional mapping methods, such as probing and Ground Penetrating Radar (GPR), in tracking changes in ice thickness and glacier volume. Over the past two decades, airborne Tomographic Synthetic Aperture Radar (TomoSAR) has shown promise for mapping the internal structures of mountain glaciers. Yet, its 3D mapping capabilities are limited by the radar signal’s relatively shallow penetration depth, with bedrock echoes rarely detected beyond 60 meters. Additionally, most TomoSAR studies ignored the air-ice refraction during the image-focusing step, reducing the 3D focusing accuracy for deeper subsurface targets. In this study, we developed a novel algorithm that integrates refraction path calculations into SAR image focusing. We also introduced a new method to construct the 3D TomoSAR cube by stacking InSAR phase coherence images, enabling the retrieval of deep bedrock signals even at low signal-to-noise ratios.
We tested our algorithms on 14 P-band SAR images acquired on April 8, 2023, over Bayi Glacier in the Qilian Mountains, located on the Qinghai-Tibet Plateau. For the first time, we successfully mapped the ice thickness across an entire mountain glacier using the airborne TomoSAR technique, detecting bedrock signals at depths reaching up to 120 m. Our ice thickness estimates showed strong agreement with in situ measurements from three GPR transects totaling 3.8 km in length, with root-mean-square errors (RMSE) ranging from 3.18 to 4.66 m. For comparison, we applied the state-of-the-art 3D focusing algorithm used in the AlpTomoSAR campaign for ice thickness estimation, which resulted in RMSE values between 5.67 and 5.81 m. Our proposed method reduced the RMSE by 18% to 44% relative to the AlpTomoSAR algorithm. Based on these measurements, we calculated a total ice volume of 0.121 km3, reflecting a decline of approximately 20.92% since the last reported volume in 2009, which was estimated from sparse GPR data. These results demonstrate that the proposed algorithm can effectively map ice thickness, providing a cost-efficient solution for large-scale glacier surveys in high-mountain regions.
高海拔山地冰川对环境变化反应迅速。然而,它们的偏远位置限制了探测和探地雷达(GPR)等传统测绘方法在跟踪冰厚和冰川体积变化方面的适用性。在过去的二十年里,机载层析合成孔径雷达(TomoSAR)在绘制山地冰川内部结构方面显示出了希望。然而,雷达信号的穿透深度相对较浅,基岩回波很少超过60米,因此其3D制图能力受到限制。此外,大多数TomoSAR研究在图像聚焦步骤中忽略了空气-冰折射,降低了深层地下目标的三维聚焦精度。在这项研究中,我们开发了一种新的算法,将折射路径计算集成到SAR图像聚焦中。我们还介绍了一种通过叠加InSAR相位相干图像来构建三维TomoSAR立方体的新方法,即使在低信噪比的情况下也能检索深层基岩信号。
{"title":"A novel airborne TomoSAR 3-D focusing method for accurate ice thickness and glacier volume estimation","authors":"Ke Wang ,&nbsp;Yue Wu ,&nbsp;Xiaolan Qiu ,&nbsp;Jinbiao Zhu ,&nbsp;Donghai Zheng ,&nbsp;Songtao Shangguan ,&nbsp;Jie Pan ,&nbsp;Yuquan Liu ,&nbsp;Liming Jiang ,&nbsp;Xin Li","doi":"10.1016/j.isprsjprs.2025.01.011","DOIUrl":"10.1016/j.isprsjprs.2025.01.011","url":null,"abstract":"<div><div>High-altitude mountain glaciers are highly responsive to environmental changes. However, their remote locations limit the applicability of traditional mapping methods, such as probing and Ground Penetrating Radar (GPR), in tracking changes in ice thickness and glacier volume. Over the past two decades, airborne Tomographic Synthetic Aperture Radar (TomoSAR) has shown promise for mapping the internal structures of mountain glaciers. Yet, its 3D mapping capabilities are limited by the radar signal’s relatively shallow penetration depth, with bedrock echoes rarely detected beyond 60 meters. Additionally, most TomoSAR studies ignored the air-ice refraction during the image-focusing step, reducing the 3D focusing accuracy for deeper subsurface targets. In this study, we developed a novel algorithm that integrates refraction path calculations into SAR image focusing. We also introduced a new method to construct the 3D TomoSAR cube by stacking InSAR phase coherence images, enabling the retrieval of deep bedrock signals even at low signal-to-noise ratios.</div><div>We tested our algorithms on 14 P-band SAR images acquired on April 8, 2023, over Bayi Glacier in the Qilian Mountains, located on the Qinghai-Tibet Plateau. For the first time, we successfully mapped the ice thickness across an entire mountain glacier using the airborne TomoSAR technique, detecting bedrock signals at depths reaching up to 120 m. Our ice thickness estimates showed strong agreement with in situ measurements from three GPR transects totaling 3.8 km in length, with root-mean-square errors (RMSE) ranging from 3.18 to 4.66 m. For comparison, we applied the state-of-the-art 3D focusing algorithm used in the AlpTomoSAR campaign for ice thickness estimation, which resulted in RMSE values between 5.67 and 5.81 m. Our proposed method reduced the RMSE by 18% to 44% relative to the AlpTomoSAR algorithm. Based on these measurements, we calculated a total ice volume of 0.121 km<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>, reflecting a decline of approximately 20.92% since the last reported volume in 2009, which was estimated from sparse GPR data. These results demonstrate that the proposed algorithm can effectively map ice thickness, providing a cost-efficient solution for large-scale glacier surveys in high-mountain regions.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 593-607"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142989646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An interactive fusion attention-guided network for ground surface hot spring fluids segmentation in dual-spectrum UAV images
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-02-01 DOI: 10.1016/j.isprsjprs.2025.01.022
Shi Yi , Mengting Chen , Xuesong Yuan , Si Guo , Jiashuai Wang
<div><div>Investigating the distribution of ground surface hot spring fluids is crucial for the exploitation and utilization of geothermal resources. The detailed information provided by dual-spectrum images captured by unmanned aerial vehicles (UAVs) flew at low altitudes is beneficial to accurately segment ground surface hot spring fluids. However, existing image segmentation methods face significant challenges of hot spring fluids segmentation due to the frequent and irregular variations in fluid boundaries, meanwhile the presence of substances within such fluids lead to segmentation uncertainties. In addition, there is currently no benchmark dataset dedicated to ground surface hot spring fluid segmentation in dual-spectrum UAV images. To this end, in this study, a benchmark dataset called the dual-spectrum hot spring fluid segmentation (DHFS) dataset was constructed for segmenting ground surface hot spring fluids in dual-spectrum UAV images. Additionally, a novel interactive fusion attention-guided RGB-Thermal (RGB-T) semantic segmentation network named IFAGNet was proposed in this study for accurately segmenting ground surface hot spring fluids in dual-spectrum UAV images. The proposed IFAGNet consists of two sub-networks that leverage two feature fusion architectures and the two-stage feature fusion module is designed to achieve optimal intermediate feature fusion. Furthermore, IFAGNet utilizes an interactive fusion attention-guided architecture to guide the two sub-networks further process the extracted features through complementary information exchange, resulting in a significant boost in hot spring fluid segmentation accuracy. Additionally, two down-up full scale feature pyramid network (FPN) decoders are developed for each sub-network to fully utilize multi-stage fused features and improve the preservation of detailed information during hot spring fluid segmentation. Moreover, a hybrid consistency learning strategy is implemented to train the IFAGNet, which combines fully supervised learning with consistency learning between each sub-network and their fusion results to further optimize the segmentation accuracy of hot spring fluid in RGB-T UAV images. The optimal model of the IFAGNet was tested on the proposed DHFS dataset, and the experimental results demonstrated that the IFAGNet outperforms existing image segmentation frameworks in terms of segmentation accuracy for hot spring fluids segmentation in dual-spectrum UAV images which achieved Pixel Accuracy (PA) of 96.1%, Precision of 93.2%, Recall of 85.9%, Intersection over Union (IoU) of 78.3%, and F1-score (F1) of 89.4%, respectively. And overcomes segmentation uncertainties to a great extent, while maintaining competitive computational efficiency. The ablation studies have confirmed the effectiveness of each main innovation in IFAGNet for improving the accuracy of hot spring fluid segmentation. Therefore, the proposed DHFS dataset and IFAGNet lay the foundation for segmentation of
{"title":"An interactive fusion attention-guided network for ground surface hot spring fluids segmentation in dual-spectrum UAV images","authors":"Shi Yi ,&nbsp;Mengting Chen ,&nbsp;Xuesong Yuan ,&nbsp;Si Guo ,&nbsp;Jiashuai Wang","doi":"10.1016/j.isprsjprs.2025.01.022","DOIUrl":"10.1016/j.isprsjprs.2025.01.022","url":null,"abstract":"&lt;div&gt;&lt;div&gt;Investigating the distribution of ground surface hot spring fluids is crucial for the exploitation and utilization of geothermal resources. The detailed information provided by dual-spectrum images captured by unmanned aerial vehicles (UAVs) flew at low altitudes is beneficial to accurately segment ground surface hot spring fluids. However, existing image segmentation methods face significant challenges of hot spring fluids segmentation due to the frequent and irregular variations in fluid boundaries, meanwhile the presence of substances within such fluids lead to segmentation uncertainties. In addition, there is currently no benchmark dataset dedicated to ground surface hot spring fluid segmentation in dual-spectrum UAV images. To this end, in this study, a benchmark dataset called the dual-spectrum hot spring fluid segmentation (DHFS) dataset was constructed for segmenting ground surface hot spring fluids in dual-spectrum UAV images. Additionally, a novel interactive fusion attention-guided RGB-Thermal (RGB-T) semantic segmentation network named IFAGNet was proposed in this study for accurately segmenting ground surface hot spring fluids in dual-spectrum UAV images. The proposed IFAGNet consists of two sub-networks that leverage two feature fusion architectures and the two-stage feature fusion module is designed to achieve optimal intermediate feature fusion. Furthermore, IFAGNet utilizes an interactive fusion attention-guided architecture to guide the two sub-networks further process the extracted features through complementary information exchange, resulting in a significant boost in hot spring fluid segmentation accuracy. Additionally, two down-up full scale feature pyramid network (FPN) decoders are developed for each sub-network to fully utilize multi-stage fused features and improve the preservation of detailed information during hot spring fluid segmentation. Moreover, a hybrid consistency learning strategy is implemented to train the IFAGNet, which combines fully supervised learning with consistency learning between each sub-network and their fusion results to further optimize the segmentation accuracy of hot spring fluid in RGB-T UAV images. The optimal model of the IFAGNet was tested on the proposed DHFS dataset, and the experimental results demonstrated that the IFAGNet outperforms existing image segmentation frameworks in terms of segmentation accuracy for hot spring fluids segmentation in dual-spectrum UAV images which achieved Pixel Accuracy (PA) of 96.1%, Precision of 93.2%, Recall of 85.9%, Intersection over Union (IoU) of 78.3%, and F1-score (F1) of 89.4%, respectively. And overcomes segmentation uncertainties to a great extent, while maintaining competitive computational efficiency. The ablation studies have confirmed the effectiveness of each main innovation in IFAGNet for improving the accuracy of hot spring fluid segmentation. Therefore, the proposed DHFS dataset and IFAGNet lay the foundation for segmentation of ","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 661-691"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143035286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Plug-and-play DISep: Separating dense instances for scene-to-pixel weakly-supervised change detection in high-resolution remote sensing images
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-02-01 DOI: 10.1016/j.isprsjprs.2025.01.007
Zhenghui Zhao , Chen Wu , Lixiang Ru , Di Wang , Hongruixuan Chen , Cuiqun Chen
Change Detection (CD) focuses on identifying specific pixel-level landscape changes in multi-temporal remote sensing images. The process of obtaining pixel-level annotations for CD is generally both time-consuming and labor-intensive. Faced with this annotation challenge, there has been a growing interest in research on Weakly-Supervised Change Detection (WSCD). WSCD aims to detect pixel-level changes using only scene-level (i.e., image-level) change labels, thereby offering a more cost-effective approach. Despite considerable efforts to precisely locate changed regions, existing WSCD methods often encounter the problem of “instance lumping” under scene-level supervision, particularly in scenarios with a dense distribution of changed instances (i.e., changed objects). In these scenarios, unchanged pixels between changed instances are also mistakenly identified as changed, causing multiple changes to be mistakenly viewed as one. In practical applications, this issue prevents the accurate quantification of the number of changes. To address this issue, we propose a Dense Instance Separation (DISep) method as a plug-and-play solution, refining pixel features from a unified instance perspective under scene-level supervision. Specifically, our DISep comprises a three-step iterative training process: (1) Instance Localization: We locate instance candidate regions for changed pixels using high-pass class activation maps. (2) Instance Retrieval: We identify and group these changed pixels into different instance IDs through connectivity searching. Then, based on the assigned instance IDs, we extract corresponding pixel-level features on a per-instance basis. (3) Instance Separation: We introduce a separation loss to enforce intra-instance pixel consistency in the embedding space, thereby ensuring separable instance feature representations. The proposed DISep adds only minimal training cost and no inference cost. It can be seamlessly integrated to enhance existing WSCD methods. We achieve state-of-the-art performance by enhancing three Transformer-based and four ConvNet-based methods on the LEVIR-CD, WHU-CD, DSIFN-CD, SYSU-CD, and CDD datasets. Additionally, our DISep can be used to improve fully-supervised change detection methods. Code is available at https://github.com/zhenghuizhao/Plug-and-Play-DISep-for-Change-Detection.
{"title":"Plug-and-play DISep: Separating dense instances for scene-to-pixel weakly-supervised change detection in high-resolution remote sensing images","authors":"Zhenghui Zhao ,&nbsp;Chen Wu ,&nbsp;Lixiang Ru ,&nbsp;Di Wang ,&nbsp;Hongruixuan Chen ,&nbsp;Cuiqun Chen","doi":"10.1016/j.isprsjprs.2025.01.007","DOIUrl":"10.1016/j.isprsjprs.2025.01.007","url":null,"abstract":"<div><div>Change Detection (CD) focuses on identifying specific pixel-level landscape changes in multi-temporal remote sensing images. The process of obtaining pixel-level annotations for CD is generally both time-consuming and labor-intensive. Faced with this annotation challenge, there has been a growing interest in research on Weakly-Supervised Change Detection (WSCD). WSCD aims to detect pixel-level changes using only scene-level (i.e., image-level) change labels, thereby offering a more cost-effective approach. Despite considerable efforts to precisely locate changed regions, existing WSCD methods often encounter the problem of “instance lumping” under scene-level supervision, particularly in scenarios with a dense distribution of changed instances (i.e., changed objects). In these scenarios, unchanged pixels between changed instances are also mistakenly identified as changed, causing multiple changes to be mistakenly viewed as one. In practical applications, this issue prevents the accurate quantification of the number of changes. To address this issue, we propose a Dense Instance Separation (DISep) method as a plug-and-play solution, refining pixel features from a unified instance perspective under scene-level supervision. Specifically, our DISep comprises a three-step iterative training process: (1) Instance Localization: We locate instance candidate regions for changed pixels using high-pass class activation maps. (2) Instance Retrieval: We identify and group these changed pixels into different instance IDs through connectivity searching. Then, based on the assigned instance IDs, we extract corresponding pixel-level features on a per-instance basis. (3) Instance Separation: We introduce a separation loss to enforce intra-instance pixel consistency in the embedding space, thereby ensuring separable instance feature representations. The proposed DISep adds only minimal training cost and no inference cost. It can be seamlessly integrated to enhance existing WSCD methods. We achieve state-of-the-art performance by enhancing three Transformer-based and four ConvNet-based methods on the LEVIR-CD, WHU-CD, DSIFN-CD, SYSU-CD, and CDD datasets. Additionally, our DISep can be used to improve fully-supervised change detection methods. Code is available at <span><span>https://github.com/zhenghuizhao/Plug-and-Play-DISep-for-Change-Detection</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 770-782"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143072523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FO-Net: An advanced deep learning network for individual tree identification using UAV high-resolution images FO-Net:一种先进的深度学习网络,用于使用无人机高分辨率图像识别单个树木
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-02-01 DOI: 10.1016/j.isprsjprs.2024.12.020
Jian Zeng, Xin Shen, Kai Zhou, Lin Cao
The identification of individual trees can reveal the competitive and symbiotic relationships among trees within forest stands, which is fundamental understand biodiversity and forest ecosystems. Highly precise identification of individual trees can significantly improve the efficiency of forest resource inventory, and is valuable for biomass measurement and forest carbon storage assessment. In previous studies through deep learning approaches for identifying individual tree, feature extraction is usually difficult to adapt to the variation of tree crown architecture, and the loss of feature information in the multi-scale fusion process is also a marked challenge for extracting trees by remote sensing images. Based on the one-stage deep learning network structure, this study improves and optimizes the three stages of feature extraction, feature fusion and feature identification in deep learning methods, and constructs a novel feature-oriented individual tree identification network (FO-Net) suitable for UAV high-resolution images. Firstly, an adaptive feature extraction algorithm based on variable position drift convolution was proposed, which improved the feature extraction ability for the individual tree with various crown size and shape in UAV images. Secondly, to enhance the network’s ability to fuse multiscale forest features, a feature fusion algorithm based on the “gather-and-distribute” mechanism is proposed in the feature pyramid network, which realizes the lossless cross-layer transmission of feature map information. Finally, in the stage of individual tree identification, a unified self-attention identification head is introduced to enhanced FO-Net’s perception ability to identify the trees with small crown diameters. FO-Net achieved the best performance in quantitative analysis experiments on self-constructed datasets, with mAP50, F1-score, Precision, and Recall of 90.7%, 0.85, 85.8%, and 82.8%, respectively, realizing a relatively high accuracy for individual tree identification compared to the traditional deep learning methods. The proposed feature extraction and fusion algorithms have improved the accuracy of individual tree identification by 1.1% and 2.7% respectively. The qualitative experiments based on Grad-CAM heat maps also demonstrate that FO-Net can focus more on the contours of an individual tree in high-resolution images, and reduce the influence of background factors during feature extraction and individual tree identification. FO-Net deep learning network improves the accuracy of individual trees identification in UAV high-resolution images without significantly increasing the parameters of the network, which provides a reliable method to support various tasks in fine-scale precision forestry.
单株树木的鉴定可以揭示林分内树木之间的竞争和共生关系,这是了解生物多样性和森林生态系统的基础。林木单株的高精度识别可显著提高森林资源清查效率,对生物量测量和森林碳储量评估具有重要价值。在以往利用深度学习方法识别单棵树的研究中,特征提取通常难以适应树冠结构的变化,并且在多尺度融合过程中特征信息的丢失也是遥感图像提取树木的一个显著挑战。本研究在单阶段深度学习网络结构的基础上,对深度学习方法中的特征提取、特征融合和特征识别三个阶段进行改进和优化,构建了一种适合无人机高分辨率图像的面向特征的个体树识别网络(FO-Net)。首先,提出了一种基于变位置漂移卷积的自适应特征提取算法,提高了无人机图像中不同树冠大小和形状的单株树的特征提取能力;其次,为了增强网络对多尺度森林特征的融合能力,在特征金字塔网络中提出了一种基于“采集-分布”机制的特征融合算法,实现了特征图信息的无损跨层传输;最后,在单株树识别阶段,引入统一的自注意识别头,增强FO-Net对小树冠直径树的感知能力。FO-Net在自建数据集的定量分析实验中表现最好,mAP50、f1 score、Precision和Recall分别为90.7%、0.85、85.8%和82.8%,相对于传统的深度学习方法,实现了相对较高的个体树识别准确率。所提出的特征提取和融合算法分别将单个树的识别准确率提高了1.1%和2.7%。基于grado - cam热图的定性实验也表明,FO-Net可以在高分辨率图像中更加关注单个树的轮廓,并在特征提取和单个树识别过程中减少背景因素的影响。FO-Net深度学习网络在不显著增加网络参数的情况下,提高了无人机高分辨率图像中单株树识别的精度,为支持精细尺度精准林业的各种任务提供了可靠的方法。
{"title":"FO-Net: An advanced deep learning network for individual tree identification using UAV high-resolution images","authors":"Jian Zeng,&nbsp;Xin Shen,&nbsp;Kai Zhou,&nbsp;Lin Cao","doi":"10.1016/j.isprsjprs.2024.12.020","DOIUrl":"10.1016/j.isprsjprs.2024.12.020","url":null,"abstract":"<div><div>The identification of individual trees can reveal the competitive and symbiotic relationships among trees within forest stands, which is fundamental understand biodiversity and forest ecosystems. Highly precise identification of individual trees can significantly improve the efficiency of forest resource inventory, and is valuable for biomass measurement and forest carbon storage assessment. In previous studies through deep learning approaches for identifying individual tree, feature extraction is usually difficult to adapt to the variation of tree crown architecture, and the loss of feature information in the multi-scale fusion process is also a marked challenge for extracting trees by remote sensing images. Based on the one-stage deep learning network structure, this study improves and optimizes the three stages of feature extraction, feature fusion and feature identification in deep learning methods, and constructs a novel feature-oriented individual tree identification network (FO-Net) suitable for UAV high-resolution images. Firstly, an adaptive feature extraction algorithm based on variable position drift convolution was proposed, which improved the feature extraction ability for the individual tree with various crown size and shape in UAV images. Secondly, to enhance the network’s ability to fuse multiscale forest features, a feature fusion algorithm based on the “gather-and-distribute” mechanism is proposed in the feature pyramid network, which realizes the lossless cross-layer transmission of feature map information. Finally, in the stage of individual tree identification, a unified self-attention identification head is introduced to enhanced FO-Net’s perception ability to identify the trees with small crown diameters. FO-Net achieved the best performance in quantitative analysis experiments on self-constructed datasets, with mAP50, F1-score, Precision, and Recall of 90.7%, 0.85, 85.8%, and 82.8%, respectively, realizing a relatively high accuracy for individual tree identification compared to the traditional deep learning methods. The proposed feature extraction and fusion algorithms have improved the accuracy of individual tree identification by 1.1% and 2.7% respectively. The qualitative experiments based on Grad-CAM heat maps also demonstrate that FO-Net can focus more on the contours of an individual tree in high-resolution images, and reduce the influence of background factors during feature extraction and individual tree identification. FO-Net deep learning network improves the accuracy of individual trees identification in UAV high-resolution images without significantly increasing the parameters of the network, which provides a reliable method to support various tasks in fine-scale precision forestry.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 323-338"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142889390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A deep data fusion-based reconstruction of water index time series for intermittent rivers and ephemeral streams monitoring 基于深度数据融合的间歇河流和短暂溪流监测水指数时间序列重建
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-02-01 DOI: 10.1016/j.isprsjprs.2024.12.015
Junyuan Fei , Xuan Zhang , Chong Li , Fanghua Hao , Yahui Guo , Yongshuo Fu
Intermittent Rivers and Ephemeral Streams (IRES) are the major sources of flowing water on Earth. Yet, their dynamics are challenging for optical and radar satellites to monitor due to the heavy cloud cover and narrow water surfaces. The significant backscattering mechanism change and image mismatch further hinder the joint use of optical-SAR images in IRES monitoring. Here, a Deep data fusion-based Reconstruction of the wide-accepted Modified Normalized Difference Water Index (MNDWI) time series is conducted for IRES Monitoring (DRIM). The study utilizes 3 categories of explanatory variables, i.e., the cross-orbits Sentinel-1 SAR for the continuous IRES observation, anchor data for the implicit co-registration, and auxiliary data that reflects the dynamics of IRES. A tight-coupled CNN-RNN architecture is designed to achieve pixel-level SAR-to-optical reconstruction under significant backscattering mechanism changes. The 10 m MNDWI time series with a 12-day interval is effectively regressed, R2 > 0.80, on the experimental catchment. The comparison with the RF, RNN, and CNN methods affirms the advantage of the tight-coupled CNN-RNN system in the SAR-to-optical regression with the R2 increasing by 0.68 at least. The ablation test highlights the contributions of the Sentinel-1 to the precise MNDWI time series reconstruction, and the anchor and auxiliary data to the effective multi-source data fusion, respectively. The reconstructions highly match the observations of IRES with river widths ranging from 2 m to 300 m. Furthermore, the DRIM method shows excellent applicability, i.e., average R2 of 0.77, in IRES under polar, temperate, tropical, and arid climates. In conclusion, the proposed method is powerful in reconstructing the MNDWI time series of sub-pixel to multi-pixel scale IRES under the problem of backscattering mechanism change and image mismatch. The reconstructed MNDWI time series are essential for exploring the hydrological processes of IRES dynamics and optimizing water resource management at the basin scale.
间歇性河流和短暂溪流(IRES)是地球上流动水的主要来源。然而,由于云层厚、水面窄,光学卫星和雷达卫星对其动态监测具有挑战性。显著的反向散射机制变化和图像不匹配进一步阻碍了光学-合成孔径雷达图像在 IRES 监测中的联合使用。在此,针对 IRES 监测(DRIM),对广泛接受的修正归一化差异水指数(MNDWI)时间序列进行了基于深度数据融合的重构。研究利用了三类解释变量,即用于连续 IRES 观测的跨轨道 Sentinel-1 SAR、用于隐式共存的锚数据以及反映 IRES 动态的辅助数据。设计了一个紧密耦合的 CNN-RNN 架构,以在显著的反向散射机制变化下实现像素级 SAR 到光学重建。10 m MNDWI 时间序列间隔为 12 天,对实验流域进行了有效回归,R2 > 0.80。与射频、RNN 和 CNN 方法的比较证实了紧耦合 CNN-RNN 系统在合成孔径雷达-光学回归中的优势,R2 至少增加了 0.68。消融测试凸显了 Sentinel-1 对精确 MNDWI 时间序列重建的贡献,以及锚数据和辅助数据对有效多源数据融合的贡献。此外,DRIM 方法在极地、温带、热带和干旱气候条件下的 IRES 中显示了极佳的适用性,即平均 R2 为 0.77。总之,在反向散射机制变化和图像不匹配的情况下,所提出的方法在重建亚像素到多像素尺度 IRES 的 MNDWI 时间序列方面具有强大的功能。重建的 MNDWI 时间序列对于探索 IRES 动态水文过程和优化流域尺度的水资源管理至关重要。
{"title":"A deep data fusion-based reconstruction of water index time series for intermittent rivers and ephemeral streams monitoring","authors":"Junyuan Fei ,&nbsp;Xuan Zhang ,&nbsp;Chong Li ,&nbsp;Fanghua Hao ,&nbsp;Yahui Guo ,&nbsp;Yongshuo Fu","doi":"10.1016/j.isprsjprs.2024.12.015","DOIUrl":"10.1016/j.isprsjprs.2024.12.015","url":null,"abstract":"<div><div>Intermittent Rivers and Ephemeral Streams (IRES) are the major sources of flowing water on Earth. Yet, their dynamics are challenging for optical and radar satellites to monitor due to the heavy cloud cover and narrow water surfaces. The significant backscattering mechanism change and image mismatch further hinder the joint use of optical-SAR images in IRES monitoring. Here, a <strong>D</strong>eep data fusion-based <strong>R</strong>econstruction of the wide-accepted Modified Normalized Difference Water Index (MNDWI) time series is conducted for <strong>I</strong>RES <strong>M</strong>onitoring (DRIM). The study utilizes 3 categories of explanatory variables, i.e., the cross-orbits Sentinel-1 SAR for the continuous IRES observation, anchor data for the implicit co-registration, and auxiliary data that reflects the dynamics of IRES. A tight-coupled CNN-RNN architecture is designed to achieve pixel-level SAR-to-optical reconstruction under significant backscattering mechanism changes. The 10 m MNDWI time series with a 12-day interval is effectively regressed, <span><math><mrow><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></mrow></math></span> &gt; 0.80, on the experimental catchment. The comparison with the RF, RNN, and CNN methods affirms the advantage of the tight-coupled CNN-RNN system in the SAR-to-optical regression with the <span><math><mrow><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></mrow></math></span> increasing by 0.68 at least. The ablation test highlights the contributions of the Sentinel-1 to the precise MNDWI time series reconstruction, and the anchor and auxiliary data to the effective multi-source data fusion, respectively. The reconstructions highly match the observations of IRES with river widths ranging from 2 m to 300 m. Furthermore, the DRIM method shows excellent applicability, i.e., average <span><math><mrow><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></mrow></math></span> of 0.77, in IRES under polar, temperate, tropical, and arid climates. In conclusion, the proposed method is powerful in reconstructing the MNDWI time series of sub-pixel to multi-pixel scale IRES under the problem of backscattering mechanism change and image mismatch. The reconstructed MNDWI time series are essential for exploring the hydrological processes of IRES dynamics and optimizing water resource management at the basin scale.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 339-353"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corrigendum to “Comparison of detectability of ship wake components between C-Band and X-Band synthetic aperture radar sensors operating under different slant ranges” [ISPRS J. Photogramm. Remote Sens. 196 (2023) 306-324]
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-02-01 DOI: 10.1016/j.isprsjprs.2025.01.026
Björn Tings, Andrey Pleskachevsky, Stefan Wiehle
{"title":"Corrigendum to “Comparison of detectability of ship wake components between C-Band and X-Band synthetic aperture radar sensors operating under different slant ranges” [ISPRS J. Photogramm. Remote Sens. 196 (2023) 306-324]","authors":"Björn Tings,&nbsp;Andrey Pleskachevsky,&nbsp;Stefan Wiehle","doi":"10.1016/j.isprsjprs.2025.01.026","DOIUrl":"10.1016/j.isprsjprs.2025.01.026","url":null,"abstract":"","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Page 740"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143072526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Target-aware attentional network for rare class segmentation in large-scale LiDAR point clouds 大规模LiDAR点云中稀有类分割的目标感知关注网络
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-02-01 DOI: 10.1016/j.isprsjprs.2024.11.012
Xinlong Zhang , Dong Lin , Uwe Soergel
Semantic interpretation of 3D scenes poses a formidable challenge in point cloud processing, which also stands as a requisite undertaking across various fields of application involving point clouds. Although a number of point cloud segmentation methods have achieved leading performance, 3D rare class segmentation continues to be a challenge owing to the imbalanced distribution of fine-grained classes and the complexity of large scenes. In this paper, we present target-aware attentional network (TaaNet), a novel mask-constrained attention framework to address 3D semantic segmentation of imbalanced classes in large-scale point clouds. Adapting the self-attention mechanism, a hierarchical aggregation strategy is first applied to enhance the learning of point-wise features across various scales, which leverages both global and local perspectives to guarantee presence of fine-grained patterns in the case of scenes with high complexity. Subsequently, rare target masks are imposed by a contextual module on the hierarchical features. Specifically, a target-aware aggregator is proposed to boost discriminative features of rare classes, which constrains hierarchical features with learnable adaptive weights and simultaneously embeds confidence constraints of rare classes. Furthermore, a target pseudo-labeling strategy based on strong contour cues of rare classes is designed, which effectively delivers instance-level supervisory signals restricted to rare targets only. We conducted thorough experiments on four multi-platform LiDAR benchmarks, i.e., airborne, mobile and terrestrial platforms, to assess the performance of our framework. Results demonstrate that compared to other commonly used advanced segmentation methods, our method can obtain not only high segmentation accuracy but also remarkable F1-scores in rare classes. In a submission to the official ranking page of Hessigheim 3D benchmark, our approach achieves a state-of-the-art mean F1-score of 83.84% and an outstanding overall accuracy (OA) of 90.45%. In particular, the F1-scores of rare classes namely vehicles and chimneys notably exceed the average of other published methods by a wide margin, boosting by 32.00% and 32.46%, respectively. Additionally, extensive experimental analysis on benchmarks collected from multiple platforms, Paris-Lille-3D, Semantic3D and WHU-Urban3D, validates the robustness and effectiveness of the proposed method.
三维场景的语义解释对点云处理提出了巨大的挑战,这也是涉及点云的各种应用领域的必要工作。虽然一些点云分割方法已经取得了领先的性能,但由于细粒度类分布的不平衡和大场景的复杂性,3D稀有类分割仍然是一个挑战。针对大规模点云中不平衡类的三维语义分割问题,提出了一种新的基于掩码约束的注意力框架——目标感知注意力网络(TaaNet)。采用自关注机制,首先应用层次聚合策略增强了对不同尺度上逐点特征的学习,该策略利用全局和局部视角来保证在高复杂性场景下存在细粒度模式。随后,通过上下文模块对分层特征施加稀有目标掩码。具体而言,提出了一种目标感知聚合器来增强稀有类的判别特征,该聚合器约束具有可学习自适应权值的层次特征,同时嵌入稀有类的置信度约束。在此基础上,设计了一种基于稀有类强轮廓线索的目标伪标记策略,有效地传递了只局限于稀有目标的实例级监视信号。我们在四个多平台LiDAR基准上进行了全面的实验,即机载、移动和地面平台,以评估我们的框架的性能。结果表明,与其他常用的高级分割方法相比,我们的方法不仅可以获得较高的分割精度,而且在极少数类别中获得了显著的f1分数。在提交给hessighheim 3D基准的官方排名页面中,我们的方法达到了最先进的平均f1分数83.84%和出色的整体精度(OA) 90.45%。其中,车辆和烟囱这两个稀有类别的f1得分明显高于其他已公布方法的平均值,分别提高了32.00%和32.46%。此外,对从多个平台(Paris-Lille-3D、Semantic3D和WHU-Urban3D)收集的基准测试进行了广泛的实验分析,验证了该方法的鲁棒性和有效性。
{"title":"Target-aware attentional network for rare class segmentation in large-scale LiDAR point clouds","authors":"Xinlong Zhang ,&nbsp;Dong Lin ,&nbsp;Uwe Soergel","doi":"10.1016/j.isprsjprs.2024.11.012","DOIUrl":"10.1016/j.isprsjprs.2024.11.012","url":null,"abstract":"<div><div>Semantic interpretation of 3D scenes poses a formidable challenge in point cloud processing, which also stands as a requisite undertaking across various fields of application involving point clouds. Although a number of point cloud segmentation methods have achieved leading performance, 3D rare class segmentation continues to be a challenge owing to the imbalanced distribution of fine-grained classes and the complexity of large scenes. In this paper, we present target-aware attentional network (TaaNet), a novel mask-constrained attention framework to address 3D semantic segmentation of imbalanced classes in large-scale point clouds. Adapting the self-attention mechanism, a hierarchical aggregation strategy is first applied to enhance the learning of point-wise features across various scales, which leverages both global and local perspectives to guarantee presence of fine-grained patterns in the case of scenes with high complexity. Subsequently, rare target masks are imposed by a contextual module on the hierarchical features. Specifically, a target-aware aggregator is proposed to boost discriminative features of rare classes, which constrains hierarchical features with learnable adaptive weights and simultaneously embeds confidence constraints of rare classes. Furthermore, a target pseudo-labeling strategy based on strong contour cues of rare classes is designed, which effectively delivers instance-level supervisory signals restricted to rare targets only. We conducted thorough experiments on four multi-platform LiDAR benchmarks, i.e., airborne, mobile and terrestrial platforms, to assess the performance of our framework. Results demonstrate that compared to other commonly used advanced segmentation methods, our method can obtain not only high segmentation accuracy but also remarkable F1-scores in rare classes. In a submission to the official ranking page of Hessigheim 3D benchmark, our approach achieves a state-of-the-art mean F1-score of 83.84% and an outstanding overall accuracy (OA) of 90.45%. In particular, the F1-scores of rare classes namely vehicles and chimneys notably exceed the average of other published methods by a wide margin, boosting by 32.00% and 32.46%, respectively. Additionally, extensive experimental analysis on benchmarks collected from multiple platforms, Paris-Lille-3D, Semantic3D and WHU-Urban3D, validates the robustness and effectiveness of the proposed method.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 32-50"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142789962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
National scale sub-meter mangrove mapping using an augmented border training sample method 利用增强边界训练样本法绘制国家尺度亚米级红树林地图
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-02-01 DOI: 10.1016/j.isprsjprs.2024.12.009
Jinyan Tian , Le Wang , Chunyuan Diao , Yameng Zhang , Mingming Jia , Lin Zhu , Meng Xu , Xiaojuan Li , Huili Gong
This study presents the development of China’s first national-scale sub-meter mangrove map, addressing the need for high-resolution mapping to accurately delineate mangrove boundaries and identify fragmented patches. To overcome the current limitation of 10-m resolution, we developed a novel Semi-automatic Sub-meter Mapping Method (SSMM). The SSMM enhances the spectral separability of mangroves from other land covers by selecting nine critical features from both Sentinel-2 and Google Earth imagery. We also developed an innovative automated sample collection method to ensure ample and precise training samples, increasing sample density in areas susceptible to misclassification and reducing it in uniform regions. This method surpasses traditional uniform sampling in representing the national-scale study area. The classification is performed using a random forest classifier and is manually refined, culminating in the production of the pioneering Large-scale Sub-meter Mangrove Map (LSMM).
Our study showcases the LSMM’s superior performance over the established High-resolution Global Mangrove Forest (HGMF) map. The LSMM demonstrates enhanced classification accuracy, improved spatial delineation, and more precise area calculations, along with a robust framework of spatial analysis. Notably, compared to the HGMF, the LSMM achieves a 22.0 % increase in overall accuracy and a 0.27 improvement in the F1 score. In terms of mangrove coverage within China, the LSMM estimates a reduction of 4,345 ha (15.4 %), decreasing from 32,598 ha in the HGMF to 28,253 ha. This reduction is further underscored by a significant 61.7 % discrepancy in spatial distribution areas when compared to the HGMF, indicative of both commission and omission errors associated with the 10-m HGMF. Additionally, the LSMM identifies a fivefold increase in the number of mangrove patches, totaling 40,035, compared to the HGMF’s 7,784. These findings underscore the substantial improvements offered by sub-meter resolution products over those with a 10-m resolution. The LSMM and its automated mapping methodology establish new benchmarks for comprehensive, long-term mangrove mapping at sub-meter scales, as well as for the detailed mapping of extensive land cover types. Our study is expected to catalyze a shift toward high-resolution mangrove mapping on a large scale.
本研究提出了中国第一个国家级亚米红树林地图的开发,解决了高分辨率制图的需求,以准确划定红树林边界和识别破碎斑块。为了克服目前10米分辨率的限制,我们开发了一种新的半自动亚米映射方法(SSMM)。SSMM通过从Sentinel-2和谷歌地球图像中选择9个关键特征,增强了红树林与其他陆地覆盖的光谱可分性。我们还开发了一种创新的自动化样本采集方法,以确保充足和精确的训练样本,增加易误分类区域的样本密度,减少均匀区域的样本密度。该方法在代表全国范围内的研究区域方面优于传统的均匀抽样方法。分类是使用随机森林分类器进行的,并经过人工改进,最终生成开创性的大尺度亚米红树林地图(LSMM)。
{"title":"National scale sub-meter mangrove mapping using an augmented border training sample method","authors":"Jinyan Tian ,&nbsp;Le Wang ,&nbsp;Chunyuan Diao ,&nbsp;Yameng Zhang ,&nbsp;Mingming Jia ,&nbsp;Lin Zhu ,&nbsp;Meng Xu ,&nbsp;Xiaojuan Li ,&nbsp;Huili Gong","doi":"10.1016/j.isprsjprs.2024.12.009","DOIUrl":"10.1016/j.isprsjprs.2024.12.009","url":null,"abstract":"<div><div>This study presents the development of China’s first national-scale sub-meter mangrove map, addressing the need for high-resolution mapping to accurately delineate mangrove boundaries and identify fragmented patches. To overcome the current limitation of 10-m resolution, we developed a novel Semi-automatic Sub-meter Mapping Method (SSMM). The SSMM enhances the spectral separability of mangroves from other land covers by selecting nine critical features from both Sentinel-2 and Google Earth imagery. We also developed an innovative automated sample collection method to ensure ample and precise training samples, increasing sample density in areas susceptible to misclassification and reducing it in uniform regions. This method surpasses traditional uniform sampling in representing the national-scale study area. The classification is performed using a random forest classifier and is manually refined, culminating in the production of the pioneering Large-scale Sub-meter Mangrove Map (LSMM).</div><div>Our study showcases the LSMM’s superior performance over the established High-resolution Global Mangrove Forest (HGMF) map. The LSMM demonstrates enhanced classification accuracy, improved spatial delineation, and more precise area calculations, along with a robust framework of spatial analysis. Notably, compared to the HGMF, the LSMM achieves a 22.0 % increase in overall accuracy and a 0.27 improvement in the F1 score. In terms of mangrove coverage within China, the LSMM estimates a reduction of 4,345 ha (15.4 %), decreasing from 32,598 ha in the HGMF to 28,253 ha. This reduction is further underscored by a significant 61.7 % discrepancy in spatial distribution areas when compared to the HGMF, indicative of both commission and omission errors associated with the 10-m HGMF. Additionally, the LSMM identifies a fivefold increase in the number of mangrove patches, totaling 40,035, compared to the HGMF’s 7,784. These findings underscore the substantial improvements offered by sub-meter resolution products over those with a 10-m resolution. The LSMM and its automated mapping methodology establish new benchmarks for comprehensive, long-term mangrove mapping at sub-meter scales, as well as for the detailed mapping of extensive land cover types. Our study is expected to catalyze a shift toward high-resolution mangrove mapping on a large scale.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 156-171"},"PeriodicalIF":10.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142823149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1