首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
Explainable spatiotemporal deep learning for subseasonal super-resolution forecasting of Arctic sea ice concentration during the melting season 基于可解释时空深度学习的北极海冰融化季亚季节超分辨率预报
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-12-01 DOI: 10.1016/j.isprsjprs.2025.11.027
Jianxin He , Yuxin Zhao , Shuo Yang , Woping Wu , Jian Wang , Xiong Deng
Accurate, high-resolution forecasting of Arctic sea-ice concentration (SIC) during the melting season is crucial for climate monitoring and polar navigation, yet remains hindered by the system’s complex, multi-scale, and cross-sphere dynamics. We present MSS-STFormer, an explainable multi-scale spatiotemporal Transformer designed for subseasonal SIC super-resolution forecasting. The model integrates 14 environmental factors spanning the ice, ocean, and atmosphere, and incorporates four specialized modules to enhance spatiotemporal representation and physical consistency. Trained with OSTIA satellite observations and ERA5 reanalysis data, MSS-STFormer achieves high forecasting skill over a 60-day horizon, yielding an RMSE of 0.049, a correlation of 0.9951, an SSIM of 0.9603, and a BACC of 0.9656. Post-hoc explainability methods, Gradient SHAP and LIME—reveal that the model captures a temporally evolving prediction mechanism: early forecasts are dominated by persistence of initial conditions, mid-term phases are governed by atmospheric dynamics such as wind and pressure, and later stages transition to a coupled influence of radiative and dynamic processes. This progression aligns closely with established thermodynamic and dynamic theories of sea-ice evolution, underscoring the model’s ability to identify physically meaningful drivers. The framework demonstrates strong potential for advancing explainable GeoAI in Earth observation, combining predictive accuracy with physical explainability for operational Arctic SIC monitoring and climate applications.
融冰季北极海冰浓度(SIC)的准确、高分辨率预报对气候监测和极地导航至关重要,但仍受到系统复杂、多尺度和跨球体动力学的阻碍。我们提出了MSS-STFormer,一个可解释的多尺度时空转换器,设计用于亚季节SIC超分辨率预测。该模型集成了14个环境因子,包括冰、海洋和大气,并结合了四个专门的模块,以增强时空表征和物理一致性。在OSTIA卫星观测数据和ERA5再分析数据的训练下,MSS-STFormer在60天范围内具有较高的预测能力,RMSE为0.049,相关系数为0.9951,SSIM为0.9603,BACC为0.9656。事后可解释性方法、Gradient SHAP和lime表明,该模式捕获了一种时间演化的预测机制:早期预报受初始条件的持续影响,中期阶段受风和气压等大气动力学的影响,后期阶段过渡到辐射和动力过程的耦合影响。这一进展与已建立的海冰演化热力学和动力学理论密切相关,强调了该模型识别物理上有意义的驱动因素的能力。该框架展示了在地球观测中推进可解释GeoAI的强大潜力,将北极SIC监测和气候应用的预测精度与物理可解释性相结合。
{"title":"Explainable spatiotemporal deep learning for subseasonal super-resolution forecasting of Arctic sea ice concentration during the melting season","authors":"Jianxin He ,&nbsp;Yuxin Zhao ,&nbsp;Shuo Yang ,&nbsp;Woping Wu ,&nbsp;Jian Wang ,&nbsp;Xiong Deng","doi":"10.1016/j.isprsjprs.2025.11.027","DOIUrl":"10.1016/j.isprsjprs.2025.11.027","url":null,"abstract":"<div><div>Accurate, high-resolution forecasting of Arctic sea-ice concentration (SIC) during the melting season is crucial for climate monitoring and polar navigation, yet remains hindered by the system’s complex, multi-scale, and cross-sphere dynamics. We present MSS-STFormer, an explainable multi-scale spatiotemporal Transformer designed for subseasonal SIC super-resolution forecasting. The model integrates 14 environmental factors spanning the ice, ocean, and atmosphere, and incorporates four specialized modules to enhance spatiotemporal representation and physical consistency. Trained with OSTIA satellite observations and ERA5 reanalysis data, MSS-STFormer achieves high forecasting skill over a 60-day horizon, yielding an RMSE of 0.049, a correlation of 0.9951, an SSIM of 0.9603, and a BACC of 0.9656. Post-hoc explainability methods, Gradient SHAP and LIME—reveal that the model captures a temporally evolving prediction mechanism: early forecasts are dominated by persistence of initial conditions, mid-term phases are governed by atmospheric dynamics such as wind and pressure, and later stages transition to a coupled influence of radiative and dynamic processes. This progression aligns closely with established thermodynamic and dynamic theories of sea-ice evolution, underscoring the model’s ability to identify physically meaningful drivers. The framework demonstrates strong potential for advancing explainable GeoAI in Earth observation, combining predictive accuracy with physical explainability for operational Arctic SIC monitoring and climate applications.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 1-17"},"PeriodicalIF":12.2,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145625097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
National mapping of wetland vegetation leaf area index in China using hybrid model with Sentinel-2 and Landsat-8 data 基于Sentinel-2和Landsat-8数据混合模型的中国湿地植被叶面积指数全国制图
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-12-01 DOI: 10.1016/j.isprsjprs.2025.11.031
Jianing Zhen , Dehua Mao , Yeqiao Wang , Junjie Wang , Chenwei Nie , Shiqi Huo , Hengxing Xiang , Yongxing Ren , Ling Luo , Zongming Wang
Leaf area index (LAI) of wetland vegetation provides vital information for its growth condition, structure and functioning. Accurately mapping LAI at a broad scale is essential for conservation and rehabilitation of wetland ecosystem. However, owing to the spatial complexity and periodic inundation characteristics of wetland vegetation, retrieving LAI of wetlands remains a challenging task with significant uncertainty. Here, with 865 in-situ measurements across different wetland biomes in China during 2013–2023, we proposed a hybrid strategy that incorporated active learning (AL) technique, physically-based PROSAIL-5B model, and Random Forest machine learning algorithm to map wetland biomes LAI across China from Sentinel-2 and Landsat-8 imagery. The validation results showed that the hybrid approach outperformed physically-based and empirically-based methods and achieved higher accuracy (R2 increased by 0.15–0.40, RMSE decreased by 0.02–0.27, and RRMSE reduced by 3.37–12.78 %). Additionally, three indices that we newly-developed (TBVI5, TBVI3 and TBVI1) exhibited superior potential for LAI inversion across different types of wetland vegetation. Our mapping results exhibited spatial details and consistency, and matched with in-situ observations from Sentinel-2 compared to Landsat-8 and the other MODIS-based products. In this study, we developed the first national-scale mapping of wetland vegetation LAI in China. The findings offer insights into accurate retrieval of LAI in wetland vegetation, providing valuable support for the scientific restoration of wetlands and assessing their responses to climate change.
湿地植被叶面积指数(LAI)是湿地植被生长状况、结构和功能的重要信息。在大尺度上准确绘制LAI对湿地生态系统的保护和恢复至关重要。然而,由于湿地植被的空间复杂性和周期性淹没特征,湿地LAI的反演仍然是一项具有挑战性的任务,具有很大的不确定性。本文基于2013-2023年中国不同湿地生物群落的865个原位测量数据,提出了一种结合主动学习(AL)技术、基于物理的PROSAIL-5B模型和随机森林机器学习算法的混合策略,利用Sentinel-2和Landsat-8图像绘制中国湿地生物群落LAI。验证结果表明,混合方法优于物理方法和经验方法,获得了更高的准确率(R2提高0.15 ~ 0.40,RMSE降低0.02 ~ 0.27,RRMSE降低3.37 ~ 12.78%)。此外,我们新开发的3个指数TBVI5、TBVI3和TBVI1在不同类型湿地植被的LAI反演中表现出较好的潜力。与Landsat-8和其他基于modis的产品相比,我们的制图结果显示了空间细节和一致性,并且与Sentinel-2的原位观测结果相匹配。在这项研究中,我们开发了中国第一个国家尺度的湿地植被LAI制图。研究结果为湿地植被LAI的精确反演提供了新的思路,为湿地的科学恢复和评估其对气候变化的响应提供了有价值的支持。
{"title":"National mapping of wetland vegetation leaf area index in China using hybrid model with Sentinel-2 and Landsat-8 data","authors":"Jianing Zhen ,&nbsp;Dehua Mao ,&nbsp;Yeqiao Wang ,&nbsp;Junjie Wang ,&nbsp;Chenwei Nie ,&nbsp;Shiqi Huo ,&nbsp;Hengxing Xiang ,&nbsp;Yongxing Ren ,&nbsp;Ling Luo ,&nbsp;Zongming Wang","doi":"10.1016/j.isprsjprs.2025.11.031","DOIUrl":"10.1016/j.isprsjprs.2025.11.031","url":null,"abstract":"<div><div>Leaf area index (LAI) of wetland vegetation provides vital information for its growth condition, structure and functioning. Accurately mapping LAI at a broad scale is essential for conservation and rehabilitation of wetland ecosystem. However, owing to the spatial complexity and periodic inundation characteristics of wetland vegetation, retrieving LAI of wetlands remains a challenging task with significant uncertainty. Here, with 865 in-situ measurements across different wetland biomes in China during 2013–2023, we proposed a hybrid strategy that incorporated active learning (AL) technique, physically-based PROSAIL-5B model, and Random Forest machine learning algorithm to map wetland biomes LAI across China from Sentinel-2 and Landsat-8 imagery. The validation results showed that the hybrid approach outperformed physically-based and empirically-based methods and achieved higher accuracy (R<sup>2</sup> increased by 0.15–0.40, RMSE decreased by 0.02–0.27, and RRMSE reduced by 3.37–12.78 %). Additionally, three indices that we newly-developed (TBVI5, TBVI3 and TBVI1) exhibited superior potential for LAI inversion across different types of wetland vegetation. Our mapping results exhibited spatial details and consistency, and matched with in-situ observations from Sentinel-2 compared to Landsat-8 and the other MODIS-based products. In this study, we developed the first national-scale mapping of wetland vegetation LAI in China. The findings offer insights into accurate retrieval of LAI in wetland vegetation, providing valuable support for the scientific restoration of wetlands and assessing their responses to climate change.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 18-33"},"PeriodicalIF":12.2,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145658151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-supervised despeckling based solely on SAR intensity images: A general strategy 基于SAR强度图像的自监督去噪:一种通用策略
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-12-01 DOI: 10.1016/j.isprsjprs.2025.11.025
Liang Chen , Yifei Yin , Hao Shi , Jingfei He , Wei Li
Speckle noise is generated along with the SAR imaging mechanism and degrades the quality of SAR images, leading to difficult interpretation. Hence, despeckling is an indispensable step in SAR pre-processing. Fortunately, supervised learning (SL) has proven to be a progressive method for SAR image despeckling. SL methods necessitate the availability of both original SAR images and their speckle-free counterparts during training, whilst speckle-free SAR images do not exist in the real world. Even though there are several substitutes for speckle-free images, the domain gap leads to poor performance and adaptability. Self-supervision provides an approach to training without clean reference. However, most self-supervised methods introduce additional requirements on speckle modeling or specific data, posing challenges in real-world applications. To address these challenges, we propose a general Self-supervised Despeckling Strategy for SAR images (SDS-SAR) that relies solely on speckled intensity data for training. Firstly, the theoretical feasibility of SAR image despeckling without speckle-free images is established. A self-supervised despeckling criteria suitable for diverse SAR images is proposed. Subsequently, a Random-Aware sub-SAMpler with Projection correLation Estimation (RA-SAMPLE) is put forth. Mutually independent training pairs can be derived from actual SAR intensity images. Furthermore, a multi-feature loss function is introduced, consisting of a despeckling term, a regularization term, and a perception term. The performance of speckle suppression and texture preservation is well-balanced. Experiments reveal that the proposed method performs comparably to supervised approaches on synthetic data and outperforms them on actual data. Both visual and quantitative evaluations confirm its superiority over state-of-the-art despeckling techniques. Moreover, the results demonstrates that SDS-SAR provides a novel solution for noise suppression in other multiplicative coherent systems. The trained model and dataset will be available at https://github.com/YYF121/SDS-SAR.
伴随着SAR成像机制产生的散斑噪声会降低SAR图像的质量,导致解译困难。因此,去噪是SAR预处理中不可缺少的步骤。幸运的是,监督学习(SL)已被证明是一种渐进的SAR图像去斑方法。SL方法需要在训练过程中同时获得原始SAR图像和无斑点图像,而无斑点的SAR图像在现实世界中是不存在的。尽管有几种无斑点图像的替代品,但由于域间隙的存在,导致其性能和适应性较差。自我监督提供了一种无需明确参考的培训方法。然而,大多数自监督方法对散斑建模或特定数据提出了额外的要求,在实际应用中提出了挑战。为了解决这些挑战,我们提出了一种通用的自监督SAR图像去斑策略(SDS-SAR),该策略仅依赖于斑点强度数据进行训练。首先,建立了SAR图像无斑点去斑的理论可行性;提出了一种适用于不同SAR图像的自监督去斑准则。随后,提出了一种带有投影相关估计的随机感知子采样器(RA-SAMPLE)。相互独立的训练对可以从实际的SAR强度图像中得到。在此基础上,引入了一个多特征损失函数,该函数由去斑项、正则化项和感知项组成。斑点抑制性能和纹理保持性能很好。实验表明,该方法在合成数据上的性能与监督方法相当,在实际数据上的性能优于监督方法。视觉和定量评估都证实了它优于最先进的消斑技术。此外,结果表明,SDS-SAR为其他乘性相干系统的噪声抑制提供了一种新的解决方案。经过训练的模型和数据集将在https://github.com/YYF121/SDS-SAR上提供。
{"title":"Self-supervised despeckling based solely on SAR intensity images: A general strategy","authors":"Liang Chen ,&nbsp;Yifei Yin ,&nbsp;Hao Shi ,&nbsp;Jingfei He ,&nbsp;Wei Li","doi":"10.1016/j.isprsjprs.2025.11.025","DOIUrl":"10.1016/j.isprsjprs.2025.11.025","url":null,"abstract":"<div><div>Speckle noise is generated along with the SAR imaging mechanism and degrades the quality of SAR images, leading to difficult interpretation. Hence, despeckling is an indispensable step in SAR pre-processing. Fortunately, supervised learning (SL) has proven to be a progressive method for SAR image despeckling. SL methods necessitate the availability of both original SAR images and their speckle-free counterparts during training, whilst speckle-free SAR images do not exist in the real world. Even though there are several substitutes for speckle-free images, the domain gap leads to poor performance and adaptability. Self-supervision provides an approach to training without clean reference. However, most self-supervised methods introduce additional requirements on speckle modeling or specific data, posing challenges in real-world applications. To address these challenges, we propose a general Self-supervised Despeckling Strategy for SAR images (SDS-SAR) that relies solely on speckled intensity data for training. Firstly, the theoretical feasibility of SAR image despeckling without speckle-free images is established. A self-supervised despeckling criteria suitable for diverse SAR images is proposed. Subsequently, a Random-Aware sub-SAMpler with Projection correLation Estimation (RA-SAMPLE) is put forth. Mutually independent training pairs can be derived from actual SAR intensity images. Furthermore, a multi-feature loss function is introduced, consisting of a despeckling term, a regularization term, and a perception term. The performance of speckle suppression and texture preservation is well-balanced. Experiments reveal that the proposed method performs comparably to supervised approaches on synthetic data and outperforms them on actual data. Both visual and quantitative evaluations confirm its superiority over state-of-the-art despeckling techniques. Moreover, the results demonstrates that SDS-SAR provides a novel solution for noise suppression in other multiplicative coherent systems. The trained model and dataset will be available at <span><span>https://github.com/YYF121/SDS-SAR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"231 ","pages":"Pages 854-873"},"PeriodicalIF":12.2,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145657555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reducing semantic ambiguity in open-vocabulary remote sensing image segmentation via knowledge graph-enhanced class representations 利用知识图增强类表示减少开放词汇遥感图像分割中的语义歧义
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-11-30 DOI: 10.1016/j.isprsjprs.2025.11.029
Wubiao Huang , Huchen Li , Shuai Zhang , Fei Deng
Open-vocabulary semantic segmentation (OVSS) task presents a new challenge for remote sensing image understanding by requiring the recognition of previously unseen or novel classes during inference. However, existing OVSS methods often suffer from severe semantic ambiguity in land cover classification due to inconsistent naming conventions, hierarchical dependency, and insufficient semantic proximity in the embedding space. To address these issues, we propose KG-OVRSeg, a novel framework that mitigates semantic ambiguity by aggregating structured knowledge from a knowledge graph. This approach significantly enhances intra-class compactness and inter-class separability in the embedding space, thereby fundamentally enhancing class representations. We design a knowledge graph-enhanced class encoder (KGCE) that generates enriched class embeddings by querying hypernym–hyponym and synonym relationships within a localized knowledge graph. These enhanced embeddings are further utilized by a class attention gradual decoder (CAGD), which leverages a class-aware attention mechanism and guidance refinement to guide feature decoding. Extensive experiments on seven publicly available datasets demonstrated that KG-OVRSeg achieves state-of-the-art performance, with a mean mF1 of 51.65% and a mean mIoU of 39.18%, surpassing previous methods by 8.06% mF1 and 6.52% mIoU. Comprehensive ablation and visual analyses confirmed that KGCE significantly improves intra-class semantic compactness and inter-class separability in the embedding space, playing a crucial role in mitigating semantic inconsistency. Our work offers a robust and scalable solution for ambiguity-aware open-vocabulary tasks in remote sensing. The code is publicly available at https://github.com/HuangWBill/KG-OVRSeg.
开放词汇语义分割(OVSS)任务要求在推理过程中识别以前未见过的或新的类别,这对遥感图像理解提出了新的挑战。然而,现有的OVSS方法由于命名约定不一致、层次依赖、嵌入空间语义接近度不足等原因,在土地覆盖分类中存在严重的语义模糊性。为了解决这些问题,我们提出了KG-OVRSeg,这是一个通过从知识图中聚合结构化知识来减轻语义模糊的新框架。该方法显著增强了嵌入空间中的类内紧密性和类间可分性,从而从根本上增强了类表示。我们设计了一个知识图增强类编码器(KGCE),该编码器通过查询局部知识图中的上下词和同义词关系来生成丰富的类嵌入。类注意渐进解码器(CAGD)进一步利用了这些增强的嵌入,它利用类感知的注意机制和指导改进来指导特征解码。在7个公开数据集上进行的大量实验表明,KG-OVRSeg达到了最先进的性能,平均mF1为51.65%,平均mIoU为39.18%,比之前的方法分别提高了8.06% mF1和6.52% mIoU。综合消融和可视化分析证实,KGCE显著提高了嵌入空间中类内语义紧密性和类间语义可分离性,在缓解语义不一致方面发挥了至关重要的作用。我们的工作为遥感中具有歧义意识的开放词汇任务提供了一个健壮且可扩展的解决方案。该代码可在https://github.com/HuangWBill/KG-OVRSeg上公开获得。
{"title":"Reducing semantic ambiguity in open-vocabulary remote sensing image segmentation via knowledge graph-enhanced class representations","authors":"Wubiao Huang ,&nbsp;Huchen Li ,&nbsp;Shuai Zhang ,&nbsp;Fei Deng","doi":"10.1016/j.isprsjprs.2025.11.029","DOIUrl":"10.1016/j.isprsjprs.2025.11.029","url":null,"abstract":"<div><div>Open-vocabulary semantic segmentation (OVSS) task presents a new challenge for remote sensing image understanding by requiring the recognition of previously unseen or novel classes during inference. However, existing OVSS methods often suffer from severe semantic ambiguity in land cover classification due to inconsistent naming conventions, hierarchical dependency, and insufficient semantic proximity in the embedding space. To address these issues, we propose KG-OVRSeg, a novel framework that mitigates semantic ambiguity by aggregating structured knowledge from a knowledge graph. This approach significantly enhances intra-class compactness and inter-class separability in the embedding space, thereby fundamentally enhancing class representations. We design a knowledge graph-enhanced class encoder (KGCE) that generates enriched class embeddings by querying hypernym–hyponym and synonym relationships within a localized knowledge graph. These enhanced embeddings are further utilized by a class attention gradual decoder (CAGD), which leverages a class-aware attention mechanism and guidance refinement to guide feature decoding. Extensive experiments on seven publicly available datasets demonstrated that KG-OVRSeg achieves state-of-the-art performance, with a mean mF1 of 51.65% and a mean mIoU of 39.18%, surpassing previous methods by 8.06% mF1 and 6.52% mIoU. Comprehensive ablation and visual analyses confirmed that KGCE significantly improves intra-class semantic compactness and inter-class separability in the embedding space, playing a crucial role in mitigating semantic inconsistency. Our work offers a robust and scalable solution for ambiguity-aware open-vocabulary tasks in remote sensing. The code is publicly available at <span><span>https://github.com/HuangWBill/KG-OVRSeg</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"231 ","pages":"Pages 837-853"},"PeriodicalIF":12.2,"publicationDate":"2025-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145619489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M3FNet: Multi-modal multi-temporal multi-scale data fusion network for tree species composition mapping M3FNet:树种组成制图的多模态多时间多尺度数据融合网络
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-11-29 DOI: 10.1016/j.isprsjprs.2025.11.026
Yuwei Cao , Nicholas C. Coops , Brent A. Murray , Ian Sinclair , Robere-McGugan Geordie
Accurate estimation and mapping of tree species composition (TSC) is crucial for sustainable forest management. Recent advances in Light Detection and Ranging (lidar) technology and the availability of moderate spatial resolution, surface reflectance time series passive optical imagery offer scalable and efficient approaches for automated TSC estimation. In this research we develop a novel deep learning framework, M3F-Net (Multi-modal, Multi-temporal, and Multi-scale Fusion Network), that integrates multi-temporal Sentinel-2 (S2) imagery and single photon lidar (SPL) data to estimate TSC for nine common species across the 630,000-hectare Romeo Malette Forest in Ontario, Canada. A dual-level alignment strategy combines (i) superpixel-based spatial aggregation to reconcile mismatched resolutions between high-resolution SPL point clouds (>25 pts/m2) and coarser S2 imagery (20 m), and (ii) a grid-based feature alignment that transforms unordered 3D point cloud features into structured 2D representations, enabling seamless integration of spectral and structural information. Within this aligned space, a multi-level Mamba-Fusion module jointly models multi-scale spatial patterns and seasonal dynamics through selective state-space modelling, efficiently capturing long-range dependencies while filtering redundant information. The framework achieves an R2 score of 0.676, outperforming existing point cloud-based methods by 6% in TSC estimation. For leading species classification, our results are 6% better in terms of weighted F1, using either the TSC-based method or the standalone leading species classification method. Addition of seasonal S2 imagery added a 10% R2 gain compared to the SPL-only mode. These results underscore the potential of fusing multi-modal and multi-temporal data with deep learning for scalable, high-accurate TSC estimation, offering a robust tool for large-scale management applications.
树种组成的准确估算和制图对森林的可持续管理至关重要。光探测和测距(激光雷达)技术的最新进展以及中等空间分辨率、表面反射率时间序列被动光学图像的可用性为自动估计TSC提供了可扩展和有效的方法。在这项研究中,我们开发了一个新的深度学习框架M3F-Net(多模态、多时相和多尺度融合网络),该框架集成了多时相Sentinel-2 (S2)图像和单光子激光雷达(SPL)数据,以估计加拿大安大略省630,000公顷Romeo Malette森林中9种常见物种的TSC。双级对齐策略结合了(i)基于超像素的空间聚合,以协调高分辨率SPL点云(>25 pts/m2)和较粗的S2图像(20 m)之间的不匹配分辨率;(ii)基于网格的特征对齐,将无序的3D点云特征转换为结构化的2D表示,从而实现光谱和结构信息的无缝集成。在这个对齐的空间中,一个多层次的Mamba-Fusion模块通过选择性的状态空间建模,联合建模多尺度空间模式和季节动态,有效地捕获远程依赖关系,同时过滤冗余信息。该框架在TSC估计上的R2得分为0.676,比现有基于点云的方法高出6%。对于主导物种分类,无论是基于tsc的方法还是独立的主导物种分类方法,我们的结果在加权F1方面都要好6%。季节性S2图像的添加使R2比仅限spll模式增加了10%。这些结果强调了将多模态和多时间数据与深度学习相融合的潜力,以实现可扩展的、高精度的TSC估计,为大规模管理应用提供了一个强大的工具。
{"title":"M3FNet: Multi-modal multi-temporal multi-scale data fusion network for tree species composition mapping","authors":"Yuwei Cao ,&nbsp;Nicholas C. Coops ,&nbsp;Brent A. Murray ,&nbsp;Ian Sinclair ,&nbsp;Robere-McGugan Geordie","doi":"10.1016/j.isprsjprs.2025.11.026","DOIUrl":"10.1016/j.isprsjprs.2025.11.026","url":null,"abstract":"<div><div>Accurate estimation and mapping of <strong>t</strong>ree <strong>s</strong>pecies <strong>c</strong>omposition (TSC) is crucial for sustainable forest management. Recent advances in Light Detection and Ranging (lidar) technology and the availability of moderate spatial resolution, surface reflectance time series passive optical imagery offer scalable and efficient approaches for automated TSC estimation. In this research we develop a novel deep learning framework, M3F-Net (Multi-modal, Multi-temporal, and Multi-scale Fusion Network), that integrates multi-temporal Sentinel-2 (S2) imagery and single photon lidar (SPL) data to estimate TSC for nine common species across the 630,000-hectare Romeo Malette Forest in Ontario, Canada. A dual-level alignment strategy combines (i) superpixel-based spatial aggregation to reconcile mismatched resolutions between high-resolution SPL point clouds (&gt;25 pts/m<sup>2</sup>) and coarser S2 imagery (20 m), and (ii) a grid-based feature alignment that transforms unordered 3D point cloud features into structured 2D representations, enabling seamless integration of spectral and structural information. Within this aligned space, a multi-level Mamba-Fusion module jointly models multi-scale spatial patterns and seasonal dynamics through selective state-space modelling, efficiently capturing long-range dependencies while filtering redundant information. The framework achieves an R<sup>2</sup> score of 0.676, outperforming existing point cloud-based methods by 6% in TSC estimation. For leading species classification, our results are 6% better in terms of weighted F1, using either the TSC-based method or the standalone leading species classification method. Addition of seasonal S2 imagery added a 10% R<sup>2</sup> gain compared to the SPL-only mode. These results underscore the potential of fusing multi-modal and multi-temporal data with deep learning for scalable, high-accurate TSC estimation, offering a robust tool for large-scale management applications.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"231 ","pages":"Pages 797-814"},"PeriodicalIF":12.2,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145613693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey of publicly available multi-temporal point cloud datasets 公开的多时点云数据集的调查
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-11-29 DOI: 10.1016/j.isprsjprs.2025.11.003
Ole Wegen , Willy Scheibel , Rico Richter , Jürgen Döllner
Multi-temporal point clouds, which capture the same acquisition area at different points in time, enable change analysis and forecasting across various disciplines. Publicly available datasets play an important role in the development and evaluation of such approaches by enhancing comparability and reducing the effort required for data acquisition and preparation. However, identifying suitable datasets, assessing their characteristics, and comparing them with similar ones remains challenging and tedious due to the lack of a centralized distribution and documentation platform. In this paper, we provide a comprehensive overview of publicly available multi-temporal point cloud datasets. We evaluate each dataset across 30 different characteristics, grouped into six categories, and highlight current gaps and future challenges. Our analysis shows that, although many datasets are accompanied by extensive documentation, unclear usage terms and unreliable data hosting can limit their accessibility and adoption. In addition to clear correlations between application domains, acquisition methods, and captured scene types, there is also some overlap in point cloud requirements across domains. However, inconsistencies in file formats, data representations, and labeling practices hinder cross-domain and cross-application reuse. In the context of machine learning, we observe a positive trend towards more labeled datasets. Nevertheless, gaps remain due to limited coverage of natural environments and poor geographic diversity. Although there are already many positive examples of accessible datasets, future dataset publications would benefit from standardized review processes and a stronger focus on accessibility and usability across application areas.
多时间点云在不同的时间点捕获相同的采集区域,可以跨不同学科进行变化分析和预测。可公开获得的数据集通过增强可比性和减少数据获取和准备所需的努力,在开发和评价这些方法方面发挥着重要作用。然而,由于缺乏集中的分发和文档平台,识别合适的数据集,评估它们的特征,并将它们与类似的数据集进行比较仍然是具有挑战性和繁琐的。在本文中,我们提供了公开可用的多时点云数据集的全面概述。我们评估了每个数据集的30个不同特征,分为六类,并强调了当前的差距和未来的挑战。我们的分析表明,尽管许多数据集都附有大量文档,但不明确的使用术语和不可靠的数据托管会限制它们的可访问性和采用。除了应用领域、获取方法和捕获场景类型之间的明确相关性之外,在不同领域的点云需求中也存在一些重叠。然而,文件格式、数据表示和标签实践中的不一致性阻碍了跨领域和跨应用程序的重用。在机器学习的背景下,我们观察到更多标记数据集的积极趋势。然而,由于自然环境的覆盖范围有限和地理多样性差,差距仍然存在。虽然已经有许多可访问数据集的积极例子,但未来的数据集出版物将受益于标准化的审查过程,并更加关注跨应用领域的可访问性和可用性。
{"title":"A survey of publicly available multi-temporal point cloud datasets","authors":"Ole Wegen ,&nbsp;Willy Scheibel ,&nbsp;Rico Richter ,&nbsp;Jürgen Döllner","doi":"10.1016/j.isprsjprs.2025.11.003","DOIUrl":"10.1016/j.isprsjprs.2025.11.003","url":null,"abstract":"<div><div>Multi-temporal point clouds, which capture the same acquisition area at different points in time, enable change analysis and forecasting across various disciplines. Publicly available datasets play an important role in the development and evaluation of such approaches by enhancing comparability and reducing the effort required for data acquisition and preparation. However, identifying suitable datasets, assessing their characteristics, and comparing them with similar ones remains challenging and tedious due to the lack of a centralized distribution and documentation platform. In this paper, we provide a comprehensive overview of publicly available multi-temporal point cloud datasets. We evaluate each dataset across 30 different characteristics, grouped into six categories, and highlight current gaps and future challenges. Our analysis shows that, although many datasets are accompanied by extensive documentation, unclear usage terms and unreliable data hosting can limit their accessibility and adoption. In addition to clear correlations between application domains, acquisition methods, and captured scene types, there is also some overlap in point cloud requirements across domains. However, inconsistencies in file formats, data representations, and labeling practices hinder cross-domain and cross-application reuse. In the context of machine learning, we observe a positive trend towards more labeled datasets. Nevertheless, gaps remain due to limited coverage of natural environments and poor geographic diversity. Although there are already many positive examples of accessible datasets, future dataset publications would benefit from standardized review processes and a stronger focus on accessibility and usability across application areas.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"231 ","pages":"Pages 815-836"},"PeriodicalIF":12.2,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145613697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond static imaging: A dynamic decision paradigm for robust array-SAR in diverse sensing scenarios 超越静态成像:不同传感场景下鲁棒阵列sar的动态决策范式
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-11-27 DOI: 10.1016/j.isprsjprs.2025.11.023
Xiangdong Ma , Xu Zhan , Xiaoling Zhang , Yaping Wang , Jun Shi , Shunjun Wei , Tianjiao Zeng
Array synthetic aperture radar (array-SAR) is a popular radar imaging technique for 3D scene sensing, especially for urban area. Recently, deep learning imaging methods have achieved significant advancements, showing promise for large-scale spatial sensing. However current methods struggle with generalization because their imaging pipelines are static—key parameters are fixed after training—so performance degrades across varying noise levels, measurement models, and scene distributions—a critical gap that remains insufficiently addressed. We address this by recasting array-SAR imaging as a dynamic Markov decision process. And we introduce a state–sequence–decision framework: a sequence of state transitions, where each state triggers learnable actions determined by decision that adapt step size, regularization threshold, and stopping based on the evolving state. We have conducted extensive experiments across a wide range of noise conditions (0–10 dB), measurement models (from ground-based to airborne systems, with 10%–50% sampling ratios), and scene distributions in both near-field and far-field sensing scenarios. Across all these settings, the proposed method consistently outperforms representative baselines, achieving average gains of 5.1 dB in PSNR and 0.35 in SSIM, demonstrating strong robustness across diverse sensing environments.
阵列合成孔径雷达(Array - sar)是一种流行的三维场景传感雷达成像技术,尤其适用于城市区域。最近,深度学习成像方法取得了重大进展,显示出大规模空间感知的前景。然而,目前的方法难以泛化,因为它们的成像管道是静态的——关键参数在训练后是固定的——所以性能在不同的噪声水平、测量模型和场景分布中都会下降——这是一个仍然没有得到充分解决的关键差距。我们通过将阵列sar成像重铸为动态马尔可夫决策过程来解决这个问题。我们引入了一个状态序列决策框架:一个状态转换序列,其中每个状态触发可学习的动作,这些动作由适应步长、正则化阈值的决策决定,并基于不断发展的状态停止。我们在广泛的噪声条件(0-10 dB)、测量模型(从地面到机载系统,采样率为10%-50%)以及近场和远场传感场景分布中进行了广泛的实验。在所有这些设置中,所提出的方法始终优于代表性基线,在PSNR中实现5.1 dB的平均增益,在SSIM中实现0.35的平均增益,在不同的传感环境中表现出强大的鲁棒性。
{"title":"Beyond static imaging: A dynamic decision paradigm for robust array-SAR in diverse sensing scenarios","authors":"Xiangdong Ma ,&nbsp;Xu Zhan ,&nbsp;Xiaoling Zhang ,&nbsp;Yaping Wang ,&nbsp;Jun Shi ,&nbsp;Shunjun Wei ,&nbsp;Tianjiao Zeng","doi":"10.1016/j.isprsjprs.2025.11.023","DOIUrl":"10.1016/j.isprsjprs.2025.11.023","url":null,"abstract":"<div><div>Array synthetic aperture radar (array-SAR) is a popular radar imaging technique for 3D scene sensing, especially for urban area. Recently, deep learning imaging methods have achieved significant advancements, showing promise for large-scale spatial sensing. However current methods struggle with generalization because their imaging pipelines are static—key parameters are fixed after training—so performance degrades across varying noise levels, measurement models, and scene distributions—a critical gap that remains insufficiently addressed. We address this by recasting array-SAR imaging as a dynamic Markov decision process. And we introduce a state–sequence–decision framework: a sequence of state transitions, where each state triggers learnable actions determined by decision that adapt step size, regularization threshold, and stopping based on the evolving state. We have conducted extensive experiments across a wide range of noise conditions (0–10 dB), measurement models (from ground-based to airborne systems, with 10%–50% sampling ratios), and scene distributions in both near-field and far-field sensing scenarios. Across all these settings, the proposed method consistently outperforms representative baselines, achieving average gains of 5.1 dB in PSNR and 0.35 in SSIM, demonstrating strong robustness across diverse sensing environments.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"231 ","pages":"Pages 778-796"},"PeriodicalIF":12.2,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145611833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TasselNetV4: A vision foundation model for cross-scene, cross-scale, and cross-species plant counting TasselNetV4:用于跨场景、跨尺度和跨物种植物计数的视觉基础模型
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-11-27 DOI: 10.1016/j.isprsjprs.2025.11.017
Xiaonan Hu , Xuebing Li , Jinyu Xu , Abdulkadir Duran Adan , Letian Zhou , Xuhui Zhu , Yanan Li , Wei Guo , Shouyang Liu , Wenzhong Liu , Hao Lu
Accurate plant counting provides valuable information for agriculture such as crop yield prediction, plant density assessment, and phenotype quantification. Vision-based approaches are currently the mainstream solution. Prior art typically uses a detection or a regression model to count a specific plant. However, plants have biodiversity, and new cultivars are increasingly bred each year. It is almost impossible to exhaust and build all species-dependent counting models. Inspired by class-agnostic counting (CAC) in computer vision, we argue that it is time to rethink the problem formulation of plant counting, from what plants to count to how to count plants. In contrast to most daily objects with spatial and temporal invariance, plants are dynamic, changing with time and space. Their non-rigid structure often leads to worse performance than counting rigid instances like heads and cars such that current CAC and open-world detection models are suboptimal to count plants. In this work, we inherit the vein of the TasselNet plant counting model and introduce a new extension, TasselNetV4, shifting from species-specific counting to cross-species counting. TasselNetV4 marries the local counting idea of TasselNet with the extract-and-match paradigm in CAC. It builds upon a plain vision transformer and incorporates novel multi-branch box-aware local counters used to enhance cross-scale robustness. In particular, two challenging datasets, PAC-105 and PAC-Somalia, are harvested. PAC-105 features 105 plant- and organ-level categories from 64 plant species, spanning various scenes. PAC-Somalia, specific to out-of-distribution validation, features 32 unique plant species in Somalia. Extensive experiments against state-of-the-art CAC models show that TasselNetV4 achieves not only superior counting performance but also high efficiency, with a mean absolute error of 16.04, an R2 of 0.92, and up to 121 FPS inference speed on images of 384 × 384 resolution. Our results indicate that TasselNetV4 emerges to be a vision foundation model for cross-scene, cross-scale, and cross-species plant counting. To facilitate future plant counting research, we plan to release all the data, annotations, code, and pretrained models at https://github.com/tiny-smart/tasselnetv4.
准确的植物计数为农业提供了有价值的信息,如作物产量预测、植物密度评估和表型量化。基于视觉的方法是目前的主流解决方案。现有技术通常使用检测或回归模型来计数特定的植物。然而,植物具有生物多样性,每年都有越来越多的新品种被培育出来。要穷尽并建立所有依赖物种的计数模型几乎是不可能的。受计算机视觉中类别不可知论计数(CAC)的启发,我们认为现在是时候重新思考植物计数的问题表述,从计数什么植物到如何计数植物。与大多数具有空间和时间不变性的日常物品不同,植物是动态的,随时间和空间而变化。它们的非刚性结构通常会导致比计算头部和汽车等刚性实例更差的性能,因此当前的CAC和开放世界检测模型对于计算植物是次优的。在这项工作中,我们继承了TasselNet植物计数模型的精神,并引入了一个新的扩展,TasselNetV4,从物种特异性计数转向跨物种计数。TasselNetV4将TasselNet的局部计数思想与CAC中的提取匹配范式结合在一起。它建立在普通视觉变压器的基础上,并结合了用于增强跨尺度鲁棒性的新型多支路盒感知本地计数器。特别是,收集了两个具有挑战性的数据集,PAC-105和pac -索马里。PAC-105包含来自64种植物的105种植物和器官级别的分类,跨越不同的场景。PAC-Somalia,专门针对分销外验证,在索马里有32种独特的植物物种。针对最先进的CAC模型进行的大量实验表明,TasselNetV4不仅具有优越的计数性能,而且具有很高的效率,平均绝对误差为16.04,R2为0.92,在384 × 384分辨率的图像上达到121 FPS的推理速度。我们的研究结果表明,TasselNetV4是一个跨场景、跨尺度和跨物种植物计数的视觉基础模型。为了便于未来的植物计数研究,我们计划在https://github.com/tiny-smart/tasselnetv4上发布所有数据、注释、代码和预训练模型。
{"title":"TasselNetV4: A vision foundation model for cross-scene, cross-scale, and cross-species plant counting","authors":"Xiaonan Hu ,&nbsp;Xuebing Li ,&nbsp;Jinyu Xu ,&nbsp;Abdulkadir Duran Adan ,&nbsp;Letian Zhou ,&nbsp;Xuhui Zhu ,&nbsp;Yanan Li ,&nbsp;Wei Guo ,&nbsp;Shouyang Liu ,&nbsp;Wenzhong Liu ,&nbsp;Hao Lu","doi":"10.1016/j.isprsjprs.2025.11.017","DOIUrl":"10.1016/j.isprsjprs.2025.11.017","url":null,"abstract":"<div><div>Accurate plant counting provides valuable information for agriculture such as crop yield prediction, plant density assessment, and phenotype quantification. Vision-based approaches are currently the mainstream solution. Prior art typically uses a detection or a regression model to count a specific plant. However, plants have biodiversity, and new cultivars are increasingly bred each year. It is almost impossible to exhaust and build all species-dependent counting models. Inspired by class-agnostic counting (CAC) in computer vision, we argue that it is time to rethink the problem formulation of plant counting, from what plants to count to how to count plants. In contrast to most daily objects with spatial and temporal invariance, plants are dynamic, changing with time and space. Their non-rigid structure often leads to worse performance than counting rigid instances like heads and cars such that current CAC and open-world detection models are suboptimal to count plants. In this work, we inherit the vein of the TasselNet plant counting model and introduce a new extension, TasselNetV4, shifting from species-specific counting to cross-species counting. TasselNetV4 marries the local counting idea of TasselNet with the extract-and-match paradigm in CAC. It builds upon a plain vision transformer and incorporates novel multi-branch box-aware local counters used to enhance cross-scale robustness. In particular, two challenging datasets, PAC-105 and PAC-Somalia, are harvested. PAC-105 features 105 plant- and organ-level categories from 64 plant species, spanning various scenes. PAC-Somalia, specific to out-of-distribution validation, features 32 unique plant species in Somalia. Extensive experiments against state-of-the-art CAC models show that TasselNetV4 achieves not only superior counting performance but also high efficiency, with a mean absolute error of 16.04, an <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> of 0.92, and up to 121 FPS inference speed on images of 384 × 384 resolution. Our results indicate that TasselNetV4 emerges to be a vision foundation model for cross-scene, cross-scale, and cross-species plant counting. To facilitate future plant counting research, we plan to release all the data, annotations, code, and pretrained models at <span><span>https://github.com/tiny-smart/tasselnetv4</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"231 ","pages":"Pages 745-760"},"PeriodicalIF":12.2,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145609525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spherical target eccentricity correction in photogrammetric applications 球面目标偏心校正在摄影测量中的应用
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-11-26 DOI: 10.1016/j.isprsjprs.2025.11.022
F. Liebold, H.-G. Maas
The perspective projection of a sphere appears as an ellipse in the image where the ellipse center differs from the projection of the sphere’s center. This eccentricity leads to systematic errors in photogrammetric measurements. For a sphere of 40 mm diameter on a plate with 33 cm distance from the camera’s projection center, 15 cm distance to the nadir point and a principal distance of 12 mm, the eccentricity can reach more than 20 μm in the image. The publication at hand deals with eccentricity correction terms that can be applied either to the measured image coordinates or through a model adaption. An overview of existing correction terms in image space is provided and a new extension of the pinhole camera model for spheres is proposed which also can be used simultaneously for the sphere parameter determination. Furthermore, estimation procedures for the initial values of the sphere radius as well as the principal distance and the principal point from the ellipse measurements in a single image are presented. In experiments, the proposed methods reduced the reprojection error by a factor of three and achieved a relative scale accuracy of 0.2% to 0.3% when using known radii.
球体的透视投影在图像中显示为椭圆,其中椭圆中心与球体中心的投影不同。这种偏心率导致摄影测量中的系统误差。在距摄像机投影中心33cm,距最低点15cm,主距12mm的条件下,在平板上直径为40mm的球体,其成像偏心率可达20 μm以上。手头的出版物处理可以应用于测量图像坐标或通过模型自适应的偏心校正项。概述了图像空间中现有的校正项,提出了一种新的球面针孔相机模型的扩展,该模型也可同时用于球面参数的确定。在此基础上,给出了单幅图像中椭球半径初值、主距离初值和主点初值的估计方法。在实验中,当使用已知半径时,所提出的方法将重投影误差降低了三倍,并实现了0.2%至0.3%的相对比例尺精度。
{"title":"Spherical target eccentricity correction in photogrammetric applications","authors":"F. Liebold,&nbsp;H.-G. Maas","doi":"10.1016/j.isprsjprs.2025.11.022","DOIUrl":"10.1016/j.isprsjprs.2025.11.022","url":null,"abstract":"<div><div>The perspective projection of a sphere appears as an ellipse in the image where the ellipse center differs from the projection of the sphere’s center. This eccentricity leads to systematic errors in photogrammetric measurements. For a sphere of 40 mm diameter on a plate with 33 cm distance from the camera’s projection center, 15 cm distance to the nadir point and a principal distance of 12 mm, the eccentricity can reach more than 20<!--> <span><math><mi>μ</mi></math></span>m in the image. The publication at hand deals with eccentricity correction terms that can be applied either to the measured image coordinates or through a model adaption. An overview of existing correction terms in image space is provided and a new extension of the pinhole camera model for spheres is proposed which also can be used simultaneously for the sphere parameter determination. Furthermore, estimation procedures for the initial values of the sphere radius as well as the principal distance and the principal point from the ellipse measurements in a single image are presented. In experiments, the proposed methods reduced the reprojection error by a factor of three and achieved a relative scale accuracy of 0.2% to 0.3% when using known radii.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"231 ","pages":"Pages 761-777"},"PeriodicalIF":12.2,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145598738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ITS-Net: A platform and sensor agnostic 3D deep learning model for individual tree segmentation using aerial LiDAR data ITS-Net:一个平台和传感器不可知的3D深度学习模型,用于使用航空激光雷达数据进行单株树木分割
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2025-11-26 DOI: 10.1016/j.isprsjprs.2025.11.019
Bowen Li , Yong Pang , Daniel Kükenbrink , Luo Wang , Dan Kong , Mauro Marty
Recent advances in aerial Light Detection and Ranging (LiDAR) technologies have revolutionized the capability to characterize individual tree structure, enabling detailed ecological analyses at the tree level. A critical prerequisite for such analysis is an accurate individual tree segmentation. However, this task remains challenging due to the complexity of forest environments and varying quality of point clouds collected by diverse aerial sensors and platforms. Existing methods are mostly designed for a single aerial platform or sensor and struggle with complex forest environments. To address these limitations, we propose ITS-Net, an aerial platform and sensor-agnostic deep learning model for individual tree segmentation, which integrates three modules designed to enhance its learning capability under complex forest environments. To facilitate and evaluate its platform and sensor-agnostic capabilities, we constructed AerialTrees, a comprehensive individual tree segmentation dataset that included aerial LiDAR data collected with point densities ranging from 50 to 10,000 pts/m2 using different sensors from ALS and ULS platforms over four climate zones. This dataset also included 2,903 individual trees that had been labeled manually. ITS-Net outperformed state-of-the-art individual tree segmentation methods on AerialTrees, achieving the highest average performance with a detection rate of 94.8 % and an F1-score of 90.9 %. It achieved an F1-score of 88.1 % when tested on the publicly available FOR-instance dataset. ITS-Net also performed better than the state-of-the-art ForAINet method for multi-layered canopy segmentation, outperforming the latter by 12.3 % in detecting understory vegetation. When directly transferred to the five study sites of the FOR-instance dataset as well as the study sites in Switzerland and Russia, ITS-Net produced accuracies that were reasonably close to those produced by several other algorithms trained over those study sites. These results were achieved without requiring efforts to address differences in LiDAR data characteristics through explicit data preprocessing or to fine tune the parameters of the deep learning model, demonstrating ITS-Net’s robustness for segmenting various aerial LiDAR point clouds acquired using different sensors from different aerial platforms. As a sensor and platform agnostic method, ITS-Net may provide an end-to-end solution needed to facilitate the use of rapidly evolving aerial LiDAR technology in various forestry applications. The AerialTrees dataset developed through this study is a significant contribution to the very few publicly available labeled LiDAR datasets that are crucial for calibrating, testing, and benchmarking individual tree segmentation algorithms. Our code and data are available at: https://github.com/A8366233/AerialTrees.
航空光探测和测距(LiDAR)技术的最新进展彻底改变了表征单个树木结构的能力,从而能够在树木水平上进行详细的生态分析。这种分析的一个关键先决条件是准确的单个树分割。然而,由于森林环境的复杂性以及不同航空传感器和平台收集的点云质量的差异,这项任务仍然具有挑战性。现有的方法大多是针对单一的空中平台或传感器设计的,并且难以适应复杂的森林环境。为了解决这些限制,我们提出了its - net,这是一种用于单树分割的空中平台和传感器不可知深度学习模型,该模型集成了三个模块,旨在增强其在复杂森林环境下的学习能力。为了促进和评估其平台和传感器无关能力,我们构建了AerialTrees,这是一个全面的单独树木分割数据集,其中包括使用来自ALS和ULS平台的不同传感器在四个气候带以50至10,000 pts/m2的点密度收集的航空激光雷达数据。该数据集还包括手工标记的2,903棵单独的树。ITS-Net在AerialTrees上优于最先进的单个树分割方法,达到了最高的平均性能,检测率为94.8%,f1得分为90.9%。在公开可用的FOR-instance数据集上测试时,它获得了88.1%的f1分数。ITS-Net在多层冠层分割方面也比最先进的ForAINet方法表现更好,在检测林下植被方面比后者高出12.3%。当直接转移到for实例数据集的五个研究地点以及瑞士和俄罗斯的研究地点时,ITS-Net产生的准确性与在这些研究地点上训练的其他几个算法产生的准确性相当接近。这些结果无需通过显式数据预处理来解决激光雷达数据特征的差异,也无需微调深度学习模型的参数,证明了ITS-Net在分割使用来自不同空中平台的不同传感器获取的各种空中激光雷达点云方面的鲁棒性。作为一种与传感器和平台无关的方法,ITS-Net可以提供端到端解决方案,以促进在各种林业应用中快速发展的空中激光雷达技术的使用。通过这项研究开发的AerialTrees数据集对少数公开可用的标记LiDAR数据集做出了重大贡献,这些数据集对于校准、测试和基准测试单个树木分割算法至关重要。我们的代码和数据可在:https://github.com/A8366233/AerialTrees。
{"title":"ITS-Net: A platform and sensor agnostic 3D deep learning model for individual tree segmentation using aerial LiDAR data","authors":"Bowen Li ,&nbsp;Yong Pang ,&nbsp;Daniel Kükenbrink ,&nbsp;Luo Wang ,&nbsp;Dan Kong ,&nbsp;Mauro Marty","doi":"10.1016/j.isprsjprs.2025.11.019","DOIUrl":"10.1016/j.isprsjprs.2025.11.019","url":null,"abstract":"<div><div>Recent advances in aerial Light Detection and Ranging (LiDAR) technologies have revolutionized the capability to characterize individual tree structure, enabling detailed ecological analyses at the tree level. A critical prerequisite for such analysis is an accurate individual tree segmentation. However, this task remains challenging due to the complexity of forest environments and varying quality of point clouds collected by diverse aerial sensors and platforms. Existing methods are mostly designed for a single aerial platform or sensor and struggle with complex forest environments. To address these limitations, we propose ITS-Net, an aerial platform and sensor-agnostic deep learning model for individual tree segmentation, which integrates three modules designed to enhance its learning capability under complex forest environments. To facilitate and evaluate its platform and sensor-agnostic capabilities, we constructed AerialTrees, a comprehensive individual tree segmentation dataset that included aerial LiDAR data collected with point densities ranging from 50 to 10,000 pts/m<sup>2</sup> using different sensors from ALS and ULS platforms over four climate zones. This dataset also included 2,903 individual trees that had been labeled manually. ITS-Net outperformed state-of-the-art individual tree segmentation methods on AerialTrees, achieving the highest average performance with a detection rate of 94.8 % and an F1-score of 90.9 %. It achieved an F1-score of 88.1 % when tested on the publicly available FOR-instance dataset. ITS-Net also performed better than the state-of-the-art ForAINet method for multi-layered canopy segmentation, outperforming the latter by 12.3 % in detecting understory vegetation. When directly transferred to the five study sites of the FOR-instance dataset as well as the study sites in Switzerland and Russia, ITS-Net produced accuracies that were reasonably close to those produced by several other algorithms trained over those study sites. These results were achieved without requiring efforts to address differences in LiDAR data characteristics through explicit data preprocessing or to fine tune the parameters of the deep learning model, demonstrating ITS-Net’s robustness for segmenting various aerial LiDAR point clouds acquired using different sensors from different aerial platforms. As a sensor and platform agnostic method, ITS-Net may provide an end-to-end solution needed to facilitate the use of rapidly evolving aerial LiDAR technology in various forestry applications. The AerialTrees dataset developed through this study is a significant contribution to the very few publicly available labeled LiDAR datasets that are crucial for calibrating, testing, and benchmarking individual tree segmentation algorithms. Our code and data are available at: <span><span>https://github.com/A8366233/AerialTrees</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"231 ","pages":"Pages 719-744"},"PeriodicalIF":12.2,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145598737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1