首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
Monitoring global power outages induced by tropical cyclones using nighttime light data 利用夜间灯光数据监测热带气旋引起的全球停电
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-03-01 Epub Date: 2026-02-04 DOI: 10.1016/j.isprsjprs.2026.01.042
Liujun Zhu , Yaqian Li , Shanshui Yuan , Shi Shi , Fang Ji
Tropical cyclones (TCs) are among the most destructive natural hazards, frequently causing widespread power outages (POs) in coastal urban areas that disrupt economic activity and social stability. Quantifying TC-induced POs remains challenging due to limited outage data availability. This study made the first global detection and quantification of TC-induced POs using NASA’s Black Marble nighttime lights (NTL) data. The proposed method analyzed changes in NTL brightness within urban agglomerations by establishing pre-TC baselines and applying statistical outlier detection to identify outages. A total of 1,239 POs were detected by the algorithm from 19,999 agglomeration-TC events between 2012 and 2023, with the corresponding outage duration and severity being also estimated. Validation against media reports showed an overall accuracy of 0.78, with the accuracy being improved with TC intensity. Case studies demonstrated robust performance in regions with vulnerable infrastructure and high-quality NTL observations, such as North America, while performance declined in areas affected by frequent data gaps or rapid restoration, notably East Asia and India. While only 50% of agglomeration–TC events can be evaluated due to the missing NTL data, this work offers a scalable, near-real-time approach to global TC-induced PO monitoring, providing critical insights for urban resilience planning, disaster response, and power system management.
热带气旋(tc)是最具破坏性的自然灾害之一,经常造成沿海城市地区大面积停电,破坏经济活动和社会稳定。由于有限的停机数据可用性,量化tc引起的POs仍然具有挑战性。这项研究首次利用NASA的黑大理石夜间灯(NTL)数据对tc诱导的POs进行了全球检测和量化。该方法通过建立tc前基线和统计离群值检测来分析城市群内NTL亮度的变化。该算法从2012年至2023年的19,999个聚类- tc事件中检测到1,239个POs,并估计了相应的停机时间和严重程度。对媒体报道的验证表明,总体准确率为0.78,随着TC强度的增加,准确率有所提高。案例研究表明,在基础设施脆弱和高质量NTL观测的地区,如北美,表现强劲,而在受频繁数据缺口或快速恢复影响的地区,特别是东亚和印度,表现有所下降。虽然由于缺少NTL数据,只有50%的聚集性tc事件可以进行评估,但这项工作为全球tc引起的PO监测提供了一种可扩展的、接近实时的方法,为城市弹性规划、灾害响应和电力系统管理提供了重要见解。
{"title":"Monitoring global power outages induced by tropical cyclones using nighttime light data","authors":"Liujun Zhu ,&nbsp;Yaqian Li ,&nbsp;Shanshui Yuan ,&nbsp;Shi Shi ,&nbsp;Fang Ji","doi":"10.1016/j.isprsjprs.2026.01.042","DOIUrl":"10.1016/j.isprsjprs.2026.01.042","url":null,"abstract":"<div><div>Tropical cyclones (TCs) are among the most destructive natural hazards, frequently causing widespread power outages (POs) in coastal urban areas that disrupt economic activity and social stability. Quantifying TC-induced POs remains challenging due to limited outage data availability. This study made the first global detection and quantification of TC-induced POs using NASA’s Black Marble nighttime lights (NTL) data. The proposed method analyzed changes in NTL brightness within urban agglomerations by establishing pre-TC baselines and applying statistical outlier detection to identify outages. A total of 1,239 POs were detected by the algorithm from 19,999 agglomeration-TC events between 2012 and 2023, with the corresponding outage duration and severity being also estimated. Validation against media reports showed an overall accuracy of 0.78, with the accuracy being improved with TC intensity. Case studies demonstrated robust performance in regions with vulnerable infrastructure and high-quality NTL observations, such as North America, while performance declined in areas affected by frequent data gaps or rapid restoration, notably East Asia and India. While only 50% of agglomeration–TC events can be evaluated due to the missing NTL data, this work offers a scalable, near-real-time approach to global TC-induced PO monitoring, providing critical insights for urban resilience planning, disaster response, and power system management.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 437-451"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146135041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SuperMapNet for long-range and high-accuracy vectorized HD map construction SuperMapNet用于远程和高精度矢量化高清地图构建
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-03-01 Epub Date: 2026-01-20 DOI: 10.1016/j.isprsjprs.2026.01.023
Ruqin Zhou , Chenguang Dai , Wanshou Jiang , Yongsheng Zhang , Zhenchao Zhang , San Jiang
Vectorized high-definition (HD) map construction is formulated as the task of classifying and localizing typical map elements based on features in a bird’s-eye view (BEV). This is essential for autonomous driving systems, providing interpretable environmental structured representations for decision and planning. Remarkable work has been achieved in recent years, but several major issues remain: (1) in the generation of the BEV features, single modality methods suffer from limited perception capability and range, while existing multi-modal fusion approaches underutilize cross-modal synergies and fail to resolve spatial disparities between modalities, resulting in misaligned BEV features with holes; (2) in the classification and localization of map elements, existing methods heavily rely on point-level modeling information while neglecting the information between elements and between point and element, leading to low accuracy with erroneous shapes and element entanglement. To address these limitations, we propose SuperMapNet, a multi-modal framework designed for long-range and high-accuracy vectorized HD map construction. This framework uses both camera images and LiDAR point clouds as input. It first tightly couples semantic information from camera images and geometric information from LiDAR point clouds by a cross-attention based synergy enhancement module and a flow-based disparity alignment module for long-range BEV feature generation. Subsequently, local information acquired by point queries and global information acquired by element queries are tightly coupled by three-level interactions for high-accuracy classification and localization, where Point2Point interaction captures local geometric consistency between points of the same element, Element2Element interaction learns global semantic relationships between elements, and Point2Element interaction complement element information for its constituent points. Experiments on the nuScenes and Argoverse2 datasets demonstrate high accuracy, surpassing previous state-of-the-art methods (SOTAs) by 14.9%/8.8% and 18.5%/3.1% mAP under the hard/easy settings, respectively, even over the double perception ranges (up to 120 m in the X-axis and 60 m in the Y-axis). The code is made publicly available at https://github.com/zhouruqin/SuperMapNet.
矢量化高清晰地图构建是一种基于鸟瞰图特征对典型地图元素进行分类和定位的任务。这对自动驾驶系统至关重要,为决策和规划提供可解释的环境结构表示。近年来,研究工作取得了显著成果,但仍存在以下几个主要问题:(1)在纯电动汽车特征的生成中,单模态方法的感知能力和范围有限,而现有的多模态融合方法未充分利用跨模态协同作用,未能解决模态间的空间差异,导致纯电动汽车特征与空穴对齐不一致;(2)在地图元素的分类和定位中,现有方法严重依赖点级建模信息,而忽略了元素之间和点与元素之间的信息,导致精度低,存在错误形状和元素纠缠。为了解决这些限制,我们提出了SuperMapNet,这是一个多模态框架,专为远程和高精度矢量化高清地图构建而设计。该框架使用相机图像和激光雷达点云作为输入。首先,通过基于交叉关注的协同增强模块和基于流的视差对准模块,将相机图像的语义信息与LiDAR点云的几何信息紧密耦合,用于远程BEV特征生成。随后,将点查询获取的局部信息与元素查询获取的全局信息通过三级交互紧密耦合,实现高精度的分类和定位,其中Point2Point交互捕获同一元素点之间的局部几何一致性,Element2Element交互学习元素之间的全局语义关系,Point2Element交互补充其组成点的元素信息。在nuScenes和Argoverse2数据集上的实验表明,即使在双感知范围内(x轴120米和y轴60米),在难/易设置下,也比以前的最先进的方法(SOTAs)分别高出14.9%/8.8%和18.5%/3.1%。该代码可在https://github.com/zhouruqin/SuperMapNet上公开获取。
{"title":"SuperMapNet for long-range and high-accuracy vectorized HD map construction","authors":"Ruqin Zhou ,&nbsp;Chenguang Dai ,&nbsp;Wanshou Jiang ,&nbsp;Yongsheng Zhang ,&nbsp;Zhenchao Zhang ,&nbsp;San Jiang","doi":"10.1016/j.isprsjprs.2026.01.023","DOIUrl":"10.1016/j.isprsjprs.2026.01.023","url":null,"abstract":"<div><div>Vectorized high-definition (HD) map construction is formulated as the task of classifying and localizing typical map elements based on features in a bird’s-eye view (BEV). This is essential for autonomous driving systems, providing interpretable environmental structured representations for decision and planning. Remarkable work has been achieved in recent years, but several major issues remain: (1) in the generation of the BEV features, single modality methods suffer from limited perception capability and range, while existing multi-modal fusion approaches underutilize cross-modal synergies and fail to resolve spatial disparities between modalities, resulting in misaligned BEV features with holes; (2) in the classification and localization of map elements, existing methods heavily rely on point-level modeling information while neglecting the information between elements and between point and element, leading to low accuracy with erroneous shapes and element entanglement. To address these limitations, we propose SuperMapNet, a multi-modal framework designed for long-range and high-accuracy vectorized HD map construction. This framework uses both camera images and LiDAR point clouds as input. It first tightly couples semantic information from camera images and geometric information from LiDAR point clouds by a cross-attention based synergy enhancement module and a flow-based disparity alignment module for long-range BEV feature generation. Subsequently, local information acquired by point queries and global information acquired by element queries are tightly coupled by three-level interactions for high-accuracy classification and localization, where Point2Point interaction captures local geometric consistency between points of the same element, Element2Element interaction learns global semantic relationships between elements, and Point2Element interaction complement element information for its constituent points. Experiments on the nuScenes and Argoverse2 datasets demonstrate high accuracy, surpassing previous state-of-the-art methods (SOTAs) by 14.9%/8.8% and 18.5%/3.1% mAP under the hard/easy settings, respectively, even over the double perception ranges (up to 120 <span><math><mi>m</mi></math></span> in the X-axis and 60 <span><math><mi>m</mi></math></span> in the Y-axis). The code is made publicly available at <span><span>https://github.com/zhouruqin/SuperMapNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 89-103"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of global riverine total phosphorus concentration based on multi-source data and stacked ensemble learning 基于多源数据和堆叠集成学习的全球河流总磷浓度估算
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-03-01 Epub Date: 2026-02-11 DOI: 10.1016/j.isprsjprs.2026.01.041
Qi Li , Lan Zhang , Xi Chen , Chen Zhang , Jingyi Tian , Xianghan Sun , Liqiao Tian
Quantifying riverine total phosphorus (TP) concentration at the global scale using remote sensing remains challenging because TP is not optically active and its spatial variability is strongly regulated by hydrological and environmental processes. In this study, a dataset at the global scale comprising 25,060 in situ TP measurements from 75 major river basins was used to examine how satellite-derived reflectance, river morphology, hydrological conditions, topography, and climate jointly constrain TP variability. The results demonstrate that integrating spectral and environmental predictors substantially improves the stability and transferability of TP estimation across heterogeneous river systems. Further improvements are achieved through stacked ensemble learning (R2 = 0.80, RMSE = 0.5204, MAE = 0.3692), which effectively leverages the complementary strengths of different learning algorithms in processing both optical and environmental information. The resulting global riverine TP distribution patterns exhibit coherent latitudinal and regional gradients associated with river size, climatic regimes, and anthropogenic pressure, supporting the physical consistency of the estimates. Model explanation indicates that environmental factors such as elevation, river width, and discharge play key regulatory roles alongside spectral information. These findings demonstrate that integrating multi-source data and employing ensemble modeling approaches provides a viable pathway for large-scale estimation of non-optically active water quality parameters.
由于TP不具有光学活性,且其空间变异性受水文和环境过程的强烈调节,在全球尺度上利用遥感对河流总磷(TP)浓度进行量化仍然具有挑战性。在这项研究中,使用了一个全球尺度的数据集,包括来自75个主要河流流域的25,060个原位TP测量值,以研究卫星衍生的反射率、河流形态、水文条件、地形和气候如何共同约束TP变率。结果表明,将光谱因子与环境因子相结合,显著提高了异质性水系间全磷估算的稳定性和可移植性。通过堆叠集成学习(R2 = 0.80, RMSE = 0.5204, MAE = 0.3692)进一步改进,有效地利用了不同学习算法在处理光学和环境信息方面的互补优势。由此得出的全球河流总磷分布格局与河流大小、气候状况和人为压力相关,呈现出连贯的纬度和区域梯度,支持了估算的物理一致性。模式解释表明,高程、河流宽度和流量等环境因子与光谱信息一起起着关键的调节作用。这些发现表明,集成多源数据和采用集成建模方法为大规模估计非旋光性水质参数提供了可行的途径。
{"title":"Estimation of global riverine total phosphorus concentration based on multi-source data and stacked ensemble learning","authors":"Qi Li ,&nbsp;Lan Zhang ,&nbsp;Xi Chen ,&nbsp;Chen Zhang ,&nbsp;Jingyi Tian ,&nbsp;Xianghan Sun ,&nbsp;Liqiao Tian","doi":"10.1016/j.isprsjprs.2026.01.041","DOIUrl":"10.1016/j.isprsjprs.2026.01.041","url":null,"abstract":"<div><div>Quantifying riverine total phosphorus (TP) concentration at the global scale using remote sensing remains challenging because TP is not optically active and its spatial variability is strongly regulated by hydrological and environmental processes. In this study, a dataset at the global scale comprising 25,060 in situ TP measurements from 75 major river basins was used to examine how satellite-derived reflectance, river morphology, hydrological conditions, topography, and climate jointly constrain TP variability. The results demonstrate that integrating spectral and environmental predictors substantially improves the stability and transferability of TP estimation across heterogeneous river systems. Further improvements are achieved through stacked ensemble learning (R<sup>2</sup> = 0.80, RMSE = 0.5204, MAE = 0.3692), which effectively leverages the complementary strengths of different learning algorithms in processing both optical and environmental information. The resulting global riverine TP distribution patterns exhibit coherent latitudinal and regional gradients associated with river size, climatic regimes, and anthropogenic pressure, supporting the physical consistency of the estimates. Model explanation indicates that environmental factors such as elevation, river width, and discharge play key regulatory roles alongside spectral information. These findings demonstrate that integrating multi-source data and employing ensemble modeling approaches provides a viable pathway for large-scale estimation of non-optically active water quality parameters.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 588-608"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146153115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge distillation with spatial semantic enhancement for remote sensing object detection 基于空间语义增强的知识升华遥感目标检测
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-03-01 Epub Date: 2026-01-22 DOI: 10.1016/j.isprsjprs.2026.01.017
Kai Hu , Jiaxin Li , Nan Ji , Xueshang Xiang , Kai Jiang , Xieping Gao
Knowledge distillation is extensively utilized in remote sensing object detection within resource-constrained environments. Among knowledge distillation methods, prediction imitation has garnered significant attention due to its ease of deployment. However, prevailing prediction imitation paradigms, which rely on an isolated, point-wise alignment of prediction scores, neglect the crucial spatial semantic information. This oversight is particularly detrimental in remote sensing images due to the abundance of objects with weak feature responses. To this end, we propose a novel Spatial Semantic Enhanced Knowledge Distillation framework, called S2EKD, for remote sensing object detection. Through two complementary modules, S2EKD shifts the focus of prediction imitation from matching isolated values to learning structured spatial semantic information. First, for classification distillation, we introduce a Weak-feature Response Enhancement Module, which models the structured spatial relationships between objects and their background to establish an initial perception of objects with weak feature responses. Second, to further capture more refined spatial information, we propose a Teacher Boundary Refinement Module for localization distillation. It provides robust boundary guidance by constructing a regression target enriched with more comprehensive spatial information. Furthermore, we introduce a Feature Mapping mechanism to ensure this spatial semantic knowledge is effectively utilized. Through extensive experiments on the DIOR and DOTA-v1.0 datasets, our method’s superiority is consistently demonstrated across diverse architectures, including both single-stage and two-stage detectors. The results show that our S2EKD achieves state-of-the-art results and, in some cases, even surpasses the performance of its teacher model. The code will be available soon.
知识蒸馏广泛应用于资源受限环境下的遥感目标检测。在知识蒸馏方法中,预测模仿因其易于部署而备受关注。然而,目前流行的预测模仿范式依赖于孤立的、逐点排列的预测分数,忽视了关键的空间语义信息。这种疏忽在遥感图像中尤其有害,因为大量的物体特征响应较弱。为此,我们提出了一种新的空间语义增强知识蒸馏框架,称为S2EKD,用于遥感目标检测。通过两个互补的模块,S2EKD将预测模仿的重点从匹配孤立的值转移到学习结构化的空间语义信息。首先,在分类蒸馏方面,我们引入了一个弱特征响应增强模块,该模块对物体及其背景之间的结构化空间关系进行建模,以建立对具有弱特征响应的物体的初始感知。其次,为了进一步捕获更精细的空间信息,我们提出了一个教师边界细化模块用于定位蒸馏。该方法通过构建具有更全面空间信息的回归目标,提供鲁棒的边界引导。此外,我们还引入了一种特征映射机制,以确保这些空间语义知识得到有效利用。通过在DIOR和DOTA-v1.0数据集上的广泛实验,我们的方法的优势在不同的架构中得到了一致的证明,包括单级和两级检测器。结果表明,我们的S2EKD达到了最先进的结果,在某些情况下,甚至超过了其教师模型的表现。代码将很快可用。
{"title":"Knowledge distillation with spatial semantic enhancement for remote sensing object detection","authors":"Kai Hu ,&nbsp;Jiaxin Li ,&nbsp;Nan Ji ,&nbsp;Xueshang Xiang ,&nbsp;Kai Jiang ,&nbsp;Xieping Gao","doi":"10.1016/j.isprsjprs.2026.01.017","DOIUrl":"10.1016/j.isprsjprs.2026.01.017","url":null,"abstract":"<div><div>Knowledge distillation is extensively utilized in remote sensing object detection within resource-constrained environments. Among knowledge distillation methods, prediction imitation has garnered significant attention due to its ease of deployment. However, prevailing prediction imitation paradigms, which rely on an isolated, point-wise alignment of prediction scores, neglect the crucial spatial semantic information. This oversight is particularly detrimental in remote sensing images due to the abundance of objects with weak feature responses. To this end, we propose a novel Spatial Semantic Enhanced Knowledge Distillation framework, called <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span><em>EKD</em>, for remote sensing object detection. Through two complementary modules, <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span><em>EKD</em> shifts the focus of prediction imitation from matching isolated values to learning structured spatial semantic information. First, for classification distillation, we introduce a Weak-feature Response Enhancement Module, which models the structured spatial relationships between objects and their background to establish an initial perception of objects with weak feature responses. Second, to further capture more refined spatial information, we propose a Teacher Boundary Refinement Module for localization distillation. It provides robust boundary guidance by constructing a regression target enriched with more comprehensive spatial information. Furthermore, we introduce a Feature Mapping mechanism to ensure this spatial semantic knowledge is effectively utilized. Through extensive experiments on the DIOR and DOTA-v1.0 datasets, our method’s superiority is consistently demonstrated across diverse architectures, including both single-stage and two-stage detectors. The results show that our <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span><em>EKD</em> achieves state-of-the-art results and, in some cases, even surpasses the performance of its teacher model. The code will be available soon.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 144-157"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Roadside lidar-based scene understanding toward intelligent traffic perception: A comprehensive review 基于路边激光雷达的场景理解与智能交通感知:综述
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-03-01 Epub Date: 2026-01-20 DOI: 10.1016/j.isprsjprs.2026.01.012
Jiaxing Zhang , Chengjun Ge , Wen Xiao , Miao Tang , Jon Mills , Benjamin Coifman , Nengcheng Chen
Urban transportation systems are undergoing a paradigm shift with the integration of high-precision sensing technologies and intelligent perception frameworks. Roadside lidar, as a key enabler of infrastructure-based sensing technology, offers robust and precise 3D spatial understanding of dynamic urban scenes. This paper presents a comprehensive review of roadside lidar-based traffic perception, structured around five key modules: sensor placement strategies; multi-lidar point cloud fusion; dynamic traffic information extraction;subsequent applications including trajectory prediction, collision risk assessment, and behavioral analysis; representative roadside perception benchmark datasets. Despite notable progress, challenges remain in deployment optimization, robust registration under occlusion and dynamic conditions, generalizable object detection and tracking, and effective utilization of heterogeneous multi-modal data. Emerging trends point toward perception-driven infrastructure design, edge-cloud-terminal collaboration, and generalizable models enabled by domain adaptation, self-supervised learning, and foundation-scale datasets. This review aims to serve as a technical reference for researchers and practitioners, providing insights into current advances, open problems, and future directions in roadside lidar-based traffic perception and digital twin applications.
随着高精度传感技术和智能感知框架的融合,城市交通系统正在经历范式转变。路边激光雷达作为基于基础设施的传感技术的关键推动者,为动态城市场景提供强大而精确的3D空间理解。本文全面回顾了基于路边激光雷达的交通感知,围绕五个关键模块:传感器放置策略;多激光雷达点云融合;动态交通信息提取;后续应用包括轨迹预测、碰撞风险评估和行为分析;具有代表性的路边感知基准数据集。尽管取得了显著进展,但在部署优化、遮挡和动态条件下的鲁棒配准、可泛化的目标检测和跟踪以及异构多模态数据的有效利用等方面仍存在挑战。新兴趋势指向感知驱动的基础设施设计、边缘云终端协作以及由领域适应、自我监督学习和基础规模数据集支持的可推广模型。本文旨在为研究人员和实践者提供技术参考,提供基于路边激光雷达的交通感知和数字孪生应用的当前进展、开放问题和未来方向的见解。
{"title":"Roadside lidar-based scene understanding toward intelligent traffic perception: A comprehensive review","authors":"Jiaxing Zhang ,&nbsp;Chengjun Ge ,&nbsp;Wen Xiao ,&nbsp;Miao Tang ,&nbsp;Jon Mills ,&nbsp;Benjamin Coifman ,&nbsp;Nengcheng Chen","doi":"10.1016/j.isprsjprs.2026.01.012","DOIUrl":"10.1016/j.isprsjprs.2026.01.012","url":null,"abstract":"<div><div>Urban transportation systems are undergoing a paradigm shift with the integration of high-precision sensing technologies and intelligent perception frameworks. Roadside lidar, as a key enabler of infrastructure-based sensing technology, offers robust and precise 3D spatial understanding of dynamic urban scenes. This paper presents a comprehensive review of roadside lidar-based traffic perception, structured around five key modules: sensor placement strategies; multi-lidar point cloud fusion; dynamic traffic information extraction;subsequent applications including trajectory prediction, collision risk assessment, and behavioral analysis; representative roadside perception benchmark datasets. Despite notable progress, challenges remain in deployment optimization, robust registration under occlusion and dynamic conditions, generalizable object detection and tracking, and effective utilization of heterogeneous multi-modal data. Emerging trends point toward perception-driven infrastructure design, edge-cloud-terminal collaboration, and generalizable models enabled by domain adaptation, self-supervised learning, and foundation-scale datasets. This review aims to serve as a technical reference for researchers and practitioners, providing insights into current advances, open problems, and future directions in roadside lidar-based traffic perception and digital twin applications.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 69-88"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A geometric Cross-Propagation-Calibration method for SAR constellation based on the graph theory 基于图论的SAR星座几何交叉传播定标方法
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-03-01 Epub Date: 2026-01-31 DOI: 10.1016/j.isprsjprs.2026.01.007
Yitong Luo , Xiaolan Qiu , Bei Lin , Zekun Jiao , Wei Wang , Chibiao Ding
The networking capability of SAR constellations can effectively reduce the average revisit period, which has become a new trend in SAR Earth observation. However, the system electronic delay of several or even dozens of SAR satellites in a constellation must be calibrated and monitored for a long time to ensure high geometric accuracy of the product. In this paper, a geometric cross-propagation-calibration method for SAR constellations is proposed, which can calibrate the slant ranges of the SAR satellites in a constellation without any calibrators. The proposed method constructs a graph from all reference and uncalibrated SAR images involved in a cross-calibration task. For each uncalibrated image, the cumulative calibration error along paths originating from the reference images is estimated, enabling the identification of a path that minimizes this error. Cross-calibration is then performed sequentially along this optimal path. A closed-form expression is derived to estimate the cumulative calibration error along any path, which also reveals the underlying mechanism of error propagation in cross-calibration. Experiments based on real data show that the proposed method enables two China’s microsatellites, Qilu-1 and Xingrui-9, to achieve geometric accuracy of less than 5 m after calibration.
SAR星座的组网能力可以有效地缩短平均重访周期,成为SAR对地观测的新趋势。然而,一个星座内的几颗甚至几十颗SAR卫星的系统电子延迟必须经过长时间的校准和监测,才能保证产品的高几何精度。本文提出了一种SAR星座几何交叉传播定标方法,该方法可以在不使用任何定标器的情况下对星座内SAR卫星的倾斜距离进行定标。该方法将交叉校准任务中涉及的所有参考和未校准SAR图像构建成一个图形。对于每个未校准的图像,沿着源自参考图像的路径估计累积校准误差,从而能够识别最小化该误差的路径。然后沿着这条最优路径依次进行交叉校准。推导出沿任意路径估计累计校准误差的封闭表达式,揭示了交叉校准误差传播的潜在机制。基于实际数据的实验表明,所提出的方法使中国两颗微卫星“齐鲁一号”和“星瑞九号”标定后的几何精度达到小于5 m。
{"title":"A geometric Cross-Propagation-Calibration method for SAR constellation based on the graph theory","authors":"Yitong Luo ,&nbsp;Xiaolan Qiu ,&nbsp;Bei Lin ,&nbsp;Zekun Jiao ,&nbsp;Wei Wang ,&nbsp;Chibiao Ding","doi":"10.1016/j.isprsjprs.2026.01.007","DOIUrl":"10.1016/j.isprsjprs.2026.01.007","url":null,"abstract":"<div><div>The networking capability of SAR constellations can effectively reduce the average revisit period, which has become a new trend in SAR Earth observation. However, the system electronic delay of several or even dozens of SAR satellites in a constellation must be calibrated and monitored for a long time to ensure high geometric accuracy of the product. In this paper, a geometric cross-propagation-calibration method for SAR constellations is proposed, which can calibrate the slant ranges of the SAR satellites in a constellation without any calibrators. The proposed method constructs a graph from all reference and uncalibrated SAR images involved in a cross-calibration task. For each uncalibrated image, the cumulative calibration error along paths originating from the reference images is estimated, enabling the identification of a path that minimizes this error. Cross-calibration is then performed sequentially along this optimal path. A closed-form expression is derived to estimate the cumulative calibration error along any path, which also reveals the underlying mechanism of error propagation in cross-calibration. Experiments based on real data show that the proposed method enables two China’s microsatellites, Qilu-1 and Xingrui-9, to achieve geometric accuracy of less than 5 m after calibration.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 346-359"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146079954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive image zoom-in with bounding box transformation for UAV object detection 基于边界盒变换的无人机目标检测自适应图像放大
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-03-01 Epub Date: 2026-02-05 DOI: 10.1016/j.isprsjprs.2026.01.036
Tao Wang , Chenyu Lin , Chenwei Tang , Jizhe Zhou , Deng Xiong , Jianan Li , Jian Zhao , Jiancheng Lv
Detecting objects from UAV-captured images is challenging due to the small object size. In this work, a simple and efficient adaptive zoom-in framework is explored for object detection on UAV images. The main motivation is that the foreground objects are generally smaller and sparser than those in common scene images, which hinders the optimization of effective object detectors. We thus aim to zoom in adaptively on the objects to better capture object features for the detection task. To achieve the goal, two core designs are required: (i) How to conduct non-uniform zooming on each image efficiently? (ii) How to enable object detection training and inference with the zoomed image space? Correspondingly, a lightweight offset prediction scheme coupled with a novel box-based zooming objective is introduced to learn non-uniform zooming on the input image. Based on the learned zooming transformation, a corner-aligned bounding box transformation method is proposed. The method warps the ground-truth bounding boxes to the zoomed space to learn object detection, and warps the predicted bounding boxes back to the original space during inference. We conduct extensive experiments on three representative UAV object detection datasets, including VisDrone, UAVDT, and SeaDronesSee. The proposed ZoomDet is architecture-independent and can be applied to an arbitrary object detection architecture. Remarkably, on the SeaDronesSee dataset, ZoomDet offers more than 8.4 absolute gain of mAP with a Faster R-CNN model, with only about 3 ms additional latency. The code is available at https://github.com/twangnh/zoomdet_code.
由于物体尺寸小,从无人机捕获的图像中检测物体具有挑战性。本文探讨了一种简单高效的无人机图像目标检测自适应放大框架。主要动机是前景目标通常比普通场景图像中的目标更小、更稀疏,这阻碍了有效目标检测器的优化。因此,我们的目标是自适应地放大对象,以便更好地捕获检测任务的对象特征。为了实现这一目标,需要两个核心设计:(i)如何有效地对每个图像进行非均匀缩放?(ii)如何利用缩放后的图像空间进行目标检测训练和推理?相应地,引入了一种轻量级的偏移量预测方案,结合一种新的基于框的缩放目标来学习输入图像的非均匀缩放。在学习到的缩放变换的基础上,提出了一种边角对齐的边界盒变换方法。该方法将ground-truth边界框翘曲到缩放后的空间学习目标检测,并在推理过程中将预测的边界框翘曲回原始空间。我们在三个具有代表性的无人机目标检测数据集上进行了广泛的实验,包括VisDrone, UAVDT和SeaDronesSee。所提出的ZoomDet与体系结构无关,可以应用于任意目标检测体系结构。值得注意的是,在SeaDronesSee数据集上,ZoomDet以更快的R-CNN模型提供了超过8.4的mAP绝对增益,仅增加了约3毫秒的延迟。代码可在https://github.com/twangnh/zoomdet_code上获得。
{"title":"Adaptive image zoom-in with bounding box transformation for UAV object detection","authors":"Tao Wang ,&nbsp;Chenyu Lin ,&nbsp;Chenwei Tang ,&nbsp;Jizhe Zhou ,&nbsp;Deng Xiong ,&nbsp;Jianan Li ,&nbsp;Jian Zhao ,&nbsp;Jiancheng Lv","doi":"10.1016/j.isprsjprs.2026.01.036","DOIUrl":"10.1016/j.isprsjprs.2026.01.036","url":null,"abstract":"<div><div>Detecting objects from UAV-captured images is challenging due to the small object size. In this work, a simple and efficient adaptive zoom-in framework is explored for object detection on UAV images. The main motivation is that the foreground objects are generally smaller and sparser than those in common scene images, which hinders the optimization of effective object detectors. We thus aim to zoom in adaptively on the objects to better capture object features for the detection task. To achieve the goal, two core designs are required: (i) How to conduct non-uniform zooming on each image efficiently? (ii) How to enable object detection training and inference with the zoomed image space? Correspondingly, a lightweight offset prediction scheme coupled with a novel box-based zooming objective is introduced to learn non-uniform zooming on the input image. Based on the learned zooming transformation, a corner-aligned bounding box transformation method is proposed. The method warps the ground-truth bounding boxes to the zoomed space to learn object detection, and warps the predicted bounding boxes back to the original space during inference. We conduct extensive experiments on three representative UAV object detection datasets, including VisDrone, UAVDT, and SeaDronesSee. The proposed ZoomDet is architecture-independent and can be applied to an arbitrary object detection architecture. Remarkably, on the SeaDronesSee dataset, ZoomDet offers more than 8.4 absolute gain of mAP with a Faster R-CNN model, with only about 3 ms additional latency. The code is available at <span><span>https://github.com/twangnh/zoomdet_code</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 452-466"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146135035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A weakly supervised approach for large-scale agricultural parcel extraction from VHR imagery via foundation models and adaptive noise correction 基于基础模型和自适应噪声校正的VHR图像大尺度农业地块提取弱监督方法
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-03-01 Epub Date: 2026-01-23 DOI: 10.1016/j.isprsjprs.2026.01.030
Wenpeng Zhao , Shanchuan Guo , Xueliang Zhang , Pengfei Tang , Xiaoquan Pan , Haowei Mu , Chenghan Yang , Zilong Xia , Zheng Wang , Jun Du , Peijun Du
Large-scale and fine-grained extraction of agricultural parcels from very-high-resolution (VHR) imagery is essential for precision agriculture. However, traditional parcel segmentation methods and fully supervised deep learning approaches typically face scalability constraints due to costly manual annotations, while extraction accuracy is generally limited by the inadequate capacity of segmentation architectures to represent complex agricultural scenes. To address these challenges, this study proposes a Weakly Supervised approach for agricultural Parcel Extraction (WSPE), which leverages publicly available 10 m resolution images and labels to guide the delineation of 0.5 m agricultural parcels. The WSPE framework integrates the tabular (Tabular Prior-data Fitted Network, TabPFN) and the vision foundation model (Segment Anything Model 2, SAM2) to initially generate pseudo-labels with high geometric precision. These pseudo-labels are further refined for semantic accuracy through an adaptive noisy label correction module based on curriculum learning. The refined knowledge is distilled into the proposed Triple-branch Kolmogorov-Arnold enhanced Boundary-aware Network (TKBNet), a prompt-free end-to-end architecture enabling rapid inference and scalable deployment, with outputs vectorized through post-processing. The effectiveness of WSPE was evaluated on a self-constructed dataset from nine agricultural zones in China, the public AI4Boundaries and FGFD datasets, and three large-scale regions: Zhoukou, Hengshui, and Fengcheng. Results demonstrate that WSPE and its integrated TKBNet achieve robust performance across datasets with diverse agricultural scenes, validated by extensive comparative and ablation experiments. The weakly supervised approach achieves 97.7 % of fully supervised performance, and large-scale deployment verifies its scalability and generalization, offering a practical solution for fine-grained, large-scale agricultural parcel mapping. Code is available at https://github.com/zhaowenpeng/WSPE.
从高分辨率(VHR)图像中大规模和细粒度地提取农业地块对于精准农业至关重要。然而,传统的包裹分割方法和完全监督的深度学习方法通常面临可扩展性的限制,因为人工标注成本高,而提取精度通常受到分割架构表示复杂农业场景的能力不足的限制。为了解决这些挑战,本研究提出了一种弱监督的农业包裹提取方法(WSPE),该方法利用公开可用的10米分辨率图像和标签来指导0.5米农业包裹的描绘。WSPE框架集成了表格(tabular Prior-data拟合网络,TabPFN)和视觉基础模型(Segment Anything model 2, SAM2),初步生成几何精度较高的伪标签。这些伪标签通过基于课程学习的自适应噪声标签校正模块进一步细化语义准确性。精细化的知识被提炼到提议的三分支Kolmogorov-Arnold增强边界感知网络(TKBNet)中,这是一种即时的端到端架构,可以实现快速推理和可扩展部署,并通过后处理将输出矢量化。利用中国9个农业区自建数据集、AI4Boundaries和FGFD公共数据集以及周口、衡水和丰城3个大尺度区域对WSPE的有效性进行了评价。结果表明,WSPE及其集成的TKBNet在不同农业场景的数据集上实现了稳健的性能,并得到了广泛的对比和消融实验的验证。弱监督方法达到了97.7%的完全监督性能,大规模部署验证了其可扩展性和泛化性,为细粒度、大规模的农业地块测绘提供了实用的解决方案。代码可从https://github.com/zhaowenpeng/WSPE获得。
{"title":"A weakly supervised approach for large-scale agricultural parcel extraction from VHR imagery via foundation models and adaptive noise correction","authors":"Wenpeng Zhao ,&nbsp;Shanchuan Guo ,&nbsp;Xueliang Zhang ,&nbsp;Pengfei Tang ,&nbsp;Xiaoquan Pan ,&nbsp;Haowei Mu ,&nbsp;Chenghan Yang ,&nbsp;Zilong Xia ,&nbsp;Zheng Wang ,&nbsp;Jun Du ,&nbsp;Peijun Du","doi":"10.1016/j.isprsjprs.2026.01.030","DOIUrl":"10.1016/j.isprsjprs.2026.01.030","url":null,"abstract":"<div><div>Large-scale and fine-grained extraction of agricultural parcels from very-high-resolution (VHR) imagery is essential for precision agriculture. However, traditional parcel segmentation methods and fully supervised deep learning approaches typically face scalability constraints due to costly manual annotations, while extraction accuracy is generally limited by the inadequate capacity of segmentation architectures to represent complex agricultural scenes. To address these challenges, this study proposes a Weakly Supervised approach for agricultural Parcel Extraction (WSPE), which leverages publicly available 10 m resolution images and labels to guide the delineation of 0.5 m agricultural parcels. The WSPE framework integrates the tabular (Tabular Prior-data Fitted Network, TabPFN) and the vision foundation model (Segment Anything Model 2, SAM2) to initially generate pseudo-labels with high geometric precision. These pseudo-labels are further refined for semantic accuracy through an adaptive noisy label correction module based on curriculum learning. The refined knowledge is distilled into the proposed Triple-branch Kolmogorov-Arnold enhanced Boundary-aware Network (TKBNet), a prompt-free end-to-end architecture enabling rapid inference and scalable deployment, with outputs vectorized through post-processing. The effectiveness of WSPE was evaluated on a self-constructed dataset from nine agricultural zones in China, the public AI4Boundaries and FGFD datasets, and three large-scale regions: Zhoukou, Hengshui, and Fengcheng. Results demonstrate that WSPE and its integrated TKBNet achieve robust performance across datasets with diverse agricultural scenes, validated by extensive comparative and ablation experiments. The weakly supervised approach achieves 97.7 % of fully supervised performance, and large-scale deployment verifies its scalability and generalization, offering a practical solution for fine-grained, large-scale agricultural parcel mapping. Code is available at <span><span>https://github.com/zhaowenpeng/WSPE</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 180-208"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multispectral airborne laser scanning for tree species classification: A benchmark of machine learning and deep learning algorithms 树种分类的多光谱机载激光扫描:机器学习和深度学习算法的基准
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-03-01 Epub Date: 2026-01-27 DOI: 10.1016/j.isprsjprs.2026.01.031
Josef Taher , Eric Hyyppä , Matti Hyyppä , Klaara Salolahti , Xiaowei Yu , Leena Matikainen , Antero Kukko , Matti Lehtomäki , Harri Kaartinen , Sopitta Thurachen , Paula Litkey , Ville Luoma , Markus Holopainen , Gefei Kong , Hongchao Fan , Petri Rönnholm , Matti Vaaja , Antti Polvivaara , Samuli Junttila , Mikko Vastaranta , Juha Hyyppä
<div><div>Climate-smart and biodiversity-preserving forestry demands precise information on forest resources, extending to the individual tree level. Multispectral airborne laser scanning (ALS) has shown promise in automated point cloud processing, but challenges remain in leveraging deep learning techniques and identifying rare tree species in class-imbalanced datasets. This study addresses these gaps by conducting a comprehensive benchmark of deep learning and traditional shallow machine learning methods for tree species classification. For the study, we collected high-density multispectral ALS data (<span><math><mrow><mo>></mo><mn>1000</mn></mrow></math></span> <span><math><mrow><mi>pts</mi><mo>/</mo><msup><mrow><mi>m</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span>) at three wavelengths using the FGI-developed HeliALS system, complemented by existing Optech Titan data (35 <span><math><mrow><mi>pts</mi><mo>/</mo><msup><mrow><mi>m</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></math></span>), to evaluate the species classification accuracy of various algorithms in a peri-urban study area located in southern Finland. We established a field reference dataset of 6326 segments across nine species using a newly developed browser-based crowdsourcing tool, which facilitated efficient data annotation. The ALS data, including a training dataset of 1065 segments, was shared with the scientific community to foster collaborative research and diverse algorithmic contributions. Based on 5261 test segments, our findings demonstrate that point-based deep learning methods, particularly a point transformer model, outperformed traditional machine learning and image-based deep learning approaches on high-density multispectral point clouds. For the high-density ALS dataset, a point transformer model provided the best performance reaching an overall (macro-average) accuracy of 87.9% (74.5%) with a training set of 1065 segments and 92.0% (85.1%) with a larger training set of 5000 segments. With 1065 training segments, the best image-based deep learning method, DetailView, reached an overall (macro-average) accuracy of 84.3% (63.9%), whereas a shallow random forest (RF) classifier achieved an overall (macro-average) accuracy of 83.2% (61.3%). For the sparser ALS dataset, an RF model topped the list with an overall (macro-average) accuracy of 79.9% (57.6%), closely followed by the point transformer at 79.6% (56.0%). Importantly, the overall classification accuracy of the point transformer model on the HeliALS data increased from 73.0% with no spectral information to 84.7% with single-channel reflectance, and to 87.9% with spectral information of all the three channels. Furthermore, we studied the scaling of the classification accuracy as a function of point density and training set size using 5-fold cross-validation of our dataset. Based on our findings, multispectral information is especially beneficial for sparse point clouds with 1–50 <span><math>
气候智慧型和生物多样性保护林业需要森林资源的精确信息,并延伸到单个树木的水平。多光谱机载激光扫描(ALS)在自动化点云处理中显示出前景,但在利用深度学习技术和识别类别不平衡数据集中的稀有树种方面仍然存在挑战。本研究通过对树种分类的深度学习和传统浅机器学习方法进行全面的基准测试,解决了这些差距。在这项研究中,我们使用fgis开发的helals系统在三个波长下收集高密度多光谱ALS数据(>1000 pts/m2),并辅以现有的Optech Titan数据(35 pts/m2),以评估芬兰南部城郊研究区内各种算法的物种分类精度。利用新开发的基于浏览器的众包工具,建立了9个物种6326个片段的野外参考数据集,方便了数据标注。ALS数据,包括1065个片段的训练数据集,与科学界共享,以促进合作研究和多样化的算法贡献。基于5261个测试片段,我们的研究结果表明,在高密度多光谱点云上,基于点的深度学习方法,特别是点转换器模型,优于传统的机器学习和基于图像的深度学习方法。对于高密度ALS数据集,点转换模型提供了最好的性能,在1065个片段的训练集上达到了87.9%(74.5%)的总体(宏观平均)准确率,在5000个片段的更大的训练集上达到了92.0%(85.1%)。在1065个训练片段中,基于图像的最佳深度学习方法DetailView的总体(宏观平均)准确率为84.3%(63.9%),而浅随机森林(RF)分类器的总体(宏观平均)准确率为83.2%(61.3%)。对于稀疏的ALS数据集,RF模型以79.9%(57.6%)的总体(宏观平均)准确率位居榜首,紧随其后的是点变压器,准确率为79.6%(56.0%)。重要的是,在没有光谱信息的情况下,点变压器模型在helals数据上的总体分类精度从73.0%提高到单通道反射率下的84.7%,在三个通道都有光谱信息的情况下提高到87.9%。此外,我们使用数据集的5倍交叉验证研究了分类精度作为点密度和训练集大小的函数的缩放。基于我们的研究结果,多光谱信息对1-50 pts/m2的稀疏点云特别有利。此外,我们观察到分类误差与训练集大小m呈幂律关系,并且随着训练集大小的增加,点变压器的分类误差降低的速度明显快于RF。
{"title":"Multispectral airborne laser scanning for tree species classification: A benchmark of machine learning and deep learning algorithms","authors":"Josef Taher ,&nbsp;Eric Hyyppä ,&nbsp;Matti Hyyppä ,&nbsp;Klaara Salolahti ,&nbsp;Xiaowei Yu ,&nbsp;Leena Matikainen ,&nbsp;Antero Kukko ,&nbsp;Matti Lehtomäki ,&nbsp;Harri Kaartinen ,&nbsp;Sopitta Thurachen ,&nbsp;Paula Litkey ,&nbsp;Ville Luoma ,&nbsp;Markus Holopainen ,&nbsp;Gefei Kong ,&nbsp;Hongchao Fan ,&nbsp;Petri Rönnholm ,&nbsp;Matti Vaaja ,&nbsp;Antti Polvivaara ,&nbsp;Samuli Junttila ,&nbsp;Mikko Vastaranta ,&nbsp;Juha Hyyppä","doi":"10.1016/j.isprsjprs.2026.01.031","DOIUrl":"10.1016/j.isprsjprs.2026.01.031","url":null,"abstract":"&lt;div&gt;&lt;div&gt;Climate-smart and biodiversity-preserving forestry demands precise information on forest resources, extending to the individual tree level. Multispectral airborne laser scanning (ALS) has shown promise in automated point cloud processing, but challenges remain in leveraging deep learning techniques and identifying rare tree species in class-imbalanced datasets. This study addresses these gaps by conducting a comprehensive benchmark of deep learning and traditional shallow machine learning methods for tree species classification. For the study, we collected high-density multispectral ALS data (&lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mo&gt;&gt;&lt;/mo&gt;&lt;mn&gt;1000&lt;/mn&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt; &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;pts&lt;/mi&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;) at three wavelengths using the FGI-developed HeliALS system, complemented by existing Optech Titan data (35 &lt;span&gt;&lt;math&gt;&lt;mrow&gt;&lt;mi&gt;pts&lt;/mi&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;), to evaluate the species classification accuracy of various algorithms in a peri-urban study area located in southern Finland. We established a field reference dataset of 6326 segments across nine species using a newly developed browser-based crowdsourcing tool, which facilitated efficient data annotation. The ALS data, including a training dataset of 1065 segments, was shared with the scientific community to foster collaborative research and diverse algorithmic contributions. Based on 5261 test segments, our findings demonstrate that point-based deep learning methods, particularly a point transformer model, outperformed traditional machine learning and image-based deep learning approaches on high-density multispectral point clouds. For the high-density ALS dataset, a point transformer model provided the best performance reaching an overall (macro-average) accuracy of 87.9% (74.5%) with a training set of 1065 segments and 92.0% (85.1%) with a larger training set of 5000 segments. With 1065 training segments, the best image-based deep learning method, DetailView, reached an overall (macro-average) accuracy of 84.3% (63.9%), whereas a shallow random forest (RF) classifier achieved an overall (macro-average) accuracy of 83.2% (61.3%). For the sparser ALS dataset, an RF model topped the list with an overall (macro-average) accuracy of 79.9% (57.6%), closely followed by the point transformer at 79.6% (56.0%). Importantly, the overall classification accuracy of the point transformer model on the HeliALS data increased from 73.0% with no spectral information to 84.7% with single-channel reflectance, and to 87.9% with spectral information of all the three channels. Furthermore, we studied the scaling of the classification accuracy as a function of point density and training set size using 5-fold cross-validation of our dataset. Based on our findings, multispectral information is especially beneficial for sparse point clouds with 1–50 &lt;span&gt;&lt;math&gt;","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 278-309"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge-data-model-driven multimodal few-shot learning for hyperspectral fine classification: Generalization across sensor, category and scene 知识数据模型驱动的高光谱精细分类多模态少镜头学习:跨传感器、类别和场景的泛化
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-03-01 Epub Date: 2026-02-11 DOI: 10.1016/j.isprsjprs.2026.02.001
Qiqi Zhu, Mingzhen Xu, Rui Ma, Longli Ran, Jiayao Xue, Qingfeng Guan
Fine-grained land-cover mapping is crucial for accurately assessing environmental degradation and monitoring socioeconomic dynamics. Few-shot learning of hyperspectral images offers a promising solution in cases where sample collection is limited. However, previous studies, such as tree species mapping, typically use 1% or 0.5% of samples per class, yielding thousands of samples for common species but struggling to identify unseen or rare species (only one sample/shot) in real-world scenarios. Furthermore, inevitable cross-sensor, cross-category, and cross-scene variations significantly increase the occurrence of unseen or rare classes and spectral heterogeneity within common land-cover types. To this being, we propose Knowing-Net, a knowledge-data-model-driven multimodal few-shot learning network, to bridge the application gap for fine-grained mapping of unseen or rare classes. In Knowing-Net, prior knowledge of sensor, i.e., spectral parameters, is leveraged to reconstruct cross-sensor hyperspectral images, mitigating heterogeneity in spectral responses across datasets and enabling cross-domain transfer across different sensors, scenes, and land cover types. To breakthrough the gap in recognizing unseen classes, multimodal data, including textual descriptions and natural images of unseen classes, is embedded into network to construct shared side information through modality-specific feature learning. By designing a cross-alignment mechanism for hyperspectral and multimodal information in a shared semantic space, distinct encoders are guided to produce consistent distribution for the same class across different modalities, reducing sample dependency and facilitating the identification of unseen or rare classes. Finally, inspired by the first law of geography, a sliding discriminant window is designed to incorporate spatial context, enhancing geography interpretability and robustness to noise. We evaluate Knowing-Net on five challenging airborne hyperspectral datasets with a fine-grained classification system, covering crop type, tree species, and similar urban land covers with varying materials. Extensive experiments on five datasets consistently demonstrate Knowing-net’s superiority over state-of-the-art methods in both mapping performance and cross-domain generalization. Notably, the unified framework achieves state-of-the-art results in one-shot learning and establishes a new paradigm in zero-shot classification for fine-grained land cover tasks. To the best of our knowledge, this is the first comprehensive generalization of FSL across sensor, category, and scene for hyperspectral image-based fine mapping.
细粒度土地覆盖制图对于准确评估环境退化和监测社会经济动态至关重要。在样品收集有限的情况下,少量高光谱图像的学习提供了一个有前途的解决方案。然而,以前的研究,如树种测绘,通常每类使用1%或0.5%的样本,为常见物种产生数千个样本,但在现实世界中难以识别未见或罕见的物种(只有一个样本/镜头)。此外,不可避免的跨传感器、跨类别和跨场景变化显著增加了常见土地覆盖类型中未见或罕见类别和光谱异质性的发生。为此,我们提出了knowledge- net,一个知识数据模型驱动的多模态少量学习网络,以弥合未见或罕见类的细粒度映射的应用差距。在know- net中,利用传感器的先验知识(即光谱参数)来重建跨传感器的高光谱图像,减轻了数据集之间光谱响应的异质性,并实现了跨传感器、场景和土地覆盖类型的跨域传输。为了突破未见类识别的空白,将未见类的文本描述和自然图像等多模态数据嵌入到网络中,通过模态特征学习构建共享侧信息。通过设计共享语义空间中高光谱和多模态信息的交叉对齐机制,指导不同的编码器在不同模态上为同一类产生一致的分布,减少样本依赖并促进未见或罕见类的识别。最后,受地理第一定律的启发,设计了一个包含空间背景的滑动判别窗口,增强了地理的可解释性和对噪声的鲁棒性。我们在五个具有挑战性的机载高光谱数据集上对know- net进行了评估,这些数据集具有细粒度分类系统,包括作物类型、树种和不同材料的类似城市土地覆盖。在五个数据集上的广泛实验一致地证明了Knowing-net在映射性能和跨域泛化方面优于最先进的方法。值得注意的是,统一框架在一次学习中取得了最先进的结果,并为细粒度土地覆盖任务的零次分类建立了新的范式。据我们所知,这是基于高光谱图像的精细映射的第一个跨传感器、类别和场景的FSL综合推广。
{"title":"Knowledge-data-model-driven multimodal few-shot learning for hyperspectral fine classification: Generalization across sensor, category and scene","authors":"Qiqi Zhu,&nbsp;Mingzhen Xu,&nbsp;Rui Ma,&nbsp;Longli Ran,&nbsp;Jiayao Xue,&nbsp;Qingfeng Guan","doi":"10.1016/j.isprsjprs.2026.02.001","DOIUrl":"10.1016/j.isprsjprs.2026.02.001","url":null,"abstract":"<div><div>Fine-grained land-cover mapping is crucial for accurately assessing environmental degradation and monitoring socioeconomic dynamics. Few-shot learning of hyperspectral images offers a promising solution in cases where sample collection is limited. However, previous studies, such as tree species mapping, typically use 1% or 0.5% of samples per class, yielding thousands of samples for common species but struggling to identify unseen or rare species (only one sample/shot) in real-world scenarios. Furthermore, inevitable cross-sensor, cross-category, and cross-scene variations significantly increase the occurrence of unseen or rare classes and spectral heterogeneity within common land-cover types. To this being, we propose Knowing-Net, a knowledge-data-model-driven multimodal few-shot learning network, to bridge the application gap for fine-grained mapping of unseen or rare classes. In Knowing-Net, prior knowledge of sensor, i.e., spectral parameters, is leveraged to reconstruct cross-sensor hyperspectral images, mitigating heterogeneity in spectral responses across datasets and enabling cross-domain transfer across different sensors, scenes, and land cover types. To breakthrough the gap in recognizing unseen classes, multimodal data, including textual descriptions and natural images of unseen classes, is embedded into network to construct shared side information through modality-specific feature learning. By designing a cross-alignment mechanism for hyperspectral and multimodal information in a shared semantic space, distinct encoders are guided to produce consistent distribution for the same class across different modalities, reducing sample dependency and facilitating the identification of unseen or rare classes. Finally, inspired by the first law of geography, a sliding discriminant window is designed to incorporate spatial context, enhancing geography interpretability and robustness to noise. We evaluate Knowing-Net on five challenging airborne hyperspectral datasets with a fine-grained classification system, covering crop type, tree species, and similar urban land covers with varying materials. Extensive experiments on five datasets consistently demonstrate Knowing-net’s superiority over state-of-the-art methods in both mapping performance and cross-domain generalization. Notably, the unified framework achieves state-of-the-art results in one-shot learning and establishes a new paradigm in zero-shot classification for fine-grained land cover tasks. To the best of our knowledge, this is the first comprehensive generalization of FSL across sensor, category, and scene for hyperspectral image-based fine mapping.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 623-650"},"PeriodicalIF":12.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146160594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1