首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
A weakly supervised approach for large-scale agricultural parcel extraction from VHR imagery via foundation models and adaptive noise correction 基于基础模型和自适应噪声校正的VHR图像大尺度农业地块提取弱监督方法
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-23 DOI: 10.1016/j.isprsjprs.2026.01.030
Wenpeng Zhao , Shanchuan Guo , Xueliang Zhang , Pengfei Tang , Xiaoquan Pan , Haowei Mu , Chenghan Yang , Zilong Xia , Zheng Wang , Jun Du , Peijun Du
Large-scale and fine-grained extraction of agricultural parcels from very-high-resolution (VHR) imagery is essential for precision agriculture. However, traditional parcel segmentation methods and fully supervised deep learning approaches typically face scalability constraints due to costly manual annotations, while extraction accuracy is generally limited by the inadequate capacity of segmentation architectures to represent complex agricultural scenes. To address these challenges, this study proposes a Weakly Supervised approach for agricultural Parcel Extraction (WSPE), which leverages publicly available 10 m resolution images and labels to guide the delineation of 0.5 m agricultural parcels. The WSPE framework integrates the tabular (Tabular Prior-data Fitted Network, TabPFN) and the vision foundation model (Segment Anything Model 2, SAM2) to initially generate pseudo-labels with high geometric precision. These pseudo-labels are further refined for semantic accuracy through an adaptive noisy label correction module based on curriculum learning. The refined knowledge is distilled into the proposed Triple-branch Kolmogorov-Arnold enhanced Boundary-aware Network (TKBNet), a prompt-free end-to-end architecture enabling rapid inference and scalable deployment, with outputs vectorized through post-processing. The effectiveness of WSPE was evaluated on a self-constructed dataset from nine agricultural zones in China, the public AI4Boundaries and FGFD datasets, and three large-scale regions: Zhoukou, Hengshui, and Fengcheng. Results demonstrate that WSPE and its integrated TKBNet achieve robust performance across datasets with diverse agricultural scenes, validated by extensive comparative and ablation experiments. The weakly supervised approach achieves 97.7 % of fully supervised performance, and large-scale deployment verifies its scalability and generalization, offering a practical solution for fine-grained, large-scale agricultural parcel mapping. Code is available at https://github.com/zhaowenpeng/WSPE.
从高分辨率(VHR)图像中大规模和细粒度地提取农业地块对于精准农业至关重要。然而,传统的包裹分割方法和完全监督的深度学习方法通常面临可扩展性的限制,因为人工标注成本高,而提取精度通常受到分割架构表示复杂农业场景的能力不足的限制。为了解决这些挑战,本研究提出了一种弱监督的农业包裹提取方法(WSPE),该方法利用公开可用的10米分辨率图像和标签来指导0.5米农业包裹的描绘。WSPE框架集成了表格(tabular Prior-data拟合网络,TabPFN)和视觉基础模型(Segment Anything model 2, SAM2),初步生成几何精度较高的伪标签。这些伪标签通过基于课程学习的自适应噪声标签校正模块进一步细化语义准确性。精细化的知识被提炼到提议的三分支Kolmogorov-Arnold增强边界感知网络(TKBNet)中,这是一种即时的端到端架构,可以实现快速推理和可扩展部署,并通过后处理将输出矢量化。利用中国9个农业区自建数据集、AI4Boundaries和FGFD公共数据集以及周口、衡水和丰城3个大尺度区域对WSPE的有效性进行了评价。结果表明,WSPE及其集成的TKBNet在不同农业场景的数据集上实现了稳健的性能,并得到了广泛的对比和消融实验的验证。弱监督方法达到了97.7%的完全监督性能,大规模部署验证了其可扩展性和泛化性,为细粒度、大规模的农业地块测绘提供了实用的解决方案。代码可从https://github.com/zhaowenpeng/WSPE获得。
{"title":"A weakly supervised approach for large-scale agricultural parcel extraction from VHR imagery via foundation models and adaptive noise correction","authors":"Wenpeng Zhao ,&nbsp;Shanchuan Guo ,&nbsp;Xueliang Zhang ,&nbsp;Pengfei Tang ,&nbsp;Xiaoquan Pan ,&nbsp;Haowei Mu ,&nbsp;Chenghan Yang ,&nbsp;Zilong Xia ,&nbsp;Zheng Wang ,&nbsp;Jun Du ,&nbsp;Peijun Du","doi":"10.1016/j.isprsjprs.2026.01.030","DOIUrl":"10.1016/j.isprsjprs.2026.01.030","url":null,"abstract":"<div><div>Large-scale and fine-grained extraction of agricultural parcels from very-high-resolution (VHR) imagery is essential for precision agriculture. However, traditional parcel segmentation methods and fully supervised deep learning approaches typically face scalability constraints due to costly manual annotations, while extraction accuracy is generally limited by the inadequate capacity of segmentation architectures to represent complex agricultural scenes. To address these challenges, this study proposes a Weakly Supervised approach for agricultural Parcel Extraction (WSPE), which leverages publicly available 10 m resolution images and labels to guide the delineation of 0.5 m agricultural parcels. The WSPE framework integrates the tabular (Tabular Prior-data Fitted Network, TabPFN) and the vision foundation model (Segment Anything Model 2, SAM2) to initially generate pseudo-labels with high geometric precision. These pseudo-labels are further refined for semantic accuracy through an adaptive noisy label correction module based on curriculum learning. The refined knowledge is distilled into the proposed Triple-branch Kolmogorov-Arnold enhanced Boundary-aware Network (TKBNet), a prompt-free end-to-end architecture enabling rapid inference and scalable deployment, with outputs vectorized through post-processing. The effectiveness of WSPE was evaluated on a self-constructed dataset from nine agricultural zones in China, the public AI4Boundaries and FGFD datasets, and three large-scale regions: Zhoukou, Hengshui, and Fengcheng. Results demonstrate that WSPE and its integrated TKBNet achieve robust performance across datasets with diverse agricultural scenes, validated by extensive comparative and ablation experiments. The weakly supervised approach achieves 97.7 % of fully supervised performance, and large-scale deployment verifies its scalability and generalization, offering a practical solution for fine-grained, large-scale agricultural parcel mapping. Code is available at <span><span>https://github.com/zhaowenpeng/WSPE</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 180-208"},"PeriodicalIF":12.2,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Varying sensitivities of RED-NIR-based vegetation indices to the input reflectance affect the detected long-term trends 基于red - nir的植被指数对输入反射率的不同敏感性影响了探测到的长期趋势
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-23 DOI: 10.1016/j.isprsjprs.2026.01.028
Qing Tian , Hongxiao Jin , Rasmus Fensholt , Torbern Tagesson , Luwei Feng , Feng Tian
Widespread vegetation changes have been evidenced by satellite-observed long-term trends over decades in vegetation indices (VIs). However, many issues can affect the derived VIs trends, among which the inherent difference between VIs calculated from the same input reflectance has not been investigated. Here, we compared global long-term trends in six widely used RED-NIR (near-infrared)-based VIs calculated from the MODIS nadir bidirectional reflectance distribution function (BRDF) adjusted product (MCD43A4) during 2000–2023, including normalized difference vegetation index (NDVI), kernel NDVI (kNDVI), 2-band enhanced vegetation index (EVI2), near-infrared reflectance of vegetation (NIRv), difference vegetation index (DVI), and plant phenology index (PPI). We identified two distinct groups of VIs, i.e., (1) NDVI and kNDVI, and (2) EVI2, NIRv, DVI, and PPI, which shared similar trends within the group but showed significant directional differences between groups in 17.4% of the studied area. Only 20.5% of the global land surface showed consistent trends. Based on the radiation transfer model and remote sensing observations, we demonstrated that the two groups of VIs differed in their sensitivities to RED and NIR reflectance. These differences lead to inconsistent long-term trends arising from variations in vegetation type, mixed pixel effects, saturation, and asynchronous changes in vegetation chlorophyll content and structural attributes. Comparisons with ground-observed leaf area index (LAI), flux tower gross primary productivity (GPP), and PhenoCam green chromatic coordinate (GCC) further revealed that the EVI2, NIRv, DVI, and PPI trends corresponded more closely with LAI and GPP trends, whereas the NDVI and kNDVI trends were more strongly associated with GCC trends. Our results highlight that long-term vegetation trends derived from different RED–NIR-based VIs must be interpreted by considering their intrinsic sensitivities to biophysical properties, which is essential for reliable assessments of vegetation dynamics.
卫星观测到的植被指数(VIs)几十年来的长期趋势证明了广泛的植被变化。然而,许多问题会影响所得的能见度趋势,其中,从相同的输入反射率计算所得的能见度之间的固有差异尚未得到研究。本文比较了MODIS最低值双向反射率分布函数(BRDF)调整后的产品(MCD43A4)在2000-2023年间全球广泛使用的6种基于RED-NIR(近红外)的VIs的长期趋势,包括归一化差异植被指数(NDVI)、核NDVI (kNDVI)、2波段增强植被指数(EVI2)、植被近红外反射率(NIRv)、差异植被指数(DVI)和植物物象指数(PPI)。我们确定了两个不同的VIs组,即(1)NDVI和kNDVI, (2) EVI2、NIRv、DVI和PPI,它们在组内具有相似的趋势,但在17.4%的研究区域中,组间存在显著的方向性差异。全球只有20.5%的陆地表面呈现出一致的趋势。基于辐射传输模型和遥感观测,我们证明了两组VIs对红、近红外反射率的敏感性不同。这些差异导致植被类型、混合像元效应、饱和度的变化以及植被叶绿素含量和结构属性的非同步变化导致长期趋势不一致。与地面观测叶面积指数(LAI)、通量塔总初级生产力(GPP)和PhenoCam绿色坐标(GCC)的比较进一步表明,EVI2、NIRv、DVI和PPI趋势与LAI和GPP趋势的相关性更强,而NDVI和kNDVI趋势与GCC趋势的相关性更强。我们的研究结果强调,从不同的基于red - nir的VIs中得出的长期植被趋势必须考虑到它们对生物物理特性的内在敏感性,这对于可靠地评估植被动态至关重要。
{"title":"Varying sensitivities of RED-NIR-based vegetation indices to the input reflectance affect the detected long-term trends","authors":"Qing Tian ,&nbsp;Hongxiao Jin ,&nbsp;Rasmus Fensholt ,&nbsp;Torbern Tagesson ,&nbsp;Luwei Feng ,&nbsp;Feng Tian","doi":"10.1016/j.isprsjprs.2026.01.028","DOIUrl":"10.1016/j.isprsjprs.2026.01.028","url":null,"abstract":"<div><div>Widespread vegetation changes have been evidenced by satellite-observed long-term trends over decades in vegetation indices (VIs). However, many issues can affect the derived VIs trends, among which the inherent difference between VIs calculated from the same input reflectance has not been investigated. Here, we compared global long-term trends in six widely used RED-NIR (near-infrared)-based VIs calculated from the MODIS nadir bidirectional reflectance distribution function (BRDF) adjusted product (MCD43A4) during 2000–2023, including normalized difference vegetation index (NDVI), kernel NDVI (kNDVI), 2-band enhanced vegetation index (EVI2), near-infrared reflectance of vegetation (NIRv), difference vegetation index (DVI), and plant phenology index (PPI). We identified two distinct groups of VIs, i.e., (1) NDVI and kNDVI, and (2) EVI2, NIRv, DVI, and PPI, which shared similar trends within the group but showed significant directional differences between groups in 17.4% of the studied area. Only 20.5% of the global land surface showed consistent trends. Based on the radiation transfer model and remote sensing observations, we demonstrated that the two groups of VIs differed in their sensitivities to RED and NIR reflectance. These differences lead to inconsistent long-term trends arising from variations in vegetation type, mixed pixel effects, saturation, and asynchronous changes in vegetation chlorophyll content and structural attributes. Comparisons with ground-observed leaf area index (LAI), flux tower gross primary productivity (GPP), and PhenoCam green chromatic coordinate (GCC) further revealed that the EVI2, NIRv, DVI, and PPI trends corresponded more closely with LAI and GPP trends, whereas the NDVI and kNDVI trends were more strongly associated with GCC trends. Our results highlight that long-term vegetation trends derived from different RED–NIR-based VIs must be interpreted by considering their intrinsic sensitivities to biophysical properties, which is essential for reliable assessments of vegetation dynamics.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 247-265"},"PeriodicalIF":12.2,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weak supervision makes strong details: fine-grained object recognition in remote sensing images via regional diffusion with VLM 弱监督生成强细节:利用VLM进行区域扩散的遥感图像细粒度目标识别
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-23 DOI: 10.1016/j.isprsjprs.2026.01.024
Liuqian Wang, Jing Zhang, Guangming Mi, Li Zhuo
Fine-grained object recognition (FGOR) is gaining increasing attention in automated remote sensing analysis and interpretation (RSAI). However, the full potential of FGOR in remote sensing images (RSIs) is still constrained by several key issues: the reliance on high-quality labeled data, the difficulty of reconstructing fine details in low-resolution images, and the limited robustness of FGOR model for distinguishing similar object categories. In response, we propose an automatic fine-grained object recognition network (AutoFGOR) that follows a hierarchical dual-pipeline architecture for object analysis at global and regional levels. Specifically, Pipeline I: region detection network, which leverages geometric invariance module for weakly-supervised learning to improve the detection accuracy of sparsely labeled RSIs and extract category-free regions; and on top of that, Pipeline II: regional diffusion with vision language model (RD-VLM), which pioneers the combination of stable diffusion XL (SDXL) and large language and vision assistant (LLaVA) through a specially designed adaptive resolution adaptor (ARA) for object region super-resolution reconstruction, fundamentally solving the difficulties of feature extraction from low-quality regions and fine-grained feature mining. In addition, we introduce a winner-takes-all (WTA) strategy that utilizes a voting mechanism to enhance the reliability of fine-grained classification in complex scenes. Experimental results on FAIR1M-v2.0, VEDAI, and HRSC2016 datasets demonstrate our AutoFGOR achieving 31.72%, 80.25%, and 88.05% mAP, respectively, with highly competitive performance. In addition, the × 4 reconstruction results achieve scores of 0.5275 and 0.8173 on the MANIQA and CLIP-IQA indicators, respectively. The code will be available on GitHub: https://github.com/BJUT-AIVBD/AutoFGOR.
细粒度目标识别(FGOR)在自动遥感分析与解释(RSAI)中越来越受到关注。然而,FGOR在遥感图像(rsi)中的全部潜力仍然受到几个关键问题的限制:对高质量标记数据的依赖,在低分辨率图像中重建精细细节的困难,以及FGOR模型在区分相似目标类别方面的有限鲁棒性。作为回应,我们提出了一种自动细粒度目标识别网络(AutoFGOR),该网络遵循分层双管道架构,用于全球和区域层面的目标分析。其中,管道1:区域检测网络,利用几何不变性模块进行弱监督学习,提高稀疏标记rsi的检测精度,提取无类别区域;在此基础上,Pipeline II:区域扩散与视觉语言模型(RD-VLM),通过专门设计的自适应分辨率适配器(ARA),率先将稳定扩散XL (SDXL)与大型语言视觉助手(LLaVA)相结合,进行目标区域超分辨率重建,从根本上解决了低质量区域特征提取和细粒度特征挖掘的难题。此外,我们引入了赢家通吃(WTA)策略,该策略利用投票机制来增强复杂场景中细粒度分类的可靠性。在FAIR1M-v2.0、VEDAI和HRSC2016数据集上的实验结果表明,我们的AutoFGOR分别实现了31.72%、80.25%和88.05%的mAP,具有很强的竞争力。此外,× 4重建结果在MANIQA和CLIP-IQA指标上分别达到0.5275和0.8173分。代码将在GitHub上提供:https://github.com/BJUT-AIVBD/AutoFGOR。
{"title":"Weak supervision makes strong details: fine-grained object recognition in remote sensing images via regional diffusion with VLM","authors":"Liuqian Wang,&nbsp;Jing Zhang,&nbsp;Guangming Mi,&nbsp;Li Zhuo","doi":"10.1016/j.isprsjprs.2026.01.024","DOIUrl":"10.1016/j.isprsjprs.2026.01.024","url":null,"abstract":"<div><div>Fine-grained object recognition (FGOR) is gaining increasing attention in automated remote sensing analysis and interpretation (RSAI). However, the full potential of FGOR in remote sensing images (RSIs) is still constrained by several key issues: the reliance on high-quality labeled data, the difficulty of reconstructing fine details in low-resolution images, and the limited robustness of FGOR model for distinguishing similar object categories. In response, we propose an automatic fine-grained object recognition network (AutoFGOR) that follows a hierarchical dual-pipeline architecture for object analysis at global and regional levels. Specifically, Pipeline I: region detection network, which leverages geometric invariance module for weakly-supervised learning to improve the detection accuracy of sparsely labeled RSIs and extract category-free regions; and on top of that, Pipeline II: regional diffusion with vision language model (RD-VLM), which pioneers the combination of stable diffusion XL (SDXL) and large language and vision assistant (LLaVA) through a specially designed adaptive resolution adaptor (ARA) for object region super-resolution reconstruction, fundamentally solving the difficulties of feature extraction from low-quality regions and fine-grained feature mining. In addition, we introduce a winner-takes-all (WTA) strategy that utilizes a voting mechanism to enhance the reliability of fine-grained classification in complex scenes. Experimental results on FAIR1M-v2.0, VEDAI, and HRSC2016 datasets demonstrate our AutoFGOR achieving 31.72%, 80.25%, and 88.05% mAP, respectively, with highly competitive performance. In addition, the × 4 reconstruction results achieve scores of 0.5275 and 0.8173 on the MANIQA and CLIP-IQA indicators, respectively. <u>The code will be available on GitHub:</u> <span><span><u>https://github.com/BJUT-AIVBD/AutoFGOR</u></span><svg><path></path></svg></span><u>.</u></div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 231-246"},"PeriodicalIF":12.2,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge distillation with spatial semantic enhancement for remote sensing object detection 基于空间语义增强的知识升华遥感目标检测
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-22 DOI: 10.1016/j.isprsjprs.2026.01.017
Kai Hu , Jiaxin Li , Nan Ji , Xueshang Xiang , Kai Jiang , Xieping Gao
Knowledge distillation is extensively utilized in remote sensing object detection within resource-constrained environments. Among knowledge distillation methods, prediction imitation has garnered significant attention due to its ease of deployment. However, prevailing prediction imitation paradigms, which rely on an isolated, point-wise alignment of prediction scores, neglect the crucial spatial semantic information. This oversight is particularly detrimental in remote sensing images due to the abundance of objects with weak feature responses. To this end, we propose a novel Spatial Semantic Enhanced Knowledge Distillation framework, called S2EKD, for remote sensing object detection. Through two complementary modules, S2EKD shifts the focus of prediction imitation from matching isolated values to learning structured spatial semantic information. First, for classification distillation, we introduce a Weak-feature Response Enhancement Module, which models the structured spatial relationships between objects and their background to establish an initial perception of objects with weak feature responses. Second, to further capture more refined spatial information, we propose a Teacher Boundary Refinement Module for localization distillation. It provides robust boundary guidance by constructing a regression target enriched with more comprehensive spatial information. Furthermore, we introduce a Feature Mapping mechanism to ensure this spatial semantic knowledge is effectively utilized. Through extensive experiments on the DIOR and DOTA-v1.0 datasets, our method’s superiority is consistently demonstrated across diverse architectures, including both single-stage and two-stage detectors. The results show that our S2EKD achieves state-of-the-art results and, in some cases, even surpasses the performance of its teacher model. The code will be available soon.
知识蒸馏广泛应用于资源受限环境下的遥感目标检测。在知识蒸馏方法中,预测模仿因其易于部署而备受关注。然而,目前流行的预测模仿范式依赖于孤立的、逐点排列的预测分数,忽视了关键的空间语义信息。这种疏忽在遥感图像中尤其有害,因为大量的物体特征响应较弱。为此,我们提出了一种新的空间语义增强知识蒸馏框架,称为S2EKD,用于遥感目标检测。通过两个互补的模块,S2EKD将预测模仿的重点从匹配孤立的值转移到学习结构化的空间语义信息。首先,在分类蒸馏方面,我们引入了一个弱特征响应增强模块,该模块对物体及其背景之间的结构化空间关系进行建模,以建立对具有弱特征响应的物体的初始感知。其次,为了进一步捕获更精细的空间信息,我们提出了一个教师边界细化模块用于定位蒸馏。该方法通过构建具有更全面空间信息的回归目标,提供鲁棒的边界引导。此外,我们还引入了一种特征映射机制,以确保这些空间语义知识得到有效利用。通过在DIOR和DOTA-v1.0数据集上的广泛实验,我们的方法的优势在不同的架构中得到了一致的证明,包括单级和两级检测器。结果表明,我们的S2EKD达到了最先进的结果,在某些情况下,甚至超过了其教师模型的表现。代码将很快可用。
{"title":"Knowledge distillation with spatial semantic enhancement for remote sensing object detection","authors":"Kai Hu ,&nbsp;Jiaxin Li ,&nbsp;Nan Ji ,&nbsp;Xueshang Xiang ,&nbsp;Kai Jiang ,&nbsp;Xieping Gao","doi":"10.1016/j.isprsjprs.2026.01.017","DOIUrl":"10.1016/j.isprsjprs.2026.01.017","url":null,"abstract":"<div><div>Knowledge distillation is extensively utilized in remote sensing object detection within resource-constrained environments. Among knowledge distillation methods, prediction imitation has garnered significant attention due to its ease of deployment. However, prevailing prediction imitation paradigms, which rely on an isolated, point-wise alignment of prediction scores, neglect the crucial spatial semantic information. This oversight is particularly detrimental in remote sensing images due to the abundance of objects with weak feature responses. To this end, we propose a novel Spatial Semantic Enhanced Knowledge Distillation framework, called <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span><em>EKD</em>, for remote sensing object detection. Through two complementary modules, <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span><em>EKD</em> shifts the focus of prediction imitation from matching isolated values to learning structured spatial semantic information. First, for classification distillation, we introduce a Weak-feature Response Enhancement Module, which models the structured spatial relationships between objects and their background to establish an initial perception of objects with weak feature responses. Second, to further capture more refined spatial information, we propose a Teacher Boundary Refinement Module for localization distillation. It provides robust boundary guidance by constructing a regression target enriched with more comprehensive spatial information. Furthermore, we introduce a Feature Mapping mechanism to ensure this spatial semantic knowledge is effectively utilized. Through extensive experiments on the DIOR and DOTA-v1.0 datasets, our method’s superiority is consistently demonstrated across diverse architectures, including both single-stage and two-stage detectors. The results show that our <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span><em>EKD</em> achieves state-of-the-art results and, in some cases, even surpasses the performance of its teacher model. The code will be available soon.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 144-157"},"PeriodicalIF":12.2,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying green leaf and leaf phenology of large trees and forests by time series PlanetScope and Sentinel-2 images and the chlorophyll and green leaf indicator (CGLI) 利用PlanetScope和Sentinel-2时间序列影像及叶绿素和绿叶指示剂(CGLI)识别大型树木和森林的绿叶和叶物候
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-22 DOI: 10.1016/j.isprsjprs.2026.01.027
Baihong Pan , Xiangming Xiao , Li Pan , Andrew D Richardson , Yujie Liu , Yuan Yao , Cheng Meng , Yanhua Xie , Chenchen Zhang , Yuanwei Qin
Plant phenology serves as a vital indicator of plant’s response to climate variation and change. To date, our knowledge and data products of plant leaf phenology at the scales of large trees and forest stand are very limited, in part due to the lack of time series image data at very high spatial resolution (VHSR, meters). Here, we investigated surface reflectance (BLUE, GREEN, RED) and vegetation indices over a large cottonwood tree, using images from PlanetScope (daily, 3-m) and Sentinel-2A/B (5-day, 10-m) in 2023 and in-situ field photos. At the leaf scale, a green leaf has a spectral signature of BLUE < GREEN > RED, as chlorophyll pigment absorbs more blue and red light than green light, which is named as chlorophyll and green leaf indicator (CGLI); and a dead leaf has BLUE < GREEN < RED. At the tree scale, tree with only branches and trunk (no green leaves) has BLUE < GREEN < RED, while tree with green leaves has BLUE < GREEN > RED. We evaluated the start of season (SOS) and end of season (EOS) of the cottonwood tree, derived from (1) vegetation index (VI) data with three methods (VI-slope-, VI-ratio-, and VI-threshold-based methods) and (2) surface reflectance data with CGLI-based method. To evaluate broader applicability of the CGLI-based method, we applied the same workflow to five deciduous broadleaf forest sites within the National Ecological Observatory Network, equipped with PhenoCam. At these five sites, we compared phenology metrics (SOS, EOS) derived from VI- and CGLI-based methods with reference dates derived from PhenoCam Green Chromatic Coordinate (GCC) data. Results show that the CGLI-based method, which classifies each observation as either green leaf or non-green leaf/canopy (binary), is simple and effective in delineating leaf/canopy dynamics and phenology metrics. These findings provide a foundation for monitoring leaf phenology of large trees using satellite data.
植物物候是反映植物对气候变化响应的重要指标。迄今为止,由于缺乏非常高空间分辨率(VHSR,米)的时间序列图像数据,我们在大树和林分尺度上的植物叶物候知识和数据产品非常有限。在这里,我们利用PlanetScope(每日,3米)和Sentinel-2A/B(5天,10米)在2023年的图像和现场照片,研究了一棵大型棉杨树的表面反射率(蓝色,绿色,红色)和植被指数。在叶片尺度上,绿叶的光谱特征为BLUE <; green >; RED,这是因为叶绿素色素吸收的蓝光和红光比绿光多,称为叶绿素和绿叶指示剂(CGLI);而枯叶则是蓝<;绿<;红。在树的尺度上,只有树枝和树干(没有绿叶)的树是BLUE <; green <; RED,有绿叶的树是BLUE <; green >; RED。本文对杨树的季初(SOS)和季末(EOS)数据进行了评价,该数据来源于:(1)基于植被指数(VI)的三种方法(VI-slope- based、VI-ratio- based和VI-threshold-based方法)和(2)基于cgi的地表反射率数据。为了评估基于cgi方法的更广泛适用性,我们将相同的工作流程应用于国家生态观测站网络内的五个落叶阔叶林站点,并配备了PhenoCam。在这五个地点,我们将基于VI和cgi方法得出的物候指标(SOS, EOS)与来自PhenoCam Green Chromatic Coordinate (GCC)数据的参考日期进行了比较。结果表明,基于cgi的方法将每个观测值分为绿叶或非绿叶/冠层(二元),在描述叶/冠层动态和物候指标方面简单有效。这些发现为利用卫星数据监测大型树木叶片物候提供了基础。
{"title":"Identifying green leaf and leaf phenology of large trees and forests by time series PlanetScope and Sentinel-2 images and the chlorophyll and green leaf indicator (CGLI)","authors":"Baihong Pan ,&nbsp;Xiangming Xiao ,&nbsp;Li Pan ,&nbsp;Andrew D Richardson ,&nbsp;Yujie Liu ,&nbsp;Yuan Yao ,&nbsp;Cheng Meng ,&nbsp;Yanhua Xie ,&nbsp;Chenchen Zhang ,&nbsp;Yuanwei Qin","doi":"10.1016/j.isprsjprs.2026.01.027","DOIUrl":"10.1016/j.isprsjprs.2026.01.027","url":null,"abstract":"<div><div>Plant phenology serves as a vital indicator of plant’s response to climate variation and change. To date, our knowledge and data products of plant leaf phenology at the scales of large trees and forest stand are very limited, in part due to the lack of time series image data at very high spatial resolution (VHSR, meters). Here, we investigated surface reflectance (BLUE, GREEN, RED) and vegetation indices over a large cottonwood tree, using images from PlanetScope (daily, 3-m) and Sentinel-2A/B (5-day, 10-m) in 2023 and in-situ field photos. At the leaf scale, a green leaf has a spectral signature of BLUE &lt; GREEN &gt; RED, as chlorophyll pigment absorbs more blue and red light than green light, which is named as chlorophyll and green leaf indicator (CGLI); and a dead leaf has BLUE &lt; GREEN &lt; RED. At the tree scale, tree with only branches and trunk (no green leaves) has BLUE &lt; GREEN &lt; RED, while tree with green leaves has BLUE &lt; GREEN &gt; RED. We evaluated the start of season (SOS) and end of season (EOS) of the cottonwood tree, derived from (1) vegetation index (VI) data with three methods (VI-slope-, VI-ratio-, and VI-threshold-based methods) and (2) surface reflectance data with CGLI-based method. To evaluate broader applicability of the CGLI-based method, we applied the same workflow to five deciduous broadleaf forest sites within the National Ecological Observatory Network, equipped with PhenoCam. At these five sites, we compared phenology metrics (SOS, EOS) derived from VI- and CGLI-based methods with reference dates derived from PhenoCam Green Chromatic Coordinate (GCC) data. Results show that the CGLI-based method, which classifies each observation as either green leaf or non-green leaf/canopy (binary), is simple and effective in delineating leaf/canopy dynamics and phenology metrics. These findings provide a foundation for monitoring leaf phenology of large trees using satellite data.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 104-125"},"PeriodicalIF":12.2,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PANet: A multi-scale temporal decoupling network and its high-resolution benchmark dataset for detecting pseudo changes in cropland non-agriculturalization PANet:基于多尺度时间解耦网络及其高分辨率基准数据集的耕地非农化伪变化检测
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-22 DOI: 10.1016/j.isprsjprs.2026.01.029
Songman Sui , Jixian Zhang , Haiyan Gu , Yue Chang
Cropland non-agriculturalization (CNA) refers to the conversion of cropland into non-agricultural land such as construction land or ponds, posing threats to food security and ecological balance. Remote sensing technology enables precise monitoring of this process, but bi-temporal methods are susceptible to errors caused by seasonal spectral fluctuations, weather interference, and imaging discrepancies, often leading to false detections. Existing methods, which lack support from temporal datasets, struggle to disentangle the spectral confusion of gradual non-agriculturalization and short-term disturbances, thereby limiting the accuracy of dynamic cropland resource monitoring. To address this issue, a novel phenology-aware temporal change detection network (PANet) is proposed to solve the misclassification challenges in CNA detection caused by “same object with different spectra” and “different objects with similar spectra” issues. A phenology-aware module (PATM) is designed, leveraging a dual-driven decoupling model to dynamically weight phenology-sensitive periods and adaptively represent non-uniform temporal intervals. Through a time-aligned feature enhancement strategy and dual-driven (intra-annual/inter-annual) temporal decay functions, PANet simultaneously focuses on short-term anomalies and robustly models long-term trends. Additionally, a sample balance adjustment module (DFBL) is developed to mitigate the impact of sample imbalance by incorporating prior knowledge of changes and dynamic adjustment factors, enhancing the model’s sensitivity to non-agriculturalization changes. Furthermore, the first high-resolution CNA dataset based on actual production data is constructed, containing 1295 pairs of 512 × 512 masked images. Compared to existing datasets, this dataset offers extensive temporal coverage, capturing comprehensive seasonal periodic characteristics of cropland. Comparative experiments with several classical time-series methods and bi-temporal methods validate the effectiveness of PANet. Experimental results on the LHCD dataset demonstrate that PANet achieves the highest F1 score, specifically, 61.01% and 61.70%. PANet accurately captures CNA information, making it vital for the scientific management and sustainable utilization of limited cropland resources. The LHCD can be downloaded from https://github.com/mss-s/LHCD.
耕地非农化是指将耕地转化为建设用地或池塘等非农业用地,对粮食安全和生态平衡构成威胁。遥感技术能够对这一过程进行精确监测,但双时间方法容易受到季节光谱波动、天气干扰和成像差异造成的误差的影响,往往导致错误的检测。现有的方法缺乏时间数据集的支持,难以理清逐渐非农业化和短期扰动的光谱混淆,从而限制了动态耕地资源监测的准确性。针对这一问题,提出了一种新的物候感知时间变化检测网络(PANet),以解决CNA检测中由于“同一物体具有不同光谱”和“不同物体具有相似光谱”造成的误分类问题。设计了物候感知模块(PATM),利用双驱动解耦模型动态加权物候敏感期,并自适应表示非均匀时间间隔。通过时间同步特征增强策略和双驱动(年内/年际)时间衰减函数,PANet同时关注短期异常并稳健地模拟长期趋势。此外,通过引入变化的先验知识和动态调整因子,构建了样本平衡调整模块(DFBL)来缓解样本失衡的影响,增强了模型对非农业变化的敏感性。构建了第一个基于实际生产数据的高分辨率CNA数据集,包含1295对512 × 512的掩膜图像。与现有数据集相比,该数据集提供了广泛的时间覆盖,捕获了农田的全面季节性周期性特征。与几种经典时间序列方法和双时间方法的对比实验验证了PANet的有效性。在LHCD数据集上的实验结果表明,PANet的F1得分最高,分别为61.01%和61.70%。PANet准确捕获CNA信息,对有限耕地资源的科学管理和可持续利用至关重要。LHCD可从https://github.com/mss-s/LHCD下载。
{"title":"PANet: A multi-scale temporal decoupling network and its high-resolution benchmark dataset for detecting pseudo changes in cropland non-agriculturalization","authors":"Songman Sui ,&nbsp;Jixian Zhang ,&nbsp;Haiyan Gu ,&nbsp;Yue Chang","doi":"10.1016/j.isprsjprs.2026.01.029","DOIUrl":"10.1016/j.isprsjprs.2026.01.029","url":null,"abstract":"<div><div>Cropland non-agriculturalization (CNA) refers to the conversion of cropland into non-agricultural land such as construction land or ponds, posing threats to food security and ecological balance. Remote sensing technology enables precise monitoring of this process, but bi-temporal methods are susceptible to errors caused by seasonal spectral fluctuations, weather interference, and imaging discrepancies, often leading to false detections. Existing methods, which lack support from temporal datasets, struggle to disentangle the spectral confusion of gradual non-agriculturalization and short-term disturbances, thereby limiting the accuracy of dynamic cropland resource monitoring. To address this issue, a novel phenology-aware temporal change detection network (PANet) is proposed to solve the misclassification challenges in CNA detection caused by “same object with different spectra” and “different objects with similar spectra” issues. A phenology-aware module (PATM) is designed, leveraging a dual-driven decoupling model to dynamically weight phenology-sensitive periods and adaptively represent non-uniform temporal intervals. Through a time-aligned feature enhancement strategy and dual-driven (intra-annual/inter-annual) temporal decay functions, PANet simultaneously focuses on short-term anomalies and robustly models long-term trends. Additionally, a sample balance adjustment module (DFBL) is developed to mitigate the impact of sample imbalance by incorporating prior knowledge of changes and dynamic adjustment factors, enhancing the model’s sensitivity to non-agriculturalization changes. Furthermore, the first high-resolution CNA dataset based on actual production data is constructed, containing 1295 pairs of 512 × 512 masked images. Compared to existing datasets, this dataset offers extensive temporal coverage, capturing comprehensive seasonal periodic characteristics of cropland. Comparative experiments with several classical time-series methods and bi-temporal methods validate the effectiveness of PANet. Experimental results on the LHCD dataset demonstrate that PANet achieves the highest F1 score, specifically, 61.01% and 61.70%. PANet accurately captures CNA information, making it vital for the scientific management and sustainable utilization of limited cropland resources. The LHCD can be downloaded from <span><span>https://github.com/mss-s/LHCD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 126-143"},"PeriodicalIF":12.2,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting global ocean subsurface density change with high-resolution via dual-task densely-former 利用双任务密度仪高分辨率探测全球海洋地下密度变化
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-22 DOI: 10.1016/j.isprsjprs.2026.01.026
Hua Su , Weiqi Xie , Luping You , Sihui Li , Dian Lin , An Wang
High-resolution ocean subsurface density is crucial for studying dynamic processes and stratification within the ocean under recent global ocean warming. This study proposes a novel deep learning-based model, named DDFNet (Dual-task Densely-Former Network), for reconstructing ocean subsurface density, to address the challenges in reconstructing high-resolution and high-reliability global ocean subsurface density. DDFNet employs multi-scale feature extraction, attention mechanisms, and a dual-label design, combining an encoder-decoder backbone network with a global spatial attention module to capture the complex spatiotemporal relationships in ocean data effectively. The model utilizes multisource surface remote sensing data as input and incorporates Argo profile data and ORAS5 reanalysis data as labels. An adaptive weighted loss function dynamically balances the contributions of the two label types, improving reconstruction accuracy and achieving a spatial resolution of 0.25°×0.25°. By constructing dual tasks with in situ observations and reanalysis data for joint learning, the true state of the ocean and the consistency of physical processes are enhanced, improving the model’s reconstruction accuracy and physical consistency. Experimental results demonstrate that DDFNet outperforms well-used LightGBM and CNN models, with the reconstructed DDFNet-SD dataset achieving an R2 of 0.9863 and an RMSE of 0.2804 kg/m3. The dataset further reveals a declining trend in global ocean subsurface density at a rate of −4.47 × 10-4 kg/m3/decade, particularly pronounced in the upper 0–700 m, which is likely associated with global ocean warming and salinity changes. The high-resolution dataset facilitates studies on mesoscale ocean dynamics, stratification variability, and climate change impacts.
高分辨率的海洋地下密度对于研究近年来全球海洋变暖背景下海洋内部的动态过程和分层至关重要。本文提出了一种基于深度学习的海洋地下密度重建模型DDFNet (Dual-task dense - former Network),以解决重建高分辨率、高可靠性全球海洋地下密度的挑战。DDFNet采用多尺度特征提取、关注机制和双标签设计,将编码器-解码器骨干网与全局空间关注模块相结合,有效捕获海洋数据中复杂的时空关系。该模型采用多源地表遥感数据作为输入,并结合Argo剖面数据和ORAS5再分析数据作为标签。自适应加权损失函数动态平衡了两种标签类型的贡献,提高了重建精度,实现了0.25°×0.25°的空间分辨率。通过构建现场观测和再分析数据联合学习的双重任务,增强了海洋真实状态和物理过程的一致性,提高了模型的重建精度和物理一致性。实验结果表明,DDFNet优于常用的LightGBM和CNN模型,重建的DDFNet- sd数据集的R2为0.9863,RMSE为0.2804 kg/m3。数据集进一步揭示了全球海洋地下密度以- 4.47 × 10-4 kg/m3/ 10年的速率下降的趋势,特别是在0-700 m以上,这可能与全球海洋变暖和盐度变化有关。高分辨率数据集有助于中尺度海洋动力学、分层变率和气候变化影响的研究。
{"title":"Detecting global ocean subsurface density change with high-resolution via dual-task densely-former","authors":"Hua Su ,&nbsp;Weiqi Xie ,&nbsp;Luping You ,&nbsp;Sihui Li ,&nbsp;Dian Lin ,&nbsp;An Wang","doi":"10.1016/j.isprsjprs.2026.01.026","DOIUrl":"10.1016/j.isprsjprs.2026.01.026","url":null,"abstract":"<div><div>High-resolution ocean subsurface density is crucial for studying dynamic processes and stratification within the ocean under recent global ocean warming. This study proposes a novel deep learning-based model, named DDFNet (Dual-task Densely-Former Network), for reconstructing ocean subsurface density, to address the challenges in reconstructing high-resolution and high-reliability global ocean subsurface density. DDFNet employs multi-scale feature extraction, attention mechanisms, and a dual-label design, combining an encoder-decoder backbone network with a global spatial attention module to capture the complex spatiotemporal relationships in ocean data effectively. The model utilizes multisource surface remote sensing data as input and incorporates Argo profile data and ORAS5 reanalysis data as labels. An adaptive weighted loss function dynamically balances the contributions of the two label types, improving reconstruction accuracy and achieving a spatial resolution of 0.25°×0.25°. By constructing dual tasks with <em>in situ</em> observations and reanalysis data for joint learning, the true state of the ocean and the consistency of physical processes are enhanced, improving the model’s reconstruction accuracy and physical consistency. Experimental results demonstrate that DDFNet outperforms well-used LightGBM and CNN models, with the reconstructed DDFNet-SD dataset achieving an <em>R<sup>2</sup></em> of 0.9863 and an RMSE of 0.2804 kg/m<sup>3</sup>. The dataset further reveals a declining trend in global ocean subsurface density at a rate of −4.47 × 10<sup>-4</sup> kg/m<sup>3</sup>/decade, particularly pronounced in the upper 0–700 m, which is likely associated with global ocean warming and salinity changes. The high-resolution dataset facilitates studies on mesoscale ocean dynamics, stratification variability, and climate change impacts.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 158-179"},"PeriodicalIF":12.2,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SuperMapNet for long-range and high-accuracy vectorized HD map construction SuperMapNet用于远程和高精度矢量化高清地图构建
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-20 DOI: 10.1016/j.isprsjprs.2026.01.023
Ruqin Zhou , Chenguang Dai , Wanshou Jiang , Yongsheng Zhang , Zhenchao Zhang , San Jiang
Vectorized high-definition (HD) map construction is formulated as the task of classifying and localizing typical map elements based on features in a bird’s-eye view (BEV). This is essential for autonomous driving systems, providing interpretable environmental structured representations for decision and planning. Remarkable work has been achieved in recent years, but several major issues remain: (1) in the generation of the BEV features, single modality methods suffer from limited perception capability and range, while existing multi-modal fusion approaches underutilize cross-modal synergies and fail to resolve spatial disparities between modalities, resulting in misaligned BEV features with holes; (2) in the classification and localization of map elements, existing methods heavily rely on point-level modeling information while neglecting the information between elements and between point and element, leading to low accuracy with erroneous shapes and element entanglement. To address these limitations, we propose SuperMapNet, a multi-modal framework designed for long-range and high-accuracy vectorized HD map construction. This framework uses both camera images and LiDAR point clouds as input. It first tightly couples semantic information from camera images and geometric information from LiDAR point clouds by a cross-attention based synergy enhancement module and a flow-based disparity alignment module for long-range BEV feature generation. Subsequently, local information acquired by point queries and global information acquired by element queries are tightly coupled by three-level interactions for high-accuracy classification and localization, where Point2Point interaction captures local geometric consistency between points of the same element, Element2Element interaction learns global semantic relationships between elements, and Point2Element interaction complement element information for its constituent points. Experiments on the nuScenes and Argoverse2 datasets demonstrate high accuracy, surpassing previous state-of-the-art methods (SOTAs) by 14.9%/8.8% and 18.5%/3.1% mAP under the hard/easy settings, respectively, even over the double perception ranges (up to 120 m in the X-axis and 60 m in the Y-axis). The code is made publicly available at https://github.com/zhouruqin/SuperMapNet.
矢量化高清晰地图构建是一种基于鸟瞰图特征对典型地图元素进行分类和定位的任务。这对自动驾驶系统至关重要,为决策和规划提供可解释的环境结构表示。近年来,研究工作取得了显著成果,但仍存在以下几个主要问题:(1)在纯电动汽车特征的生成中,单模态方法的感知能力和范围有限,而现有的多模态融合方法未充分利用跨模态协同作用,未能解决模态间的空间差异,导致纯电动汽车特征与空穴对齐不一致;(2)在地图元素的分类和定位中,现有方法严重依赖点级建模信息,而忽略了元素之间和点与元素之间的信息,导致精度低,存在错误形状和元素纠缠。为了解决这些限制,我们提出了SuperMapNet,这是一个多模态框架,专为远程和高精度矢量化高清地图构建而设计。该框架使用相机图像和激光雷达点云作为输入。首先,通过基于交叉关注的协同增强模块和基于流的视差对准模块,将相机图像的语义信息与LiDAR点云的几何信息紧密耦合,用于远程BEV特征生成。随后,将点查询获取的局部信息与元素查询获取的全局信息通过三级交互紧密耦合,实现高精度的分类和定位,其中Point2Point交互捕获同一元素点之间的局部几何一致性,Element2Element交互学习元素之间的全局语义关系,Point2Element交互补充其组成点的元素信息。在nuScenes和Argoverse2数据集上的实验表明,即使在双感知范围内(x轴120米和y轴60米),在难/易设置下,也比以前的最先进的方法(SOTAs)分别高出14.9%/8.8%和18.5%/3.1%。该代码可在https://github.com/zhouruqin/SuperMapNet上公开获取。
{"title":"SuperMapNet for long-range and high-accuracy vectorized HD map construction","authors":"Ruqin Zhou ,&nbsp;Chenguang Dai ,&nbsp;Wanshou Jiang ,&nbsp;Yongsheng Zhang ,&nbsp;Zhenchao Zhang ,&nbsp;San Jiang","doi":"10.1016/j.isprsjprs.2026.01.023","DOIUrl":"10.1016/j.isprsjprs.2026.01.023","url":null,"abstract":"<div><div>Vectorized high-definition (HD) map construction is formulated as the task of classifying and localizing typical map elements based on features in a bird’s-eye view (BEV). This is essential for autonomous driving systems, providing interpretable environmental structured representations for decision and planning. Remarkable work has been achieved in recent years, but several major issues remain: (1) in the generation of the BEV features, single modality methods suffer from limited perception capability and range, while existing multi-modal fusion approaches underutilize cross-modal synergies and fail to resolve spatial disparities between modalities, resulting in misaligned BEV features with holes; (2) in the classification and localization of map elements, existing methods heavily rely on point-level modeling information while neglecting the information between elements and between point and element, leading to low accuracy with erroneous shapes and element entanglement. To address these limitations, we propose SuperMapNet, a multi-modal framework designed for long-range and high-accuracy vectorized HD map construction. This framework uses both camera images and LiDAR point clouds as input. It first tightly couples semantic information from camera images and geometric information from LiDAR point clouds by a cross-attention based synergy enhancement module and a flow-based disparity alignment module for long-range BEV feature generation. Subsequently, local information acquired by point queries and global information acquired by element queries are tightly coupled by three-level interactions for high-accuracy classification and localization, where Point2Point interaction captures local geometric consistency between points of the same element, Element2Element interaction learns global semantic relationships between elements, and Point2Element interaction complement element information for its constituent points. Experiments on the nuScenes and Argoverse2 datasets demonstrate high accuracy, surpassing previous state-of-the-art methods (SOTAs) by 14.9%/8.8% and 18.5%/3.1% mAP under the hard/easy settings, respectively, even over the double perception ranges (up to 120 <span><math><mi>m</mi></math></span> in the X-axis and 60 <span><math><mi>m</mi></math></span> in the Y-axis). The code is made publicly available at <span><span>https://github.com/zhouruqin/SuperMapNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 89-103"},"PeriodicalIF":12.2,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Roadside lidar-based scene understanding toward intelligent traffic perception: A comprehensive review 基于路边激光雷达的场景理解与智能交通感知:综述
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-20 DOI: 10.1016/j.isprsjprs.2026.01.012
Jiaxing Zhang , Chengjun Ge , Wen Xiao , Miao Tang , Jon Mills , Benjamin Coifman , Nengcheng Chen
Urban transportation systems are undergoing a paradigm shift with the integration of high-precision sensing technologies and intelligent perception frameworks. Roadside lidar, as a key enabler of infrastructure-based sensing technology, offers robust and precise 3D spatial understanding of dynamic urban scenes. This paper presents a comprehensive review of roadside lidar-based traffic perception, structured around five key modules: sensor placement strategies; multi-lidar point cloud fusion; dynamic traffic information extraction;subsequent applications including trajectory prediction, collision risk assessment, and behavioral analysis; representative roadside perception benchmark datasets. Despite notable progress, challenges remain in deployment optimization, robust registration under occlusion and dynamic conditions, generalizable object detection and tracking, and effective utilization of heterogeneous multi-modal data. Emerging trends point toward perception-driven infrastructure design, edge-cloud-terminal collaboration, and generalizable models enabled by domain adaptation, self-supervised learning, and foundation-scale datasets. This review aims to serve as a technical reference for researchers and practitioners, providing insights into current advances, open problems, and future directions in roadside lidar-based traffic perception and digital twin applications.
随着高精度传感技术和智能感知框架的融合,城市交通系统正在经历范式转变。路边激光雷达作为基于基础设施的传感技术的关键推动者,为动态城市场景提供强大而精确的3D空间理解。本文全面回顾了基于路边激光雷达的交通感知,围绕五个关键模块:传感器放置策略;多激光雷达点云融合;动态交通信息提取;后续应用包括轨迹预测、碰撞风险评估和行为分析;具有代表性的路边感知基准数据集。尽管取得了显著进展,但在部署优化、遮挡和动态条件下的鲁棒配准、可泛化的目标检测和跟踪以及异构多模态数据的有效利用等方面仍存在挑战。新兴趋势指向感知驱动的基础设施设计、边缘云终端协作以及由领域适应、自我监督学习和基础规模数据集支持的可推广模型。本文旨在为研究人员和实践者提供技术参考,提供基于路边激光雷达的交通感知和数字孪生应用的当前进展、开放问题和未来方向的见解。
{"title":"Roadside lidar-based scene understanding toward intelligent traffic perception: A comprehensive review","authors":"Jiaxing Zhang ,&nbsp;Chengjun Ge ,&nbsp;Wen Xiao ,&nbsp;Miao Tang ,&nbsp;Jon Mills ,&nbsp;Benjamin Coifman ,&nbsp;Nengcheng Chen","doi":"10.1016/j.isprsjprs.2026.01.012","DOIUrl":"10.1016/j.isprsjprs.2026.01.012","url":null,"abstract":"<div><div>Urban transportation systems are undergoing a paradigm shift with the integration of high-precision sensing technologies and intelligent perception frameworks. Roadside lidar, as a key enabler of infrastructure-based sensing technology, offers robust and precise 3D spatial understanding of dynamic urban scenes. This paper presents a comprehensive review of roadside lidar-based traffic perception, structured around five key modules: sensor placement strategies; multi-lidar point cloud fusion; dynamic traffic information extraction;subsequent applications including trajectory prediction, collision risk assessment, and behavioral analysis; representative roadside perception benchmark datasets. Despite notable progress, challenges remain in deployment optimization, robust registration under occlusion and dynamic conditions, generalizable object detection and tracking, and effective utilization of heterogeneous multi-modal data. Emerging trends point toward perception-driven infrastructure design, edge-cloud-terminal collaboration, and generalizable models enabled by domain adaptation, self-supervised learning, and foundation-scale datasets. This review aims to serve as a technical reference for researchers and practitioners, providing insights into current advances, open problems, and future directions in roadside lidar-based traffic perception and digital twin applications.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 69-88"},"PeriodicalIF":12.2,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Empowering tree-scale monitoring over large areas: Individual tree delineation from high-resolution imagery 增强对大面积树木的监测能力:从高分辨率图像中描绘单个树木
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-20 DOI: 10.1016/j.isprsjprs.2025.12.022
Xinlian Liang , Yinrui Wang , Jun Pan , Janne Heiskanen , Ningning Wang , Siyu Wu , Ilja Vuorinne , Jiaojiao Tian , Jonas Troles , Myriam Cloutier , Stefano Puliti , Aishwarya Chandrasekaran , James Ball , Xiangcheng Mi , Guochun Shen , Kun Song , Guofan Shao , Rasmus Astrup , Yunsheng Wang , Petri Pellikka , Jianya Gong
Accurate individual tree delineation (ITD) is essential for forest monitoring, biodiversity assessment, and ecological modeling. While remote sensing (RS) has significantly advanced forest ITD, challenges persist, especially in complex forest environments. The use of imagery data is compelling given the rapid increase in available high-resolution aerial and satellite imagery data, the increasing need for image-based analysis where reliable 3D data are unavailable, the widening gap between data supply and processing capabilities, and the limited validation of state-of-the-art (SOTA) methods across diverse real-world conditions. This study aims to advance ITD research by evaluating SOTA instance segmentation approaches, including both recently developed and established methods. The analysis evaluates ITD algorithm performance using the largest forest instance-segmentation imagery dataset to date and standardized evaluation protocols. This study identifies key factors affecting accuracy, reveals remaining challenges, and outlines future research directions. Findings in this study reveal that ITD accuracy is heavily influenced by image resolution, forest structure, and method design. Findings also reveal that, while algorithm innovations remain important, robustness and transferability that ensure generalization across diverse environments are what differentiate method performances. In addition, this study highlights that commonly used evaluation metrics may fail to adequately capture precise performance in specific applications, e.g., individual-tree-crown segmentation in this study. Assessment reliability can be strengthened through the adoption of stricter criteria. Future research should focus on expanding datasets, refining evaluation protocols, and developing adaptive models capable of handling varying canopy structures. These advancements will enhance ITD scalability and reliability, contributing to more effective forest research and management at a global scale.
准确的单树圈定(ITD)对森林监测、生物多样性评估和生态建模至关重要。虽然遥感(RS)在森林过渡期间取得了显著进展,但挑战依然存在,特别是在复杂的森林环境中。鉴于可用的高分辨率航空和卫星图像数据的快速增长,对基于图像的分析的需求日益增加,而无法获得可靠的3D数据,数据供应和处理能力之间的差距不断扩大,以及在不同的现实世界条件下对最先进(SOTA)方法的有限验证,图像数据的使用是引人注目的。本研究旨在通过评估SOTA实例分割方法(包括最近开发的和已建立的方法)来推进ITD研究。该分析使用迄今为止最大的森林实例分割图像数据集和标准化评估协议来评估ITD算法的性能。本研究确定了影响准确性的关键因素,揭示了仍存在的挑战,并概述了未来的研究方向。研究结果表明,过渡段精度受图像分辨率、森林结构和方法设计的影响很大。研究结果还表明,虽然算法创新仍然很重要,但确保在不同环境下泛化的鲁棒性和可移植性是区分方法性能的关键。此外,本研究强调,常用的评估指标可能无法充分捕捉特定应用中的精确性能,例如本研究中的单个树冠分割。可以通过采用更严格的标准来加强评估的可靠性。未来的研究应集中在扩展数据集、改进评估方案和开发能够处理不同冠层结构的自适应模型上。这些进步将提高过渡段的可扩展性和可靠性,有助于在全球范围内更有效地进行森林研究和管理。
{"title":"Empowering tree-scale monitoring over large areas: Individual tree delineation from high-resolution imagery","authors":"Xinlian Liang ,&nbsp;Yinrui Wang ,&nbsp;Jun Pan ,&nbsp;Janne Heiskanen ,&nbsp;Ningning Wang ,&nbsp;Siyu Wu ,&nbsp;Ilja Vuorinne ,&nbsp;Jiaojiao Tian ,&nbsp;Jonas Troles ,&nbsp;Myriam Cloutier ,&nbsp;Stefano Puliti ,&nbsp;Aishwarya Chandrasekaran ,&nbsp;James Ball ,&nbsp;Xiangcheng Mi ,&nbsp;Guochun Shen ,&nbsp;Kun Song ,&nbsp;Guofan Shao ,&nbsp;Rasmus Astrup ,&nbsp;Yunsheng Wang ,&nbsp;Petri Pellikka ,&nbsp;Jianya Gong","doi":"10.1016/j.isprsjprs.2025.12.022","DOIUrl":"10.1016/j.isprsjprs.2025.12.022","url":null,"abstract":"<div><div>Accurate individual tree delineation (ITD) is essential for forest monitoring, biodiversity assessment, and ecological modeling. While remote sensing (RS) has significantly advanced forest ITD, challenges persist, especially in complex forest environments. The use of imagery data is compelling given the rapid increase in available high-resolution aerial and satellite imagery data, the increasing need for image-based analysis where reliable 3D data are unavailable, the widening gap between data supply and processing capabilities, and the limited validation of state-of-the-art (SOTA) methods across diverse real-world conditions. This study aims to advance ITD research by evaluating SOTA instance segmentation approaches, including both recently developed and established methods. The analysis evaluates ITD algorithm performance using the largest forest instance-segmentation imagery dataset to date and standardized evaluation protocols. This study identifies key factors affecting accuracy, reveals remaining challenges, and outlines future research directions. Findings in this study reveal that ITD accuracy is heavily influenced by image resolution, forest structure, and method design. Findings also reveal that, while algorithm innovations remain important, robustness and transferability that ensure generalization across diverse environments are what differentiate method performances. In addition, this study highlights that commonly used evaluation metrics may fail to adequately capture precise performance in specific applications, e.g., individual-tree-crown segmentation in this study. Assessment reliability can be strengthened through the adoption of stricter criteria. Future research should focus on expanding datasets, refining evaluation protocols, and developing adaptive models capable of handling varying canopy structures. These advancements will enhance ITD scalability and reliability, contributing to more effective forest research and management at a global scale.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 974-999"},"PeriodicalIF":12.2,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1