首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
Attributing GHG emissions to individual facilities using multi-temporal hyperspectral images: Methodology and applications 利用多时相高光谱图像将温室气体排放归因于单个设施:方法和应用
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-17 DOI: 10.1016/j.isprsjprs.2026.01.014
Yichi Zhang , Ge Han , Yiyang Huang , Huayi Wang , Hongyuan Zhang , Zhipeng Pei , Yuanxue Pu , Haotian Luo , Jinchun Yi , Tianqi Shi , Siwei Li , Wei Gong
Industrial parks are major sources of greenhouse gas (GHG) emissions and the ultimate entities responsible for implementing mitigation policies. Current satellite remote sensing technologies perform well in reporting localized strong point-source emissions, but face significant challenges in monitoring emissions from multiple densely clustered sources. To address the limitation, we propose an emission allocation framework, EA-MILES, which integrates multi-source hyperspectral data with plume modeling to quantify process-level emissions. Simulation experiments show that with existing hyperspectral satellites, EA-MILES can estimate emissions for sources with intensities above 80 t CO2/h and 100 kg CH4/h with bias not exceed 13.60 % and 17.08 %. A steel and power production park is selected as a case study, where EA-MILES estimates process-level emissions with uncertainties ranging from 26.33 % to 37.78 %. Estimation results are consistent with inventory values derived from emission factor methods. Top-down Integrated Mass Enhancement method is utilized to compare with EA-MILES results, the estimation bias did not exceed 16.84 %. According to the Climate TRACE, about 32 % of CO2 and 44 % of CH4 point-sources worldwide fall within EA-MILES detection coverage, accounting for over 80 % and 55 % of anthropogenic CO2 and CH4 emissions. Therefore, this study provides a novel satellite-based approach for reporting facility-scale GHG emissions in industrial parks, offering transparent and accurate monitoring data to support the mitigation and energy transition decision-making.
工业园区是温室气体(GHG)排放的主要来源,也是负责实施减缓政策的最终实体。目前的卫星遥感技术在报告局部强点源排放方面表现良好,但在监测多个密集聚集源的排放方面面临重大挑战。为了解决这一限制,我们提出了一个排放分配框架EA-MILES,该框架将多源高光谱数据与羽流建模相结合,以量化过程级排放。模拟实验表明,利用现有的高光谱卫星,EA-MILES可以估算强度在80 t CO2/h和100 kg CH4/h以上的源的排放量,偏差不超过13.60%和17.08%。以某钢铁和电力生产园区为例,EA-MILES估算的过程级排放不确定性在26.33% ~ 37.78%之间。估算结果与排放因子法得出的库存值一致。采用自顶向下集成质量增强方法与EA-MILES结果进行比较,估计偏差不超过16.84%。根据Climate TRACE,全球约32%的CO2和44%的CH4点源在EA-MILES检测范围内,占人为CO2和CH4排放量的80%和55%以上。因此,本研究提供了一种新的基于卫星的方法来报告工业园区设施规模的温室气体排放,提供透明和准确的监测数据,以支持减缓和能源转型决策。
{"title":"Attributing GHG emissions to individual facilities using multi-temporal hyperspectral images: Methodology and applications","authors":"Yichi Zhang ,&nbsp;Ge Han ,&nbsp;Yiyang Huang ,&nbsp;Huayi Wang ,&nbsp;Hongyuan Zhang ,&nbsp;Zhipeng Pei ,&nbsp;Yuanxue Pu ,&nbsp;Haotian Luo ,&nbsp;Jinchun Yi ,&nbsp;Tianqi Shi ,&nbsp;Siwei Li ,&nbsp;Wei Gong","doi":"10.1016/j.isprsjprs.2026.01.014","DOIUrl":"10.1016/j.isprsjprs.2026.01.014","url":null,"abstract":"<div><div>Industrial parks are major sources of greenhouse gas (GHG) emissions and the ultimate entities responsible for implementing mitigation policies. Current satellite remote sensing technologies perform well in reporting localized strong point-source emissions, but face significant challenges in monitoring emissions from multiple densely clustered sources. To address the limitation, we propose an emission allocation framework, EA-MILES, which integrates multi-source hyperspectral data with plume modeling to quantify process-level emissions. Simulation experiments show that with existing hyperspectral satellites, EA-MILES can estimate emissions for sources with intensities above 80 t CO<sub>2</sub>/h and 100 kg CH<sub>4</sub>/h with bias not exceed 13.60 % and 17.08 %. A steel and power production park is selected as a case study, where EA-MILES estimates process-level emissions with uncertainties ranging from 26.33 % to 37.78 %. Estimation results are consistent with inventory values derived from emission factor methods. Top-down Integrated Mass Enhancement method is utilized to compare with EA-MILES results, the estimation bias did not exceed 16.84 %. According to the <em>Climate TRACE</em>, about 32 % of CO<sub>2</sub> and 44 % of CH<sub>4</sub> point-sources worldwide fall within EA-MILES detection coverage, accounting for over 80 % and 55 % of anthropogenic CO<sub>2</sub> and CH<sub>4</sub> emissions. Therefore, this study provides a novel satellite-based approach for reporting facility-scale GHG emissions in industrial parks, offering transparent and accurate monitoring data to support the mitigation and energy transition decision-making.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 937-956"},"PeriodicalIF":12.2,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AnchorReF: A novel anchor-based visual re-localization framework aided by multi-sensor data fusion 基于多传感器数据融合的基于锚点的视觉再定位框架
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-16 DOI: 10.1016/j.isprsjprs.2026.01.019
Hao Wu , Yu Ran , Xiaoxiang Zhang, Xinying Luo, Li Wang, Teng Zhao, Yongcheng Song, Zhijun Zhang, Huisong Zhang, Jin Liu, Jian Li
Visual relocalization estimates the precise pose of a query image within a pre-built visual map, serving as a fundamental component for robot navigation, autonomous driving, surveying and mapping, etc. In the past few decades, significant research efforts have been devoted to achieving high relocalization accuracy. However, challenges remain when the query images exhibit significant changes compared to the reference scene. This paper primarily addresses the problem of pose verification and correction of inaccurate pose estimations from the relocalization. We propose a novel anchor-based visual relocalization framework that achieves robust pose estimations through multi-view co-visibility verification. Our approach further utilizes a tightly-coupled multi-sensor data fusion for pose refinement. Comprehensive evaluations on large-scale, real-world urban driving datasets (containing frequent dynamic objects, severe occlusions, and long-term environmental changes) demonstrate that our framework achieves state-of-the-art performance. Specifically, compared to traditional SFM-based and Transformer-based methods under these challenging conditions, our approach reduces the translation error by 46.2% and the rotation error by 8.55%.
视觉重定位是在预先构建的视觉地图中估计查询图像的精确姿态,是机器人导航、自动驾驶、测绘等的基本组成部分。在过去的几十年里,大量的研究工作致力于实现高的再定位精度。然而,与参考场景相比,当查询图像显示出显著变化时,挑战仍然存在。本文主要解决了姿态验证问题和姿态估计不准确的修正问题。提出了一种新的基于锚点的视觉定位框架,该框架通过多视图共可视性验证实现鲁棒姿态估计。我们的方法进一步利用紧密耦合的多传感器数据融合来进行姿态优化。对大规模、真实的城市驾驶数据集(包含频繁的动态物体、严重的闭塞和长期的环境变化)的综合评估表明,我们的框架达到了最先进的性能。具体来说,在这些具有挑战性的条件下,与传统的基于sfm和基于transformer的方法相比,我们的方法将平移误差降低了46.2%,旋转误差降低了8.55%。
{"title":"AnchorReF: A novel anchor-based visual re-localization framework aided by multi-sensor data fusion","authors":"Hao Wu ,&nbsp;Yu Ran ,&nbsp;Xiaoxiang Zhang,&nbsp;Xinying Luo,&nbsp;Li Wang,&nbsp;Teng Zhao,&nbsp;Yongcheng Song,&nbsp;Zhijun Zhang,&nbsp;Huisong Zhang,&nbsp;Jin Liu,&nbsp;Jian Li","doi":"10.1016/j.isprsjprs.2026.01.019","DOIUrl":"10.1016/j.isprsjprs.2026.01.019","url":null,"abstract":"<div><div>Visual relocalization estimates the precise pose of a query image within a pre-built visual map, serving as a fundamental component for robot navigation, autonomous driving, surveying and mapping, etc. In the past few decades, significant research efforts have been devoted to achieving high relocalization accuracy. However, challenges remain when the query images exhibit significant changes compared to the reference scene. This paper primarily addresses the problem of pose verification and correction of inaccurate pose estimations from the relocalization. We propose a novel anchor-based visual relocalization framework that achieves robust pose estimations through multi-view co-visibility verification. Our approach further utilizes a tightly-coupled multi-sensor data fusion for pose refinement. Comprehensive evaluations on large-scale, real-world urban driving datasets (containing frequent dynamic objects, severe occlusions, and long-term environmental changes) demonstrate that our framework achieves state-of-the-art performance. Specifically, compared to traditional SFM-based and Transformer-based methods under these challenging conditions, our approach reduces the translation error by 46.2% and the rotation error by 8.55%.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 1-13"},"PeriodicalIF":12.2,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BEDI: a comprehensive benchmark for evaluating embodied agents on UAVs BEDI:评估无人机上嵌入代理的综合基准
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-16 DOI: 10.1016/j.isprsjprs.2026.01.013
Mingning Guo , Mengwei Wu , Jiarun He, Shaoxian Li, Haifeng Li, Chao Tao
With the rapid advancement of low-altitude remote sensing and Vision-Language Models (VLMs), Embodied Agents based on Unmanned Aerial Vehicles (UAVs) have shown significant potential in autonomous tasks. However, current evaluation methods for UAV-Embodied Agents (UAV-EAs) remain constrained by the lack of standardized benchmarks, diverse testing scenarios and open system interfaces. To address these challenges, we propose BEDI (Benchmark for Embodied Drone Intelligence), a systematic and standardized benchmark designed for evaluating UAV-EAs. Specifically, we introduce a novel Dynamic Chain-of-Embodied-Task paradigm based on the perception-decision-action loop, which decomposes complex UAV tasks into standardized, measurable subtasks. Building on this paradigm, we design a unified evaluation framework encompassing six core sub-skills: semantic perception, spatial perception, motion control, tool utilization, task planning and action generation. Furthermore, we develop a hybrid testing platform that incorporates a wide range of both virtual and real-world scenarios, enabling a comprehensive evaluation of UAV-EAs across diverse contexts. The platform also offers open and standardized interfaces, allowing researchers to customize tasks and extend scenarios, thereby enhancing flexibility and scalability in the evaluation process. Finally, through empirical evaluations of several state-of-the-art (SOTA) VLMs, we reveal their limitations in embodied UAV tasks, underscoring the critical role of the BEDI benchmark in advancing embodied intelligence research and model optimization. By filling the gap in systematic and standardized evaluation within this field, BEDI facilitates objective model comparison and lays a robust foundation for future development in this field. Our benchmark is now publicly available at https://github.com/lostwolves/BEDI.
随着低空遥感技术和视觉语言模型的快速发展,基于无人机的具身智能体在自主任务中显示出巨大的潜力。然而,目前UAV-Embodied Agents (uav - ea)的评估方法仍然受到缺乏标准化基准、多样化测试场景和开放系统接口的限制。为了应对这些挑战,我们提出了BEDI(嵌入式无人机智能基准),这是一个用于评估无人机- ea的系统和标准化基准。具体来说,我们引入了一种新的基于感知-决策-行动循环的动态体现任务链范式,该范式将复杂的无人机任务分解为标准化、可测量的子任务。在此基础上,我们设计了一个统一的评估框架,包括六个核心子技能:语义感知、空间感知、运动控制、工具利用、任务规划和动作生成。此外,我们开发了一个混合测试平台,该平台结合了广泛的虚拟和现实场景,能够在不同环境下对uav - ea进行全面评估。该平台还提供开放和标准化的接口,允许研究人员自定义任务和扩展场景,从而增强评估过程的灵活性和可扩展性。最后,通过对几个最先进的(SOTA) vlm的实证评估,我们揭示了它们在具体无人机任务中的局限性,强调了BEDI基准在推进具体智能研究和模型优化方面的关键作用。BEDI填补了该领域在系统化、规范化评价方面的空白,有利于客观的模型比较,为该领域的未来发展奠定坚实的基础。我们的基准现在可以在https://github.com/lostwolves/BEDI上公开获得。
{"title":"BEDI: a comprehensive benchmark for evaluating embodied agents on UAVs","authors":"Mingning Guo ,&nbsp;Mengwei Wu ,&nbsp;Jiarun He,&nbsp;Shaoxian Li,&nbsp;Haifeng Li,&nbsp;Chao Tao","doi":"10.1016/j.isprsjprs.2026.01.013","DOIUrl":"10.1016/j.isprsjprs.2026.01.013","url":null,"abstract":"<div><div>With the rapid advancement of low-altitude remote sensing and Vision-Language Models (VLMs), Embodied Agents based on Unmanned Aerial Vehicles (UAVs) have shown significant potential in autonomous tasks. However, current evaluation methods for UAV-Embodied Agents (UAV-EAs) remain constrained by the lack of standardized benchmarks, diverse testing scenarios and open system interfaces. To address these challenges, we propose BEDI (Benchmark for Embodied Drone Intelligence), a systematic and standardized benchmark designed for evaluating UAV-EAs. Specifically, we introduce a novel Dynamic Chain-of-Embodied-Task paradigm based on the perception-decision-action loop, which decomposes complex UAV tasks into standardized, measurable subtasks. Building on this paradigm, we design a unified evaluation framework encompassing six core sub-skills: semantic perception, spatial perception, motion control, tool utilization, task planning and action generation. Furthermore, we develop a hybrid testing platform that<!--> <!-->incorporates a wide range of both virtual and real-world scenarios, enabling a comprehensive evaluation of UAV-EAs across diverse contexts. The platform also offers open and standardized interfaces, allowing researchers to customize tasks and extend scenarios, thereby enhancing flexibility and scalability in the evaluation process. Finally, through empirical evaluations of several state-of-the-art (SOTA) VLMs, we reveal their limitations in embodied UAV tasks, underscoring the critical role of the BEDI benchmark in advancing embodied intelligence research and model optimization. By filling the gap in systematic and standardized evaluation within this field, BEDI facilitates objective model comparison and lays a robust foundation for future development in this field. Our benchmark is now publicly available at <span><span>https://github.com/lostwolves/BEDI</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 910-936"},"PeriodicalIF":12.2,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RECREATE: Supervised contrastive learning and inpainting based hyperspectral image denoising 再现:基于高光谱图像去噪的监督对比学习和图像修复
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-16 DOI: 10.1016/j.isprsjprs.2026.01.022
Aditya Dixit , Anup Kumar Gupta , Puneet Gupta , Ankur Garg
Hyperspectral image (HSI) contains information at various spectra, making it valuable in several real-world applications such as environmental monitoring, agriculture, and remote sensing. However, the acquisition process often introduces noise, necessitating effective HSI denoising methods to maintain its applicability. Deep Learning (DL) is considered as the de-facto for HSI denoising, but it requires a significant number of training samples to optimize network parameters for effective denoising outcomes. However, obtaining extensive datasets is challenging in HSI, leading to epistemic uncertainties and thereby deteriorating the denoising performance. This paper introduces a novel supervised contrastive learning (SCL) method, RECREATE, to enhance feature learning and mitigate the issue of epistemic uncertainty for HSI denoising. Furthermore, we introduce the exploration of image inpainting as an auxiliary task to enhance the HSI denoising performance. By adding HSI inpainting to CL, our method essentially enhances HSI denoising by increasing training datasets and enforcing improved feature learning. Experimental outcomes on various HSI datasets validate the efficacy of RECREATE, showcasing its potential for integration with existing HSI denoising techniques to enhance their performance, both qualitatively and quantitatively. This innovative method holds promise for addressing the limitations posed by limited training data and thereby advancing the field toward proposing better HSI denoising methods.
高光谱图像(HSI)包含各种光谱的信息,使其在环境监测,农业和遥感等几个现实世界的应用中具有价值。然而,采集过程往往会引入噪声,需要有效的HSI去噪方法来保持其适用性。深度学习(DL)被认为是HSI去噪的事实,但它需要大量的训练样本来优化网络参数以获得有效的去噪结果。然而,在HSI中获得广泛的数据集是具有挑战性的,导致认知不确定性,从而降低了去噪性能。本文介绍了一种新的监督对比学习(SCL)方法——recrere,以增强特征学习并减轻HSI去噪的认知不确定性问题。此外,我们还介绍了对图像着色的探索,作为增强HSI去噪性能的辅助任务。通过将HSI图像添加到CL中,我们的方法通过增加训练数据集和加强改进的特征学习,本质上增强了HSI去噪。在各种HSI数据集上的实验结果验证了re的有效性,展示了其与现有HSI去噪技术集成的潜力,以提高其定性和定量性能。这种创新的方法有望解决有限的训练数据所带来的限制,从而推动该领域提出更好的HSI去噪方法。
{"title":"RECREATE: Supervised contrastive learning and inpainting based hyperspectral image denoising","authors":"Aditya Dixit ,&nbsp;Anup Kumar Gupta ,&nbsp;Puneet Gupta ,&nbsp;Ankur Garg","doi":"10.1016/j.isprsjprs.2026.01.022","DOIUrl":"10.1016/j.isprsjprs.2026.01.022","url":null,"abstract":"<div><div>Hyperspectral image (HSI) contains information at various spectra, making it valuable in several real-world applications such as environmental monitoring, agriculture, and remote sensing. However, the acquisition process often introduces noise, necessitating effective HSI denoising methods to maintain its applicability. Deep Learning (DL) is considered as the de-facto for HSI denoising, but it requires a significant number of training samples to optimize network parameters for effective denoising outcomes. However, obtaining extensive datasets is challenging in HSI, leading to epistemic uncertainties and thereby deteriorating the denoising performance. This paper introduces a novel supervised contrastive learning (SCL) method, <em>RECREATE</em>, to enhance feature learning and mitigate the issue of epistemic uncertainty for HSI denoising. Furthermore, we introduce the exploration of image inpainting as an auxiliary task to enhance the HSI denoising performance. By adding HSI inpainting to CL, our method essentially enhances HSI denoising by increasing training datasets and enforcing improved feature learning. Experimental outcomes on various HSI datasets validate the efficacy of <em>RECREATE</em>, showcasing its potential for integration with existing HSI denoising techniques to enhance their performance, both qualitatively and quantitatively. This innovative method holds promise for addressing the limitations posed by limited training data and thereby advancing the field toward proposing better HSI denoising methods.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 14-24"},"PeriodicalIF":12.2,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RTPSeg: A multi-modality dataset for LiDAR point cloud semantic segmentation assisted with RGB-thermal images in autonomous driving RTPSeg:自动驾驶中辅助rgb热图像的激光雷达点云语义分割多模态数据集
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-16 DOI: 10.1016/j.isprsjprs.2026.01.008
Yifan Sun , Chenguang Dai , Wenke Li , Xinpu Liu , Yongqi Sun , Ye Zhang , Weijun Guan , Yongsheng Zhang , Yulan Guo , Hanyun Wang
LiDAR point cloud semantic segmentation is crucial for scene understanding in autonomous driving, yet the sparse and textureless characteristics of point clouds cause huge challenges for this task. To address this, numerous studies have explored to leverage the dense color and fine-grained texture from RGB images for multi-modality 3D semantic segmentation. Nevertheless, these methods still encounter certain limitations when facing complex scenarios, as RGB images degrade under poor lighting conditions. In contrast, thermal infrared (TIR) images can provide thermal radiation information of road objects and are robust to illumination change, offering complementary advantages to RGB images. Therefore, in this work we introduce RTPSeg, the first and only multi-modality dataset to simultaneously provide RGB and TIR images for point cloud semantic segmentation. RTPSeg includes over 3000 synchronized frames collected by RGB camera, infrared camera, and LiDAR, providing over 248M pointwise annotations for 18 semantic categories in autonomous driving, involving urban and village scenes during both daytime and nighttime. Based on RTPSeg, we also propose RTPSegNet, a baseline model for point cloud semantic segmentation jointly assisted with RGB and TIR images. Extensive experiments demonstrate that the RTPSeg dataset presents considerable challenges and opportunities to existing point cloud semantic segmentation approaches, and our RTPSegNet exhibits promising effectiveness in jointly leveraging the complementary information between point clouds, RGB images, and TIR images. More importantly, the experimental results also confirm that 3D semantic segmentation can be effectively enhanced by introducing additional TIR image modality, revealing the promising potential of this innovative research and application. We anticipate that the RTPSeg will catalyze in-depth research in this field. Both RTPSeg and RTPSegNet will be released at https://github.com/sssssyf/RTPSeg
激光雷达点云语义分割对于自动驾驶场景理解至关重要,但点云的稀疏性和无纹理性给这一任务带来了巨大的挑战。为了解决这个问题,许多研究探索了利用RGB图像的密集颜色和细粒度纹理进行多模态3D语义分割。然而,这些方法在面对复杂场景时仍然会遇到一定的局限性,因为RGB图像在较差的光照条件下会退化。相比之下,热红外(TIR)图像可以提供道路物体的热辐射信息,并且对光照变化具有鲁棒性,是RGB图像的互补优势。因此,在这项工作中,我们引入了RTPSeg,这是第一个也是唯一一个同时提供RGB和TIR图像用于点云语义分割的多模态数据集。RTPSeg包括由RGB相机、红外相机和激光雷达采集的3000多帧同步帧,提供超过248M的自动驾驶18个语义类别的点向注释,包括白天和夜间的城市和乡村场景。在RTPSeg的基础上,提出了RTPSegNet——RGB和TIR图像联合辅助的点云语义分割基线模型。大量的实验表明,RTPSeg数据集对现有的点云语义分割方法提出了相当大的挑战和机遇,我们的RTPSegNet在联合利用点云、RGB图像和TIR图像之间的互补信息方面表现出了良好的效果。更重要的是,实验结果也证实了通过引入额外的TIR图像模态可以有效地增强三维语义分割,揭示了这一创新研究和应用的广阔潜力。我们期待RTPSeg将促进这一领域的深入研究。RTPSeg和rtpsenet都将在https://github.com/sssssyf/RTPSeg上发布
{"title":"RTPSeg: A multi-modality dataset for LiDAR point cloud semantic segmentation assisted with RGB-thermal images in autonomous driving","authors":"Yifan Sun ,&nbsp;Chenguang Dai ,&nbsp;Wenke Li ,&nbsp;Xinpu Liu ,&nbsp;Yongqi Sun ,&nbsp;Ye Zhang ,&nbsp;Weijun Guan ,&nbsp;Yongsheng Zhang ,&nbsp;Yulan Guo ,&nbsp;Hanyun Wang","doi":"10.1016/j.isprsjprs.2026.01.008","DOIUrl":"10.1016/j.isprsjprs.2026.01.008","url":null,"abstract":"<div><div>LiDAR point cloud semantic segmentation is crucial for scene understanding in autonomous driving, yet the sparse and textureless characteristics of point clouds cause huge challenges for this task. To address this, numerous studies have explored to leverage the dense color and fine-grained texture from RGB images for multi-modality 3D semantic segmentation. Nevertheless, these methods still encounter certain limitations when facing complex scenarios, as RGB images degrade under poor lighting conditions. In contrast, thermal infrared (TIR) images can provide thermal radiation information of road objects and are robust to illumination change, offering complementary advantages to RGB images. Therefore, in this work we introduce RTPSeg, the first and only multi-modality dataset to simultaneously provide RGB and TIR images for point cloud semantic segmentation. RTPSeg includes over 3000 synchronized frames collected by RGB camera, infrared camera, and LiDAR, providing over 248M pointwise annotations for 18 semantic categories in autonomous driving, involving urban and village scenes during both daytime and nighttime. Based on RTPSeg, we also propose RTPSegNet, a baseline model for point cloud semantic segmentation jointly assisted with RGB and TIR images. Extensive experiments demonstrate that the RTPSeg dataset presents considerable challenges and opportunities to existing point cloud semantic segmentation approaches, and our RTPSegNet exhibits promising effectiveness in jointly leveraging the complementary information between point clouds, RGB images, and TIR images. More importantly, the experimental results also confirm that 3D semantic segmentation can be effectively enhanced by introducing additional TIR image modality, revealing the promising potential of this innovative research and application. We anticipate that the RTPSeg will catalyze in-depth research in this field. Both RTPSeg and RTPSegNet will be released at <span><span>https://github.com/sssssyf/RTPSeg</span><svg><path></path></svg></span></div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"233 ","pages":"Pages 25-38"},"PeriodicalIF":12.2,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An SW-TES hybrid algorithm for retrieving mountainous land surface temperature from high-resolution thermal infrared remote sensing data 基于高分辨率热红外遥感数据的山地地表温度反演SW-TES混合算法
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-15 DOI: 10.1016/j.isprsjprs.2026.01.016
Zhi-Wei He , Bo-Hui Tang , Zhao-Liang Li
<div><div>Mountainous land surface temperature (MLST) is a key parameter for studying the energy exchange between land surface and atmosphere in mountainous areas. However, traditional land surface temperature (LST) retrieval methods often neglect the influence of three-dimensional (3D) structures and adjacent pixels due to rugged terrain. To address this, a mountainous split-window and temperature-emissivity separation (MSW-TES) hybrid algorithm was proposed to retrieve MLST. The hybrid algorithm that combines the improved split window (SW) algorithm and temperature-emissivity separation (TES) algorithm, which considering the topographic and adjacent effects (T-A effect) to retrieve MLST from five thermal infrared (TIR) bands of the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER). In this hybrid algorithm, an improved mountainous canopy multiple scattering TIR radiative transfer model was proposed to construct the simulation dataset. Then, an improved SW algorithm was developed to build a 3D lookup table (LUT) of regression coefficients using small-scale self-heating parameter (SSP) and sky-view factor (SVF) to estimate brightness temperature (BT) at ground level. Furthermore, The TES algorithm was refined to account for the influence of rugged terrain within pixel on mountainous land surface effective emissivity (MLSE) by reconstructing the relationship between minimum emissivity and maximum-minimum difference (MMD) for different SSPs. Results from simulated data show that the accuracy of the improved SW algorithm is increased by up to 0.5 K at most for estimating BT at ground level. The MSW-TES algorithm, considering the T-A effect, generally retrieves lower LST values compared to those without this consideration. The hybrid algorithm yielded root mean square error (RMSE) of 0.99 K and 1.83 K for LST retrieval with and without the T-A effect, respectively, with most differences falling between 0.0 K and 3.0 K. The sensitivity analysis indicated that the perturbation of input parameters has little influence on MLST and MLSE, which proves that the MSW-TES algorithm has strong robustness. Additionally, the accuracy of MLST retrieval by the MSW-TES algorithm was validated using both discrete anisotropic radiative transfer (DART) model simulations and <em>in-situ</em> measurements. The validation result of DART simulations showed biases ranging from −0.13 K to 1.03 K and RMSEs from 0.76 K to 1.29 K across the five ASTER TIR bands, while validation result of the in-situ measurements yielded a bias of 0.97 K and an RMSE of 1.25 K, demonstrating consistent and reliable results. This study underscores the necessity of accounting for the T-A effect to improve MLST retrieval and provides a promising pathway for global clear-sky high-resolution MLST mapping in upcoming thermal missions. The source code and simulated data are available at <span><span>https://github.com/hezwppp/MSW-TES</span><svg><path></path></svg></span>.</div></div
山地地表温度(MLST)是研究山区地表与大气能量交换的关键参数。然而,由于地形起伏,传统的地表温度反演方法往往忽略了三维结构及其相邻像元的影响。为了解决这一问题,提出了一种山地分割窗和温度发射率分离(MSW-TES)混合算法来检索MLST。结合改进的分割窗(SW)算法和温度-发射率分离(TES)算法,考虑地形和相邻效应(T-A效应),从先进星载热发射与反射辐射计(ASTER)的5个热红外(TIR)波段提取MLST。在该混合算法中,提出了一种改进的山地冠层多重散射TIR辐射传输模型来构建模拟数据集。然后,开发了一种改进的SW算法,利用小尺度自热参数(SSP)和天景因子(SVF)建立回归系数的三维查找表(LUT),估算地面亮度温度(BT)。此外,通过重构不同ssp的最小发射率与最大最小差(MMD)之间的关系,对TES算法进行了改进,以考虑像元内崎岖地形对山地地表有效发射率(MLSE)的影响。仿真结果表明,改进的SW算法对地面BT的估计精度最高可提高0.5 K。考虑T-A效应的MSW-TES算法通常比不考虑T-A效应的算法检索到更低的LST值。混合算法对有无T-A效应的LST检索结果的均方根误差(RMSE)分别为0.99 K和1.83 K,最大差异在0.0 K和3.0 K之间。灵敏度分析表明,输入参数的扰动对MLST和MLSE的影响较小,证明了MSW-TES算法具有较强的鲁棒性。此外,通过离散各向异性辐射传输(DART)模型模拟和现场测量,验证了MSW-TES算法检索MLST的准确性。DART模拟验证结果显示,5个ASTER TIR波段的偏差范围为- 0.13 K ~ 1.03 K,均方根误差范围为0.76 K ~ 1.29 K,而原位测量验证结果的偏差为0.97 K,均方根误差为1.25 K,结果一致可靠。该研究强调了考虑T-A效应对改进MLST检索的必要性,并为未来热成像任务中全球晴空高分辨率MLST制图提供了一条有希望的途径。源代码和模拟数据可在https://github.com/hezwppp/MSW-TES上获得。
{"title":"An SW-TES hybrid algorithm for retrieving mountainous land surface temperature from high-resolution thermal infrared remote sensing data","authors":"Zhi-Wei He ,&nbsp;Bo-Hui Tang ,&nbsp;Zhao-Liang Li","doi":"10.1016/j.isprsjprs.2026.01.016","DOIUrl":"10.1016/j.isprsjprs.2026.01.016","url":null,"abstract":"&lt;div&gt;&lt;div&gt;Mountainous land surface temperature (MLST) is a key parameter for studying the energy exchange between land surface and atmosphere in mountainous areas. However, traditional land surface temperature (LST) retrieval methods often neglect the influence of three-dimensional (3D) structures and adjacent pixels due to rugged terrain. To address this, a mountainous split-window and temperature-emissivity separation (MSW-TES) hybrid algorithm was proposed to retrieve MLST. The hybrid algorithm that combines the improved split window (SW) algorithm and temperature-emissivity separation (TES) algorithm, which considering the topographic and adjacent effects (T-A effect) to retrieve MLST from five thermal infrared (TIR) bands of the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER). In this hybrid algorithm, an improved mountainous canopy multiple scattering TIR radiative transfer model was proposed to construct the simulation dataset. Then, an improved SW algorithm was developed to build a 3D lookup table (LUT) of regression coefficients using small-scale self-heating parameter (SSP) and sky-view factor (SVF) to estimate brightness temperature (BT) at ground level. Furthermore, The TES algorithm was refined to account for the influence of rugged terrain within pixel on mountainous land surface effective emissivity (MLSE) by reconstructing the relationship between minimum emissivity and maximum-minimum difference (MMD) for different SSPs. Results from simulated data show that the accuracy of the improved SW algorithm is increased by up to 0.5 K at most for estimating BT at ground level. The MSW-TES algorithm, considering the T-A effect, generally retrieves lower LST values compared to those without this consideration. The hybrid algorithm yielded root mean square error (RMSE) of 0.99 K and 1.83 K for LST retrieval with and without the T-A effect, respectively, with most differences falling between 0.0 K and 3.0 K. The sensitivity analysis indicated that the perturbation of input parameters has little influence on MLST and MLSE, which proves that the MSW-TES algorithm has strong robustness. Additionally, the accuracy of MLST retrieval by the MSW-TES algorithm was validated using both discrete anisotropic radiative transfer (DART) model simulations and &lt;em&gt;in-situ&lt;/em&gt; measurements. The validation result of DART simulations showed biases ranging from −0.13 K to 1.03 K and RMSEs from 0.76 K to 1.29 K across the five ASTER TIR bands, while validation result of the in-situ measurements yielded a bias of 0.97 K and an RMSE of 1.25 K, demonstrating consistent and reliable results. This study underscores the necessity of accounting for the T-A effect to improve MLST retrieval and provides a promising pathway for global clear-sky high-resolution MLST mapping in upcoming thermal missions. The source code and simulated data are available at &lt;span&gt;&lt;span&gt;https://github.com/hezwppp/MSW-TES&lt;/span&gt;&lt;svg&gt;&lt;path&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;.&lt;/div&gt;&lt;/div","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 865-889"},"PeriodicalIF":12.2,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond the surface: machine learning uncovers ENSO’s hidden and contrasting impacts on phytoplankton vertical structure 超越表面:机器学习揭示了ENSO对浮游植物垂直结构的隐藏和对比影响
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-15 DOI: 10.1016/j.isprsjprs.2026.01.002
Jing Yang , Yanfeng Wen , Peng Chen , Zhenhua Zhang , Delu Pan
Satellite-based ocean remote sensing is fundamentally limited to observing the ocean surface (top-of-the-ocean), a constraint that severely hinders a comprehensive understanding of how the entire water column ecosystem responds to climate variability like the El Niño-Southern Oscillation (ENSO). Surface-only views cannot resolve critical shifts in the subsurface chlorophyll maximum (SCM), a key layer for marine biodiversity and biogeochemical cycles. To overcome this critical limitation, we develop and validate a novel stacked generalization ensemble machine learning framework. This framework robustly reconstructs a 25-year (1998–2022) high-resolution 3D chlorophyll-a (Chl-a) field by integrating 133,792 globally distributed Biogeochemical-Argo (BGC-Argo) profiles with multi-source satellite data. The reconstructed 3D Chl-a fields were rigorously validated against both satellite and in-situ observations, achieving strong agreement (R ≥ 0.97, mean absolute percentage error ≤ 27 %), demonstrating the robustness and reliability of the framework. Applying this framework to two contrasting South China Sea upwelling system reveals that ENSO phases fundamentally restructure the entire water column. Crucially, we discover that El Niño and La Niña exert opposing effects on the SCM: El Niño events deepen and thin the SCM with decreasing Chl-a by 15–30 %, whereas La Niña events cause it to shoal and thicken, increasing Chl-a by 20–40 %. This vertical restructuring is mechanistically linked to ENSO-driven changes in wind stress curl, Rossby wave propagation, and nitrate availability. Furthermore, we identify a significant subsurface-first response, where the SCM reacts to ENSO forcing months before significant changes are detectable at the surface. Our findings demonstrate that a three-dimensional perspective, enabled by our novel remote sensing reconstruction framework, is essential for accurately quantifying the biogeochemical consequences of climate variability, revealing that surface-only observations can significantly underestimate the vulnerability and response of marine ecosystems to ENSO events.
基于卫星的海洋遥感基本上仅限于观测海洋表面(海洋顶部),这一限制严重阻碍了对整个水柱生态系统如何响应厄尔尼诺Niño-Southern涛动(ENSO)等气候变化的全面理解。仅从地表观测无法解决海洋生物多样性和生物地球化学循环的关键层——亚地表叶绿素最大值(SCM)的关键变化。为了克服这一关键限制,我们开发并验证了一种新的堆叠泛化集成机器学习框架。该框架通过整合133,792个全球分布的生物地球化学- argo (BGC-Argo)剖面和多源卫星数据,对25年(1998-2022)高分辨率三维叶绿素-a (Chl-a)场进行了强大的重建。重建的三维Chl-a场与卫星和现场观测结果进行了严格验证,结果一致性强(R≥0.97,平均绝对百分比误差≤27%),证明了框架的鲁棒性和可靠性。将这一框架应用到两个对比鲜明的南海上升流系统中,可以发现ENSO阶段从根本上重构了整个水柱。至关重要的是,我们发现El Niño和La Niña事件对SCM产生相反的影响:El Niño事件使SCM加深和变薄,使Chl-a减少15 - 30%,而La Niña事件使SCM变浅和变厚,使Chl-a增加20 - 40%。这种垂直重构与enso驱动的风应力旋度、罗斯比波传播和硝酸盐有效性的变化有机械上的联系。此外,我们还发现了一个显著的地下优先响应,即SCM在地表检测到显著变化之前几个月就对ENSO强迫做出反应。我们的研究结果表明,通过我们的新型遥感重建框架,三维视角对于准确量化气候变率的生物地球化学后果至关重要,揭示了仅表面观测可以显著低估海洋生态系统对ENSO事件的脆弱性和响应。
{"title":"Beyond the surface: machine learning uncovers ENSO’s hidden and contrasting impacts on phytoplankton vertical structure","authors":"Jing Yang ,&nbsp;Yanfeng Wen ,&nbsp;Peng Chen ,&nbsp;Zhenhua Zhang ,&nbsp;Delu Pan","doi":"10.1016/j.isprsjprs.2026.01.002","DOIUrl":"10.1016/j.isprsjprs.2026.01.002","url":null,"abstract":"<div><div>Satellite-based ocean remote sensing is fundamentally limited to observing the ocean surface (top-of-the-ocean), a constraint that severely hinders a comprehensive understanding of how the entire water column ecosystem responds to climate variability like the El Niño-Southern Oscillation (ENSO). Surface-only views cannot resolve critical shifts in the subsurface chlorophyll maximum (SCM), a key layer for marine biodiversity and biogeochemical cycles. To overcome this critical limitation, we develop and validate a novel stacked generalization ensemble machine learning framework. This framework robustly reconstructs a 25-year (1998–2022) high-resolution 3D chlorophyll-a (Chl-a) field by integrating 133,792 globally distributed Biogeochemical-Argo (BGC-Argo) profiles with multi-source satellite data. The reconstructed 3D Chl-a fields were rigorously validated against both satellite and in-situ observations, achieving strong agreement (R ≥ 0.97, mean absolute percentage error ≤ 27 %), demonstrating the robustness and reliability of the framework. Applying this framework to two contrasting South China Sea upwelling system reveals that ENSO phases fundamentally restructure the entire water column. Crucially, we discover that El Niño and La Niña exert opposing effects on the SCM: El Niño events deepen and thin the SCM with decreasing Chl-a by 15–30 %, whereas La Niña events cause it to shoal and thicken, increasing Chl-a by 20–40 %. This vertical restructuring is mechanistically linked to ENSO-driven changes in wind stress curl, Rossby wave propagation, and nitrate availability. Furthermore, we identify a significant subsurface-first response, where the SCM reacts to ENSO forcing months before significant changes are detectable at the surface. Our findings demonstrate that a three-dimensional perspective, enabled by our novel remote sensing reconstruction framework, is essential for accurately quantifying the biogeochemical consequences of climate variability, revealing that surface-only observations can significantly underestimate the vulnerability and response of marine ecosystems to ENSO events.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 890-909"},"PeriodicalIF":12.2,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DVGBench: Implicit-to-explicit visual grounding benchmark in UAV imagery with large vision–language models DVGBench:基于大型视觉语言模型的无人机图像中隐式到显式视觉接地基准
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-14 DOI: 10.1016/j.isprsjprs.2026.01.005
Yue Zhou , Jue Chen , Zilun Zhang , Penghui Huang , Ran Ding , Zhentao Zou , PengFei Gao , Yuchen Wei , Ke Li , Xue Yang , Xue Jiang , Hongxin Yang , Jonathan Li
Remote sensing (RS) large vision–language models (LVLMs) have shown strong promise across visual grounding (VG) tasks. However, existing RS VG datasets predominantly rely on explicit referring expressions – such as relative position, relative size, and color cues – thereby constraining performance on implicit VG tasks that require scenario-specific domain knowledge. This article introduces DVGBench, a high-quality implicit VG benchmark for drones, covering six major application scenarios: traffic, disaster, security, sport, social activity, and productive activity. Each object provides both explicit and implicit queries. Based on the dataset, we design DroneVG-R1, an LVLM that integrates the novel Implicit-to-Explicit Chain-of-Thought (I2E-CoT) within a reinforcement learning paradigm. This enables the model to take advantage of scene-specific expertise, converting implicit references into explicit ones and thus reducing grounding difficulty. Finally, an evaluation of mainstream models on both explicit and implicit VG tasks reveals substantial limitations in their reasoning capabilities. These findings provide actionable insights for advancing the reasoning capacity of LVLMs for drone-based agents. The code and datasets will be released at https://github.com/zytx121/DVGBench.
遥感(RS)大视觉语言模型(LVLMs)在视觉基础(VG)任务中显示出强大的应用前景。然而,现有的RS VG数据集主要依赖于显式引用表达式——例如相对位置、相对大小和颜色线索——从而限制了需要特定于场景的领域知识的隐式VG任务的性能。本文介绍了DVGBench,这是一个用于无人机的高质量隐式VG基准测试,涵盖了六个主要应用场景:交通、灾难、安全、体育、社交活动和生产活动。每个对象都提供显式和隐式查询。基于该数据集,我们设计了DroneVG-R1,这是一种将新型的隐式到显式思维链(I2E-CoT)集成到强化学习范式中的LVLM。这使模型能够利用特定场景的专业知识,将隐式引用转换为显式引用,从而降低接地难度。最后,对主流模型在显式和隐式VG任务上的评估揭示了它们在推理能力上的实质性限制。这些发现为提高基于无人机的智能体的LVLMs推理能力提供了可操作的见解。代码和数据集将在https://github.com/zytx121/DVGBench上发布。
{"title":"DVGBench: Implicit-to-explicit visual grounding benchmark in UAV imagery with large vision–language models","authors":"Yue Zhou ,&nbsp;Jue Chen ,&nbsp;Zilun Zhang ,&nbsp;Penghui Huang ,&nbsp;Ran Ding ,&nbsp;Zhentao Zou ,&nbsp;PengFei Gao ,&nbsp;Yuchen Wei ,&nbsp;Ke Li ,&nbsp;Xue Yang ,&nbsp;Xue Jiang ,&nbsp;Hongxin Yang ,&nbsp;Jonathan Li","doi":"10.1016/j.isprsjprs.2026.01.005","DOIUrl":"10.1016/j.isprsjprs.2026.01.005","url":null,"abstract":"<div><div>Remote sensing (RS) large vision–language models (LVLMs) have shown strong promise across visual grounding (VG) tasks. However, existing RS VG datasets predominantly rely on explicit referring expressions – such as relative position, relative size, and color cues – thereby constraining performance on implicit VG tasks that require scenario-specific domain knowledge. This article introduces DVGBench, a high-quality implicit VG benchmark for drones, covering six major application scenarios: traffic, disaster, security, sport, social activity, and productive activity. Each object provides both explicit and implicit queries. Based on the dataset, we design DroneVG-R1, an LVLM that integrates the novel Implicit-to-Explicit Chain-of-Thought (I2E-CoT) within a reinforcement learning paradigm. This enables the model to take advantage of scene-specific expertise, converting implicit references into explicit ones and thus reducing grounding difficulty. Finally, an evaluation of mainstream models on both explicit and implicit VG tasks reveals substantial limitations in their reasoning capabilities. These findings provide actionable insights for advancing the reasoning capacity of LVLMs for drone-based agents. The code and datasets will be released at <span><span>https://github.com/zytx121/DVGBench</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 831-847"},"PeriodicalIF":12.2,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling spatiotemporal forest cover patterns breaking the cloud barrier: Annual 30  m mapping in cloud-prone southern China from 2000 to 2020 揭示突破云层屏障的森林覆盖时空格局:2000 - 2020年中国南方多云地区30 m年际制图
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-14 DOI: 10.1016/j.isprsjprs.2026.01.015
Peng Qin , Huabing Huang , Jie Wang , Yunxia Cui , Peimin Chen , Shuang Chen , Yu Xia , Shuai Yuan , Yumei Li , Xiangyu Liu
Large-scale, long-term, and high-frequency monitoring of forest cover is essential for sustainable forest management and carbon stock assessment. However, in persistently cloudy regions such as southern China, the scarcity of high-quality remote sensing data and reliable training samples has resulted in forest cover products with limited spatial and temporal resolution. In addition, many existing datasets fail to accurately characterize forest distribution and dynamics—particularly underestimating forest expansion and overlooking fine-scale and high-frequency changes. To address these limitations, we propose a novel forest–non-forest mapping framework based on reconstructed remote sensing data. First, we successfully achieved large-scale data reconstruction using two deep learning-based multi-sensor fusion methods across extensive (2.04 million km2), long-term (2000–2020), persistently cloudy regions, effectively generating seamless imagery and NDVI time series to address extensive spatial and temporal data gaps for forest classification. Next, by combining spectrally similar sample transfer method with existing land cover products, we constructed robust training samples spanning broad spatial and temporal scales. Subsequently, using a random forest classifier we generated annual 30  m forest cover maps for cloudy southern China, achieving an unprecedented balance between spatial and temporal resolution while improving mapping accuracy. The results demonstrate an overall accuracy of 0.904, surpassing that of the China Land Cover Dataset (CLCD, 0.889) and the China Annual Tree Cover Dataset (CATCD, 0.850). Particularly, our results revealed an overall upward trend in forest area—from 119.84 to 132.09 million hectares (Mha)—that was rarely captured in previous studies, closely aligning with National Forest Inventory (NFI) data (R2 = 0.86). Finally, by integrating time-series analysis with classification results, this study transformed forest mapping from a traditional static framework to a dynamic temporal perspective, reducing uncertainties associated with direct interannual comparisons and estimating forest gains of 23.87 Mha and losses of 12.56 Mha. Notably, reconstructed data improved forest mapping in terms of completeness, resolution, and accuracy. In Guangxi, the annual product detected 11.24 Mha more forest gain than the 10-year composite, indicating better completeness. It also offered finer spatial resolution (30  m vs. 500  m) and higher overall accuracy (0.879 vs. 0.853), compared to the widely used cloud-affected annual product. Overall, this study presents a robust framework for precise forest monitoring in cloudy regions.
森林覆盖的大规模、长期和高频率监测对于可持续森林管理和碳储量评估至关重要。然而,在中国南方等持续多云的地区,由于缺乏高质量的遥感数据和可靠的训练样本,导致森林覆盖产品的时空分辨率有限。此外,许多现有数据集无法准确描述森林分布和动态特征,特别是低估了森林扩张,忽视了精细尺度和高频变化。为了解决这些问题,我们提出了一种基于重建遥感数据的森林-非森林制图框架。首先,我们成功地利用两种基于深度学习的多传感器融合方法在广泛(204万平方公里)、长期(2000-2020)、持续多云地区实现了大规模数据重建,有效地生成了无缝图像和NDVI时间序列,以解决森林分类中广泛的时空数据缺口。其次,将光谱相似的样本转移方法与现有的土地覆盖产品相结合,构建了跨越广泛时空尺度的鲁棒训练样本。随后,我们使用随机森林分类器生成了中国南方多云地区每年30米的森林覆盖地图,在提高制图精度的同时,实现了前所未有的时空分辨率平衡。结果表明,总体精度为0.904,优于中国土地覆盖数据集(CLCD, 0.889)和中国树木覆盖数据集(CATCD, 0.850)。特别是,我们的研究结果显示,森林面积总体呈上升趋势,从1.1984亿公顷(Mha)增加到1.3209亿公顷(Mha),这在以前的研究中很少被捕捉到,与国家森林清查(NFI)数据密切一致(R2 = 0.86)。最后,通过将时间序列分析与分类结果相结合,将森林作图从传统的静态框架转变为动态时间视角,减少了直接年际比较的不确定性,估算出23.87 Mha的森林收益和12.56 Mha的森林损失。值得注意的是,重建数据在完整性、分辨率和精度方面提高了森林制图。在广西,年际产品比10年综合产品多检测到11.24 Mha的森林增收,表明完整性较好。与广泛使用的受云影响的年度产品相比,它还提供了更精细的空间分辨率(30米vs 500米)和更高的整体精度(0.879 vs 0.853)。总的来说,本研究为多云地区的精确森林监测提供了一个强大的框架。
{"title":"Unveiling spatiotemporal forest cover patterns breaking the cloud barrier: Annual 30  m mapping in cloud-prone southern China from 2000 to 2020","authors":"Peng Qin ,&nbsp;Huabing Huang ,&nbsp;Jie Wang ,&nbsp;Yunxia Cui ,&nbsp;Peimin Chen ,&nbsp;Shuang Chen ,&nbsp;Yu Xia ,&nbsp;Shuai Yuan ,&nbsp;Yumei Li ,&nbsp;Xiangyu Liu","doi":"10.1016/j.isprsjprs.2026.01.015","DOIUrl":"10.1016/j.isprsjprs.2026.01.015","url":null,"abstract":"<div><div>Large-scale, long-term, and high-frequency monitoring of forest cover is essential for sustainable forest management and carbon stock assessment. However, in persistently cloudy regions such as southern China, the scarcity of high-quality remote sensing data and reliable training samples has resulted in forest cover products with limited spatial and temporal resolution. In addition, many existing datasets fail to accurately characterize forest distribution and dynamics—particularly underestimating forest expansion and overlooking fine-scale and high-frequency changes. To address these limitations, we propose a novel forest–non-forest mapping framework based on reconstructed remote sensing data. First, we successfully achieved large-scale data reconstruction using two deep learning-based multi-sensor fusion methods across extensive (2.04 million km<sup>2</sup>), long-term (2000–2020), persistently cloudy regions, effectively generating seamless imagery and NDVI time series to address extensive spatial and temporal data gaps for forest classification. Next, by combining spectrally similar sample transfer method with existing land cover products, we constructed robust training samples spanning broad spatial and temporal scales. Subsequently, using a random forest classifier we generated annual 30  m forest cover maps for cloudy southern China, achieving an unprecedented balance between spatial and temporal resolution while improving mapping accuracy. The results demonstrate an overall accuracy of 0.904, surpassing that of the China Land Cover Dataset (CLCD, 0.889) and the China Annual Tree Cover Dataset (CATCD, 0.850). Particularly, our results revealed an overall upward trend in forest area—from 119.84 to 132.09 million hectares (Mha)—that was rarely captured in previous studies, closely aligning with National Forest Inventory (NFI) data (R<sup>2</sup> = 0.86). Finally, by integrating time-series analysis with classification results, this study transformed forest mapping from a traditional static framework to a dynamic temporal perspective, reducing uncertainties associated with direct interannual comparisons and estimating forest gains of 23.87 Mha and losses of 12.56 Mha. Notably, reconstructed data improved forest mapping in terms of completeness, resolution, and accuracy. In Guangxi, the annual product detected 11.24 Mha more forest gain than the 10-year composite, indicating better completeness. It also offered finer spatial resolution (30  m vs. 500  m) and higher overall accuracy (0.879 vs. 0.853), compared to the widely used cloud-affected annual product. Overall, this study presents a robust framework for precise forest monitoring in cloudy regions.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 848-864"},"PeriodicalIF":12.2,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TUM2TWIN: Introducing the large-scale multimodal urban digital twin benchmark dataset TUM2TWIN:介绍大规模多模式城市数字孪生基准数据集
IF 12.2 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2026-01-13 DOI: 10.1016/j.isprsjprs.2025.12.013
Olaf Wysocki , Benedikt Schwab , Manoj Kumar Biswanath , Michael Greza , Qilin Zhang , Jingwei Zhu , Thomas Froech , Medhini Heeramaglore , Ihab Hijazi , Khaoula Kanna , Mathias Pechinger , Zhaiyu Chen , Yao Sun , Alejandro Rueda Segura , Ziyang Xu , Omar AbdelGafar , Mansour Mehranfar , Chandan Yeshwanth , Yueh-Cheng Liu , Hadi Yazdi , Boris Jutzi
Urban Digital Twins (UDTs) have become essential for managing cities and integrating complex, heterogeneous data from diverse sources. Creating UDTs involves challenges at multiple process stages, including acquiring accurate 3D source data, reconstructing high-fidelity 3D models, maintaining models’ updates, and ensuring seamless interoperability to downstream tasks. Current datasets are usually limited to one part of the processing chain, hampering comprehensive Urban Digital Twin (UDT)s validation. To address these challenges, we introduce the first comprehensive multimodal Urban Digital Twin benchmark dataset: TUM2TWIN. This dataset includes georeferenced, semantically aligned 3D models and networks along with various terrestrial, mobile, aerial, and satellite observations boasting 32 data subsets over roughly 100,000 m2 and currently 767 GB of data. By ensuring georeferenced indoor–outdoor acquisition, high accuracy, and multimodal data integration, the benchmark supports robust analysis of sensors and the development of advanced reconstruction methods. Additionally, we explore downstream tasks demonstrating the potential of TUM2TWIN, including novel view synthesis of NeRF and Gaussian Splatting, solar potential analysis, point cloud semantic segmentation, and LoD3 building reconstruction. We are convinced this contribution lays a foundation for overcoming current limitations in UDT creation, fostering new research directions and practical solutions for smarter, data-driven urban environments. The project is available under: https://tum2t.win.
城市数字孪生(udt)对于管理城市和整合来自不同来源的复杂异构数据至关重要。创建udt涉及多个过程阶段的挑战,包括获取准确的3D源数据、重建高保真度的3D模型、维护模型的更新,以及确保与下游任务的无缝互操作性。目前的数据集通常仅限于处理链的一部分,阻碍了城市数字孪生(UDT)的全面验证。为了应对这些挑战,我们引入了第一个全面的多模式城市数字孪生基准数据集:TUM2TWIN。该数据集包括地理参考、语义对齐的3D模型和网络,以及各种地面、移动、空中和卫星观测,拥有32个数据子集,超过大约10万平方米,目前数据量为767 GB。通过确保地理参考室内外采集,高精度和多模态数据集成,基准支持传感器的鲁棒分析和先进重建方法的开发。此外,我们还探索了展示TUM2TWIN潜力的下游任务,包括NeRF和高斯飞溅的新视图合成、太阳能潜力分析、点云语义分割和LoD3建筑重建。我们相信,这一贡献为克服当前UDT创建的局限性奠定了基础,为更智能、数据驱动的城市环境培育新的研究方向和实用解决方案。该项目可在:https://tum2t.win。
{"title":"TUM2TWIN: Introducing the large-scale multimodal urban digital twin benchmark dataset","authors":"Olaf Wysocki ,&nbsp;Benedikt Schwab ,&nbsp;Manoj Kumar Biswanath ,&nbsp;Michael Greza ,&nbsp;Qilin Zhang ,&nbsp;Jingwei Zhu ,&nbsp;Thomas Froech ,&nbsp;Medhini Heeramaglore ,&nbsp;Ihab Hijazi ,&nbsp;Khaoula Kanna ,&nbsp;Mathias Pechinger ,&nbsp;Zhaiyu Chen ,&nbsp;Yao Sun ,&nbsp;Alejandro Rueda Segura ,&nbsp;Ziyang Xu ,&nbsp;Omar AbdelGafar ,&nbsp;Mansour Mehranfar ,&nbsp;Chandan Yeshwanth ,&nbsp;Yueh-Cheng Liu ,&nbsp;Hadi Yazdi ,&nbsp;Boris Jutzi","doi":"10.1016/j.isprsjprs.2025.12.013","DOIUrl":"10.1016/j.isprsjprs.2025.12.013","url":null,"abstract":"<div><div>Urban Digital Twins (UDTs) have become essential for managing cities and integrating complex, heterogeneous data from diverse sources. Creating UDTs involves challenges at multiple process stages, including acquiring accurate 3D source data, reconstructing high-fidelity 3D models, maintaining models’ updates, and ensuring seamless interoperability to downstream tasks. Current datasets are usually limited to one part of the processing chain, hampering comprehensive Urban Digital Twin (UDT)s validation. To address these challenges, we introduce the first comprehensive multimodal Urban Digital Twin benchmark dataset: TUM2TWIN. This dataset includes georeferenced, semantically aligned 3D models and networks along with various terrestrial, mobile, aerial, and satellite observations boasting 32 data subsets over roughly 100,000 <span><math><msup><mrow><mi>m</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> and currently 767 GB of data. By ensuring georeferenced indoor–outdoor acquisition, high accuracy, and multimodal data integration, the benchmark supports robust analysis of sensors and the development of advanced reconstruction methods. Additionally, we explore downstream tasks demonstrating the potential of TUM2TWIN, including novel view synthesis of NeRF and Gaussian Splatting, solar potential analysis, point cloud semantic segmentation, and LoD3 building reconstruction. We are convinced this contribution lays a foundation for overcoming current limitations in UDT creation, fostering new research directions and practical solutions for smarter, data-driven urban environments. The project is available under: <span><span>https://tum2t.win</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"232 ","pages":"Pages 810-830"},"PeriodicalIF":12.2,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145962403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1