首页 > 最新文献

ISPRS Journal of Photogrammetry and Remote Sensing最新文献

英文 中文
Word2Scene: Efficient remote sensing image scene generation with only one word via hybrid intelligence and low-rank representation Word2Scene:通过混合智能和低秩表示,仅用一个词就能高效生成遥感图像场景
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-06 DOI: 10.1016/j.isprsjprs.2024.11.002
Jiaxin Ren , Wanzeng Liu , Jun Chen , Shunxi Yin , Yuan Tao
To address the numerous challenges existing in current remote sensing scene generation methods, such as the difficulty in capturing complex interrelations among geographical features and the integration of implicit expert knowledge into generative models, this paper proposes an efficient method for generating remote sensing scenes using hybrid intelligence and low-rank representation, named Word2Scene, which can generate complex scenes with just one word. This approach combines geographic expert knowledge to optimize the remote sensing scene description, enhancing the accuracy and interpretability of the input descriptions. By employing a diffusion model based on hybrid intelligence and low-rank representation techniques, this method endows the diffusion model with the capability to understand remote sensing scene concepts and significantly improves the training efficiency of the diffusion model. This study also introduces the geographic scene holistic perceptual similarity (GSHPS), a novel evaluation metric that holistically assesses the performance of generative models from a global perspective. Experimental results demonstrate that our proposed method outperforms existing state-of-the-art models in terms of remote sensing scene generation quality, efficiency, and realism. Compared to the original diffusion models, LPIPS decreased by 18.52% (from 0.81 to 0.66), and GSHPS increased by 28.57% (from 0.70 to 0.90), validating the effectiveness and advancement of our method. Moreover, Word2Scene is capable of generating remote sensing scenes not present in the training set, showcasing strong zero-shot capabilities. This provides a new perspective and solution for remote sensing image scene generation, with the potential to advance the development of remote sensing, geographic information systems, and related fields. Our code will be released at https://github.com/jaycecd/Word2Scene.
针对目前遥感场景生成方法中存在的诸多挑战,如难以捕捉地理特征之间复杂的相互关系、将隐含的专家知识融入生成模型等,本文提出了一种利用混合智能和低秩表示生成遥感场景的高效方法,命名为 "Word2Scene",只需一个单词即可生成复杂的场景。该方法结合地理专家知识优化遥感场景描述,提高了输入描述的准确性和可解释性。通过采用基于混合智能和低秩表示技术的扩散模型,该方法赋予了扩散模型理解遥感场景概念的能力,并显著提高了扩散模型的训练效率。本研究还引入了地理场景整体感知相似度(GSHPS),这是一种新颖的评价指标,可从全局角度全面评估生成模型的性能。实验结果表明,我们提出的方法在遥感场景生成质量、效率和逼真度方面都优于现有的先进模型。与原始的扩散模型相比,LPIPS 降低了 18.52%(从 0.81 降至 0.66),GSHPS 提高了 28.57%(从 0.70 升至 0.90),验证了我们方法的有效性和先进性。此外,Word2Scene 还能生成训练集中没有的遥感场景,展示了强大的零拍摄能力。这为遥感图像场景生成提供了新的视角和解决方案,有望推动遥感、地理信息系统及相关领域的发展。我们的代码将在 https://github.com/jaycecd/Word2Scene 上发布。
{"title":"Word2Scene: Efficient remote sensing image scene generation with only one word via hybrid intelligence and low-rank representation","authors":"Jiaxin Ren ,&nbsp;Wanzeng Liu ,&nbsp;Jun Chen ,&nbsp;Shunxi Yin ,&nbsp;Yuan Tao","doi":"10.1016/j.isprsjprs.2024.11.002","DOIUrl":"10.1016/j.isprsjprs.2024.11.002","url":null,"abstract":"<div><div>To address the numerous challenges existing in current remote sensing scene generation methods, such as the difficulty in capturing complex interrelations among geographical features and the integration of implicit expert knowledge into generative models, this paper proposes an efficient method for generating remote sensing scenes using hybrid intelligence and low-rank representation, named Word2Scene, which can generate complex scenes with just one word. This approach combines geographic expert knowledge to optimize the remote sensing scene description, enhancing the accuracy and interpretability of the input descriptions. By employing a diffusion model based on hybrid intelligence and low-rank representation techniques, this method endows the diffusion model with the capability to understand remote sensing scene concepts and significantly improves the training efficiency of the diffusion model. This study also introduces the geographic scene holistic perceptual similarity (GSHPS), a novel evaluation metric that holistically assesses the performance of generative models from a global perspective. Experimental results demonstrate that our proposed method outperforms existing state-of-the-art models in terms of remote sensing scene generation quality, efficiency, and realism. Compared to the original diffusion models, LPIPS decreased by 18.52% (from 0.81 to 0.66), and GSHPS increased by 28.57% (from 0.70 to 0.90), validating the effectiveness and advancement of our method. Moreover, Word2Scene is capable of generating remote sensing scenes not present in the training set, showcasing strong zero-shot capabilities. This provides a new perspective and solution for remote sensing image scene generation, with the potential to advance the development of remote sensing, geographic information systems, and related fields. Our code will be released at <span><span>https://github.com/jaycecd/Word2Scene</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 231-257"},"PeriodicalIF":10.6,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A_OPTRAM-ET: An automatic optical trapezoid model for evapotranspiration estimation and its global-scale assessments A_OPTRAM-ET:用于蒸散估计及其全球尺度评估的自动光学梯形模型
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-11-02 DOI: 10.1016/j.isprsjprs.2024.10.019
Zhaoyuan Yao, Wangyipu Li, Yaokui Cui
Remotely sensed evapotranspiration (ET) at a high spatial resolution (30 m) has wide-ranging applications in agriculture, hydrology and meteorology. The original optical trapezoid model for ET (O_OPTRAM-ET), which does not require thermal remote sensing, shows potential for high-resolution ET estimation. However, the non-automated O_OPTRAM-ET heavily depends on visual interpretation or optimization with in situ measurements, limiting its practical utility. In this study, a SpatioTemporal Aggregated Regression algorithm (STAR) is proposed to develop an automatic trapezoid model for ET (A_OPTRAM-ET), implemented within the Google Earth Engine environment, and evaluated globally at both moderate and high resolutions (500 m and 30 m, respectively). Through the integration of an aggregation algorithm across multiple dimensions to automatically determine its parameters, A_OPTRAM-ET can operate efficiently without the need for ground-based measurements as input. Evaluation against in situ ET demonstrates that the proposed A_OPTRAM-ET model effectively estimates ET across various land cover types and satellite platforms. The overall root mean square error (RMSE), mean absolute error (MAE), and correlation coefficient (CC) when compared with in situ latent heat flux (LE) measurements are 35.5 W·m−2 (41.3 W·m−2, 40.0 W·m−2, 36.1 W·m−2,), 26.3 W·m−2 (28.9 W·m−2, 28.7 W·m−2, 25.8 W·m−2,), and 0.78 (0.73, 0.70, 0.72) for Sentinel-2 (Landsat-8, Landsat-5, MOD09GA), respectively. The A_OPTRAM-ET model exhibits a stable accuracy over long time periods (approximately 10 years). When compared with other published ET datasets, ET estimated by the A_OPTRAM-ET model is better with the land cover types of cropland and shrubland. Additionally, global ET derived from the A_OPTRAM-ET model shows trends consistent with other published ET datasets over the period 2001–2020, while offering enhanced spatial details. Therefore, the proposed A_OPTRAM-ET model provides an efficient, high-resolution, and globally applicable method for ET estimation, with significant practical values for agriculture, hydrology, and related fields.
高空间分辨率(30 米)的遥感蒸散量(ET)在农业、水文和气象领域有着广泛的应用。最初的蒸散发光学梯形模型(O_OPTRAM-ET)不需要热遥感,显示了高分辨率蒸散发估算的潜力。然而,非自动化的 O_OPTRAM-ET 在很大程度上依赖于目视判读或通过现场测量进行优化,限制了其实用性。本研究提出了一种时空聚合回归算法(STAR),用于开发一个自动梯形蒸散发模型(A_OPTRAM-ET),该模型在谷歌地球引擎环境中实施,并在中分辨率和高分辨率(分别为 500 米和 30 米)下进行了全球评估。A_OPTRAM-ET 通过整合多维度的聚合算法自动确定参数,无需地面测量数据作为输入即可高效运行。根据原位蒸散发进行的评估表明,所提出的 A_OPTRAM-ET 模型能有效估算各种土地覆被类型和卫星平台的蒸散发。与原地潜热通量(LE)测量值相比,总的均方根误差(RMSE)、平均绝对误差(MAE)和相关系数(CC)分别为 35.5 W-m-2 (41.3 W-m-2、40.0 W-m-2、36.1 W-m-2)、26.3 W-m-2(28.9 W-m-2、28.7 W-m-2、25.8 W-m-2)和 0.78(0.73、0.70、0.72)。A_OPTRAM-ET 模型在长时间内(约 10 年)都表现出稳定的精度。与其他已发布的蒸散发数据集相比,A_OPTRAM-ET 模型估算的蒸散发与耕地和灌木林等土地覆被类型的关系更好。此外,A_OPTRAM-ET 模型得出的全球蒸散发在 2001-2020 年间的趋势与其他已发布的蒸散发数据集一致,同时提供了更多的空间细节。因此,建议的 A_OPTRAM-ET 模型提供了一种高效、高分辨率和全球适用的蒸散发估算方法,对农业、水文及相关领域具有重要的实用价值。
{"title":"A_OPTRAM-ET: An automatic optical trapezoid model for evapotranspiration estimation and its global-scale assessments","authors":"Zhaoyuan Yao,&nbsp;Wangyipu Li,&nbsp;Yaokui Cui","doi":"10.1016/j.isprsjprs.2024.10.019","DOIUrl":"10.1016/j.isprsjprs.2024.10.019","url":null,"abstract":"<div><div>Remotely sensed evapotranspiration (ET) at a high spatial resolution (30 m) has wide-ranging applications in agriculture, hydrology and meteorology. The original optical trapezoid model for ET (O_OPTRAM-ET), which does not require thermal remote sensing, shows potential for high-resolution ET estimation. However, the non-automated O_OPTRAM-ET heavily depends on visual interpretation or optimization with in situ measurements, limiting its practical utility. In this study, a SpatioTemporal Aggregated Regression algorithm (STAR) is proposed to develop an automatic trapezoid model for ET (A_OPTRAM-ET), implemented within the Google Earth Engine environment, and evaluated globally at both moderate and high resolutions (500 m and 30 m, respectively). Through the integration of an aggregation algorithm across multiple dimensions to automatically determine its parameters, A_OPTRAM-ET can operate efficiently without the need for ground-based measurements as input. Evaluation against in situ ET demonstrates that the proposed A_OPTRAM-ET model effectively estimates ET across various land cover types and satellite platforms. The overall root mean square error (RMSE), mean absolute error (MAE), and correlation coefficient (CC) when compared with in situ latent heat flux (LE) measurements are 35.5 W·m<sup>−2</sup> (41.3 W·m<sup>−2</sup>, 40.0 W·m<sup>−2</sup>, 36.1 W·m<sup>−2</sup>,), 26.3 W·m<sup>−2</sup> (28.9 W·m<sup>−2</sup>, 28.7 W·m<sup>−2</sup>, 25.8 W·m<sup>−2</sup>,), and 0.78 (0.73, 0.70, 0.72) for Sentinel-2 (Landsat-8, Landsat-5, MOD09GA), respectively. The A_OPTRAM-ET model exhibits a stable accuracy over long time periods (approximately 10 years). When compared with other published ET datasets, ET estimated by the A_OPTRAM-ET model is better with the land cover types of cropland and shrubland. Additionally, global ET derived from the A_OPTRAM-ET model shows trends consistent with other published ET datasets over the period 2001–2020, while offering enhanced spatial details. Therefore, the proposed A_OPTRAM-ET model provides an efficient, high-resolution, and globally applicable method for ET estimation, with significant practical values for agriculture, hydrology, and related fields.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 181-197"},"PeriodicalIF":10.6,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Atmospheric correction of geostationary ocean color imager data over turbid coastal waters under high solar zenith angles 太阳高天顶角下浑浊沿岸水域地球静止海洋彩色成像仪数据的大气校正
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-31 DOI: 10.1016/j.isprsjprs.2024.10.018
Hao Li , Xianqiang He , Palanisamy Shanmugam , Yan Bai , Xuchen Jin , Zhihong Wang , Yifan Zhang , Difeng wang , Fang Gong , Min Zhao
The traditional atmospheric correction models employed with the near-infrared iterative schemes inaccurately estimate aerosol radiance at high solar zenith angles (SZAs), leading to a substantial loss of valid products for dawn or dusk observations by the geostationary satellite ocean color sensor. To overcome this issue, we previously developed an atmospheric correction model suitable for open ocean waters observed by the first geostationary satellite ocean color imager (GOCI) under high SZAs. This model was constructed based on a dataset from stable open ocean waters, which makes it less suitable for coastal waters. In this study, we developed a specialized atmospheric correction model (GOCI-II-NN) capable of accurately retrieving the water-leaving radiance from GOCI-II observations in coastal oceans under high SZAs. We utilized multiple observations from GOCI-II throughout the day to develop the selection criteria for extracting the stable coastal water pixels and created a new training dataset for the proposed model. The performance of the GOCI-II-NN model was validated by in-situ data collected from coastal/shelf waters. The results showed an Average Percentage Difference (APD) of less than 23% across the entire visible spectrum. In terms of the valid data and retrieval accuracy, the GOCI-II-NN model was superior to the traditional near-infrared and ultraviolet atmospheric correction models in terms of accurately retrieving the ocean color products for various applications, such as tracking/monitoring of algal blooms, sediment dynamics, and water quality among other applications.
传统的近红外迭代方案所采用的大气校正模型对高太阳天顶角(SZA)下气溶胶辐射率的估计不准确,导致地球静止卫星海洋色彩传感器在黎明或黄昏观测时有效产品的大量损失。为了解决这个问题,我们之前开发了一个大气校正模型,适用于第一颗地球静止卫星海洋彩色成像仪(GOCI)在高日照角下观测到的开阔海域。该模型是基于稳定的开阔海域数据集构建的,因此不太适合沿岸海域。在这项研究中,我们开发了一种专门的大气校正模式(GOCI-II-NN),能够在高 SZAs 条件下从 GOCI-II 观测数据中准确获取沿岸海域的离水辐射率。我们利用来自 GOCI-II 的全天多次观测数据,制定了提取稳定沿岸水域像素的选择标准,并为所提出的模式创建了一个新的训练数据集。从沿岸/大陆架水域收集的现场数据验证了 GOCI-II-NN 模式的性能。结果显示,在整个可见光谱范围内,平均百分比差异(APD)小于 23%。在有效数据和检索精度方面,GOCI-II-NN 模型优于传统的近红外和紫外大气校正模型,能准确检索出各种应用领域的海洋颜色产品,如跟踪/监测藻华、沉积物动力学和水质等。
{"title":"Atmospheric correction of geostationary ocean color imager data over turbid coastal waters under high solar zenith angles","authors":"Hao Li ,&nbsp;Xianqiang He ,&nbsp;Palanisamy Shanmugam ,&nbsp;Yan Bai ,&nbsp;Xuchen Jin ,&nbsp;Zhihong Wang ,&nbsp;Yifan Zhang ,&nbsp;Difeng wang ,&nbsp;Fang Gong ,&nbsp;Min Zhao","doi":"10.1016/j.isprsjprs.2024.10.018","DOIUrl":"10.1016/j.isprsjprs.2024.10.018","url":null,"abstract":"<div><div>The traditional atmospheric correction models employed with the near-infrared iterative schemes inaccurately estimate aerosol radiance at high solar zenith angles (SZAs), leading to a substantial loss of valid products for dawn or dusk observations by the geostationary satellite ocean color sensor. To overcome this issue, we previously developed an atmospheric correction model suitable for open ocean waters observed by the first geostationary satellite ocean color imager (GOCI) under high SZAs. This model was constructed based on a dataset from stable open ocean waters, which makes it less suitable for coastal waters. In this study, we developed a specialized atmospheric correction model (GOCI-II-NN) capable of accurately retrieving the water-leaving radiance from GOCI-II observations in coastal oceans under high SZAs. We utilized multiple observations from GOCI-II throughout the day to develop the selection criteria for extracting the stable coastal water pixels and created a new training dataset for the proposed model. The performance of the GOCI-II-NN model was validated by in-situ data collected from coastal/shelf waters. The results showed an Average Percentage Difference (APD) of less than 23% across the entire visible spectrum. In terms of the valid data and retrieval accuracy, the GOCI-II-NN model was superior to the traditional near-infrared and ultraviolet atmospheric correction models in terms of accurately retrieving the ocean color products for various applications, such as tracking/monitoring of algal blooms, sediment dynamics, and water quality among other applications.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 166-180"},"PeriodicalIF":10.6,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142561234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cascaded recurrent networks with masked representation learning for stereo matching of high-resolution satellite images 用于高分辨率卫星图像立体匹配的具有遮蔽表示学习功能的级联递归网络
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-30 DOI: 10.1016/j.isprsjprs.2024.10.017
Zhibo Rao , Xing Li , Bangshu Xiong , Yuchao Dai , Zhelun Shen , Hangbiao Li , Yue Lou
Stereo matching of satellite images presents challenges due to missing data, domain differences, and imperfect rectification. To address these issues, we propose cascaded recurrent networks with masked representation learning for high-resolution satellite stereo images, consisting of feature extraction and cascaded recurrent modules. First, we develop the correlation computation in the cascaded recurrent module to search for results on the epipolar line and adjacent areas, mitigating the impacts of erroneous rectification. Second, we use a training strategy based on masked representation learning to handle missing data and different domain attributes, enhancing data utilization and feature representation. Our training strategy includes two stages: (1) image reconstruction stage. We feed masked left or right images to the feature extraction module and adopt a reconstruction decoder to reconstruct the original images as a pre-training process, obtaining a pre-trained feature extraction module; (2) the stereo matching stage. We lock the parameters of the feature extraction module and employ stereo image pairs to train the cascaded recurrent module to get the final model. We implement the cascaded recurrent networks with two well-known feature extraction modules (CNN-based Restormer or Transformer-based ViT) to prove the effectiveness of our approach. Experimental results on the US3D and WHU-Stereo datasets show that: (1) Our training strategy can be used for CNN-based and Transformer-based methods on the remote sensing datasets with limited data to improve performance, outperforming the second-best network HMSM-Net by approximately 0.54% and 1.95% in terms of the percentage of the 3-px error on the WHU-Stereo and US3D datasets, respectively; (2) Our correlation manner can handle imperfect rectification, reducing the error rate by 8.9% on the random shift test; (3) Our method can predict high-quality disparity maps and achieve state-of-the-art performance, reducing the percentage of the 3-px error to 12.87% and 7.01% on the WHU-Stereo and US3D datasets, respectively. The source codes are released at https://github.com/Archaic-Atom/MaskCRNet.
卫星图像的立体匹配因数据缺失、域差异和不完善的校正而面临挑战。为解决这些问题,我们提出了针对高分辨率卫星立体图像的级联递归网络,该网络由特征提取模块和级联递归模块组成,具有遮蔽表示学习功能。首先,我们在级联递归模块中开发了相关性计算功能,以搜索外极线和相邻区域的结果,从而减轻错误校正带来的影响。其次,我们使用基于掩码表示学习的训练策略来处理缺失数据和不同的领域属性,从而提高数据利用率和特征表示能力。我们的训练策略包括两个阶段:(1)图像重建阶段。我们向特征提取模块输入遮蔽的左右图像,并采用重构解码器重构原始图像作为预训练过程,从而获得预训练的特征提取模块;(2)立体匹配阶段。我们锁定特征提取模块的参数,利用立体图像对训练级联递归模块,得到最终模型。我们用两个著名的特征提取模块(基于 CNN 的 Restormer 或基于 Transformer 的 ViT)来实现级联递归网络,以证明我们方法的有效性。在 US3D 和 WHU-Stereo 数据集上的实验结果表明(1) 在数据有限的遥感数据集上,我们的训练策略可用于基于 CNN 和基于 Transformer 的方法,以提高性能,在 WHU-Stereo 和 US3D 数据集上的 3-px 误差百分比分别比第二好的网络 HMSM-Net 高出约 0.54% 和 1.95%;(2) 我们的相关方式可以处理不完美的整流,在随机偏移测试中将误差率降低了 8.9%;(3) 我们的方法可以预测高质量的差距图,并达到最先进的性能,在 WHU-Stereo 和 US3D 数据集上的 3-px 误差百分比分别降低到 12.87% 和 7.01%。源代码发布于 https://github.com/Archaic-Atom/MaskCRNet。
{"title":"Cascaded recurrent networks with masked representation learning for stereo matching of high-resolution satellite images","authors":"Zhibo Rao ,&nbsp;Xing Li ,&nbsp;Bangshu Xiong ,&nbsp;Yuchao Dai ,&nbsp;Zhelun Shen ,&nbsp;Hangbiao Li ,&nbsp;Yue Lou","doi":"10.1016/j.isprsjprs.2024.10.017","DOIUrl":"10.1016/j.isprsjprs.2024.10.017","url":null,"abstract":"<div><div>Stereo matching of satellite images presents challenges due to missing data, domain differences, and imperfect rectification. To address these issues, we propose cascaded recurrent networks with masked representation learning for high-resolution satellite stereo images, consisting of feature extraction and cascaded recurrent modules. First, we develop the correlation computation in the cascaded recurrent module to search for results on the epipolar line and adjacent areas, mitigating the impacts of erroneous rectification. Second, we use a training strategy based on masked representation learning to handle missing data and different domain attributes, enhancing data utilization and feature representation. Our training strategy includes two stages: (1) image reconstruction stage. We feed masked left or right images to the feature extraction module and adopt a reconstruction decoder to reconstruct the original images as a pre-training process, obtaining a pre-trained feature extraction module; (2) the stereo matching stage. We lock the parameters of the feature extraction module and employ stereo image pairs to train the cascaded recurrent module to get the final model. We implement the cascaded recurrent networks with two well-known feature extraction modules (CNN-based Restormer or Transformer-based ViT) to prove the effectiveness of our approach. Experimental results on the US3D and WHU-Stereo datasets show that: (1) Our training strategy can be used for CNN-based and Transformer-based methods on the remote sensing datasets with limited data to improve performance, outperforming the second-best network HMSM-Net by approximately 0.54% and 1.95% in terms of the percentage of the 3-px error on the WHU-Stereo and US3D datasets, respectively; (2) Our correlation manner can handle imperfect rectification, reducing the error rate by 8.9% on the random shift test; (3) Our method can predict high-quality disparity maps and achieve state-of-the-art performance, reducing the percentage of the 3-px error to 12.87% and 7.01% on the WHU-Stereo and US3D datasets, respectively. The source codes are released at <span><span>https://github.com/Archaic-Atom/MaskCRNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 151-165"},"PeriodicalIF":10.6,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142554188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging real and simulated data for cross-spatial- resolution vegetation segmentation with application to rice crops 将真实数据和模拟数据用于跨空间分辨率植被划分,并将其应用于水稻作物
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-28 DOI: 10.1016/j.isprsjprs.2024.10.007
Yangmingrui Gao , Linyuan Li , Marie Weiss , Wei Guo , Ming Shi , Hao Lu , Ruibo Jiang , Yanfeng Ding , Tejasri Nampally , P. Rajalakshmi , Frédéric Baret , Shouyang Liu
Accurate image segmentation is essential for image-based estimation of vegetation canopy traits, as it minimizes background interference. However, existing segmentation models often lack the generalization ability to effectively tackle both ground-based and aerial images across a wide range of spatial resolutions. To address this limitation, a cross-spatial-resolution image segmentation model for rice crop was trained using the integration of in-situ and in silico multi-resolution images. We collected more than 3,000 RGB images (real set) covering 17 different resolutions reflecting diverse canopy structures, illumination conditions and background in rice fields, with vegetation pixels annotated manually. Using the previously developed Digital Plant Phenotyping Platform, we created a simulated dataset (sim set) including 10,000 RGB images with resolutions ranging from 0.5 to 3.5 mm/pixel, accompanied by corresponding mask labels. By employing a domain adaptation technique, the simulated images were further transformed into visually realistic images while preserving the original labels, creating a simulated-to-realistic dataset (sim2real set). Building upon a SegFormer deep learning model, we demonstrated that training with multi-resolution samples led to more generalized segmentation results than single-resolution training on the real dataset. Our exploration of various integration strategies revealed that a training set of 9,600 sim2real images combined with only 60 real images achieved the same segmentation accuracy as 2,400 real images (IoU = 0.819, F1 = 0.901). Moreover, combining 2,400 real images and 1,200 sim2real images resulted in the best performing model, effective against six challenging situations, such as specular reflections and shadows. Compared with models trained with single-resolution samples and an established model (i.e., VegANN), our model effectively improved the estimation of both green fraction and green area index across spatial resoultions. The strategy of bridging real and simulated data for cross-resolution deep learning model is expected to be applicable to other crops. The best trained model is available at https://github.com/PheniX-Lab/crossGSD-seg.
准确的图像分割对基于图像的植被冠层特征估量至关重要,因为它能最大限度地减少背景干扰。然而,现有的分割模型往往缺乏泛化能力,无法有效处理各种空间分辨率的地面图像和航空图像。为了解决这一局限性,我们利用原位图像和硅学多分辨率图像的整合,训练了一个水稻作物的跨空间分辨率图像分割模型。我们收集了 3,000 多张 RGB 图像(真实集),涵盖 17 种不同分辨率,反映了水稻田中不同的冠层结构、光照条件和背景,并人工标注了植被像素。利用之前开发的数字植物表型平台,我们创建了一个模拟数据集(模拟集),其中包括 10,000 张分辨率为 0.5 至 3.5 毫米/像素的 RGB 图像,并附有相应的掩膜标签。通过采用域自适应技术,模拟图像被进一步转换为视觉上真实的图像,同时保留了原始标签,从而创建了一个模拟到真实的数据集(sim2real 集)。在 SegFormer 深度学习模型的基础上,我们证明了使用多分辨率样本进行训练比在真实数据集上进行单分辨率训练能获得更广泛的分割结果。我们对各种整合策略的探索表明,由 9,600 张模拟真实图像和 60 张真实图像组成的训练集达到了与 2,400 张真实图像相同的分割精度(IoU = 0.819,F1 = 0.901)。此外,将 2,400 张真实图像和 1,200 张 sim2real 图像结合在一起可生成性能最佳的模型,能有效地应对镜面反射和阴影等六种具有挑战性的情况。与使用单分辨率样本和成熟模型(即 VegANN)训练的模型相比,我们的模型有效地改进了跨空间分辨率的绿化分数和绿地指数的估算。跨分辨率深度学习模型的真实数据和模拟数据桥接策略有望适用于其他作物。训练有素的最佳模型可在 https://github.com/PheniX-Lab/crossGSD-seg 上查阅。
{"title":"Bridging real and simulated data for cross-spatial- resolution vegetation segmentation with application to rice crops","authors":"Yangmingrui Gao ,&nbsp;Linyuan Li ,&nbsp;Marie Weiss ,&nbsp;Wei Guo ,&nbsp;Ming Shi ,&nbsp;Hao Lu ,&nbsp;Ruibo Jiang ,&nbsp;Yanfeng Ding ,&nbsp;Tejasri Nampally ,&nbsp;P. Rajalakshmi ,&nbsp;Frédéric Baret ,&nbsp;Shouyang Liu","doi":"10.1016/j.isprsjprs.2024.10.007","DOIUrl":"10.1016/j.isprsjprs.2024.10.007","url":null,"abstract":"<div><div>Accurate image segmentation is essential for image-based estimation of vegetation canopy traits, as it minimizes background interference. However, existing segmentation models often lack the generalization ability to effectively tackle both ground-based and aerial images across a wide range of spatial resolutions. To address this limitation, a cross-spatial-resolution image segmentation model for rice crop was trained using the integration of <em>in-situ</em> and <em>in silico</em> multi-resolution images. We collected more than 3,000 RGB images (real set) covering 17 different resolutions reflecting diverse canopy structures, illumination conditions and background in rice fields, with vegetation pixels annotated manually. Using the previously developed Digital Plant Phenotyping Platform, we created a simulated dataset (sim set) including 10,000 RGB images with resolutions ranging from 0.5 to 3.5 mm/pixel, accompanied by corresponding mask labels. By employing a domain adaptation technique, the simulated images were further transformed into visually realistic images while preserving the original labels, creating a simulated-to-realistic dataset (sim2real set). Building upon a SegFormer deep learning model, we demonstrated that training with multi-resolution samples led to more generalized segmentation results than single-resolution training on the real dataset. Our exploration of various integration strategies revealed that a training set of 9,600 sim2real images combined with only 60 real images achieved the same segmentation accuracy as 2,400 real images (IoU = 0.819, F1 = 0.901). Moreover, combining 2,400 real images and 1,200 sim2real images resulted in the best performing model, effective against six challenging situations, such as specular reflections and shadows. Compared with models trained with single-resolution samples and an established model (i.e., VegANN), our model effectively improved the estimation of both green fraction and green area index across spatial resoultions. The strategy of bridging real and simulated data for cross-resolution deep learning model is expected to be applicable to other crops. The best trained model is available at <span><span>https://github.com/PheniX-Lab/crossGSD-seg</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 133-150"},"PeriodicalIF":10.6,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142534771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-modal change detection using historical land use maps and current remote sensing images 利用历史土地利用图和当前遥感图像进行跨模式变化探测
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-24 DOI: 10.1016/j.isprsjprs.2024.10.010
Kai Deng , Xiangyun Hu , Zhili Zhang , Bo Su , Cunjun Feng , Yuanzeng Zhan , Xingkun Wang , Yansong Duan
Using bi-temporal remote sensing imagery to detect land in urban expansion has become a common practice. However, in the process of updating land resource surveys, directly detecting changes between historical land use maps (referred to as “maps” in this paper) and current remote sensing images (referred to as “images” in this paper) is more direct and efficient than relying on bi-temporal image comparisons. The difficulty stems from the substantial modality differences between maps and images, presenting a complex challenge for effective change detection. To address this issue, in this paper, we propose a novel deep learning model named the cross-modal patch alignment network (CMPANet), which bridges the gap between different modalities for cross-modal change detection (CMCD) between maps and images. Our proposed model uses a vision transformer (ViT-B/16) fine-tuned on 1.8 million remote sensing images as an encoder for images and trainable ViTs as the encoder for maps. To bridge the distribution differences between these encoders, we introduce a feature domain adaptation image-map alignment module (IMAM) to transfer and share pretrained model knowledge rapidly. Additionally, we incorporate the cross-modal and cross-channel attention (CCMAT) module and the transformer block attention module to facilitate the interaction and fusion of features across modalities. These fused features are then processed through a UperNet-based feature pyramid to generate pixel-level change maps. These fused features are then processed through a UperNet-based feature pyramid to generate pixel-level change maps. On the newly created EVLab-CMCD dataset and the publicly available HRSCD dataset, CMPANet has achieved state-of-the-art results and offers a novel technical approach for CMCD between maps and images.
利用双时相遥感图像探测城市扩张中的土地已成为一种常见做法。然而,在土地资源调查更新过程中,直接检测历史土地利用图(本文中简称 "图")与当前遥感影像(本文中简称 "影像")之间的变化比依靠双时相影像对比更为直接和有效。困难在于地图和图像之间存在巨大的模态差异,这给有效的变化检测带来了复杂的挑战。为解决这一问题,我们在本文中提出了一种名为 "跨模态补丁配准网络(CMPANet)"的新型深度学习模型,该模型在不同模态之间架起了桥梁,用于地图和图像之间的跨模态变化检测(CMCD)。我们提出的模型使用在 180 万幅遥感图像上微调过的视觉转换器(ViT-B/16)作为图像编码器,使用可训练的视觉转换器作为地图编码器。为了消除这些编码器之间的分布差异,我们引入了特征域适应图像-地图配准模块(IMAM),以快速传输和共享预训练模型知识。此外,我们还加入了跨模态和跨通道注意(CCMAT)模块和转换器块注意模块,以促进跨模态特征的交互和融合。这些融合后的特征通过基于 UperNet 的特征金字塔处理,生成像素级变化图。这些融合后的特征通过基于 UperNet 的特征金字塔进行处理,生成像素级的变化图。在新创建的 EVLab-CMCD 数据集和公开可用的 HRSCD 数据集上,CMPANet 取得了最先进的结果,为地图和图像之间的 CMCD 提供了一种新颖的技术方法。
{"title":"Cross-modal change detection using historical land use maps and current remote sensing images","authors":"Kai Deng ,&nbsp;Xiangyun Hu ,&nbsp;Zhili Zhang ,&nbsp;Bo Su ,&nbsp;Cunjun Feng ,&nbsp;Yuanzeng Zhan ,&nbsp;Xingkun Wang ,&nbsp;Yansong Duan","doi":"10.1016/j.isprsjprs.2024.10.010","DOIUrl":"10.1016/j.isprsjprs.2024.10.010","url":null,"abstract":"<div><div>Using bi-temporal remote sensing imagery to detect land in urban expansion has become a common practice. However, in the process of updating land resource surveys, directly detecting changes between historical land use maps (referred to as “maps” in this paper) and current remote sensing images (referred to as “images” in this paper) is more direct and efficient than relying on bi-temporal image comparisons. The difficulty stems from the substantial modality differences between maps and images, presenting a complex challenge for effective change detection. To address this issue, in this paper, we propose a novel deep learning model named the cross-modal patch alignment network (CMPANet), which bridges the gap between different modalities for cross-modal change detection (CMCD) between maps and images. Our proposed model uses a vision transformer (ViT-B/16) fine-tuned on 1.8 million remote sensing images as an encoder for images and trainable ViTs as the encoder for maps. To bridge the distribution differences between these encoders, we introduce a feature domain adaptation image-map alignment module (IMAM) to transfer and share pretrained model knowledge rapidly. Additionally, we incorporate the cross-modal and cross-channel attention (CCMAT) module and the transformer block attention module to facilitate the interaction and fusion of features across modalities. These fused features are then processed through a UperNet-based feature pyramid to generate pixel-level change maps. These fused features are then processed through a UperNet-based feature pyramid to generate pixel-level change maps. On the newly created EVLab-CMCD dataset and the publicly available HRSCD dataset, CMPANet has achieved state-of-the-art results and offers a novel technical approach for CMCD between maps and images.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 114-132"},"PeriodicalIF":10.6,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142534770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nighttime fog and low stratus detection under multi-scene and all lunar phase conditions using S-NPP/VIIRS visible and infrared channels 利用 S-NPP/VIIRS 可见光和红外通道,在多场景和所有月相条件下探测夜间雾和低层云
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-21 DOI: 10.1016/j.isprsjprs.2024.10.014
Jun Jiang , Zhigang Yao , Yang Liu
A scheme for satellite remote sensing is proposed to detect nighttime fog and low stratus (FLS) by combining visible, mid-infrared, and far-infrared channels. The S-NPP/VIIRS dataset and ERA5 reanalysis data are primarily used, and a comprehensive threshold system is established through statistical analysis, simulation calculations, and sensitivity experiments. 98 cases of nighttime FLS occurring from 2012 to 2020 in China, the United States, and surrounding areas are selected for algorithm validation, utilizing the global surface meteorological observations as comparison data. Preliminary results from the analysis of four typical cases indicate that the algorithm is temporally suitable for all lunar phase conditions from new moon to full moon at night, and spatially applicable to various types of underlying surfaces. The accuracy evaluation results of 14,378 satellite-ground matching samples further show that the algorithm has high accuracy overall, with a POD of 0.86, CSI of 0.81, and FAR of 0.06. The accuracy is highest in winter, lowest in summer, and intermediate in spring and autumn. The missed detections and false alarms predominantly occur at the edge of clouds, which may be caused by parallax and time difference between satellite and ground observations.
提出了一种卫星遥感方案,通过结合可见光、中红外和远红外通道来探测夜间雾和低层云(FLS)。主要使用 S-NPP/VIIRS 数据集和 ERA5 再分析数据,并通过统计分析、模拟计算和灵敏度实验建立了一个全面的阈值系统。利用全球地面气象观测数据作为对比数据,选取了 2012 年至 2020 年发生在中国、美国及周边地区的 98 个夜间 FLS 案例进行算法验证。对四个典型案例的初步分析结果表明,该算法在时间上适用于夜间从新月到满月的所有月相条件,在空间上适用于各种类型的底面。对 14378 个卫星-地面匹配样本的精度评估结果进一步表明,该算法总体精度较高,POD 为 0.86,CSI 为 0.81,FAR 为 0.06。准确率在冬季最高,夏季最低,春季和秋季居中。漏检和误报主要发生在云层边缘,这可能是由于视差和卫星与地面观测的时间差造成的。
{"title":"Nighttime fog and low stratus detection under multi-scene and all lunar phase conditions using S-NPP/VIIRS visible and infrared channels","authors":"Jun Jiang ,&nbsp;Zhigang Yao ,&nbsp;Yang Liu","doi":"10.1016/j.isprsjprs.2024.10.014","DOIUrl":"10.1016/j.isprsjprs.2024.10.014","url":null,"abstract":"<div><div>A scheme for satellite remote sensing is proposed to detect nighttime fog and low stratus (FLS) by combining visible, mid-infrared, and far-infrared channels. The S-NPP/VIIRS dataset and ERA5 reanalysis data are primarily used, and a comprehensive threshold system is established through statistical analysis, simulation calculations, and sensitivity experiments. 98 cases of nighttime FLS occurring from 2012 to 2020 in China, the United States, and surrounding areas are selected for algorithm validation, utilizing the global surface meteorological observations as comparison data. Preliminary results from the analysis of four typical cases indicate that the algorithm is temporally suitable for all lunar phase conditions from new moon to full moon at night, and spatially applicable to various types of underlying surfaces. The accuracy evaluation results of 14,378 satellite-ground matching samples further show that the algorithm has high accuracy overall, with a POD of 0.86, CSI of 0.81, and FAR of 0.06. The accuracy is highest in winter, lowest in summer, and intermediate in spring and autumn. The missed detections and false alarms predominantly occur at the edge of clouds, which may be caused by parallax and time difference between satellite and ground observations.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 102-113"},"PeriodicalIF":10.6,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142534769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PRISMethaNet: A novel deep learning model for landfill methane detection using PRISMA satellite data PRISMethaNet:利用 PRISMA 卫星数据检测垃圾填埋场甲烷的新型深度学习模型
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-20 DOI: 10.1016/j.isprsjprs.2024.10.003
Mohammad Marjani , Fariba Mohammadimanesh , Daniel J. Varon , Ali Radman , Masoud Mahdianpari
Methane (CH4) is one of the most significant greenhouse gases responsible for about one-third of climate warming since preindustrial times, originating from various sources. Landfills are responsible for a large percentage of CH4 emissions, and population growth can boost these emissions. Therefore, it is vital to automate the process of CH4 monitoring over landfills. This study proposes a convolutional neural network (CNN) with an Atrous Spatial Pyramid Pooling (ASPP) mechanism, called PRISMethaNet, to automate the CH4 detection process using PRISMA satellite data in the 400–2500 nm spectral range. A total number of 41 PRISMA images from 17 landfill sites located in several countries, such as India, Nigeria, Mexico, Pakistan, Iran, and other regions, were used as our study areas. The PRISMethaNet model was trained using augmented data as the input, and plume masks were obtained from the matched filter (MF) algorithm. This novel proposed model successfully detected plumes with overall accuracy (OA), F1-score (F1), precision, and recall of 0.99, 0.96, 0.93, and 0.99, respectively, and quantification uncertainties ranging from 11 % to 58 %. An unboxing of the ASPP module using Gradient-weighted Class Activation Mapping (Grad-CAM) algorithm demonstrated a strong relationship between larger dilation rates (DRs) and CH4 plume detectability. Importantly, the results highlighted that plume masks obtained by PRISMethaNet provided more accurate CH4 quantification rate compared to the statistical methods used in previous studies. In particular, the mean square error (MSE) for PRISMethaNet was approximately 1,102 kg/h, whereas the MSE for the commonly used statistical method was around 1,974 kg/h.
甲烷(CH4)是最重要的温室气体之一,自工业化前时代以来,约有三分之一的气候变暖是由甲烷造成的,其来源多种多样。垃圾填埋场排放的 CH4 占了很大比例,而人口增长也会增加这些排放。因此,实现垃圾填埋场 CH4 监测过程的自动化至关重要。本研究利用 PRISMA 卫星在 400-2500 nm 光谱范围内的数据,提出了一种具有 Atrous Spatial Pyramid Pooling(ASPP)机制的卷积神经网络(CNN),称为 PRISMethaNet,用于实现 CH4 检测过程的自动化。我们以印度、尼日利亚、墨西哥、巴基斯坦、伊朗等多个国家的 17 个垃圾填埋场共 41 幅 PRISMA 图像为研究区域。使用增强数据作为输入对 PRISMethaNet 模型进行了训练,并通过匹配滤波 (MF) 算法获得了羽流掩模。这个新提出的模型成功地检测到了羽流,其总体准确度(OA)、F1 分数(F1)、精确度和召回率分别为 0.99、0.96、0.93 和 0.99,量化不确定性介于 11 % 到 58 % 之间。使用梯度加权类活化映射(Grad-CAM)算法对 ASPP 模块进行的开箱分析表明,较大的扩张率(DR)与 CH4 羽流可探测性之间存在密切关系。重要的是,研究结果表明,与以往研究中使用的统计方法相比,PRISMethaNet 所获得的羽流掩模能提供更准确的 CH4 量化率。特别是,PRISMethaNet 的均方误差 (MSE) 约为 1,102 千克/小时,而常用统计方法的均方误差约为 1,974 千克/小时。
{"title":"PRISMethaNet: A novel deep learning model for landfill methane detection using PRISMA satellite data","authors":"Mohammad Marjani ,&nbsp;Fariba Mohammadimanesh ,&nbsp;Daniel J. Varon ,&nbsp;Ali Radman ,&nbsp;Masoud Mahdianpari","doi":"10.1016/j.isprsjprs.2024.10.003","DOIUrl":"10.1016/j.isprsjprs.2024.10.003","url":null,"abstract":"<div><div>Methane (CH4) is one of the most significant greenhouse gases responsible for about one-third of climate warming since preindustrial times, originating from various sources. Landfills are responsible for a large percentage of CH4 emissions, and population growth can boost these emissions. Therefore, it is vital to automate the process of CH4 monitoring over landfills. This study proposes a convolutional neural network (CNN) with an Atrous Spatial Pyramid Pooling (ASPP) mechanism, called PRISMethaNet, to automate the CH4 detection process using PRISMA satellite data in the 400–2500 nm spectral range. A total number of 41 PRISMA images from 17 landfill sites located in several countries, such as India, Nigeria, Mexico, Pakistan, Iran, and other regions, were used as our study areas. The PRISMethaNet model was trained using augmented data as the input, and plume masks were obtained from the matched filter (MF) algorithm. This novel proposed model successfully detected plumes with overall accuracy (OA), F1-score (F1), precision, and recall of 0.99, 0.96, 0.93, and 0.99, respectively, and quantification uncertainties ranging from 11 % to 58 %. An unboxing of the ASPP module using Gradient-weighted Class Activation Mapping (Grad-CAM) algorithm demonstrated a strong relationship between larger dilation rates (DRs) and CH4 plume detectability. Importantly, the results highlighted that plume masks obtained by PRISMethaNet provided more accurate CH4 quantification rate compared to the statistical methods used in previous studies. In particular, the mean square error (MSE) for PRISMethaNet was approximately 1,102 kg/h, whereas the MSE for the commonly used statistical method was around 1,974 kg/h.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 802-818"},"PeriodicalIF":10.6,"publicationDate":"2024-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142531794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving crop type mapping by integrating LSTM with temporal random masking and pixel-set spatial information 通过将 LSTM 与时间随机掩码和像素集空间信息相结合,改进作物类型制图
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-19 DOI: 10.1016/j.isprsjprs.2024.10.013
Xinyu Zhang , Zhiwen Cai , Qiong Hu , Jingya Yang , Haodong Wei , Liangzhi You , Baodong Xu
Accurate and timely crop type classification is essential for effective agricultural monitoring, cropland management, and yield estimation. Unfortunately, the complicated temporal patterns of different crops, combined with gaps and noise in satellite observations caused by clouds and rain, restrict crop classification accuracy, particularly during early seasons with limited temporal information. Although deep learning-based methods have exhibited great potential for improving crop type mapping, insufficient and noisy training data may lead them to overlook more generalizable features and derive inferior classification performance. To address these challenges, we developed a Mask Pixel-set SpatioTemporal Integration Network (Mask-PSTIN), which integrates a temporal random masking technique and a novel PSTIN model. Temporal random masking augments the training data by selectively removing certain temporal information to improve data variability, enforcing the model to learn more generalized features. The PSTIN, comprising a pixel-set aggregation encoder (PSAE) and long short-term memory (LSTM) module, effectively captures comprehensive spatiotemporal features from time-series satellite images. The effectiveness of Mask-PSTIN was evaluated across three regions with different landscapes and cropping systems. Results demonstrated that the addition of PSAE in PSTIN significantly improved crop classification accuracy compared to a basic LSTM, with average overall accuracy (OA) increasing from 80.9% to 83.9%, and the mean F1-Score (mF1) rising from 0.781 to 0.818. Incorporating temporal random masking in training led to further improvements, increasing average OA and mF1 to 87.4% and 0.865, respectively. The Mask-PSTIN significantly outperformed traditional machine learning and deep learning methods (i.e., RF, SVM, Transformer, and CNN-LSTM) in crop type mapping across all three regions. Furthermore, Mask-PSTIN enabled earlier and more accurate crop type identification before or during their developing stages compared to machine learning models. Feature importance analysis based on the gradient backpropagation algorithm revealed that Mask-PSTIN effectively leveraged multi-temporal features, exhibiting broader attention across various time steps and capturing critical crop phenological characteristics. These results suggest that Mask-PSTIN is a promising approach for improving both post-harvest and early-season crop type classification, with potential applications in agricultural management and monitoring.
准确及时的作物类型分类对于有效的农业监测、耕地管理和产量估算至关重要。遗憾的是,不同作物复杂的时间模式,加上云层和雨水造成的卫星观测空白和噪声,限制了作物分类的准确性,尤其是在时间信息有限的早期季节。虽然基于深度学习的方法在改进作物类型绘图方面表现出巨大潜力,但训练数据不足和噪声可能会导致它们忽略更多可概括的特征,并产生较差的分类性能。为了应对这些挑战,我们开发了掩码像素集时空整合网络(Mask-PSTIN),它集成了时空随机掩码技术和新型 PSTIN 模型。时间随机掩码通过有选择性地移除某些时间信息来增强训练数据,从而提高数据的可变性,迫使模型学习更多的通用特征。PSTIN 由像素集聚合编码器(PSAE)和长短期记忆(LSTM)模块组成,能有效捕捉时间序列卫星图像的综合时空特征。在三个不同地貌和耕作制度的地区对 Mask-PSTIN 的有效性进行了评估。结果表明,与基本的 LSTM 相比,在 PSTIN 中添加 PSAE 能显著提高作物分类的准确性,平均总体准确性(OA)从 80.9% 提高到 83.9%,平均 F1 分数(mF1)从 0.781 提高到 0.818。在训练中加入时间随机掩码可进一步提高准确率,使平均 OA 和 mF1 分别提高到 87.4% 和 0.865。在所有三个区域的作物类型映射方面,Mask-PSTIN 明显优于传统的机器学习和深度学习方法(即 RF、SVM、Transformer 和 CNN-LSTM)。此外,与机器学习模型相比,Mask-PSTIN 能够在作物生长阶段之前或生长阶段中更早更准确地识别作物类型。基于梯度反向传播算法的特征重要性分析表明,Mask-PSTIN 能有效利用多时相特征,在不同的时间步骤中表现出更广泛的关注度,并捕捉到关键的作物物候特征。这些结果表明,Mask-PSTIN 是改进收获后和早季作物类型分类的一种有前途的方法,有望应用于农业管理和监测领域。
{"title":"Improving crop type mapping by integrating LSTM with temporal random masking and pixel-set spatial information","authors":"Xinyu Zhang ,&nbsp;Zhiwen Cai ,&nbsp;Qiong Hu ,&nbsp;Jingya Yang ,&nbsp;Haodong Wei ,&nbsp;Liangzhi You ,&nbsp;Baodong Xu","doi":"10.1016/j.isprsjprs.2024.10.013","DOIUrl":"10.1016/j.isprsjprs.2024.10.013","url":null,"abstract":"<div><div>Accurate and timely crop type classification is essential for effective agricultural monitoring, cropland management, and yield estimation. Unfortunately, the complicated temporal patterns of different crops, combined with gaps and noise in satellite observations caused by clouds and rain, restrict crop classification accuracy, particularly during early seasons with limited temporal information. Although deep learning-based methods have exhibited great potential for improving crop type mapping, insufficient and noisy training data may lead them to overlook more generalizable features and derive inferior classification performance. To address these challenges, we developed a Mask Pixel-set SpatioTemporal Integration Network (Mask-PSTIN), which integrates a temporal random masking technique and a novel PSTIN model. Temporal random masking augments the training data by selectively removing certain temporal information to improve data variability, enforcing the model to learn more generalized features. The PSTIN, comprising a pixel-set aggregation encoder (PSAE) and long short-term memory (LSTM) module, effectively captures comprehensive spatiotemporal features from time-series satellite images. The effectiveness of Mask-PSTIN was evaluated across three regions with different landscapes and cropping systems. Results demonstrated that the addition of PSAE in PSTIN significantly improved crop classification accuracy compared to a basic LSTM, with average overall accuracy (OA) increasing from 80.9% to 83.9%, and the mean F1-Score (mF1) rising from 0.781 to 0.818. Incorporating temporal random masking in training led to further improvements, increasing average OA and mF1 to 87.4% and 0.865, respectively. The Mask-PSTIN significantly outperformed traditional machine learning and deep learning methods (i.e., RF, SVM, Transformer, and CNN-LSTM) in crop type mapping across all three regions. Furthermore, Mask-PSTIN enabled earlier and more accurate crop type identification before or during their developing stages compared to machine learning models. Feature importance analysis based on the gradient backpropagation algorithm revealed that Mask-PSTIN effectively leveraged multi-temporal features, exhibiting broader attention across various time steps and capturing critical crop phenological characteristics. These results suggest that Mask-PSTIN is a promising approach for improving both post-harvest and early-season crop type classification, with potential applications in agricultural management and monitoring.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 87-101"},"PeriodicalIF":10.6,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142534768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized spatio-temporal-spectral integrated fusion for soil moisture downscaling 用于土壤水分降尺度的广义时空-光谱综合融合技术
IF 10.6 1区 地球科学 Q1 GEOGRAPHY, PHYSICAL Pub Date : 2024-10-19 DOI: 10.1016/j.isprsjprs.2024.10.012
Menghui Jiang , Huanfeng Shen , Jie Li , Liangpei Zhang
Soil moisture (SM) is one of the key land surface parameters, but the coarse spatial resolution of the passive microwave SM products constrains the precise monitoring of surface changes. The existing SM downscaling methods typically either utilize spatio-temporal information or leverage auxiliary parameters, without fully mining the complementary information between them. In this paper, a generalized spatio-temporal-spectral integrated fusion-based downscaling method is proposed to fully utilize the complementary features between multi-source auxiliary parameters and multi-temporal SM data. Specifically, we define the spectral characteristic of geographic objects as an assemblage of diverse attribute characteristics at specific spatio-temporal locations and scales. Based on this, the SM-related auxiliary parameter data can be treated as the generalized spectral characteristics of SM, and a generalized spatio-temporal-spectral integrated fusion framework is proposed to integrate the spatio-temporal features of the SM products and the generalized spectral features from the auxiliary parameters to generate fine spatial resolution SM data with high quality. In addition, considering the high heterogeneity of multi-source data, the proposed framework is based on a spatio-temporal constrained cycle generative adversarial network (STC-CycleGAN). The proposed STC-CycleGAN network comprises a forward integrated fusion stage and a backward spatio-temporal constraint stage, between which spatio-temporal cycle-consistent constraints are formed. Numerous experiments were conducted on Soil Moisture Active Passive (SMAP) SM products. The qualitative, quantitative, and in-situ site verification results demonstrate the capability of the proposed method to mine the complementary information of multi-source data and achieve high-accuracy downscaling of global daily SM data from 36 km to 9 km.
土壤水分(SM)是关键的地表参数之一,但被动微波土壤水分产品的空间分辨率较低,限制了对地表变化的精确监测。现有的土壤水分降尺度方法通常要么利用时空信息,要么利用辅助参数,没有充分挖掘它们之间的互补信息。本文提出了一种基于广义时空-光谱综合融合的降尺度方法,以充分利用多源辅助参数与多时空 SM 数据之间的互补特征。具体来说,我们将地理对象的光谱特征定义为特定时空位置和尺度上各种属性特征的集合。在此基础上,与 SM 相关的辅助参数数据可被视为 SM 的广义光谱特征,并提出了广义时空-光谱综合融合框架,以整合 SM 产品的时空特征和辅助参数的广义光谱特征,生成高质量的精细空间分辨率 SM 数据。此外,考虑到多源数据的高度异质性,提出的框架基于时空约束循环生成对抗网络(STC-CycleGAN)。所提出的 STC-CycleGAN 网络包括一个前向综合融合阶段和一个后向时空约束阶段,在这两个阶段之间形成时空周期一致性约束。在土壤水分主动被动(SMAP)SM 产品上进行了大量实验。定性、定量和现场验证结果表明,所提出的方法能够挖掘多源数据的互补信息,实现从 36 千米到 9 千米的全球每日土壤水分数据的高精度降尺度。
{"title":"Generalized spatio-temporal-spectral integrated fusion for soil moisture downscaling","authors":"Menghui Jiang ,&nbsp;Huanfeng Shen ,&nbsp;Jie Li ,&nbsp;Liangpei Zhang","doi":"10.1016/j.isprsjprs.2024.10.012","DOIUrl":"10.1016/j.isprsjprs.2024.10.012","url":null,"abstract":"<div><div>Soil moisture (SM) is one of the key land surface parameters, but the coarse spatial resolution of the passive microwave SM products constrains the precise monitoring of surface changes. The existing SM downscaling methods typically either utilize spatio-temporal information or leverage auxiliary parameters, without fully mining the complementary information between them. In this paper, a generalized spatio-temporal-spectral integrated fusion-based downscaling method is proposed to fully utilize the complementary features between multi-source auxiliary parameters and multi-temporal SM data. Specifically, we define the spectral characteristic of geographic objects as an assemblage of diverse attribute characteristics at specific spatio-temporal locations and scales. Based on this, the SM-related auxiliary parameter data can be treated as the generalized spectral characteristics of SM, and a generalized spatio-temporal-spectral integrated fusion framework is proposed to integrate the spatio-temporal features of the SM products and the generalized spectral features from the auxiliary parameters to generate fine spatial resolution SM data with high quality. In addition, considering the high heterogeneity of multi-source data, the proposed framework is based on a spatio-temporal constrained cycle generative adversarial network (STC-CycleGAN). The proposed STC-CycleGAN network comprises a forward integrated fusion stage and a backward spatio-temporal constraint stage, between which spatio-temporal cycle-consistent constraints are formed. Numerous experiments were conducted on Soil Moisture Active Passive (SMAP) SM products. The qualitative, quantitative, and in-situ site verification results demonstrate the capability of the proposed method to mine the complementary information of multi-source data and achieve high-accuracy downscaling of global daily SM data from 36 km to 9 km.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 70-86"},"PeriodicalIF":10.6,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142534855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ISPRS Journal of Photogrammetry and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1