Semantic guided large scale factor remote sensing image super-resolution with generative diffusion prior

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2025-02-01 DOI:10.1016/j.isprsjprs.2024.12.001

Ce Wang, Wanjie Sun

{"title":"Semantic guided large scale factor remote sensing image super-resolution with generative diffusion prior","authors":"Ce Wang, Wanjie Sun","doi":"10.1016/j.isprsjprs.2024.12.001","DOIUrl":null,"url":null,"abstract":"<div><div>In the realm of remote sensing, images captured by different platforms exhibit significant disparities in spatial resolution. Consequently, effective large scale factor super-resolution (SR) algorithms are vital for maximizing the utilization of low-resolution (LR) satellite data captured from orbit. However, existing methods confront challenges such as semantic inaccuracies and blurry textures in the reconstructed images. To tackle these issues, we introduce a novel framework, the Semantic Guided Diffusion Model (SGDM), designed for large scale factor remote sensing image super-resolution. The framework exploits a pre-trained generative model as a prior to generate perceptually plausible high-resolution (HR) images, thereby constraining the solution space and mitigating texture blurriness. We further enhance the reconstruction by incorporating vector maps, which carry structural and semantic cues to enhance the reconstruction fidelity of ground objects. Moreover, pixel-level inconsistencies in paired remote sensing images, stemming from sensor-specific imaging characteristics, may hinder the convergence of the model and the diversity in generated results. To address this problem, we develop a method to extract sensor-specific imaging characteristics and model the distribution of them. The proposed model can decouple imaging characteristics from image content, allowing it to generate diverse super-resolution images based on imaging characteristics provided by reference satellite images or sampled from the imaging characteristic probability distributions. To validate and evaluate our approach, we create the Cross-Modal Super-Resolution Dataset (CMSRD). Qualitative and quantitative experiments on CMSRD showcase the superiority and broad applicability of our method. Experimental results on downstream vision tasks also demonstrate the utilitarian of the generated SR images. The dataset and code will be publicly available at <span><span>https://github.com/wwangcece/SGDM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 125-138"},"PeriodicalIF":10.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271624004714","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

In the realm of remote sensing, images captured by different platforms exhibit significant disparities in spatial resolution. Consequently, effective large scale factor super-resolution (SR) algorithms are vital for maximizing the utilization of low-resolution (LR) satellite data captured from orbit. However, existing methods confront challenges such as semantic inaccuracies and blurry textures in the reconstructed images. To tackle these issues, we introduce a novel framework, the Semantic Guided Diffusion Model (SGDM), designed for large scale factor remote sensing image super-resolution. The framework exploits a pre-trained generative model as a prior to generate perceptually plausible high-resolution (HR) images, thereby constraining the solution space and mitigating texture blurriness. We further enhance the reconstruction by incorporating vector maps, which carry structural and semantic cues to enhance the reconstruction fidelity of ground objects. Moreover, pixel-level inconsistencies in paired remote sensing images, stemming from sensor-specific imaging characteristics, may hinder the convergence of the model and the diversity in generated results. To address this problem, we develop a method to extract sensor-specific imaging characteristics and model the distribution of them. The proposed model can decouple imaging characteristics from image content, allowing it to generate diverse super-resolution images based on imaging characteristics provided by reference satellite images or sampled from the imaging characteristic probability distributions. To validate and evaluate our approach, we create the Cross-Modal Super-Resolution Dataset (CMSRD). Qualitative and quantitative experiments on CMSRD showcase the superiority and broad applicability of our method. Experimental results on downstream vision tasks also demonstrate the utilitarian of the generated SR images. The dataset and code will be publicly available at https://github.com/wwangcece/SGDM.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用生成扩散先验实现语义引导的大尺度因子遥感图像超分辨率

在遥感领域，不同平台捕获的图像在空间分辨率方面存在显著差异。因此，有效的大尺度因子超分辨率（SR）算法对于最大化利用从轨道捕获的低分辨率（LR）卫星数据至关重要。然而，现有方法存在语义不准确、重构图像纹理模糊等问题。为了解决这些问题，我们引入了一个新的框架，即语义引导扩散模型（SGDM），该模型是为大尺度因子遥感图像超分辨率而设计的。该框架利用预训练的生成模型作为先验，生成感知上可信的高分辨率（HR）图像，从而限制了解决方案空间并减轻了纹理模糊。我们通过结合矢量图进一步增强重建，矢量图携带结构和语义线索，以提高地面物体的重建保真度。此外，由于传感器特定的成像特性，配对遥感图像的像素级不一致可能会阻碍模型的收敛和生成结果的多样性。为了解决这个问题，我们开发了一种提取传感器特定成像特征并对其分布建模的方法。该模型可以将成像特征与图像内容解耦，使其能够基于参考卫星图像提供的成像特征或从成像特征概率分布中采样生成各种超分辨率图像。为了验证和评估我们的方法，我们创建了跨模态超分辨率数据集（CMSRD）。在CMSRD上的定性和定量实验证明了该方法的优越性和广泛的适用性。下游视觉任务的实验结果也证明了生成的SR图像的实用性。数据集和代码将在https://github.com/wwangcece/SGDM上公开提供。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ISPRS Journal of Photogrammetry and Remote Sensing 工程技术-成像科学与照相技术

CiteScore

21.00

自引率

6.30%

发文量

273

审稿时长

40 days

期刊介绍： The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive. P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields. In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.