MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation

IF 18.6 IEEE transactions on pattern analysis and machine intelligence Pub Date : 2024-11-26 DOI:10.1109/TPAMI.2024.3507010

Zhiping Yu;Chenyang Liu;Liqin Liu;Zhenwei Shi;Zhengxia Zou

{"title":"MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation","authors":"Zhiping Yu;Chenyang Liu;Liqin Liu;Zhenwei Shi;Zhengxia Zou","doi":"10.1109/TPAMI.2024.3507010","DOIUrl":null,"url":null,"abstract":"The recent advancement of generative foundational models has ushered in a new era of image generation in the realm of natural images, revolutionizing art design, entertainment, environment simulation, and beyond. Despite producing high-quality samples, existing methods are constrained to generating images of scenes at a limited scale. In this paper, we present MetaEarth - a generative foundation model that breaks the barrier by scaling image generation to a global level, exploring the creation of worldwide, multi-resolution, unbounded, and virtually limitless remote sensing images. In MetaEarth, we propose a resolution-guided self-cascading generative framework, which enables the generating of images at any region with a wide range of geographical resolutions. To achieve unbounded and arbitrary-sized image generation, we design a novel noise sampling strategy for denoising diffusion models by analyzing the generation conditions and initial noise. To train MetaEarth, we construct a large dataset comprising multi-resolution optical remote sensing images with geographical information. Experiments have demonstrated the powerful capabilities of our method in generating global-scale images. Additionally, the MetaEarth serves as a data engine that can provide high-quality and rich training data for downstream tasks. Our model opens up new possibilities for constructing generative world models by simulating Earthâs visuals from an innovative overhead perspective.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 3","pages":"1764-1781"},"PeriodicalIF":18.6000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10768939/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The recent advancement of generative foundational models has ushered in a new era of image generation in the realm of natural images, revolutionizing art design, entertainment, environment simulation, and beyond. Despite producing high-quality samples, existing methods are constrained to generating images of scenes at a limited scale. In this paper, we present MetaEarth - a generative foundation model that breaks the barrier by scaling image generation to a global level, exploring the creation of worldwide, multi-resolution, unbounded, and virtually limitless remote sensing images. In MetaEarth, we propose a resolution-guided self-cascading generative framework, which enables the generating of images at any region with a wide range of geographical resolutions. To achieve unbounded and arbitrary-sized image generation, we design a novel noise sampling strategy for denoising diffusion models by analyzing the generation conditions and initial noise. To train MetaEarth, we construct a large dataset comprising multi-resolution optical remote sensing images with geographical information. Experiments have demonstrated the powerful capabilities of our method in generating global-scale images. Additionally, the MetaEarth serves as a data engine that can provide high-quality and rich training data for downstream tasks. Our model opens up new possibilities for constructing generative world models by simulating Earthâs visuals from an innovative overhead perspective.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MetaEarth：全球尺度遥感图像生成基础模型

生成基础模型的最新进展在自然图像领域开创了图像生成的新时代，彻底改变了艺术设计、娱乐、环境模拟等领域。尽管产生高质量的样本，现有的方法仅限于生成有限规模的场景图像。在本文中，我们提出了MetaEarth -一个生成基础模型，它通过将图像生成缩放到全球水平来打破障碍，探索创建全球，多分辨率，无界和几乎无限的遥感图像。在MetaEarth中，我们提出了一个分辨率导向的自级联生成框架，它可以在任何区域生成具有广泛地理分辨率的图像。为了实现无界和任意大小的图像生成，我们通过分析生成条件和初始噪声，设计了一种新的噪声采样策略来对扩散模型进行去噪。为了训练MetaEarth，我们构建了一个包含地理信息的多分辨率光学遥感图像的大型数据集。实验证明了我们的方法在生成全局尺度图像方面的强大能力。此外，MetaEarth还可以作为数据引擎，为下游任务提供高质量和丰富的训练数据。我们的模型通过从创新的头顶视角模拟地球的视觉效果，为构建生成世界模型开辟了新的可能性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量