On the Opportunities and Challenges of Foundation Models for GeoAI (Vision Paper)

IF 17.7 1区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Accounts of Chemical Research Pub Date : 2024-03-20 DOI:10.1145/3653070

Gengchen Mai, Weiming Huang, Jin Sun, Suhang Song, Deepak Mishra, Ninghao Liu, Song Gao, Tianming Liu, Gao Cong, Yingjie Hu, Chris Cundy, Ziyuan Li, Rui Zhu, Ni Lao

{"title":"On the Opportunities and Challenges of Foundation Models for GeoAI (Vision Paper)","authors":"Gengchen Mai, Weiming Huang, Jin Sun, Suhang Song, Deepak Mishra, Ninghao Liu, Song Gao, Tianming Liu, Gao Cong, Yingjie Hu, Chris Cundy, Ziyuan Li, Rui Zhu, Ni Lao","doi":"10.1145/3653070","DOIUrl":null,"url":null,"abstract":"\n Large pre-trained models, also known as\n foundation models\n (FMs), are trained in a task-agnostic manner on large-scale data and can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or even zero-shot learning. Despite their successes in language and vision tasks, we have yet seen an attempt to develop foundation models for geospatial artificial intelligence (GeoAI). In this work, we explore the promises and challenges of developing multimodal foundation models for GeoAI. We first investigate the potential of many existing FMs by testing their performances on seven tasks across multiple geospatial domains including Geospatial Semantics, Health Geography, Urban Geography, and Remote Sensing. Our results indicate that on several geospatial tasks that only involve text modality such as toponym recognition, location description recognition, and US state-level/county-level dementia time series forecasting, the task-agnostic LLMs can outperform task-specific fully-supervised models in a zero-shot or few-shot learning setting. However, on other geospatial tasks, especially tasks that involve multiple data modalities (e.g., POI-based urban function classification, street view image-based urban noise intensity classification, and remote sensing image scene classification), existing foundation models still underperform task-specific models. Based on these observations, we propose that one of the major challenges of developing a foundation model for GeoAI is to address the multimodality nature of geospatial tasks. After discussing the distinct challenges of each geospatial data modality, we suggest the possibility of a multimodal foundation model which can reason over various types of geospatial data through geospatial alignments. We conclude this paper by discussing the unique risks and challenges to develop such a model for GeoAI.\n","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":"32 31","pages":""},"PeriodicalIF":17.7000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3653070","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Large pre-trained models, also known as foundation models (FMs), are trained in a task-agnostic manner on large-scale data and can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or even zero-shot learning. Despite their successes in language and vision tasks, we have yet seen an attempt to develop foundation models for geospatial artificial intelligence (GeoAI). In this work, we explore the promises and challenges of developing multimodal foundation models for GeoAI. We first investigate the potential of many existing FMs by testing their performances on seven tasks across multiple geospatial domains including Geospatial Semantics, Health Geography, Urban Geography, and Remote Sensing. Our results indicate that on several geospatial tasks that only involve text modality such as toponym recognition, location description recognition, and US state-level/county-level dementia time series forecasting, the task-agnostic LLMs can outperform task-specific fully-supervised models in a zero-shot or few-shot learning setting. However, on other geospatial tasks, especially tasks that involve multiple data modalities (e.g., POI-based urban function classification, street view image-based urban noise intensity classification, and remote sensing image scene classification), existing foundation models still underperform task-specific models. Based on these observations, we propose that one of the major challenges of developing a foundation model for GeoAI is to address the multimodality nature of geospatial tasks. After discussing the distinct challenges of each geospatial data modality, we suggest the possibility of a multimodal foundation model which can reason over various types of geospatial data through geospatial alignments. We conclude this paper by discussing the unique risks and challenges to develop such a model for GeoAI.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

论 GeoAI 基础模型的机遇与挑战（远景规划论文）

大型预训练模型（也称为基础模型（FM））是在大规模数据上以与任务无关的方式进行训练的，可以通过微调、少量学习甚至零点学习来适应各种下游任务。尽管在语言和视觉任务中取得了成功，但我们尚未看到为地理空间人工智能（GeoAI）开发基础模型的尝试。在这项工作中，我们将探索为 GeoAI 开发多模态基础模型的前景和挑战。我们首先调查了许多现有基础模型的潜力，测试了它们在多个地理空间领域（包括地理空间语义学、健康地理学、城市地理学和遥感）的七项任务中的表现。我们的结果表明，在地名识别、位置描述识别和美国州级/县级痴呆症时间序列预测等只涉及文本模式的地理空间任务中，任务无关的 LLM 在零点学习或少点学习环境下的表现优于特定任务的完全监督模型。然而，在其他地理空间任务中，特别是涉及多种数据模式的任务（如基于 POI 的城市功能分类、基于街景图像的城市噪声强度分类和遥感图像场景分类），现有的基础模型仍然不如特定任务模型。基于这些观察结果，我们提出为 GeoAI 开发基础模型的主要挑战之一是解决地理空间任务的多模态特性。在讨论了每种地理空间数据模式的独特挑战之后，我们提出了多模式基础模型的可能性，该模型可以通过地理空间排列对各种类型的地理空间数据进行推理。最后，我们讨论了为 GeoAI 开发这种模型的独特风险和挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Accounts of Chemical Research 化学-化学综合

CiteScore

31.40

自引率

1.10%

发文量

312

审稿时长

2 months

期刊介绍： Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance. Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.