Geospatial large language model trained with a simulated environment for generating tool-use chains autonomously

IF 7.6 Q1 REMOTE SENSING International journal of applied earth observation and geoinformation : ITC journal Pub Date : 2025-02-01 DOI:10.1016/j.jag.2024.104312

Yifan Zhang , Jingxuan Li , Zhiyun Wang , Zhengting He , Qingfeng Guan , Jianfeng Lin , Wenhao Yu

{"title":"Geospatial large language model trained with a simulated environment for generating tool-use chains autonomously","authors":"Yifan Zhang , Jingxuan Li , Zhiyun Wang , Zhengting He , Qingfeng Guan , Jianfeng Lin , Wenhao Yu","doi":"10.1016/j.jag.2024.104312","DOIUrl":null,"url":null,"abstract":"<div><div>Solving geospatial tasks generally requires multiple geospatial tools and steps, i.e., tool-use chains. Automating the geospatial task solving process can effectively enhance the efficiency of GIS users. Traditionally, researchers tend to design rule-based systems to autonomously solve similar geospatial tasks, which is inflexible and difficult to adapt to different tasks. With the development of Large Language Models (LLMs), some research suggests that LLMs have the potential for intelligent task solving with their tool-use ability, which means LLMs can invoke externally provided tools for specific tasks. However, most studies rely on closed-source commercial LLMs like ChatGPT and GPT-4, whose limited API accessibility restricts their deployment on local private devices. Some researchers in the general domain proposed using instruction tuning to improve the tool-use ability of open-source LLMs. However, the requirement of tool-use chains to solve geospatial tasks, including multiple data input and output processes, poses challenges for collecting effective instruction tuning data. To solve these challenges, we propose a framework for training a Geospatial large language model to generate Tool-use Chains autonomously (GTChain). Specifically, we design a seed task-guided self-instruct strategy to generate a geospatial tool-use instruction tuning dataset within a simulated environment, encompassing diverse geospatial task production and corresponding tool-use chain generation. Subsequently, an open-source general-domain LLM, LLaMA-2-7B, is fine-tuned on the collected instruction data to understand geospatial tasks and learn how to generate geospatial tool-use chains. Finally, we also collect an evaluation dataset to serve as a benchmark for assessing the geospatial tool-use ability of LLMs. Experimental results on the evaluation dataset demonstrate that the fine-tuned GTChain can effectively solve geospatial tasks using the provided tools, achieving 32.5% and 27.5% higher accuracy in the percentage of correctly solved tasks compared to GPT-4 and Gemini 1.5 Pro, respectively.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"136 ","pages":"Article 104312"},"PeriodicalIF":7.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843224006708","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}

引用次数: 0

Abstract

Solving geospatial tasks generally requires multiple geospatial tools and steps, i.e., tool-use chains. Automating the geospatial task solving process can effectively enhance the efficiency of GIS users. Traditionally, researchers tend to design rule-based systems to autonomously solve similar geospatial tasks, which is inflexible and difficult to adapt to different tasks. With the development of Large Language Models (LLMs), some research suggests that LLMs have the potential for intelligent task solving with their tool-use ability, which means LLMs can invoke externally provided tools for specific tasks. However, most studies rely on closed-source commercial LLMs like ChatGPT and GPT-4, whose limited API accessibility restricts their deployment on local private devices. Some researchers in the general domain proposed using instruction tuning to improve the tool-use ability of open-source LLMs. However, the requirement of tool-use chains to solve geospatial tasks, including multiple data input and output processes, poses challenges for collecting effective instruction tuning data. To solve these challenges, we propose a framework for training a Geospatial large language model to generate Tool-use Chains autonomously (GTChain). Specifically, we design a seed task-guided self-instruct strategy to generate a geospatial tool-use instruction tuning dataset within a simulated environment, encompassing diverse geospatial task production and corresponding tool-use chain generation. Subsequently, an open-source general-domain LLM, LLaMA-2-7B, is fine-tuned on the collected instruction data to understand geospatial tasks and learn how to generate geospatial tool-use chains. Finally, we also collect an evaluation dataset to serve as a benchmark for assessing the geospatial tool-use ability of LLMs. Experimental results on the evaluation dataset demonstrate that the fine-tuned GTChain can effectively solve geospatial tasks using the provided tools, achieving 32.5% and 27.5% higher accuracy in the percentage of correctly solved tasks compared to GPT-4 and Gemini 1.5 Pro, respectively.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

解决地理空间任务一般需要多种地理空间工具和步骤，即工具使用链。地理空间任务解决过程的自动化可以有效提高 GIS 用户的效率。传统上，研究人员倾向于设计基于规则的系统来自主解决类似的地理空间任务，这种系统缺乏灵活性，难以适应不同的任务。随着大型语言模型（LLMs）的发展，一些研究表明 LLMs 具有工具使用能力，即 LLMs 可以调用外部提供的工具来完成特定任务，从而具有智能解决任务的潜力。然而，大多数研究都依赖于 ChatGPT 和 GPT-4 等闭源商业 LLM，其有限的 API 可访问性限制了它们在本地私人设备上的部署。通用领域的一些研究人员建议使用指令调整来提高开源 LLM 的工具使用能力。然而，解决地理空间任务的工具使用链要求包括多个数据输入和输出过程，这给收集有效的指令调整数据带来了挑战。为了解决这些难题，我们提出了一个用于训练地理空间大型语言模型的框架，以自主生成工具使用链（GTChain）。具体来说，我们设计了一种种子任务引导的自我指导策略，在模拟环境中生成地理空间工具使用指令调整数据集，其中包括各种地理空间任务的生成和相应的工具使用链生成。随后，根据收集到的指令数据对开源通用域 LLM LLaMA-2-7B 进行微调，以了解地理空间任务并学习如何生成地理空间工具使用链。最后，我们还收集了一个评估数据集，作为评估 LLM 地理空间工具使用能力的基准。在评估数据集上的实验结果表明，经过微调的 GTChain 可以使用所提供的工具有效地解决地理空间任务，与 GPT-4 和 Gemini 1.5 Pro 相比，正确解决任务的准确率分别提高了 32.5% 和 27.5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International journal of applied earth observation and geoinformation : ITC journal Global and Planetary Change, Management, Monitoring, Policy and Law, Earth-Surface Processes, Computers in Earth Sciences

CiteScore

12.00

自引率

0.00%

发文量

审稿时长

77 days

期刊介绍： The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.