地理问题的语义复杂性——答案概念转换方面的比较

AGILE: GIScience Series Pub Date : 2023-06-06 DOI:10.5194/agile-giss-4-10-2023

E. Nyamsuren, Haiqi Xu, Eric Top, S. Scheider, N. Steenbergen

{"title":"地理问题的语义复杂性——答案概念转换方面的比较","authors":"E. Nyamsuren, Haiqi Xu, Eric Top, S. Scheider, N. Steenbergen","doi":"10.5194/agile-giss-4-10-2023","DOIUrl":null,"url":null,"abstract":"Abstract. There is an increasing trend of applying AIbased automated methods to geoscience problems. An important example is a geographic question answering (geoQA) focused on answer generation via GIS workflows rather than retrieval of a factual answer. However, a representative question corpus is necessary for developing, testing, and validating such generative geoQA systems. We compare five manually constructed geographical question corpora, GeoAnQu, Giki, GeoCLEF, GeoQuestions201, and Geoquery, by applying a conceptual transformation parser. The parser infers geo-analytical concepts and their transformations from a geographical question, akin to an abstract GIS workflow. Transformations thus represent the complexity of geo-analytical operations necessary to answer a question. By estimating the variety of concepts and the number of transformations for each corpus, the five corpora can be compared on the level of geo-analytical complexity, which cannot be done with purely NLP-based methods. Results indicate that the questions in GeoAnQu, which were compiled from GIS literature, require a higher number as well as more diverse geo-analytical operations than questions from the four other corpora. Furthermore, constructing a corpus with a sufficient representation (including GIS) may require an approach targeting a uniquely qualified group of users as a source. In contrast, sampling questions from large-scale online repositories like Google, Microsoft, and Yahoo may not provide the quality necessary for testing generative geoQA systems. \n","PeriodicalId":116168,"journal":{"name":"AGILE: GIScience Series","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic complexity of geographic questions - A comparison in terms of conceptual transformations of answers\",\"authors\":\"E. Nyamsuren, Haiqi Xu, Eric Top, S. Scheider, N. Steenbergen\",\"doi\":\"10.5194/agile-giss-4-10-2023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. There is an increasing trend of applying AIbased automated methods to geoscience problems. An important example is a geographic question answering (geoQA) focused on answer generation via GIS workflows rather than retrieval of a factual answer. However, a representative question corpus is necessary for developing, testing, and validating such generative geoQA systems. We compare five manually constructed geographical question corpora, GeoAnQu, Giki, GeoCLEF, GeoQuestions201, and Geoquery, by applying a conceptual transformation parser. The parser infers geo-analytical concepts and their transformations from a geographical question, akin to an abstract GIS workflow. Transformations thus represent the complexity of geo-analytical operations necessary to answer a question. By estimating the variety of concepts and the number of transformations for each corpus, the five corpora can be compared on the level of geo-analytical complexity, which cannot be done with purely NLP-based methods. Results indicate that the questions in GeoAnQu, which were compiled from GIS literature, require a higher number as well as more diverse geo-analytical operations than questions from the four other corpora. Furthermore, constructing a corpus with a sufficient representation (including GIS) may require an approach targeting a uniquely qualified group of users as a source. In contrast, sampling questions from large-scale online repositories like Google, Microsoft, and Yahoo may not provide the quality necessary for testing generative geoQA systems. \\n\",\"PeriodicalId\":116168,\"journal\":{\"name\":\"AGILE: GIScience Series\",\"volume\":\"2016 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AGILE: GIScience Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5194/agile-giss-4-10-2023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AGILE: GIScience Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/agile-giss-4-10-2023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

摘要应用基于人工智能的自动化方法解决地球科学问题的趋势日益增加。一个重要的例子是地理问答(geoQA)，侧重于通过GIS工作流生成答案，而不是检索事实答案。然而，一个有代表性的问题语料库对于开发、测试和验证这种生成式地理质量保证系统是必要的。通过应用概念转换解析器，我们比较了五个手动构建的地理问题语料库:GeoAnQu、Giki、GeoCLEF、GeoQuestions201和Geoquery。解析器从地理问题推断出地理分析概念及其转换，类似于抽象的GIS工作流。因此，转换代表了回答一个问题所必需的地理分析操作的复杂性。通过估计每个语料库的概念多样性和转换次数，可以在地理分析复杂性水平上对五个语料库进行比较，这是纯粹基于自然语言处理的方法无法做到的。结果表明，与其他四个语料库中的问题相比，GeoAnQu中的问题从GIS文献中编译而来，需要更多的数量和更多样化的地理分析操作。此外，构建具有足够表示的语料库(包括GIS)可能需要一种针对唯一合格用户组作为源的方法。相比之下，从b谷歌、Microsoft和Yahoo等大型在线存储库中抽取问题可能无法提供测试生成式geoQA系统所需的质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Semantic complexity of geographic questions - A comparison in terms of conceptual transformations of answers

Abstract. There is an increasing trend of applying AIbased automated methods to geoscience problems. An important example is a geographic question answering (geoQA) focused on answer generation via GIS workflows rather than retrieval of a factual answer. However, a representative question corpus is necessary for developing, testing, and validating such generative geoQA systems. We compare five manually constructed geographical question corpora, GeoAnQu, Giki, GeoCLEF, GeoQuestions201, and Geoquery, by applying a conceptual transformation parser. The parser infers geo-analytical concepts and their transformations from a geographical question, akin to an abstract GIS workflow. Transformations thus represent the complexity of geo-analytical operations necessary to answer a question. By estimating the variety of concepts and the number of transformations for each corpus, the five corpora can be compared on the level of geo-analytical complexity, which cannot be done with purely NLP-based methods. Results indicate that the questions in GeoAnQu, which were compiled from GIS literature, require a higher number as well as more diverse geo-analytical operations than questions from the four other corpora. Furthermore, constructing a corpus with a sufficient representation (including GIS) may require an approach targeting a uniquely qualified group of users as a source. In contrast, sampling questions from large-scale online repositories like Google, Microsoft, and Yahoo may not provide the quality necessary for testing generative geoQA systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

AGILE: GIScience Series

自引率

0.00%

发文量