E. Nyamsuren, Haiqi Xu, Eric Top, S. Scheider, N. Steenbergen
{"title":"地理问题的语义复杂性——答案概念转换方面的比较","authors":"E. Nyamsuren, Haiqi Xu, Eric Top, S. Scheider, N. Steenbergen","doi":"10.5194/agile-giss-4-10-2023","DOIUrl":null,"url":null,"abstract":"Abstract. There is an increasing trend of applying AIbased automated methods to geoscience problems. An important example is a geographic question answering (geoQA) focused on answer generation via GIS workflows rather than retrieval of a factual answer. However, a representative question corpus is necessary for developing, testing, and validating such generative geoQA systems. We compare five manually constructed geographical question corpora, GeoAnQu, Giki, GeoCLEF, GeoQuestions201, and Geoquery, by applying a conceptual transformation parser. The parser infers geo-analytical concepts and their transformations from a geographical question, akin to an abstract GIS workflow. Transformations thus represent the complexity of geo-analytical operations necessary to answer a question. By estimating the variety of concepts and the number of transformations for each corpus, the five corpora can be compared on the level of geo-analytical complexity, which cannot be done with purely NLP-based methods. Results indicate that the questions in GeoAnQu, which were compiled from GIS literature, require a higher number as well as more diverse geo-analytical operations than questions from the four other corpora. Furthermore, constructing a corpus with a sufficient representation (including GIS) may require an approach targeting a uniquely qualified group of users as a source. In contrast, sampling questions from large-scale online repositories like Google, Microsoft, and Yahoo may not provide the quality necessary for testing generative geoQA systems. \n","PeriodicalId":116168,"journal":{"name":"AGILE: GIScience Series","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic complexity of geographic questions - A comparison in terms of conceptual transformations of answers\",\"authors\":\"E. Nyamsuren, Haiqi Xu, Eric Top, S. Scheider, N. Steenbergen\",\"doi\":\"10.5194/agile-giss-4-10-2023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. There is an increasing trend of applying AIbased automated methods to geoscience problems. An important example is a geographic question answering (geoQA) focused on answer generation via GIS workflows rather than retrieval of a factual answer. However, a representative question corpus is necessary for developing, testing, and validating such generative geoQA systems. We compare five manually constructed geographical question corpora, GeoAnQu, Giki, GeoCLEF, GeoQuestions201, and Geoquery, by applying a conceptual transformation parser. The parser infers geo-analytical concepts and their transformations from a geographical question, akin to an abstract GIS workflow. Transformations thus represent the complexity of geo-analytical operations necessary to answer a question. By estimating the variety of concepts and the number of transformations for each corpus, the five corpora can be compared on the level of geo-analytical complexity, which cannot be done with purely NLP-based methods. Results indicate that the questions in GeoAnQu, which were compiled from GIS literature, require a higher number as well as more diverse geo-analytical operations than questions from the four other corpora. Furthermore, constructing a corpus with a sufficient representation (including GIS) may require an approach targeting a uniquely qualified group of users as a source. In contrast, sampling questions from large-scale online repositories like Google, Microsoft, and Yahoo may not provide the quality necessary for testing generative geoQA systems. \\n\",\"PeriodicalId\":116168,\"journal\":{\"name\":\"AGILE: GIScience Series\",\"volume\":\"2016 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AGILE: GIScience Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5194/agile-giss-4-10-2023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AGILE: GIScience Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/agile-giss-4-10-2023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semantic complexity of geographic questions - A comparison in terms of conceptual transformations of answers
Abstract. There is an increasing trend of applying AIbased automated methods to geoscience problems. An important example is a geographic question answering (geoQA) focused on answer generation via GIS workflows rather than retrieval of a factual answer. However, a representative question corpus is necessary for developing, testing, and validating such generative geoQA systems. We compare five manually constructed geographical question corpora, GeoAnQu, Giki, GeoCLEF, GeoQuestions201, and Geoquery, by applying a conceptual transformation parser. The parser infers geo-analytical concepts and their transformations from a geographical question, akin to an abstract GIS workflow. Transformations thus represent the complexity of geo-analytical operations necessary to answer a question. By estimating the variety of concepts and the number of transformations for each corpus, the five corpora can be compared on the level of geo-analytical complexity, which cannot be done with purely NLP-based methods. Results indicate that the questions in GeoAnQu, which were compiled from GIS literature, require a higher number as well as more diverse geo-analytical operations than questions from the four other corpora. Furthermore, constructing a corpus with a sufficient representation (including GIS) may require an approach targeting a uniquely qualified group of users as a source. In contrast, sampling questions from large-scale online repositories like Google, Microsoft, and Yahoo may not provide the quality necessary for testing generative geoQA systems.