{"title":"PeCoQ:基于知识图谱的波斯语复杂问题回答数据集","authors":"Romina Etezadi, M. Shamsfard","doi":"10.1109/IKT51791.2020.9345610","DOIUrl":null,"url":null,"abstract":"Question answering systems may find the answers to users' questions from either unstructured texts or structured data such as knowledge graphs. Answering questions using supervised learning approaches including deep learning models need large training datasets. In recent years, some datasets have been presented for the task of Question answering over knowledge graphs, which is the focus of this paper. Although many datasets in English were proposed, there have been a few question answering datasets in Persian. This paper introduces PeCoQ, a dataset for Persian question answering. This dataset contains 10,000 complex questions and answers extracted from the Persian knowledge graph, FarsBase. For each question, the SPARQL query and two paraphrases that were written by linguists are provided as well. There are different types of complexities in the dataset, such as multi-relation, multi-entity, ordinal, and temporal constraints. In this paper, we discuss the dataset's characteristics and describe our methodolozv for buildinz it.","PeriodicalId":382725,"journal":{"name":"2020 11th International Conference on Information and Knowledge Technology (IKT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"PeCoQ: A Dataset for Persian Complex Question Answering over Knowledge Graph\",\"authors\":\"Romina Etezadi, M. Shamsfard\",\"doi\":\"10.1109/IKT51791.2020.9345610\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Question answering systems may find the answers to users' questions from either unstructured texts or structured data such as knowledge graphs. Answering questions using supervised learning approaches including deep learning models need large training datasets. In recent years, some datasets have been presented for the task of Question answering over knowledge graphs, which is the focus of this paper. Although many datasets in English were proposed, there have been a few question answering datasets in Persian. This paper introduces PeCoQ, a dataset for Persian question answering. This dataset contains 10,000 complex questions and answers extracted from the Persian knowledge graph, FarsBase. For each question, the SPARQL query and two paraphrases that were written by linguists are provided as well. There are different types of complexities in the dataset, such as multi-relation, multi-entity, ordinal, and temporal constraints. In this paper, we discuss the dataset's characteristics and describe our methodolozv for buildinz it.\",\"PeriodicalId\":382725,\"journal\":{\"name\":\"2020 11th International Conference on Information and Knowledge Technology (IKT)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 11th International Conference on Information and Knowledge Technology (IKT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IKT51791.2020.9345610\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th International Conference on Information and Knowledge Technology (IKT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IKT51791.2020.9345610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
PeCoQ: A Dataset for Persian Complex Question Answering over Knowledge Graph
Question answering systems may find the answers to users' questions from either unstructured texts or structured data such as knowledge graphs. Answering questions using supervised learning approaches including deep learning models need large training datasets. In recent years, some datasets have been presented for the task of Question answering over knowledge graphs, which is the focus of this paper. Although many datasets in English were proposed, there have been a few question answering datasets in Persian. This paper introduces PeCoQ, a dataset for Persian question answering. This dataset contains 10,000 complex questions and answers extracted from the Persian knowledge graph, FarsBase. For each question, the SPARQL query and two paraphrases that were written by linguists are provided as well. There are different types of complexities in the dataset, such as multi-relation, multi-entity, ordinal, and temporal constraints. In this paper, we discuss the dataset's characteristics and describe our methodolozv for buildinz it.