PeCoQ:基于知识图谱的波斯语复杂问题回答数据集

Romina Etezadi, M. Shamsfard
{"title":"PeCoQ:基于知识图谱的波斯语复杂问题回答数据集","authors":"Romina Etezadi, M. Shamsfard","doi":"10.1109/IKT51791.2020.9345610","DOIUrl":null,"url":null,"abstract":"Question answering systems may find the answers to users' questions from either unstructured texts or structured data such as knowledge graphs. Answering questions using supervised learning approaches including deep learning models need large training datasets. In recent years, some datasets have been presented for the task of Question answering over knowledge graphs, which is the focus of this paper. Although many datasets in English were proposed, there have been a few question answering datasets in Persian. This paper introduces PeCoQ, a dataset for Persian question answering. This dataset contains 10,000 complex questions and answers extracted from the Persian knowledge graph, FarsBase. For each question, the SPARQL query and two paraphrases that were written by linguists are provided as well. There are different types of complexities in the dataset, such as multi-relation, multi-entity, ordinal, and temporal constraints. In this paper, we discuss the dataset's characteristics and describe our methodolozv for buildinz it.","PeriodicalId":382725,"journal":{"name":"2020 11th International Conference on Information and Knowledge Technology (IKT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"PeCoQ: A Dataset for Persian Complex Question Answering over Knowledge Graph\",\"authors\":\"Romina Etezadi, M. Shamsfard\",\"doi\":\"10.1109/IKT51791.2020.9345610\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Question answering systems may find the answers to users' questions from either unstructured texts or structured data such as knowledge graphs. Answering questions using supervised learning approaches including deep learning models need large training datasets. In recent years, some datasets have been presented for the task of Question answering over knowledge graphs, which is the focus of this paper. Although many datasets in English were proposed, there have been a few question answering datasets in Persian. This paper introduces PeCoQ, a dataset for Persian question answering. This dataset contains 10,000 complex questions and answers extracted from the Persian knowledge graph, FarsBase. For each question, the SPARQL query and two paraphrases that were written by linguists are provided as well. There are different types of complexities in the dataset, such as multi-relation, multi-entity, ordinal, and temporal constraints. In this paper, we discuss the dataset's characteristics and describe our methodolozv for buildinz it.\",\"PeriodicalId\":382725,\"journal\":{\"name\":\"2020 11th International Conference on Information and Knowledge Technology (IKT)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 11th International Conference on Information and Knowledge Technology (IKT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IKT51791.2020.9345610\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th International Conference on Information and Knowledge Technology (IKT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IKT51791.2020.9345610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

问答系统可以从非结构化文本或结构化数据(如知识图谱)中找到用户问题的答案。使用监督学习方法(包括深度学习模型)回答问题需要大量的训练数据集。近年来,已有一些数据集用于知识图问答任务,这是本文的重点。虽然提出了许多英语的数据集,但波斯语的问答数据集很少。本文介绍了一个波斯语问答数据集PeCoQ。该数据集包含从波斯语知识图谱FarsBase中提取的10,000个复杂问题和答案。对于每个问题,还提供了SPARQL查询和语言学家编写的两个释义。数据集中存在不同类型的复杂性,例如多关系、多实体、顺序和时间约束。在本文中,我们讨论了数据集的特点,并描述了我们的方法来建立它。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PeCoQ: A Dataset for Persian Complex Question Answering over Knowledge Graph
Question answering systems may find the answers to users' questions from either unstructured texts or structured data such as knowledge graphs. Answering questions using supervised learning approaches including deep learning models need large training datasets. In recent years, some datasets have been presented for the task of Question answering over knowledge graphs, which is the focus of this paper. Although many datasets in English were proposed, there have been a few question answering datasets in Persian. This paper introduces PeCoQ, a dataset for Persian question answering. This dataset contains 10,000 complex questions and answers extracted from the Persian knowledge graph, FarsBase. For each question, the SPARQL query and two paraphrases that were written by linguists are provided as well. There are different types of complexities in the dataset, such as multi-relation, multi-entity, ordinal, and temporal constraints. In this paper, we discuss the dataset's characteristics and describe our methodolozv for buildinz it.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A New Sentence Ordering Method using BERT Pretrained Model Classical-Quantum Multiple Access Wiretap Channel with Common Message: One-Shot Rate Region Business Process Improvement Challenges: A Systematic Literature Review The risk prediction of heart disease by using neuro-fuzzy and improved GOA Distributed Learning Automata-Based Algorithm for Finding K-Clique in Complex Social Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1