MLPQ:一个多语言知识图路径问答数据集

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC ACS Applied Electronic Materials Pub Date : 2023-05-28 DOI:10.1016/j.bdr.2023.100381
Yiming Tan , Yongrui Chen , Guilin Qi , Weizhuo Li , Meng Wang
{"title":"MLPQ:一个多语言知识图路径问答数据集","authors":"Yiming Tan ,&nbsp;Yongrui Chen ,&nbsp;Guilin Qi ,&nbsp;Weizhuo Li ,&nbsp;Meng Wang","doi":"10.1016/j.bdr.2023.100381","DOIUrl":null,"url":null,"abstract":"<div><p>Knowledge Graph-based Multilingual Question Answering (KG-MLQA), as one of the essential subtasks in Knowledge Graph-based Question Answering (KGQA), emphasizes that questions on the KGQA task can be expressed in different languages to solve the lexical gap between questions and knowledge graph(s). However, the existing KG-MLQA works mainly focus on the semantic parsing<span> of multilingual questions but ignore the questions that require integrating information from cross-lingual knowledge graphs (CLKG). This paper extends KG-MLQA to Cross-lingual KG-based multilingual Question Answering (CLKGQA) and constructs the first CLKGQA dataset over multilingual DBpedia named MLPQ, which contains 300K questions in English, Chinese, and French. We further propose a novel KG sampling algorithm<span> for KG construction, making the MLPQ support the research of different types of methods. To evaluate the dataset, we put forward a general question answering workflow whose core idea is to transform CLKGQA into KG-MLQA. We first use the Entity Alignment (EA) model to merge CLKG into a single KG and get the answer to the question by the Multi-hop QA model combined with the Multilingual pre-training model. By instantiating the above QA workflow, we establish two baseline models for MLPQ, one of which uses Google translation to obtain alignment entities, and the other adopts the recent EA model. Experiments show that the baseline models are insufficient to obtain the ideal performances on CLKGQA. Moreover, the availability of our benchmark contributes to the community of question answering and entity alignment.</span></span></p></div>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2023-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"MLPQ: A Dataset for Path Question Answering over Multilingual Knowledge Graphs\",\"authors\":\"Yiming Tan ,&nbsp;Yongrui Chen ,&nbsp;Guilin Qi ,&nbsp;Weizhuo Li ,&nbsp;Meng Wang\",\"doi\":\"10.1016/j.bdr.2023.100381\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Knowledge Graph-based Multilingual Question Answering (KG-MLQA), as one of the essential subtasks in Knowledge Graph-based Question Answering (KGQA), emphasizes that questions on the KGQA task can be expressed in different languages to solve the lexical gap between questions and knowledge graph(s). However, the existing KG-MLQA works mainly focus on the semantic parsing<span> of multilingual questions but ignore the questions that require integrating information from cross-lingual knowledge graphs (CLKG). This paper extends KG-MLQA to Cross-lingual KG-based multilingual Question Answering (CLKGQA) and constructs the first CLKGQA dataset over multilingual DBpedia named MLPQ, which contains 300K questions in English, Chinese, and French. We further propose a novel KG sampling algorithm<span> for KG construction, making the MLPQ support the research of different types of methods. To evaluate the dataset, we put forward a general question answering workflow whose core idea is to transform CLKGQA into KG-MLQA. We first use the Entity Alignment (EA) model to merge CLKG into a single KG and get the answer to the question by the Multi-hop QA model combined with the Multilingual pre-training model. By instantiating the above QA workflow, we establish two baseline models for MLPQ, one of which uses Google translation to obtain alignment entities, and the other adopts the recent EA model. Experiments show that the baseline models are insufficient to obtain the ideal performances on CLKGQA. Moreover, the availability of our benchmark contributes to the community of question answering and entity alignment.</span></span></p></div>\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2023-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S221457962300014X\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S221457962300014X","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 2

摘要

基于知识图的多语言问答(KG-MLQA)作为基于知识图问答(KGQA)的重要子任务之一,强调KGQA任务中的问题可以用不同的语言表达,以解决问题与知识图之间的词汇差距。然而,现有的KG-MLQA工作主要集中在多语言问题的语义解析上,而忽略了需要整合跨语言知识图信息的问题。本文将KG-MLQA扩展到基于跨语言KG的多语言问答(CLKGQA),并在多语言DBpedia上构建了第一个CLKGQA数据集MLPQ,该数据集包含300K个英语、汉语和法语问题。我们进一步提出了一种用于KG构造的新的KG采样算法,使MLPQ支持不同类型方法的研究。为了评估数据集,我们提出了一个通用的问答工作流,其核心思想是将CLKGQA转换为KG-MLQA。我们首先使用实体对齐(EA)模型将CLKG合并为单个KG,并通过多跳QA模型与多语言预训练模型相结合来获得问题的答案。通过实例化上述QA工作流程,我们为MLPQ建立了两个基线模型,其中一个使用谷歌翻译来获得对齐实体,另一个使用最近的EA模型。实验表明,基线模型不足以在CLKGQA上获得理想的性能。此外,我们的基准的可用性有助于问答和实体协调的社区。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MLPQ: A Dataset for Path Question Answering over Multilingual Knowledge Graphs

Knowledge Graph-based Multilingual Question Answering (KG-MLQA), as one of the essential subtasks in Knowledge Graph-based Question Answering (KGQA), emphasizes that questions on the KGQA task can be expressed in different languages to solve the lexical gap between questions and knowledge graph(s). However, the existing KG-MLQA works mainly focus on the semantic parsing of multilingual questions but ignore the questions that require integrating information from cross-lingual knowledge graphs (CLKG). This paper extends KG-MLQA to Cross-lingual KG-based multilingual Question Answering (CLKGQA) and constructs the first CLKGQA dataset over multilingual DBpedia named MLPQ, which contains 300K questions in English, Chinese, and French. We further propose a novel KG sampling algorithm for KG construction, making the MLPQ support the research of different types of methods. To evaluate the dataset, we put forward a general question answering workflow whose core idea is to transform CLKGQA into KG-MLQA. We first use the Entity Alignment (EA) model to merge CLKG into a single KG and get the answer to the question by the Multi-hop QA model combined with the Multilingual pre-training model. By instantiating the above QA workflow, we establish two baseline models for MLPQ, one of which uses Google translation to obtain alignment entities, and the other adopts the recent EA model. Experiments show that the baseline models are insufficient to obtain the ideal performances on CLKGQA. Moreover, the availability of our benchmark contributes to the community of question answering and entity alignment.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
期刊最新文献
Vitamin B12: prevention of human beings from lethal diseases and its food application. Current status and obstacles of narrowing yield gaps of four major crops. Cold shock treatment alleviates pitting in sweet cherry fruit by enhancing antioxidant enzymes activity and regulating membrane lipid metabolism. Removal of proteins and lipids affects structure, in vitro digestion and physicochemical properties of rice flour modified by heat-moisture treatment. Investigating the impact of climate variables on the organic honey yield in Turkey using XGBoost machine learning.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1