No one is perfect: Analysing the performance of question answering components over the DBpedia knowledge graph

Pub Date : 2020-12-01 Epub Date: 2020-08-05 DOI:10.1016/j.websem.2020.100594

Kuldeep Singh , Ioanna Lytra , Arun Sethupat Radhakrishna , Saeedeh Shekarpour , Maria-Esther Vidal , Jens Lehmann

{"title":"No one is perfect: Analysing the performance of question answering components over the DBpedia knowledge graph","authors":"Kuldeep Singh , Ioanna Lytra , Arun Sethupat Radhakrishna , Saeedeh Shekarpour , Maria-Esther Vidal , Jens Lehmann","doi":"10.1016/j.websem.2020.100594","DOIUrl":null,"url":null,"abstract":"<div><p><span>Question answering (QA) over knowledge graphs has gained significant momentum over the past five years due to the increasing availability of large knowledge graphs and the rising importance of Question Answering for user interaction. Existing QA systems have been extensively evaluated as black boxes and their performance has been characterised in terms of average results over all the questions of </span>benchmarking datasets (i.e. macro evaluation). Albeit informative, macro evaluation studies do not provide evidence about QA components’ strengths and concrete weaknesses. Therefore, the objective of this article is to analyse and micro evaluate available QA components in order to comprehend which question characteristics impact on their performance. For this, we measure at question level and with respect to different question features the accuracy of 29 components reused in QA frameworks for the DBpedia knowledge graph using state-of-the-art benchmarks. As a result, we provide a perspective on collective failure cases, study the similarities and synergies among QA components for different component types and suggest their characteristics preventing them from effectively solving the corresponding QA tasks. Finally, based on these extensive results, we present conclusive insights for future challenges and research directions in the field of Question Answering over knowledge graphs.</p></div>","PeriodicalId":75319,"journal":{"name":"","volume":"65 ","pages":"Article 100594"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2020.100594","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570826820300342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/8/5 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

Abstract

Question answering (QA) over knowledge graphs has gained significant momentum over the past five years due to the increasing availability of large knowledge graphs and the rising importance of Question Answering for user interaction. Existing QA systems have been extensively evaluated as black boxes and their performance has been characterised in terms of average results over all the questions of benchmarking datasets (i.e. macro evaluation). Albeit informative, macro evaluation studies do not provide evidence about QA components’ strengths and concrete weaknesses. Therefore, the objective of this article is to analyse and micro evaluate available QA components in order to comprehend which question characteristics impact on their performance. For this, we measure at question level and with respect to different question features the accuracy of 29 components reused in QA frameworks for the DBpedia knowledge graph using state-of-the-art benchmarks. As a result, we provide a perspective on collective failure cases, study the similarities and synergies among QA components for different component types and suggest their characteristics preventing them from effectively solving the corresponding QA tasks. Finally, based on these extensive results, we present conclusive insights for future challenges and research directions in the field of Question Answering over knowledge graphs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

没有人是完美的:在DBpedia知识图上分析问答组件的性能

在过去的五年中，由于大型知识图谱的可用性不断增加，以及问答对用户交互的重要性不断提高，知识图谱上的问答(QA)获得了显著的发展势头。现有的QA系统已被广泛地评估为黑盒，其性能已被描述为基准测试数据集(即宏观评估)的所有问题的平均结果。尽管提供了信息，但宏观评估研究并没有提供有关QA组成部分的优势和具体弱点的证据。因此，本文的目的是分析和微观评估可用的QA组件，以了解哪些问题特征会影响它们的性能。为此，我们使用最先进的基准，在问题级别和不同的问题特征方面测量DBpedia知识图的QA框架中重用的29个组件的准确性。因此，我们提供了一个集体故障案例的视角，研究了不同组件类型的QA组件之间的相似性和协同性，并提出了它们的特征，阻止它们有效地解决相应的QA任务。最后，基于这些广泛的结果，我们对知识图谱问答领域未来的挑战和研究方向提出了结论性的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助