No one is perfect: Analysing the performance of question answering components over the DBpedia knowledge graph

Pub Date : 2020-12-01 Epub Date: 2020-08-05 DOI:10.1016/j.websem.2020.100594
Kuldeep Singh , Ioanna Lytra , Arun Sethupat Radhakrishna , Saeedeh Shekarpour , Maria-Esther Vidal , Jens Lehmann
{"title":"No one is perfect: Analysing the performance of question answering components over the DBpedia knowledge graph","authors":"Kuldeep Singh ,&nbsp;Ioanna Lytra ,&nbsp;Arun Sethupat Radhakrishna ,&nbsp;Saeedeh Shekarpour ,&nbsp;Maria-Esther Vidal ,&nbsp;Jens Lehmann","doi":"10.1016/j.websem.2020.100594","DOIUrl":null,"url":null,"abstract":"<div><p><span>Question answering (QA) over knowledge graphs has gained significant momentum over the past five years due to the increasing availability of large knowledge graphs and the rising importance of Question Answering for user interaction. Existing QA systems have been extensively evaluated as black boxes and their performance has been characterised in terms of average results over all the questions of </span>benchmarking datasets (i.e. macro evaluation). Albeit informative, macro evaluation studies do not provide evidence about QA components’ strengths and concrete weaknesses. Therefore, the objective of this article is to analyse and micro evaluate available QA components in order to comprehend which question characteristics impact on their performance. For this, we measure at question level and with respect to different question features the accuracy of 29 components reused in QA frameworks for the DBpedia knowledge graph using state-of-the-art benchmarks. As a result, we provide a perspective on collective failure cases, study the similarities and synergies among QA components for different component types and suggest their characteristics preventing them from effectively solving the corresponding QA tasks. Finally, based on these extensive results, we present conclusive insights for future challenges and research directions in the field of Question Answering over knowledge graphs.</p></div>","PeriodicalId":75319,"journal":{"name":"","volume":"65 ","pages":"Article 100594"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2020.100594","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570826820300342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/8/5 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29

Abstract

Question answering (QA) over knowledge graphs has gained significant momentum over the past five years due to the increasing availability of large knowledge graphs and the rising importance of Question Answering for user interaction. Existing QA systems have been extensively evaluated as black boxes and their performance has been characterised in terms of average results over all the questions of benchmarking datasets (i.e. macro evaluation). Albeit informative, macro evaluation studies do not provide evidence about QA components’ strengths and concrete weaknesses. Therefore, the objective of this article is to analyse and micro evaluate available QA components in order to comprehend which question characteristics impact on their performance. For this, we measure at question level and with respect to different question features the accuracy of 29 components reused in QA frameworks for the DBpedia knowledge graph using state-of-the-art benchmarks. As a result, we provide a perspective on collective failure cases, study the similarities and synergies among QA components for different component types and suggest their characteristics preventing them from effectively solving the corresponding QA tasks. Finally, based on these extensive results, we present conclusive insights for future challenges and research directions in the field of Question Answering over knowledge graphs.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
没有人是完美的:在DBpedia知识图上分析问答组件的性能
在过去的五年中,由于大型知识图谱的可用性不断增加,以及问答对用户交互的重要性不断提高,知识图谱上的问答(QA)获得了显著的发展势头。现有的QA系统已被广泛地评估为黑盒,其性能已被描述为基准测试数据集(即宏观评估)的所有问题的平均结果。尽管提供了信息,但宏观评估研究并没有提供有关QA组成部分的优势和具体弱点的证据。因此,本文的目的是分析和微观评估可用的QA组件,以了解哪些问题特征会影响它们的性能。为此,我们使用最先进的基准,在问题级别和不同的问题特征方面测量DBpedia知识图的QA框架中重用的29个组件的准确性。因此,我们提供了一个集体故障案例的视角,研究了不同组件类型的QA组件之间的相似性和协同性,并提出了它们的特征,阻止它们有效地解决相应的QA任务。最后,基于这些广泛的结果,我们对知识图谱问答领域未来的挑战和研究方向提出了结论性的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1