Scalable Knowledge Graph Construction from Text Collections

Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER) Pub Date : 1900-01-01 DOI:10.18653/v1/D19-6607

R. Clancy, I. Ilyas, Jimmy J. Lin

引用次数: 12

Abstract

We present a scalable, open-source platform that “distills” a potentially large text collection into a knowledge graph. Our platform takes documents stored in Apache Solr and scales out the Stanford CoreNLP toolkit via Apache Spark integration to extract mentions and relations that are then ingested into the Neo4j graph database. The raw knowledge graph is then enriched with facts extracted from an external knowledge graph. The complete product can be manipulated by various applications using Neo4j’s native Cypher query language: We present a subgraph-matching approach to align extracted relations with external facts and show that fact verification, locating textual support for asserted facts, detecting inconsistent and missing facts, and extracting distantly-supervised training data can all be performed within the same framework.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从文本集合构建可扩展的知识图谱

我们提出了一个可扩展的开源平台，可以将潜在的大型文本集合“提炼”成知识图。我们的平台采用存储在Apache Solr中的文档，并通过Apache Spark集成扩展斯坦福CoreNLP工具包，以提取提及和关系，然后将其摄取到Neo4j图形数据库中。然后用从外部知识图中提取的事实来丰富原始知识图。完整的产品可以通过使用Neo4j的原生Cypher查询语言的各种应用程序来操作:我们提出了一种子图匹配方法，将提取的关系与外部事实对齐，并显示事实验证，定位断言事实的文本支持，检测不一致和缺失的事实，以及提取远程监督的训练数据都可以在同一个框架内执行。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)

自引率

0.00%

发文量

期刊最新文献

Veritas Annotator: Discovering the Origin of a Rumour Neural Multi-Task Learning for Stance Prediction Hybrid Models for Aspects Extraction without Labelled Dataset Relation Extraction among Multiple Entities Using a Dual Pointer Network with a Multi-Head Attention Mechanism Team GPLSI. Approach for automated fact checking